Module data: functions for working with datasets¶
Package teneva, module data: functions for working with datasets.
This module contains functions for working with datasets, including “accuracy_on_data” function.
- teneva.data.accuracy_on_data(Y, I_data, y_data, e_trunc=None)[source]¶
Compute the relative error of TT-tensor on the dataset.
- Parameters:
I_data (np.ndarray) – multi-indices for items of dataset in the form of array of the shape [samples, d].
y_data (np.ndarray) – values for items related to I_data of dataset in the form of array of the shape [samples].
e_trunc (float) – optional truncation accuracy. If this parameter is set, then sampling will be performed from the rounded TT-tensor.
- Returns:
the relative error.
- Return type:
float
Note
If I_data or y_data is not provided, the function will return -1.
Examples:
m = 100 # Size of the dataset n = [5] * 10 # Shape of the tensor # Random TT-tensor with TT-rank 2: Y = teneva.rand(n, 2) # Let build toy dataset: I_data = teneva.sample_lhs(n, m) y_data = [teneva.get(Y, i) for i in I_data] y_data = np.array(y_data) # Add add some noise: y_data = y_data + 1.E-3*np.random.randn(m) # Compute the accuracy: eps = teneva.accuracy_on_data(Y, I_data, y_data) print(f'Accuracy : {eps:-8.2e}') # >>> ---------------------------------------- # >>> Output: # Accuracy : 3.09e-03 #
- teneva.data.cache_to_data(cache={})[source]¶
Transform cache of the TT-cross into I, Y data arrays.
- Parameters:
cache (dict) – cache of the TT-cross (see “cross” function), that contains the requested function values and related tensor multi-indices.
- Returns:
tensor multi-indices (I_data; in the form of array of the shape [samples, dimension]) and related function values (y_data; in the form of array of the shape [samples]).
- Return type:
(np.ndarray, np.ndarray)
Examples:
Let apply TT-cross for benchmark function:
a = [-5., -4., -3., -2., -1.] # Lower bounds for spatial grid b = [+6., +3., +3., +1., +2.] # Upper bounds for spatial grid n = [ 20, 18, 16, 14, 12] # Shape of the tensor m = 8.E+3 # Number of calls to function r = 3 # TT-rank of the initial tensor from scipy.optimize import rosen def func(I): X = teneva.ind_to_poi(I, a, b, n) return rosen(X.T) cache = {} Y = teneva.rand(n, r) Y = teneva.cross(func, Y, m, cache=cache)
Now cache contains the requested function values and related tensor multi-indices:
I_trn, y_trn = teneva.cache_to_data(cache) print(I_trn.shape) print(y_trn.shape) i = I_trn[0, :] # The 1th multi-index y = y_trn[0] # Saved value in cache print(i) print(y) print(func(i)) # >>> ---------------------------------------- # >>> Output: # (7956, 5) # (7956,) # [0 0 0 4 3] # 130615.73557017733 # 130615.73557017733 #