Module data: functions for working with datasets

Package teneva, module data: functions for working with datasets.

This module contains functions for working with datasets, including “accuracy_on_data” function.




teneva.data.accuracy_on_data(Y, I_data, y_data, e_trunc=None)[source]

Compute the relative error of TT-tensor on the dataset.

Parameters:
  • I_data (np.ndarray) – multi-indices for items of dataset in the form of array of the shape [samples, d].

  • y_data (np.ndarray) – values for items related to I_data of dataset in the form of array of the shape [samples].

  • e_trunc (float) – optional truncation accuracy. If this parameter is set, then sampling will be performed from the rounded TT-tensor.

Returns:

the relative error.

Return type:

float

Note

If I_data or y_data is not provided, the function will return -1.

Examples:

m = 100       # Size of the dataset
n = [5] * 10  # Shape of the tensor

# Random TT-tensor with TT-rank 2:
Y = teneva.rand(n, 2)

# Let build toy dataset:
I_data = teneva.sample_lhs(n, m)
y_data = [teneva.get(Y, i) for i in I_data]
y_data = np.array(y_data)

# Add add some noise:
y_data = y_data + 1.E-3*np.random.randn(m)

# Compute the accuracy:
eps = teneva.accuracy_on_data(Y, I_data, y_data)

print(f'Accuracy     : {eps:-8.2e}')

# >>> ----------------------------------------
# >>> Output:

# Accuracy     : 3.09e-03
#


teneva.data.cache_to_data(cache={})[source]

Transform cache of the TT-cross into I, Y data arrays.

Parameters:

cache (dict) – cache of the TT-cross (see “cross” function), that contains the requested function values and related tensor multi-indices.

Returns:

tensor multi-indices (I_data; in the form of array of the shape [samples, dimension]) and related function values (y_data; in the form of array of the shape [samples]).

Return type:

(np.ndarray, np.ndarray)

Examples:

Let apply TT-cross for benchmark function:

a = [-5., -4., -3., -2., -1.] # Lower bounds for spatial grid
b = [+6., +3., +3., +1., +2.] # Upper bounds for spatial grid
n = [ 20,  18,  16,  14,  12] # Shape of the tensor
m = 8.E+3                     # Number of calls to function
r = 3                         # TT-rank of the initial tensor

from scipy.optimize import rosen
def func(I):
    X = teneva.ind_to_poi(I, a, b, n)
    return rosen(X.T)

cache = {}
Y = teneva.rand(n, r)
Y = teneva.cross(func, Y, m, cache=cache)

Now cache contains the requested function values and related tensor multi-indices:

I_trn, y_trn = teneva.cache_to_data(cache)

print(I_trn.shape)
print(y_trn.shape)

i = I_trn[0, :] # The 1th multi-index
y = y_trn[0]    # Saved value in cache

print(i)
print(y)
print(func(i))

# >>> ----------------------------------------
# >>> Output:

# (7956, 5)
# (7956,)
# [0 0 0 4 3]
# 130615.73557017733
# 130615.73557017733
#