Module anova: construct TT-tensor by TT-ANOVA

Package teneva, module anova: ANOVA decomposition in the TT-format.

This module contains the function “anova” which computes the TT-approximation for the tensor, using given random samples.




teneva.anova.anova(I_trn, y_trn, r=2, order=1, noise=1e-10, seed=None, fpath=None)[source]

Build TT-tensor by TT-ANOVA from the given random tensor samples.

Parameters:
  • I_trn (np.ndarray) – multi-indices for the tensor in the form of array of the shape [samples, d].

  • y_trn (np.ndarray) – values of the tensor for multi-indices I in the form of array of the shape [samples].

  • r (int) – rank of the constructed TT-tensor.

  • order (int) – order of the ANOVA decomposition (may be only 1 or 2).

  • noise (float) – noise added to formally zero elements of TT-cores.

  • seed (int) – random seed. It should be an integer number or a numpy Generator class instance.

  • fpath (str) – optional path for train data (I_trn, y_trn).

Returns:

TT-tensor, which represents the TT-approximation for the tensor.

Return type:

list

Note

A class “ANOVA” that represents a wider set of methods for working with this decomposition is also available. See “anova.py” for more details (detailed documentation for this class will be prepared later). This function is just a wrapper for “ANOVA” class. Maybe later this class will be replaced by the function.

Examples:

d = 5                           # Dimension of the function
a = [-5., -4., -3., -2., -1.]   # Lower bounds for spatial grid
b = [+6., +3., +3., +1., +2.]   # Upper bounds for spatial grid
n = [ 20,  18,  16,  14,  12]   # Shape of the tensor
m     = 1.E+4  # Number of calls to target function
order = 1      # Order of ANOVA decomposition (1 or 2)
r     = 2      # TT-rank of the resulting tensor

We set the target function (the function takes as input a set of tensor multi-indices I of the shape [samples, dimension], which are transformed into points X of a uniform spatial grid using the function “ind_to_poi”):

from scipy.optimize import rosen
def func(I):
    X = teneva.ind_to_poi(I, a, b, n)
    return rosen(X.T)

We prepare train data from the LHS random distribution:

I_trn = teneva.sample_lhs(n, m)
y_trn = func(I_trn)

We prepare test data from random tensor multi-indices:

# Number of test points:
m_tst = int(1.E+4)

# Random multi-indices for the test points:
I_tst = np.vstack([np.random.choice(k, m_tst) for k in n]).T

# Function values for the test points:
y_tst = func(I_tst)

We build the TT-tensor, which approximates the target function:

t = tpc()
Y = teneva.anova(I_trn, y_trn, r, order, seed=12345)
t = tpc() - t

print(f'Build time     : {t:-10.2f}')

# >>> ----------------------------------------
# >>> Output:

# Build time     :       0.01
#

And now we can check the result:

# Compute approximation in train points:
y_our = teneva.get_many(Y, I_trn)

# Accuracy of the result for train points:
e_trn = np.linalg.norm(y_our - y_trn)
e_trn /= np.linalg.norm(y_trn)

# Compute approximation in test points:
y_our = teneva.get_many(Y, I_tst)

# Accuracy of the result for test points:
e_tst = np.linalg.norm(y_our - y_tst)
e_tst /= np.linalg.norm(y_tst)

print(f'Error on train : {e_trn:-10.2e}')
print(f'Error on test  : {e_tst:-10.2e}')

# >>> ----------------------------------------
# >>> Output:

# Error on train :   1.08e-01
# Error on test  :   1.11e-01
#

We can also build approximation using 2-th order ANOVA decomposition:

t = tpc()
Y = teneva.anova(I_trn, y_trn, r, order=2, seed=12345)
t = tpc() - t

y_our = teneva.get_many(Y, I_trn)
e_trn = np.linalg.norm(y_our - y_trn)
e_trn /= np.linalg.norm(y_trn)

y_our = teneva.get_many(Y, I_tst)
e_tst = np.linalg.norm(y_our - y_tst)
e_tst /= np.linalg.norm(y_tst)

print(f'Build time     : {t:-10.2f}')
print(f'Error on train : {e_trn:-10.2e}')
print(f'Error on test  : {e_tst:-10.2e}')

# >>> ----------------------------------------
# >>> Output:

# Build time     :       0.09
# Error on train :   8.41e-02
# Error on test  :   8.51e-02
#

Let’s look at the quality of approximation for a linear function:

d = 4
a = -2.
b = +3.
n = [10] * d
r = 3
m_trn = int(1.E+5)
m_tst = int(1.E+4)
def func(I):
    X = teneva.ind_to_poi(I, a, b, n)
    return 5. + 0.1 * X[:, 0] + 0.2 * X[:, 1] + 0.3 * X[:, 2] + 0.4 * X[:, 3]
I_trn = teneva.sample_lhs(n, m_trn)
y_trn = func(I_trn)

I_tst = np.vstack([np.random.choice(n[i], m_tst) for i in range(d)]).T
y_tst = func(I_tst)
t = tpc()
Y = teneva.anova(I_trn, y_trn, r, order=1, seed=12345)
t = tpc() - t

y_our = teneva.get_many(Y, I_trn)
e_trn = np.linalg.norm(y_our - y_trn)
e_trn /= np.linalg.norm(y_trn)

y_our = teneva.get_many(Y, I_tst)
e_tst = np.linalg.norm(y_our - y_tst)
e_tst /= np.linalg.norm(y_tst)

print(f'Build time     : {t:-10.2f}')
print(f'Error on train : {e_trn:-10.2e}')
print(f'Error on test  : {e_tst:-10.2e}')

# >>> ----------------------------------------
# >>> Output:

# Build time     :       0.03
# Error on train :   2.70e-03
# Error on test  :   2.72e-03
#

Let’s look at the quality of approximation for a quadratic function

d = 4
a = -2.
b = +3.
n = [10] * d
r = 3
m_trn = int(1.E+5)
m_tst = int(1.E+4)
def func(I):
    X = teneva.ind_to_poi(I, a, b, n)
    return 5. + 0.1 * X[:, 0]**2 + 0.2 * X[:, 1]**2 + 0.3 * X[:, 2]**2 + 0.4 * X[:, 3]**2
I_trn = teneva.sample_lhs(n, m_trn)
y_trn = func(I_trn)

I_tst = np.vstack([np.random.choice(n[i], m_tst) for i in range(d)]).T
y_tst = func(I_tst)
t = tpc()
Y = teneva.anova(I_trn, y_trn, r, order=1, seed=12345)
t = tpc() - t

y_our = teneva.get_many(Y, I_trn)
e_trn = np.linalg.norm(y_our - y_trn)
e_trn /= np.linalg.norm(y_trn)

y_our = teneva.get_many(Y, I_tst)
e_tst = np.linalg.norm(y_our - y_tst)
e_tst /= np.linalg.norm(y_tst)

print(f'Build time     : {t:-10.2f}')
print(f'Error on train : {e_trn:-10.2e}')
print(f'Error on test  : {e_tst:-10.2e}')

# >>> ----------------------------------------
# >>> Output:

# Build time     :       0.03
# Error on train :   3.49e-03
# Error on test  :   3.51e-03
#

[Draft] We can also sample, using ANOVA decomposition:

d         = 5                           # Dimension of the function
a         = [-5., -4., -3., -2., -1.]   # Lower bounds for spatial grid
b         = [+6., +3., +3., +1., +2.]   # Upper bounds for spatial grid
n         = [ 20,  18,  16,  14,  12]   # Shape of the tensor
m         = 1.E+4  # Number of calls to target function
order     = 2      # Order of ANOVA decomposition (1 or 2)
from scipy.optimize import rosen
def func(I):
    X = teneva.ind_to_poi(I, a, b, n)
    return rosen(X.T)
I_trn = teneva.sample_lhs(n, m)
y_trn = func(I_trn)
t = tpc()
ano = teneva.ANOVA(I_trn, y_trn, order, seed=12345)
t = tpc() - t

print(f'Build time     : {t:-10.2f}')

# >>> ----------------------------------------
# >>> Output:

# Build time     :       0.07
#
for _ in range(10):
    print(ano.sample())

# >>> ----------------------------------------
# >>> Output:

# [2, 3, 12, 9, 4]
# [8, 5, 1, 9, 11]
# [3, 16, 8, 2, 4]
# [19, 11, 5, 10, 2]
# [0, 1, 5, 6, 3]
# [19, 2, 2, 1, 7]
# [19, 9, 14, 10, 10]
# [19, 8, 15, 6, 3]
# [15, 9, 4, 12, 3]
# [9, 1, 4, 0, 7]
#