Tensor rank decomposition

From HandWiki

In multilinear algebra, the tensor rank decomposition [1] or the [math]\displaystyle{ rank-R }[/math] decomposition of a tensor is the decomposition of a tensor in terms of a sum of minimum [math]\displaystyle{ R }[/math] [math]\displaystyle{ rank-1 }[/math] tensors. This is an open problem.[clarification needed] Canonical polyadic decomposition (CPD) is a variant of the rank decomposition which computes the best fitting [math]\displaystyle{ K }[/math] [math]\displaystyle{ rank-1 }[/math] terms for a user specified [math]\displaystyle{ K }[/math]. The CP decomposition has found some applications in linguistics and chemometrics. The CP rank was introduced by Frank Lauren Hitchcock in 1927[2] and later rediscovered several times, notably in psychometrics.Cite error: Closing </ref> missing for <ref> tag decomposition is a variation of the CP decomposition.

Another popular generalization of the matrix SVD known as the higher-order singular value decomposition computes orthonormal mode matrices and has found applications in econometrics, signal processing, computer vision, computer graphics, psychometrics.

Notation

A scalar variable is denoted by lower case italic letters, [math]\displaystyle{ a }[/math] and an upper bound scalar is denoted by an upper case italic letter, [math]\displaystyle{ A }[/math].

Indices are denoted by a combination of lowercase and upper case italic letters, [math]\displaystyle{ 1 \le i \le I }[/math]. Multiple indices that one might encounter when referring to the multiple modes of a tensor are conveniently denoted by [math]\displaystyle{ 1\le i_m \le I_m }[/math] where [math]\displaystyle{ 1\le m \le M }[/math].

A vector is denoted by a lower case bold Times Roman, [math]\displaystyle{ \mathbf a }[/math] and a matrix is denoted by bold upper case letters [math]\displaystyle{ \mathbf A }[/math].

A higher order tensor is denoted by calligraphic letters,[math]\displaystyle{ \mathcal A }[/math]. An element of an [math]\displaystyle{ M }[/math]-order tensor [math]\displaystyle{ \mathcal A \in \mathbb C^{I_1 \times I_2 \times \dots I_m \times \dots I_M} }[/math] is denoted by [math]\displaystyle{ a_{i_1, i_2,\dots,i_m,\dots i_M} }[/math] or [math]\displaystyle{ \mathcal A_{i_1, i_2,\dots,i_m,\dots i_M} }[/math].

Definition

A data tensor [math]\displaystyle{ {\mathcal A}\in {\mathbb F}^{I_0 \times I_1 \times \ldots \times I_C} }[/math] is a collection of multivariate observations organized into a M-way array where M=C+1. Every tensor may be represented with a suitably large [math]\displaystyle{ R }[/math] as a linear combination of [math]\displaystyle{ r }[/math] rank-1 tensors:

[math]\displaystyle{ \mathcal{A} = \sum_{r=1}^{R} \lambda_r \mathbf{a}_{0,r} \otimes\mathbf{a}_{1,r} \otimes \mathbf{a}_{2,r} \dots \otimes \mathbf{a}_{c,r}\otimes \cdots \otimes \mathbf{a}_{C,r}, }[/math]

where [math]\displaystyle{ \lambda_r \in {\mathbb R} }[/math] and [math]\displaystyle{ \mathbf{a}_{m,r} \in {\mathbb F}^{I_m} }[/math] where [math]\displaystyle{ 1\le m\le M }[/math]. When the number of terms [math]\displaystyle{ R }[/math] is minimal in the above expression, then [math]\displaystyle{ R }[/math] is called the rank of the tensor, and the decomposition is often referred to as a (tensor) rank decomposition, minimal CP decomposition, or Canonical Polyadic Decomposition (CPD). If the number of terms is not minimal, then the above decomposition is often referred to as CANDECOMP/PARAFAC, Polyadic decomposition'.

Tensor rank

Contrary to the case of matrices, computing the rank of a tensor is NP-hard.[3] The only notable well-understood case consists of tensors in [math]\displaystyle{ F^{I_m} \otimes F^{I_n} \otimes F^2 }[/math], whose rank can be obtained from the Kronecker–Weierstrass normal form of the linear matrix pencil that the tensor represents.[4] A simple polynomial-time algorithm exists for certifying that a tensor is of rank 1, namely the higher-order singular value decomposition.

The rank of the tensor of zeros is zero by convention. The rank of a tensor [math]\displaystyle{ \mathbf{a}_1 \otimes \cdots \otimes \mathbf{a}_M }[/math] is one, provided that [math]\displaystyle{ \mathbf{a}_m \in F^{I_m}\setminus\{0\} }[/math].

Field dependence

The rank of a tensor depends on the field over which the tensor is decomposed. It is known that some real tensors may admit a complex decomposition whose rank is strictly less than the rank of a real decomposition of the same tensor. As an example,[5] consider the following real tensor

[math]\displaystyle{ \mathcal{A} = \mathbf{x}_1 \otimes \mathbf{x}_2 \otimes \mathbf{x}_3 + \mathbf{x}_1 \otimes \mathbf{y}_2 \otimes \mathbf{y}_3 - \mathbf{y}_1 \otimes \mathbf{x}_2 \otimes \mathbf{y}_3 + \mathbf{y}_1 \otimes \mathbf{y}_2 \otimes \mathbf{x}_3, }[/math]

where [math]\displaystyle{ \mathbf{x}_i, \mathbf{y}_j \in \mathbb{R}^2 }[/math]. The rank of this tensor over the reals is known to be 3, while its complex rank is only 2 because it is the sum of a complex rank-1 tensor with its complex conjugate, namely

[math]\displaystyle{ \mathcal{A} = \frac{1}{2}( \bar{\mathbf{z}}_1 \otimes \mathbf{z}_2 \otimes \bar{\mathbf{z}}_3 + \mathbf{z}_1 \otimes \bar{\mathbf{z}}_2 \otimes \mathbf{z}_3), }[/math]

where [math]\displaystyle{ \mathbf{z}_k=\mathbf{x}_k + i\mathbf{y}_k }[/math].

In contrast, the rank of real matrices will never decrease under a field extension to [math]\displaystyle{ \mathbb{C} }[/math]: real matrix rank and complex matrix rank coincide for real matrices.

Generic rank

The generic rank [math]\displaystyle{ r(I_1,\ldots,I_M) }[/math] is defined as the least rank [math]\displaystyle{ r }[/math] such that the closure in the Zariski topology of the set of tensors of rank at most [math]\displaystyle{ r }[/math] is the entire space [math]\displaystyle{ F^{I_1} \otimes \cdots \otimes F^{I_M} }[/math]. In the case of complex tensors, tensors of rank at most [math]\displaystyle{ r(I_1,\ldots,I_M) }[/math] form a dense set [math]\displaystyle{ S }[/math]: every tensor in the aforementioned space is either of rank less than the generic rank, or it is the limit in the Euclidean topology of a sequence of tensors from [math]\displaystyle{ S }[/math]. In the case of real tensors, the set of tensors of rank at most [math]\displaystyle{ r(I_1,\ldots,I_M) }[/math] only forms an open set of positive measure in the Euclidean topology. There may exist Euclidean-open sets of tensors of rank strictly higher than the generic rank. All ranks appearing on open sets in the Euclidean topology are called typical ranks. The smallest typical rank is called the generic rank; this definition applies to both complex and real tensors. The generic rank of tensor spaces was initially studied in 1983 by Volker Strassen.[6]

As an illustration of the above concepts, it is known that both 2 and 3 are typical ranks of [math]\displaystyle{ \mathbb{R}^2 \otimes \mathbb{R}^2 \otimes \mathbb{R}^2 }[/math] while the generic rank of [math]\displaystyle{ \mathbb{C}^2 \otimes \mathbb{C}^2 \otimes \mathbb{C}^2 }[/math] is 2. Practically, this means that a randomly sampled real tensor (from a continuous probability measure on the space of tensors) of size [math]\displaystyle{ 2 \times 2 \times 2 }[/math] will be a rank-1 tensor with probability zero, a rank-2 tensor with positive probability, and rank-3 with positive probability. On the other hand, a randomly sampled complex tensor of the same size will be a rank-1 tensor with probability zero, a rank-2 tensor with probability one, and a rank-3 tensor with probability zero. It is even known that the generic rank-3 real tensor in [math]\displaystyle{ \mathbb{R}^2 \otimes \mathbb{R}^2 \otimes \mathbb{R}^2 }[/math] will be of complex rank equal to 2.

The generic rank of tensor spaces depends on the distinction between balanced and unbalanced tensor spaces. A tensor space [math]\displaystyle{ F^{I_1} \otimes \cdots \otimes F^{I_M} }[/math], where [math]\displaystyle{ I_1 \ge I_2 \ge \cdots \ge I_M }[/math], is called unbalanced whenever

[math]\displaystyle{ I_1 \gt 1 + \prod_{m=2}^M I_m - \sum_{m=2}^M (I_m-1), }[/math]

and it is called balanced otherwise.

Unbalanced tensor spaces

When the first factor is very large with respect to the other factors in the tensor product, then the tensor space essentially behaves as a matrix space. The generic rank of tensors living in an unbalanced tensor spaces is known to equal

[math]\displaystyle{ r(I_1,\ldots,I_M) = \min\left\{ I_1, \prod_{m=2}^M I_m \right\} }[/math]

almost everywhere. More precisely, the rank of every tensor in an unbalanced tensor space [math]\displaystyle{ F^{I_1 \times \cdots \times I_M} \setminus Z }[/math], where [math]\displaystyle{ Z }[/math] is some indeterminate closed set in the Zariski topology, equals the above value.[7]

Balanced tensor spaces

The expected generic rank of tensors living in a balanced tensor space is equal to

[math]\displaystyle{ r_E(I_1,\ldots,I_M) = \left\lceil \frac{\Pi}{\Sigma+1} \right\rceil }[/math]

almost everywhere for complex tensors and on a Euclidean-open set for real tensors, where

[math]\displaystyle{ \Pi = \prod_{m=1}^{M} I_m \quad\text{and}\quad \Sigma = \sum_{m=1}^{M} (I_m - 1). }[/math]

More precisely, the rank of every tensor in [math]\displaystyle{ \mathbb{C}^{I_1 \times \cdots \times I_M} \setminus Z }[/math], where [math]\displaystyle{ Z }[/math] is some indeterminate closed set in the Zariski topology, is expected to equal the above value.[8] For real tensors, [math]\displaystyle{ r_E(I_1,\ldots,I_M) }[/math] is the least rank that is expected to occur on a set of positive Euclidean measure. The value [math]\displaystyle{ r_E(I_1,\ldots,I_M) }[/math] is often referred to as the expected generic rank of the tensor space [math]\displaystyle{ F^{I_1 \times \cdots \times I_M} }[/math] because it is only conjecturally correct. It is known that the true generic rank always satisfies

[math]\displaystyle{ r(I_1, \ldots, I_M) \ge r_E(I_1, \ldots, I_M). }[/math]

The Abo–Ottaviani–Peterson conjecture[8] states that equality is expected, i.e., [math]\displaystyle{ r(I_1,\ldots,I_M) = r_E(I_1,\ldots,I_M) }[/math], with the following exceptional cases:

  • [math]\displaystyle{ F^{(2m+1) \times (2m+1) \times 3} \text{ with } m = 1, 2, \ldots }[/math]
  • [math]\displaystyle{ F^{(m+1) \times (m+1) \times 2 \times 2} \text{ with } m = 2,3, \ldots }[/math]

In each of these exceptional cases, the generic rank is known to be [math]\displaystyle{ r(I_1,\ldots,I_m,\ldots,I_M) = r_E(I_1,\ldots,I_M)+1 }[/math]. Note that while the set of tensors of rank 3 in [math]\displaystyle{ F^{2 \times 2 \times 2 \times 2} }[/math] is defective (13 and not the expected 14), the generic rank in that space is still the expected one, 4. Similarly, the set of tensors of rank 5 in [math]\displaystyle{ F^{4 \times 4 \times 3} }[/math] is defective (44 and not the expected 45), but the generic rank in that space is still the expected 6.

The AOP conjecture has been proved completely in a number of special cases. Lickteig showed already in 1985 that [math]\displaystyle{ r(n,n,n) = r_E(n,n,n) }[/math], provided that [math]\displaystyle{ n \ne 3 }[/math].[9] In 2011, a major breakthrough was established by Catalisano, Geramita, and Gimigliano who proved that the expected dimension of the set of rank [math]\displaystyle{ s }[/math] tensors of format [math]\displaystyle{ 2\times 2\times \cdots \times 2 }[/math] is the expected one except for rank 3 tensors in the 4 factor case, yet the expected rank in that case is still 4. As a consequence, [math]\displaystyle{ r(2,2,\ldots,2) = r_E(2,2,\ldots,2) }[/math] for all binary tensors.[10]

Maximum rank

The maximum rank that can be admitted by any of the tensors in a tensor space is unknown in general; even a conjecture about this maximum rank is missing. Presently, the best general upper bound states that the maximum rank [math]\displaystyle{ r_\mbox{max}(I_1,\ldots,I_M) }[/math] of [math]\displaystyle{ F^{I_1} \otimes \cdots \otimes F^{I_M} }[/math], where [math]\displaystyle{ I_1 \ge I_2 \ge \cdots \ge I_M }[/math], satisfies

[math]\displaystyle{ r_\mbox{max}(I_1,\ldots,I_M) \le \min\left\{ \prod_{m=2}^M I_m, 2 \cdot r(I_1,\ldots,I_M) \right\}, }[/math]

where [math]\displaystyle{ r(I_1,\ldots,I_M) }[/math] is the (least) generic rank of [math]\displaystyle{ F^{I_1} \otimes \cdots \otimes F^{I_M} }[/math].[11] It is well-known that the foregoing inequality may be strict. For instance, the generic rank of tensors in [math]\displaystyle{ \mathbb{R}^{2\times 2 \times 2} }[/math] is two, so that the above bound yields [math]\displaystyle{ r_\mbox{max}(2,2,2) \le 4 }[/math], while it is known that the maximum rank equals 3.[5]

Border rank

A rank-[math]\displaystyle{ s }[/math] tensor [math]\displaystyle{ \mathcal{A} }[/math] is called a border tensor if there exists a sequence of tensors of rank at most [math]\displaystyle{ r \lt s }[/math] whose limit is [math]\displaystyle{ \mathcal{A} }[/math]. If [math]\displaystyle{ r }[/math] is the least value for which such a convergent sequence exists, then it is called the border rank of [math]\displaystyle{ \mathcal{A} }[/math]. For order-2 tensors, i.e., matrices, rank and border rank always coincide, however, for tensors of order [math]\displaystyle{ \ge3 }[/math] they may differ. Border tensors were first studied in the context of fast approximate matrix multiplication algorithms by Bini, Lotti, and Romani in 1980.[12]

A classic example of a border tensor is the rank-3 tensor

[math]\displaystyle{ \mathcal{A} = \mathbf{u} \otimes \mathbf{u} \otimes \mathbf{v} + \mathbf{u} \otimes \mathbf{v} \otimes \mathbf{u} + \mathbf{v} \otimes \mathbf{u} \otimes \mathbf{u}, \quad \text{with } \|\mathbf{u}\| = \|\mathbf{v}\| = 1 \text{ and } \langle \mathbf{u}, \mathbf{v}\rangle \ne 1. }[/math]

It can be approximated arbitrarily well by the following sequence of rank-2 tensors

[math]\displaystyle{ \begin{align} \mathcal{A}_m &= m (\mathbf{u} + \frac{1}{m} \mathbf{v}) \otimes (\mathbf{u} + \frac{1}{m} \mathbf{v}) \otimes (\mathbf{u} + \frac{1}{m} \mathbf{v}) - m \mathbf{u}\otimes\mathbf{u}\otimes\mathbf{u} \\ &= \mathbf{u} \otimes \mathbf{u} \otimes \mathbf{v} + \mathbf{u} \otimes \mathbf{v} \otimes \mathbf{u} + \mathbf{v} \otimes \mathbf{u} \otimes \mathbf{u} + \frac{1}{m} (\mathbf{u}\otimes\mathbf{v}\otimes\mathbf{v} + \mathbf{v}\otimes\mathbf{u}\otimes\mathbf{v} + \mathbf{v}\otimes\mathbf{v}\otimes\mathbf{u}) + \frac{1}{m^2} \mathbf{v}\otimes\mathbf{v}\otimes\mathbf{v} \end{align} }[/math]

as [math]\displaystyle{ m \to \infty }[/math]. Therefore, its border rank is 2, which is strictly less than its rank. When the two vectors are orthogonal, this example is also known as a W state.

Properties

Identifiability

It follows from the definition of a pure tensor that [math]\displaystyle{ \mathcal{A} = \mathbf{a}_1 \otimes \mathbf{a}_2 \otimes \cdots \otimes \mathbf{a}_M = \mathbf{b}_1 \otimes \mathbf{b}_2 \otimes \cdots \otimes \mathbf{b}_M }[/math] if and only if there exist [math]\displaystyle{ \lambda_k }[/math] such that [math]\displaystyle{ \lambda_1 \lambda_2 \cdots \lambda_M = 1 }[/math] and [math]\displaystyle{ \mathbf{a}_m = \lambda_m \mathbf{b}_m }[/math] for all m. For this reason, the parameters [math]\displaystyle{ \{ \mathbf{a}_m \}_{m=1}^M }[/math] of a rank-1 tensor [math]\displaystyle{ \mathcal{A} }[/math] are called identifiable or essentially unique. A rank-[math]\displaystyle{ r }[/math] tensor [math]\displaystyle{ \mathcal{A} \in F^{I_1} \otimes F^{I_2} \otimes \cdots \otimes F^{I_M} }[/math] is called identifiable if every of its tensor rank decompositions is the sum of the same set of [math]\displaystyle{ r }[/math] distinct tensors [math]\displaystyle{ \{ \mathcal{A}_1, \mathcal{A}_2, \ldots, \mathcal{A}_r \} }[/math] where the [math]\displaystyle{ \mathcal{A}_i }[/math]'s are of rank 1. An identifiable rank-[math]\displaystyle{ r }[/math] thus has only one essentially unique decomposition [math]\displaystyle{ \mathcal{A} = \sum_{i=1}^r \mathcal{A}_i, }[/math]and all [math]\displaystyle{ r! }[/math] tensor rank decompositions of [math]\displaystyle{ \mathcal{A} }[/math] can be obtained by permuting the order of the summands. Observe that in a tensor rank decomposition all the [math]\displaystyle{ \mathcal{A}_i }[/math]'s are distinct, for otherwise the rank of [math]\displaystyle{ \mathcal{A} }[/math] would be at most [math]\displaystyle{ r-1 }[/math].

Generic identifiability

Order-2 tensors in [math]\displaystyle{ F^{I_1} \otimes F^{I_2} \simeq F^{I_1 \times I_2} }[/math], i.e., matrices, are not identifiable for [math]\displaystyle{ r \gt 1 }[/math]. This follows essentially from the observation [math]\displaystyle{ \mathcal{A} = \sum_{i=1}^r \mathbf{a}_i \otimes \mathbf{b}_i = \sum_{i=1}^r \mathbf{a}_i \mathbf{b}_i^T = A B^T = (A X^{-1}) (B X^T)^T = \sum_{i=1}^r \mathbf{c}_i \mathbf{d}_i^T = \sum_{i=1}^r \mathbf{c}_i \otimes \mathbf{d}_i, }[/math]where [math]\displaystyle{ X \in \mathrm{GL}_{r}(F) }[/math] is an invertible [math]\displaystyle{ r \times r }[/math] matrix, [math]\displaystyle{ A = [\mathbf{a}_i]_{i=1}^r }[/math] , [math]\displaystyle{ B = [\mathbf{b}_i]_{i=1}^r }[/math], [math]\displaystyle{ A X^{-1} = [\mathbf{c}_i]_{i=1}^r }[/math] and [math]\displaystyle{ B X^T = [\mathbf{d}_i]_{i=1}^r }[/math]. It can be shown[13] that for every [math]\displaystyle{ X \in \mathrm{GL}_n(F)\setminus Z }[/math], where [math]\displaystyle{ Z }[/math] is a closed set in the Zariski topology, the decomposition on the right-hand side is a sum of a different set of rank-1 tensors than the decomposition on the left-hand side, entailing that order-2 tensors of rank [math]\displaystyle{ r\gt 1 }[/math] are generically not identifiable.

The situation changes completely for higher-order tensors in [math]\displaystyle{ F^{I_1} \otimes F^{I_2} \otimes \cdots \otimes F^{I_M} }[/math] with [math]\displaystyle{ M \gt 2 }[/math] and all [math]\displaystyle{ I_m \ge 2 }[/math]. For simplicity in notation, assume without loss of generality that the factors are ordered such that [math]\displaystyle{ I_1 \ge I_2 \ge \cdots \ge I_M \ge 2 }[/math]. Let [math]\displaystyle{ S_r \subset F^{I_1} \otimes \cdots F^{I_m}\otimes \cdots \otimes F^{I_M} }[/math]denote the set of tensors of rank bounded by [math]\displaystyle{ r }[/math]. Then, the following statement was proved to be correct using a computer-assisted proof for all spaces of dimension [math]\displaystyle{ \Pi \lt 15000 }[/math],[14] and it is conjectured to be valid in general:[14][15][16]

There exists a closed set [math]\displaystyle{ Z_r }[/math] in the Zariski topology such that every tensor [math]\displaystyle{ \mathcal{A} \in S_r\setminus Z_r }[/math]is identifiable ([math]\displaystyle{ S_r }[/math] is called generically identifiable in this case), unless either one of the following exceptional cases holds:

  1. The rank is too large: [math]\displaystyle{ r \gt r_E(I_1, I_2, \ldots, I_M) }[/math];
  2. The space is identifiability-unbalanced, i.e., [math]\displaystyle{ I_1 \gt \prod_{m=2}^M i_m - \sum_{m=2}^M (I_m - 1) }[/math], and the rank is too large: [math]\displaystyle{ r \ge \prod_{m=2}^M I_m - \sum_{m=2}^M (I_m-1) }[/math];
  3. The space is the defective case [math]\displaystyle{ F^4 \otimes F^4 \otimes F^3 }[/math] and the rank is [math]\displaystyle{ r=5 }[/math];
  4. The space is the defective case [math]\displaystyle{ F^n \otimes F^n \otimes F^2 \otimes F^2 }[/math], where [math]\displaystyle{ n \ge 2 }[/math], and the rank is [math]\displaystyle{ r = 2n-1 }[/math];
  5. The space is [math]\displaystyle{ F^4 \otimes F^4 \otimes F^4 }[/math] and the rank is [math]\displaystyle{ r=6 }[/math];
  6. The space is [math]\displaystyle{ F^6 \otimes F^6 \otimes F^3 }[/math] and the rank is [math]\displaystyle{ r = 8 }[/math]; or
  7. The space is [math]\displaystyle{ F^2 \otimes F^2 \otimes F^2 \otimes F^2 \otimes F^2 }[/math] and the rank is [math]\displaystyle{ r=5 }[/math].
  8. The space is perfect, i.e., [math]\displaystyle{ r_E(I_1,I_2,\ldots,I_M) = \frac{\Pi}{\Sigma+1} }[/math] is an integer, and the rank is [math]\displaystyle{ r = r_E(I_1,I_2,\ldots,I_M) }[/math].

In these exceptional cases, the generic (and also minimum) number of complex decompositions is

  • proved to be [math]\displaystyle{ \infty }[/math] in the first 4 cases;
  • proved to be two in case 5;[17]
  • expected[18] to be six in case 6;
  • proved to be two in case 7;[19] and
  • expected[18] to be at least two in case 8 with exception of the two identifiable cases [math]\displaystyle{ F^5 \otimes F^4 \otimes F^3 }[/math] and [math]\displaystyle{ F^3 \otimes F^2 \otimes F^2 \otimes F^2 }[/math].

In summary, the generic tensor of order [math]\displaystyle{ M \gt 2 }[/math] and rank [math]\displaystyle{ r \lt \frac{\Pi}{\Sigma+1} }[/math] that is not identifiability-unbalanced is expected to be identifiable (modulo the exceptional cases in small spaces).

Ill-posedness of the standard approximation problem

The rank approximation problem asks for the rank-[math]\displaystyle{ r }[/math] decomposition closest (in the usual Euclidean topology) to some rank-[math]\displaystyle{ s }[/math] tensor [math]\displaystyle{ \mathcal{A} }[/math], where [math]\displaystyle{ r \lt s }[/math]. That is, one seeks to solve

[math]\displaystyle{ \min_{\mathbf{a}_i^m \in F^{I_m}} \| \mathcal{A} - \sum_{i=1}^{r} \mathbf{a}_i^1 \otimes \mathbf{a}_i^2 \otimes \cdots \otimes \mathbf{a}_i^M \|_F, }[/math]

where [math]\displaystyle{ \|\cdot\|_F }[/math] is the Frobenius norm.

It was shown in a 2008 paper by de Silva and Lim[5] that the above standard approximation problem may be ill-posed. A solution to aforementioned problem may sometimes not exist because the set over which one optimizes is not closed. As such, a minimizer may not exist, even though an infimum would exist. In particular, it is known that certain so-called border tensors may be approximated arbitrarily well by a sequence of tensor of rank at most [math]\displaystyle{ r }[/math], even though the limit of the sequence converges to a tensor of rank strictly higher than [math]\displaystyle{ r }[/math]. The rank-3 tensor

[math]\displaystyle{ \mathcal{A} = \mathbf{u} \otimes \mathbf{u} \otimes \mathbf{v} + \mathbf{u} \otimes \mathbf{v} \otimes \mathbf{u} + \mathbf{v} \otimes \mathbf{u} \otimes \mathbf{u}, \quad \text{with } \|\mathbf{u}\| = \|\mathbf{v}\| = 1 \text{ and } \langle \mathbf{u}, \mathbf{v}\rangle \ne 1 }[/math]

can be approximated arbitrarily well by the following sequence of rank-2 tensors

[math]\displaystyle{ \mathcal{A}_n = n (\mathbf{u} + \frac{1}{n} \mathbf{v}) \otimes (\mathbf{u} + \frac{1}{n} \mathbf{v}) \otimes (\mathbf{u} + \frac{1}{n} \mathbf{v}) - n \mathbf{u}\otimes\mathbf{u}\otimes\mathbf{u} }[/math]

as [math]\displaystyle{ n \to \infty }[/math]. This example neatly illustrates the general principle that a sequence of rank-[math]\displaystyle{ r }[/math] tensors that converges to a tensor of strictly higher rank needs to admit at least two individual rank-1 terms whose norms become unbounded. Stated formally, whenever a sequence

[math]\displaystyle{ \mathcal{A}_n = \sum_{i=1}^r \mathbf{a}^1_{i,n} \otimes \mathbf{a}^2_{i,n} \otimes \cdots \otimes \mathbf{a}^M_{i,n} }[/math]

has the property that [math]\displaystyle{ \mathcal{A}_n \to \mathcal{A} }[/math] (in the Euclidean topology) as [math]\displaystyle{ n\to\infty }[/math], then there should exist at least [math]\displaystyle{ 1 \le i \ne j \le r }[/math] such that

[math]\displaystyle{ \| \mathbf{a}^1_{i,n} \otimes \mathbf{a}^2_{i,n} \otimes \cdots \otimes \mathbf{a}^M_{i,n} \|_F \to \infty \text{ and } \| \mathbf{a}^1_{j,n} \otimes \mathbf{a}^2_{j,n} \otimes \cdots \otimes \mathbf{a}^M_{j,n} \|_F \to \infty }[/math]

as [math]\displaystyle{ n\to\infty }[/math]. This phenomenon is often encountered when attempting to approximate a tensor using numerical optimization algorithms. It is sometimes called the problem of diverging components. It was, in addition, shown that a random low-rank tensor over the reals may not admit a rank-2 approximation with positive probability, leading to the understanding that the ill-posedness problem is an important consideration when employing the tensor rank decomposition.

A common partial solution to the ill-posedness problem consists of imposing an additional inequality constraint that bounds the norm of the individual rank-1 terms by some constant. Other constraints that result in a closed set, and, thus, well-posed optimization problem, include imposing positivity or a bounded inner product strictly less than unity between the rank-1 terms appearing in the sought decomposition.

Calculating the CPD

Alternating algorithms:

  • alternating least squares (ALS)
  • alternating slice-wise diagonalisation (ASD)

Direct algorithms:

General optimization algorithms:

  • simultaneous diagonalization (SD)
  • simultaneous generalized Schur decomposition (SGSD)
  • Levenberg–Marquardt (LM)
  • nonlinear conjugate gradient (NCG)
  • limited memory BFGS (L-BFGS)

General polynomial system solving algorithms:

Applications

In machine learning, the CP-decomposition is the central ingredient in learning probabilistic latent variables models via the technique of moment-matching. For example, consider the multi-view model[29] which is a probabilistic latent variable model. In this model, the generation of samples are posited as follows: there exists a hidden random variable that is not observed directly, given which, there are several conditionally independent random variables known as the different "views" of the hidden variable. For simplicity, assume there are three symmetrical views [math]\displaystyle{ x }[/math] of a [math]\displaystyle{ k }[/math]-state categorical hidden variable [math]\displaystyle{ h }[/math]. Then the empirical third moment of this latent variable model can be written as: [math]\displaystyle{ T=\sum_{i=1}^{k}Pr(h=i) E[x|h=i]^{\otimes 3} }[/math].

In applications such as topic modeling, this can be interpreted as the co-occurrence of words in a document. Then the eigenvalues of this empirical moment tensor can be interpreted as the probability of choosing a specific topic and each column of the factor matrix [math]\displaystyle{ E[x|h=k] }[/math] corresponds to probabilities of words in the vocabulary in the corresponding topic.

See also

References

  1. Papalexakis, Evangelos. "Automatic Unsupervised Tensor Mining with Quality Assessment". https://www.cs.ucr.edu/~epapalex/papers/sdm16-autoten.pdf. 
  2. F. L. Hitchcock (1927). "The expression of a tensor or a polyadic as a sum of products". Journal of Mathematics and Physics 6 (1–4): 164–189. doi:10.1002/sapm192761164. 
  3. Hillar, C. J.; Lim, L. (2013). "Most tensor problems are NP-Hard". Journal of the ACM 60 (6): 1–39. doi:10.1145/2512329. 
  4. Landsberg, J. M. (2012). Tensors: Geometry and Applications. AMS. 
  5. 5.0 5.1 5.2 "Tensor Rank and the Ill-Posedness of the Best Low-Rank Approximation Problem". SIAM Journal on Matrix Analysis and Applications 30 (3): 1084–1127. 2008. doi:10.1137/06066518x. 
  6. Strassen, V. (1983). "Rank and optimal computation of generic tensors". Linear Algebra and Its Applications 52/53: 645–685. doi:10.1016/0024-3795(83)80041-x. 
  7. "Ranks of tensors, secant varieties of Segre varieties and fat points". Linear Algebra and Its Applications 355 (1–3): 263–285. 2002. doi:10.1016/s0024-3795(02)00352-x. 
  8. 8.0 8.1 "Induction for secant varieties of Segre varieties". Transactions of the American Mathematical Society 361 (2): 767–792. 2009. doi:10.1090/s0002-9947-08-04725-9. 
  9. Lickteig, Thomas (1985). "Typical tensorial rank". Linear Algebra and Its Applications 69: 95–120. doi:10.1016/0024-3795(85)90070-9. 
  10. Catalisano, M. V.; Geramita, A. V.; Gimigliano, A. (2011). "Secant varieties of [math]\displaystyle{ \mathbb{P} }[/math]1 × ··· × [math]\displaystyle{ \mathbb{P} }[/math]1 (n-times) are not defective for n ≥ 5". Journal of Algebraic Geometry 20 (2): 295–327. doi:10.1090/s1056-3911-10-00537-0. 
  11. "On maximum, typical and generic ranks". Mathematische Annalen 362 (3–4): 1–11. 2015. doi:10.1007/s00208-014-1150-3. 
  12. Bini, D.; Lotti, G.; Romani, F. (1980). "Approximate solutions for the bilinear form computational problem". SIAM Journal on Scientific Computing 9 (4): 692–697. doi:10.1137/0209053. 
  13. Harris, Joe (1992). Algebraic Geometry SpringerLink. Graduate Texts in Mathematics. 133. doi:10.1007/978-1-4757-2189-8. ISBN 978-1-4419-3099-6. 
  14. 14.0 14.1 Chiantini, L.; Ottaviani, G.; Vannieuwenhoven, N. (2014-01-01). "An Algorithm For Generic and Low-Rank Specific Identifiability of Complex Tensors". SIAM Journal on Matrix Analysis and Applications 35 (4): 1265–1287. doi:10.1137/140961389. ISSN 0895-4798. 
  15. Bocci, Cristiano; Chiantini, Luca; Ottaviani, Giorgio (2014-12-01). "Refined methods for the identifiability of tensors". Annali di Matematica Pura ed Applicata 193 (6): 1691–1702. doi:10.1007/s10231-013-0352-8. ISSN 0373-3114. 
  16. Chiantini, L.; Ottaviani, G.; Vannieuwenhoven, N. (2017-01-01). "Effective Criteria for Specific Identifiability of Tensors and Forms". SIAM Journal on Matrix Analysis and Applications 38 (2): 656–681. doi:10.1137/16m1090132. ISSN 0895-4798. 
  17. Chiantini, L.; Ottaviani, G. (2012-01-01). "On Generic Identifiability of 3-Tensors of Small Rank". SIAM Journal on Matrix Analysis and Applications 33 (3): 1018–1037. doi:10.1137/110829180. ISSN 0895-4798. 
  18. 18.0 18.1 Hauenstein, J. D.; Oeding, L.; Ottaviani, G.; Sommese, A. J. (2016). "Homotopy techniques for tensor decomposition and perfect identifiability". J. Reine Angew. Math. 2019 (753): 1–22. doi:10.1515/crelle-2016-0067. 
  19. Bocci, Cristiano; Chiantini, Luca (2013). "On the identifiability of binary Segre products". Journal of Algebraic Geometry 22 (1): 1–11. doi:10.1090/s1056-3911-2011-00592-4. ISSN 1056-3911. http://www.ams.org/jag/2013-22-01/S1056-3911-2011-00592-4/. 
  20. Domanov, Ignat; Lathauwer, Lieven De (January 2014). "Canonical Polyadic Decomposition of Third-Order Tensors: Reduction to Generalized Eigenvalue Decomposition". SIAM Journal on Matrix Analysis and Applications 35 (2): 636–660. doi:10.1137/130916084. ISSN 0895-4798. 
  21. Domanov, Ignat; De Lathauwer, Lieven (January 2017). "Canonical polyadic decomposition of third-order tensors: Relaxed uniqueness conditions and algebraic algorithm". Linear Algebra and Its Applications 513: 342–375. doi:10.1016/j.laa.2016.10.019. ISSN 0024-3795. 
  22. Faber, Nicolaas (Klaas) M.; Ferré, Joan; Boqué, Ricard (January 2001). "Iteratively reweighted generalized rank annihilation method". Chemometrics and Intelligent Laboratory Systems 55 (1–2): 67–90. doi:10.1016/s0169-7439(00)00117-9. ISSN 0169-7439. 
  23. Leurgans, S. E.; Ross, R. T.; Abel, R. B. (October 1993). "A Decomposition for Three-Way Arrays". SIAM Journal on Matrix Analysis and Applications 14 (4): 1064–1083. doi:10.1137/0614071. ISSN 0895-4798. 
  24. Lorber, Avraham. (October 1985). "Features of quantifying chemical composition from two-dimensional data array by the rank annihilation factor analysis method". Analytical Chemistry 57 (12): 2395–2397. doi:10.1021/ac00289a052. ISSN 0003-2700. 
  25. Sanchez, Eugenio; Kowalski, Bruce R. (January 1990). "Tensorial resolution: A direct trilinear decomposition". Journal of Chemometrics 4 (1): 29–45. doi:10.1002/cem.1180040105. ISSN 0886-9383. 
  26. Sands, Richard; Young, Forrest W. (March 1980). "Component models for three-way data: An alternating least squares algorithm with optimal scaling features". Psychometrika 45 (1): 39–67. doi:10.1007/bf02293598. ISSN 0033-3123. 
  27. Bernardi, A.; Brachat, J.; Comon, P.; Mourrain, B. (May 2013). "General tensor decomposition, moment matrices and applications". Journal of Symbolic Computation 52: 51–71. doi:10.1016/j.jsc.2012.05.012. ISSN 0747-7171. 
  28. Bernardi, Alessandra; Daleo, Noah S.; Hauenstein, Jonathan D.; Mourrain, Bernard (December 2017). "Tensor decomposition and homotopy continuation". Differential Geometry and Its Applications 55: 78–105. doi:10.1016/j.difgeo.2017.07.009. ISSN 0926-2245. 
  29. Anandkumar, Animashree; Ge, Rong; Hsu, Daniel; Kakade, Sham M; Telgarsky, Matus (2014). "Tensor decompositions for learning latent variable models". The Journal of Machine Learning Research 15 (1): 2773–2832. 

Further reading

  • Kolda, Tamara G.; Bader, Brett W. (2009). "Tensor Decompositions and Applications". SIAM Rev. 51 (3): 455–500. doi:10.1137/07070111X. Bibcode2009SIAMR..51..455K. 
  • Landsberg, Joseph M. (2012). Tensors: Geometry and Applications. AMS. 

External links