Discover millions of ebooks, audiobooks, and so much more with a free trial

Only $11.99/month after trial. Cancel anytime.

Tensors for Data Processing: Theory, Methods, and Applications
Tensors for Data Processing: Theory, Methods, and Applications
Tensors for Data Processing: Theory, Methods, and Applications
Ebook1,146 pages10 hours

Tensors for Data Processing: Theory, Methods, and Applications

Rating: 0 out of 5 stars

()

Read preview

About this ebook

Tensors for Data Processing: Theory, Methods and Applications presents both classical and state-of-the-art methods on tensor computation for data processing, covering computation theories, processing methods, computing and engineering applications, with an emphasis on techniques for data processing. This reference is ideal for students, researchers and industry developers who want to understand and use tensor-based data processing theories and methods.

As a higher-order generalization of a matrix, tensor-based processing can avoid multi-linear data structure loss that occurs in classical matrix-based data processing methods. This move from matrix to tensors is beneficial for many diverse application areas, including signal processing, computer science, acoustics, neuroscience, communication, medical engineering, seismology, psychometric, chemometrics, biometric, quantum physics and quantum chemistry.

  • Provides a complete reference on classical and state-of-the-art tensor-based methods for data processing
  • Includes a wide range of applications from different disciplines
  • Gives guidance for their application
LanguageEnglish
Release dateOct 21, 2021
ISBN9780323859653
Tensors for Data Processing: Theory, Methods, and Applications

Related to Tensors for Data Processing

Related ebooks

Technology & Engineering For You

View More

Related articles

Reviews for Tensors for Data Processing

Rating: 0 out of 5 stars
0 ratings

0 ratings0 reviews

What did you think?

Tap to rate

Review must be at least 10 words

    Book preview

    Tensors for Data Processing - Yipeng Liu

    Preface

    Yipeng Liu     Chengdu, China

    This book provides an overview of tensors for data processing, covering computing theories, processing methods, and engineering applications. The tensor extensions of a series of classical multidimensional data processing techniques are discussed in this book. Many thanks go to all the contributors. Students can read this book to get an overall understanding, researchers can update their knowledge on the recent research advances in the field, and engineers can refer to implementations on various applications.

    The first chapter is an introduction to tensor decomposition. In the following, the book provides variants of tensor decompositions with their efficient and effective solutions, including some parallel algorithms, Riemannian algorithms, and generalized thresholding algorithms. Some tensor-based machine learning methods are summarized in detail, including tensor completion, tensor principal component analysis, support tensor machine, tensor-based kernel learning, tensor-based deep learning, etc. To demonstrate that tensors can effectively and systematically enhance performance in practical engineering problems, this book gives implemental details of many applications, such as signal recovery, recommender systems, climate forecasting, image clustering, image classification, network compression, data fusion, image enhancement, neuroimaging, and remote sensing.

    I sincerely hope this book can serve to introduce tensors to more data scientists and engineers. As a natural representation of multidimensional data, tensors can be used to substantially avoid the information loss in matrix representations of multiway data, and tensor operators can model more connections than their matrix counterparts. The related advances in applied mathematics allow us to move from matrices to tensors for data processing. This book is promising to motivate novel tensor theories and new data processing methods, and to stimulate the development of a wide range of practical applications.

    Aug. 10, 2021

    Chapter 1: Tensor decompositions: computations, applications, and challenges

    Yingyue Bi; Yingcong Lu; Zhen Long; Ce Zhu; Yipeng Liu    School of Information and Communication Engineering, University of Electronic Science and Technology of China (UESTC), Chengdu, China

    Abstract

    Many classical data processing techniques rely on the representation and computation of vector and matrix forms, where the vectorization or matricization is often employed on multidimensional data. However, during this process, vital underlying structure information can be lost, leading to suboptimal performances in processing. As a natural representation for multidimensional data, the tensor has drawn a great deal of attention over the past several years. Tensor decompositions are effective tools for tensor analysis. They have been intensively investigated in a number of areas, such as signal processing, machine learning, neuroscience, communication, psychometrics, chemometrics, biometrics, quantum physics, and quantum chemistry. In this chapter, we give a brief introduction into the basic operations of tensors and illustrate them with a few examples in data processing. To better manipulate tensor data, a number of tensor operators are defined, especially different tensor products. In addition, we elaborate on a series of important tensor decompositions and their properties, including CANDECOMP/PARAFAC decomposition, Tucker decomposition, tensor singular value decomposition, block term decomposition, tensor tree decomposition, tensor train decomposition, and tensor ring decomposition. We further present a list of machine learning techniques based on tensor decompositions, such as tensor dictionary learning, tensor completion, robust tensor principal component analysis, tensor regression, statistical tensor classification, coupled tensor fusion, and deep tensor neural networks. Finally, we discuss the challenges and opportunities for data processing under a tensor scheme.

    Keywords

    Tensor computation; Tensor norm; Tensor product; CANDECOMP/PARAFAC decomposition; Tucker decomposition; Tensor singular value decomposition; Block term decomposition; Tensor network; Higher-order tensor; Multilinear subspace analysis

    Chapter Outline

    1.1  Introduction

    1.1.1  What is a tensor?

    1.1.2  Why do we need tensors?

    1.2  Tensor operations

    1.2.1  Tensor notations

    1.2.2  Matrix operators

    1.2.3  Tensor transformations

    1.2.4  Tensor products

    1.2.5  Structural tensors

    1.2.6  Summary

    1.3  Tensor decompositions

    1.3.1  Tucker decomposition

    1.3.2  Canonical polyadic decomposition

    1.3.3  Block term decomposition

    1.3.4  Tensor singular value decomposition

    1.3.5  Tensor network

    1.4  Tensor processing techniques

    1.5  Challenges

    References

    1.1 Introduction

    1.1.1 What is a tensor?

    The tensor can be seen as a higher-order generalization of vector and matrix, which normally has three or more modes (ways) [1]. For example, a color image is a third-order tensor. It has two spatial modes and one channel mode. Similarly, a color video is a fourth-order tensor; its extra mode denotes time.

    As special forms of tensors, vector is a first-order tensor whose i-th entry (scalar) is and matrix is a second-order tensor whose -th element is . A general N-th-order tensor can be mathematically denoted as and its -th entry is . For example, a third-order tensor is illustrated in Fig. 1.1.

    Figure 1.1 A third-order tensor .

    1.1.2 Why do we need tensors?

    Tensors play important roles in a number of applications, such as signal processing, machine learning, biomedical engineering, neuroscience, computer vision, communication, psychometrics, and chemometrics. They can provide a concise mathematical framework for formulating and solving problems in those fields.

    Here are a few cases involving tensor frameworks:

    •  Many spatial-temporal signals in speech and image processing are multidimensional. Tensor factorization-based techniques can effectively extract features for enhancement, classification, regression, etc. For example, nonnegative canonical polyadic (CP) decomposition can be used for speech signal separation where the first two components of CP decomposition represent frequency and time structure of the signal and the last component is the coefficient matrix [2].

    •  The fluorescence excitation–emission data, commonly used in chemistry, medicine, and food science, has several chemical components with different concentrations. It can be denoted as a third-order tensor; its three modes represent sample, excitation, and emission. Taking advantage of CP decomposition, the tensor can be factorized into three factor matrices: relative excitation spectral matrix, relative emission spectral matrix, and relative concentration matrix. In this way, tensor decomposition can be applied to analyze the components and corresponding concentrations in each sample [3].

    •  Social data often have multidimensional structures, which can be exploited by tensor-based techniques for data mining. For example, the three modes of chat data are user, keyword, and time. Tensor analysis can reveal the communication patterns and the hidden structures in social networks, and this can benefit tasks like recommender systems [4].

    1.2 Tensor operations

    In this section, we first introduce tensor notations, i.e., fibers and slices, and then demonstrate how to represent tensors in a graphical way. Before we discuss tensor operations, several matrix operations are reviewed.

    1.2.1 Tensor notations

    Subtensors, such as fibers and slices, can be formed from the original tensor. A fiber is defined by fixing all the indices but one and a slice is defined by fixing all but two indices. For a third-order tensor , its mode-1, mode-2, and mode-3 fibers are denoted by , , and , where , and , which are illustrated in Fig. 1.2. Its horizontal slices , lateral slices , and frontal slices , are shown in Fig. 1.3. For ease of denotation, we refer to the frontal slice of as in some formulas.

    Figure 1.2 The illustration of mode-1 fibers , mode-2 fibers , and mode-3 fibers with i 1  = 1,⋯, I 1 , i 2  = 1,⋯, I 2 and i 3  = 1,⋯, I 3 .

    Figure 1.3 The illustration of horizontal slices i 1  = 1,⋯, I 1 , lateral slices i 2  = 1,⋯, I 2 , and frontal slices i 3  = 1,⋯, I 3 .

    Other than the aforementioned notations, there is another way to denote tensors and their operations [5]. Taking advantage of graphical representations, tensors can be denoted by nodes and edges in a straightforward way. Graphical representations for scalars, vectors, matrices, and tensors are shown in Fig. 1.4. The number next to the edge represents the indices of the corresponding mode.

    Figure 1.4 Graphical representations of scalar, vector, matrix and tensor.

    1.2.2 Matrix operators

    Definition 1.2.1

    (Matrix trace [6]) The trace of matrix is obtained by summing all the diagonal entries of A, i.e., .

    Definition 1.2.2

    ( -norm [6]) For matrix , its -norm is defined as

    (1.1)

    Definition 1.2.3

    (Matrix nuclear norm [7]) The nuclear norm of matrix A is denoted as , where is the i-th largest singular value of A.

    Definition 1.2.4

    (Hadamard product [8]) The Hadamard product for matrices and is defined as with

    (1.2)

    Definition 1.2.5

    (Kronecker product [9]) The Kronecker product of matrices and is defined as , which can be written mathematically as

    (1.3)

    Based on the Kronecker product, a lot of useful properties can be derived. Given matrices A, B, C, D, we have

    (1.4)

    where and represent the transpose and Moore–Penrose inverse of matrix A.

    Definition 1.2.6

    (Khatri–Rao product [10]) The Khatri–Rao product of matrices and is defined as

    (1.5)

    Similar to the Kronecker product, the Khatri–Rao product also has some convenient properties, such as

    (1.6)

    1.2.3 Tensor transformations

    Definition 1.2.7

    (Tensor transpose [11]) Given a tensor , whose frontal slices are , its transpose is acquired by first transposing each of the frontal slices and then placing them in the order of , , , ⋯, along the third mode.

    Fig. 1.5 demonstrates the tensor transpose of .

    Figure 1.5 A graphical illustration of the tensor transpose on .

    Definition 1.2.8

    (Tensor mode-n matricization [1]) For tensor , its matricization along the n-th mode is denoted as , as shown in Fig. 1.6. It rearranges fibers on the n-th mode to form the columns of . For instance, there exists a third-order tensor whose frontal slices are

    (1.7)

    Thus, its mode-1, mode-2, and mode-3 matricizations can be written as

    (1.8)

    (1.9)

    (1.10)

    Figure 1.6 A graphical illustration of tensor mode- n matricization for .

    Definition 1.2.9

    (Tensor n-th canonical matricization [12]) For a fixed index , the n-th canonical matricization of tensor can be defined as

    (1.11)

    where , are multiindices and .

    Take the multiindex as an example, , . It can either be defined using the little-endian convention (reverse lexicographic ordering) [13]

    (1.12)

    or the big-endian convention (colexicographic ordering)

    (1.13)

    1.2.4 Tensor products

    Definition 1.2.10

    (Tensor inner product [1]) The inner product of two tensors and , shown in Fig. 1.7, is expressed as

    (1.14)

    Figure 1.7 A graphical illustration of the tensor inner product.

    Definition 1.2.11

    (Tensor norm [1]) The norm of a tensor is the square root of the summation over the square of all its elements, which can be expressed as

    (1.15)

    Definition 1.2.12

    (Tensor mode-n product with a matrix [1]) The tensor mode-n product of and matrix is denoted as

    (1.16)

    or element-wisely,

    (1.17)

    A visual illustration is shown in Fig. 1.8.

    Figure 1.8 A graphical illustration of the tensor mode- n product.

    Taking advantage of tensor matricization, Eq. (1.16) can also be expressed in an unfolded form as

    (1.18)

    For example, given tensor (Eq. (1.7)) and matrix , the mode-n product will yield a tensor , whose frontal slices are

    (1.19)

    Definition 1.2.13

    (Tensor mode-n product with a vector [1]) The tensor mode-n product of the tensor and vector is denoted as

    (1.20)

    with entries

    (1.21)

    For example, given tensor in Eq. (1.7) and vector , we have

    (1.22)

    It can be clearly seen that the operation of multiplying a tensor by a matrix will not change the number of ways of the tensor. However, if a tensor is multiplied by a vector, the number of ways will decrease.

    Definition 1.2.14

    (t-product [11]) The t-product of and is defined as

    (1.23)

    where ,

    represents the block matrix [11] of , and

    is the block-circulant matrix [11] of , where and , , represent the -th frontal slice of and , respectively.

    Definition 1.2.15

    (Tensor contraction [5]) Given two tensors and , suppose they have L equal indices in and . The contraction of these two tensors yields an -th-order tensor , whose entries can be calculated by

    (1.24)

    A graphical illustration of tensor contraction is shown in Fig. 1.9.

    Figure 1.9 Graphical representation of contraction of two tensors, and , where { K 1 , K 2 ,⋯, K L } denotes the L equal indices in { I 1 , I 2 ,⋯, I M } and { J 1 , J 2 ,⋯, J N }.

    For example, given tensors and , based on the aforementioned definition, we can conclude that , , , and . As shown in Fig. 1.10, the result of tensor contraction is of the size of , and its entries are

    (1.25)

    Figure 1.10 The contraction of two tensors, and , where K 1  =  I 2  =  J 5  = 4, K 2  =  I 3  =  J 1  = 2, K 3  =  I 5  =  J 3  = 7, I 1  = 3, I 4  = 6, J 2  = 5, and J 4  = 8.

    Consider a special case when and , as demonstrated in Fig. 1.11. The contraction of tensors and results in an ( )-th-order tensor , whose entries can be calculated by

    (1.26)

    Figure 1.11 A graphical representation of contraction over two tensors, and , where K 1  =  I m  =  J n .

    1.2.5 Structural tensors

    Definition 1.2.16

    (Identity tensor [11]) An identity tensor is a tensor whose first frontal slice is an identity matrix and the rest are zero matrices.

    Definition 1.2.17

    (Orthogonal tensor [14]) Using the t-product, an orthogonal tensor is defined as

    (1.27)

    Definition 1.2.18

    (Rank-1 tensor [1]) A rank-1 tensor is formed by the outer product of vectors, as shown in Fig. 1.12. Its mathematical formulation can be written as

    (1.28)

    where ∘ means the outer product. Therefore, the entries of can be written as . Generalizing it to the N-th-order tensor , we have

    (1.29)

    Figure 1.12 A rank-1 tensor .

    Definition 1.2.19

    (Diagonal tensor [1]) Tensor is a diagonal tensor if and only if all its nonzero elements are on the superdiagonal line. Specifically, if is a diagonal tensor, then we have if and only if . A graphical illustration of a third-order diagonal tensor is demonstrated in Fig. 1.13.

    Figure 1.13 A third-order diagonal tensor .

    Definition 1.2.20

    (f-diagonal tensor [14]) An f-diagonal tensor is a tensor with diagonal frontal slices. A third-order f-diagonal tensor is visualized in Fig. 1.14.

    Figure 1.14 An f -diagonal tensor .

    1.2.6 Summary

    In this section, we first briefly described some notations of tensor representations. Then by giving basic operations of matrices, we discussed several common tensor operations, including tensor transformations and tensor products. Concepts of structural tensors such as orthogonal tensor, diagonal tensor, and f-diagonal tensor are also given. It is worth noting that we only focus on the most commonly used definitions; for more information, please refer to [1], [5], and [6].

    1.3 Tensor decompositions

    The idea of tensor decomposition was first put forward by Hitchcock in 1927 and developed by a lot of scholars until these days. Traditionally, it was implemented in psychometrics and stoichiometry. With the growing prosperity of tensor decomposition in [15–18], it began to draw attention in other fields, including signal processing [19–21], numerical linear algebra [22,23], computer vision [24], numerical analysis [25,26], and data mining [27–29]. Meanwhile, different decomposition approaches were developed to meet various requirements.

    In this section, we first discuss two cornerstones, Tucker decomposition and CP decomposition, and go through some other methods like block term decomposition (BTD), tensor singular value decomposition (t-SVD), and tensor networks (TNs).

    1.3.1 Tucker decomposition

    In 1963, Tucker decomposition was firstly proposed in [30] by Tucker and perfected by Levin and Tucker later on. In 2000, the name of higher-order singular value decomposition (HOSVD) was put forward by De Lathauwer [31]. Nowadays, the terms Tucker decomposition and HOSVD are used alternatively to refer to Tucker decomposition.

    Taking advantage of the mode-n product, Tucker decomposition can be defined as a multiplication of a core tensor and the matrix along each mode.

    Definition 1.3.1

    (Tucker decomposition) Given a tensor , its Tucker decomposition is

    (1.30)

    where are semi-orthogonal factor matrices that satisfy and is the core tensor. Even though the core tensor is usually dense, it is generally much smaller than , i.e., .

    We can also write Tucker decomposition in an element-wise style as

    (1.31)

    where .

    Fig. 1.15 is an illustration of Tucker decomposition on a third-order tensor, i.e., .

    Figure 1.15 An illustration of Tucker decomposition on a third-order tensor . The core tensor is and factor matrices are .

    Based on Tucker decomposition, we can derive the definition of Tucker rank.

    Definition 1.3.2

    (Tucker rank) The Tucker rank of a given tensor is defined as an N-tuple comprised of n-rank . The n-rank , a.k.a. the multilinear rank, is the dimension of the vector space spanned by the mode-n fibers. In other words, the n-rank is the column rank of .

    Note that Tucker decomposition is not unique; if we employ permutations , , and on each mode of the core tensor , we can always find the corresponding factor matrices by applying the reverse operations on them. Specifically, taking a third-order tensor as an example, we have

    where , , represent the inverse matrices of R, P, Q.

    This property of Tucker decomposition enables us to select the appropriate core tensor according to the situations.

    1.3.2 Canonical polyadic decomposition

    CP decomposition was invented in 1927 as polyadic decomposition by Hitchcock [32], whose idea is to express a tensor as the summation of rank-1 tensors. In 1970, polyadic decomposition was renamed by Carroll and Chang [33] as canonical decomposition (CANDECOMP), while in the meantime, Harshman [34] named it as parallel factors analysis (PARAFAC). For a long time, different articles used different names to refer to CP decomposition and it was quite confusing. In 2000, Kiers [35] suggested to call it CP decomposition uniformly. Today, even though we refer to it as CP decomposition, it can be seen as both CANDECOMP/PARAFAC decomposition and CP decomposition.

    Definition 1.3.3

    (CP decomposition) CP decomposition of an N-th-order tensor is represented as the summation of rank-1 tensors

    (1.32)

    where , , are factor matrices and R denotes the number of rank-1 components. The entries in can be computed individually as

    (1.33)

    Similar to Tucker decomposition, there is a tensor rank in terms of CP decomposition.

    Definition 1.3.4

    (Tensor rank) Tensor rank is defined as the minimal number of rank-1 components that ensure Eq. (1.32) holds.

    Taking a third-order tensor as an example, it can be decomposed as

    (1.34)

    where each component is a third-order tensor of rank-1. If R is the smallest number among all the possible values that fit in Eq. (1.34), then we say R is the tensor rank of . Fig. 1.16 illustrates the CP decomposition for intuitively.

    Figure 1.16 A demonstration of CP decomposition of a third-order tensor. Each dotted rectangle represents a rank-1 tensor resulting from a r b r c r , r  = 1,⋯, R .

    CP decomposition and Tucker decomposition are not independent. According to Eq. (1.30), when the core tensor is an identity tensor , then Eq. (1.30) is indeed a CP decomposition. Specifically,

    In this sense, CP decomposition can be regarded as a special case of Tucker decomposition. However, different from Tucker decomposition, CP decomposition is unique under permutation indeterminacy and scaling indeterminacy. The permutation indeterminacy means we can arbitrarily change the order of rank-1 tensors in Eq. (1.32), and the decomposition still holds. Taking a third-order tensor as an example, the mathematical form of permutation is as follows:

    (1.35)

    where is a permutation matrix. The scaling indeterminacy means that we can scale the column vectors by different parameters as long as the product of those parameters equal one. That is to say,

    (1.36)

    where factors satisfy for .

    The uniqueness of CP decomposition means that, for a given tensor , there is only one set of rank-1 tensors that satisfies Eq. (1.32).

    1.3.3 Block term decomposition

    As introduced above, CP decomposition and Tucker decomposition have different ranks. By unifying the ideas of these two ranks, BTD was proposed [36,37].

    Definition 1.3.5

    (Block term decomposition) For a tensor , its BTD is denoted as

    (1.37)

    where represents the n-th factor in the r-th term and represents the corresponding core tensor.

    Considering BTD on third-order tensors, we can give the definition of ( )-decomposition.

    Definition 1.3.6

    (Rank-( )) A third-order tensor is of rank-( ) if its mode-1 rank, mode-2 rank, and mode-3 rank equal L, M, and N, respectively.

    Definition 1.3.7

    (( )-decomposition) Given a third-order tensor , the ( )-decomposition is the summation of rank-( ) terms as follows:

    (1.38)

    where is of full rank-( ) and , , and are of full column rank ( , , , ).

    A graphical illustration of ( )-decomposition is shown in Fig. 1.17.

    Figure 1.17 The illustration of block term decomposition for a third-order tensor.

    Following this idea, we give three special cases of ( )-decomposition: ( )-decomposition, ( )-decomposition, and -decomposition.

    Definition 1.3.8

    (( )-decomposition) For a tensor , its ( )-decomposition is the summation of rank-( ) terms as follows:

    (1.39)

    where and ( ) are rank-L matrices.

    However, and may not have the same rank on some occasions. Therefore, ( )-decomposition was proposed, as shown in Fig. 1.18.

    Definition 1.3.9

    (( )-decomposition) For a tensor , its ( )-decomposition is the summation of rank-( ) terms, which can be defined as

    (1.40)

    where and ( ) are rank- matrices.

    Figure 1.18 The illustration of ( L r , L r ,1)-decomposition for a third-order tensor.

    Definition 1.3.10

    ( -decomposition) For a tensor , its -decomposition can be denoted as

    (1.41)

    where is the core tensor and , are of full column rank ( , , ).

    A visual representation of -decomposition is shown in Fig. 1.19.

    Figure 1.19 The illustration of ( L , M ,⋅)-decomposition for a third-order tensor.

    1.3.4 Tensor singular value decomposition

    In addition to some of the decomposition methods mentioned above, Kilmer et al. [14] proposed a new type of decomposition for tensors, named t-SVD. It is the generalization of matrix singular value decomposition (SVD) and can be applied to multiple applications such as data compression.

    Definition 1.3.11

    (t-SVD) For tensor , the t-SVD is defined as

    (1.42)

    where , are orthogonal tensors and is an f-diagonal tensor.

    An illustration of t-SVD for tensor is shown in Fig. 1.20.

    Definition 1.3.12

    (Tensor tubal rank [38]) The tubal rank of a third-order tensor is defined as the number of nonzero tubes in .

    Figure 1.20 The illustration of t-SVD for a third-order tensor.

    It is well known that block-circulant matrices can be block-diagonalized by the Fourier transform [39]. Mathematically, for , we have

    (1.43)

    where is the block-circulant matrix (see Eq. (1.23)) of , is a normalized discrete Fourier transform (DFT) matrix, is the fast Fourier transform (FFT) of along the third mode, and ( ) represents its -th frontal slice.

    In this way, instead of directly calculating circ(⋅), we are able to employ the FFT to solve Eq. (1.42). Specifically, we apply SVD on every frontal slice of to acquire , , and , . Then by simply performing the inverse FFT along the third mode on , , and , the t-SVD of is obtained. From this point of view, t-SVD can be regarded as performing matrix SVD on each frontal slice of a tensor in the frequency domain.

    1.3.5 Tensor network

    In recent years, data collected from various fields often have high dimensions. This gives rise to a tricky problem named the curse of dimensionality.

    In Tucker decomposition, if we assume the dimension of a core tensor is , then has entries. The number of entries scales exponentially with the number of tensor ways, which results in a tremendous computational burden that we may not be able to handle by standard numerical algorithms.

    To ease this issue, we can resort to TNs, which are able to decompose a high-order tensor into a set of sparsely interconnected lower-order core tensors and matrices. The core tensors are connected through tensor contractions. Apart from the aforementioned decompositions, there are also many other kinds of decompositions under the TN framework. Here we mainly introduce hierarchical Tucker (HT) decomposition and its special case, tensor train (TT) decomposition.

    1.3.5.1 Hierarchical Tucker decomposition

    HT decomposition is also known as the tensor tree network (TTN) with rank-3 tensors in quantum physics. It was originally proposed in [40,41].

    The main idea of HT is to decompose a tensor in a hierarchical way according to a binary tree (dimension tree) whose nodes indicate subsets of modes in the original tensor and the root node contains all the modes. Fig. 1.21 demonstrates a possible dimension tree for a fourth-order tensor , where , , denote the interior nodes and , , , denote the leaf nodes. Grelier et al. discussed how to select the optimal structure for a given tensor [42].

    Figure 1.21 A possible binary tree of a fourth-order tensor. Each node corresponds to a mode set of the tensor .

    Definition 1.3.13

    (Matricization for dimension tree) Given a tensor , dimension indices , and its complement in the corresponding binary tree (q represents the ordinal label assigned to the node), the matricization is

    where

    For example, given tensor and dimension indices in Fig. 1.21, the matricization is of the size of .

    Definition 1.3.14

    (Hierarchical rank) The hierarchical rank of tensor is defined as

    (1.44)

    where refers to the dimension index in the related binary tree and is the matricization according to .

    Let denote a set of tensors whose hierarchical rank is no more than , i.e.,

    . Using the nestedness property, we can define HT decomposition properly.

    Lemma 1.3.1

    (Nestedness property) Suppose . Then we can have a corresponding subspace for every and its complement

    where . For each interior node with two successors, their dimension indices are , , , respectively, and the space of naturally decouples into

    where denotes the Kronecker product.

    Definition 1.3.15

    (HT decomposition) Suppose . We can represent as

    where , , is the hierarchical rank of , and . Therefore, for , the column vectors of satisfy the nestedness property when , that is,

    (1.45)

    where represents the entry in . It is the coefficient in the linear combination of vectors. Here and are the column vectors of and . Therefore, and are the factors of HT decomposition of . According to the dimension tree in Fig. 1.21, the HT decomposition for a fourth-order tensor is illustrated in Fig. 1.22.

    Figure 1.22 Hierarchical Tucker decomposition of a fourth-order tensor , where , , , , , , and are the factors of hierarchical Tucker decomposition.

    1.3.5.2 Tensor train decomposition

    TT decomposition is proposed in [43] and is also known as matrix product state (MPS) in the area of quantum physics. Since it can avoid the recursive computation of binary trees and is mathematically easy to solve due to its compact form, it has attracted a lot of attention in recent years.

    The main idea of TT decomposition is to factorize a tensor into N factors, where head and tail factors are matrices and the rest are smaller third-order tensors.

    Definition 1.3.16

    (TT decomposition) Given , the TT decomposition is of the form

    (1.46)

    or in an element-wise style,

    (1.47)

    where ( ) are the core factors. Note that we set .

    A graphical illustration of TT decomposition on an N-th-order tensor is shown in Fig. 1.23.

    Figure 1.23 Top: Tensor train decomposition on an N th-order tensor . The yellow dot (light gray in print version) is a core tensor, leaf components (not drawn) are identities in this case and thus no need to be stored. Bottom: Corresponding core tensors.

    Similar to other decompositions, TT decomposition also has its rank.

    Definition 1.3.17

    (Tensor train rank) The TT rank of is an -tuple

    (1.48)

    where and .

    Different from the aforementioned ranks, TT rank can be affected by the permutation of tensor modes. For instance, given a tensor , if we generate by swapping the first and the third mode of , the TT rank can be different. That leads to the question of how to choose a desirable permutation for a given tensor.

    1.3.5.3 Tensor ring decomposition

    Zhao et al. [12] generalized TT to tensor ring (TR) decomposition (a.k.a. tensor chain [TC]). It employs the trace operation to create a symmetrical structure and can be treated as TT with periodic boundary conditions (PBC). An illustration of TR decomposition is shown in Fig. 1.24.

    Figure 1.24 Tensor ring decomposition of an N -th-order tensor , where are the core tensors.

    From Fig. 1.24, we can see that TR decomposes an N-th-order tensor into N third-order factor tensors whose dimensions are much smaller than the original one. The constraint still exists but not necessarily equals one. Therefore, the definition of TR decomposition can be given.

    Definition 1.3.18

    (TR decomposition) The TR decomposition of is

    (1.49)

    Using the trace operation, it can be written element-wisely as follows:

    (1.50)

    where , , are core tensors.

    Similar to TT rank, TR rank can be defined as a tuple, i.e., .

    1.3.5.4 Other variants

    Even though TNs are derived to substantially reduce the computational cost, HT and TT may fail to achieve the outstanding effect when the data size increases. TR was proposed to solve the ordering issue induced by TT.

    Other than TR, projected entangled pair states (PEPS) [44] was also introduced for the same reason. PEPS is a hierarchical 2D TT model employing fifth/sixth-order tensors as cores. This strategy can be taken as a trade-off between rank and computational complexity. However, when the estimation accuracy is high, the rank of PEPS also grows rapidly.

    Honeycomb lattice (HCL) [45] and multiscale entanglement renormalization ansatz (MERA) [46] were put forward to resolve this dilemma. Similar to TR and PEPS, HCL and MERA also contain loop structures that cause a high computational burden in the contraction. However, due to the fact that core tensors for HCL and MERA are only of third order and third/fourth order, respectively, they are still able to attain a desirable effect.

    Recently, a new idea has been proposed by combining traditional decomposition approaches. For example, TR and t-SVD are merged in a hierarchical way in [47].

    1.4 Tensor processing techniques

    Tensor decomposition plays a key role in machine learning, signal processing, and data mining. A series of tensor-based data processing techniques are listed as follows:

    •  Tensor dictionary learning [48,49]

    Tensor dictionary learning aims to sparsely represent the high-order tensor in a tensor factorization way where some factors can be regarded as learned dictionaries with respect to the rest sparse factors. Benefiting from its data-driven characteristic and high-order sparse representation ability, it has been widely used in various data processing applications, such as image denoising, enhancement and classification, texture synthesis, unsupervised clustering, and biomedical data analysis.

    •  Tensor completion [50–54]

    Tensor completion fills in the missing entries of a partially observed tensor, which is popularly applied in recommender systems, image recovery, knowledge graph completion, and traffic flow prediction.

    •  Tensor robust principal component analysis (TRPCA) [55–57]

    TRPCA separates additive low-rank and sparse components from multiway data. It can be used for dimensionality reduction, background extraction, small target detection, and anomaly detection.

    •  Tensor independent component analysis (TICA) [58–60]

    TICA separates multivariate tensor data into additive statistically independent non-Gaussian subcomponents. It is widely used for blind resource separation in speech, image, and biomedical signals.

    •  Tensor regression [61–63]

    Tensor regression sets up a multilinear mapping from a tensor input to a tensor output with a tensor system. As a tensor extension for linear regression, it has wide applications in various multidimensional data processing areas, such as weather forecasting, recommender systems, image prediction, brain activation, and connectivity analysis.

    •  Tensor classification [64–66]

    By replacing the linear function with a tensor function for a superplane to separate positive samples from negative samples, we can obtain some generalized statistical tensor classification methods including support tensor machine (STM), logistic tensor regression, and discriminative tensor component analysis. They find useful applications in image classification of handwriting, frailty detection from multisensor recordings, and breast cancer classification based on histopathology images.

    •  Coupled tensor factorization [67–69]

    Data collected from multiple sources often share latent components. Coupled tensor factorization can perform multiple decompositions based on those shared factors, which is very useful in data fusion applications.

    •  Deep tensor neural networks [70,71]

    Many layers in deep neural networks have linear combinations. By generalizing these linear operators into multilinear ones, deep tensor neural networks can be obtained. They can process multidimensional data in its original form and can be implemented in speech analysis, image classification, and natural language processing.

    1.5 Challenges

    With the recent advances in applied mathematics, tensors have been introduced into a number of data processing applications, both in theory and in methodology. However, to really make tensor computation as practical as matrix computation, there are still some challenges. We list a few of them as follows:

    •  There may be multiple decompositions for a given tensor, but no solid theoretical guidance is established on how to select a desirable decomposition for certain multiway data processing tasks. Besides, TNs are reported to have outstanding performances for higher-order data processing, but how to derive a suitable network is still an open question.

    •  For data collected from heterogeneous networks, such as social networks, banking data, and traffic patterns, graphical representation can be implemented to model the local signals and their irregular connections by nodes and edges, respectively. However, the uniform connection in tensor mode prevents the full utilization of the topological structure encoded in those networks.

    •  Tensor computation leads to a heavy computational burden in some cases. Therefore, high-performance tensor computation algorithms and efficient software are called for.

    •  Apart from the data processing community, tensors are intensively investigated in physics as well. It is quite important to make use of the results from other areas to improve tensor decomposition and its data processing performance. For example, quantum computing may highly accelerate tensor-based machine learning.

    •  As a linear representation, tensor decomposition will encounter difficulties in highly nonlinear feature extraction. Motivated by the success of deep neural networks, deep tensor factorization can be formulated by applying multilayer decompositions. However, problems with interpretability and generalization would ensue.

    •  With the development of the Internet of Things, many data processing tasks are carried out by edge computing. It would be interesting but not easy to develop distributed tensor decomposition, block tensor decomposition, and online tensor decomposition for this group of applications.

    References

    [1] T.G. Kolda, B.W. Bader, Tensor decompositions and applications, SIAM Review 2009;51(3):455–500.

    [2] L. He, W. Zhang, M. Shi, Non-negative tensor factorization for speech enhancement, 2016 International Conference on Artificial Intelligence: Technologies and Applications. Atlantis Press; 2016:1–5.

    [3] Y.-Y. Yuan, S.-T. Wang, Q. Cheng, D.-M. Kong, X.-G. Che, Simultaneous determination of carbendazim and chlorothalonil pesticide residues in peanut oil using excitation-emission matrix fluorescence coupled with three-way calibration method, Spectrochimica Acta. Part A: Molecular and Biomolecular Spectroscopy 2019;220, 117088.

    [4] K. Kawabata, Y. Matsubara, T. Honda, Y. Sakurai, Non-linear mining of social activities in tensor streams, Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 2020:2093–2102.

    [5] A. Cichocki, N. Lee, I.V. Oseledets, A.-H. Phan, Q. Zhao, D. Mandic, Low-rank tensor networks for dimensionality reduction and large-scale optimization problems: perspectives and challenges part 1, arXiv preprint arXiv:1609.00893; 2016.

    [6] D.S. Watkins, Fundamentals of Matrix Computations, vol. 64. John Wiley & Sons; 2004.

    [7] M. Fazel, H. Hindi, S.P. Boyd, A rank minimization heuristic with application to minimum order system approximation, Proceedings of the 2001 American Control Conference (Cat. No. 01CH37148). IEEE; 2001;vol. 6:4734–4739.

    [8] N.D. Sidiropoulos, L. De Lathauwer, X. Fu, K. Huang, E.E. Papalexakis, C. Faloutsos, Tensor decomposition for signal processing and machine learning, IEEE Transactions on Signal Processing 2017;65(13):3551–3582.

    [9] C.F. Van Loan, The ubiquitous Kronecker product, Journal of Computational and Applied Mathematics 2000;123(1–2):85–100.

    [10] A. Smilde, R. Bro, P. Geladi, Multi-Way Analysis: Applications in the Chemical Sciences. John Wiley & Sons; 2005.

    [11] M.E. Kilmer, C.D. Martin, Factorization strategies for third-order tensors, Linear Algebra and Its Applications 2011;435(3):641–658.

    [12] Q. Zhao, G. Zhou, S. Xie, L. Zhang, A. Cichocki, Tensor ring decomposition, arXiv preprint arXiv:1606.05535; 2016.

    [13] S.V. Dolgov, D.V. Savostyanov, Alternating minimal energy methods for linear systems in higher dimensions, SIAM Journal on Scientific Computing 2014;36(5):A2248–A2271.

    [14] M.E. Kilmer, C.D. Martin, L. Perrone, A third-order generalization of the matrix svd as a product of third-order tensors. [Tech. Rep. TR-2008-4] Tufts University, Department of Computer Science; 2008.

    [15] C.M. Andersen, R. Bro, Practical aspects of parafac modeling of fluorescence excitation-emission data, Journal of Chemometrics: A Journal of the Chemometrics Society 2003;17(4):200–215.

    [16] R. Bro, Review on multiway analysis in chemistry—2000–2005, Critical Reviews in Analytical Chemistry 2006;36(3–4):279–293.

    [17] R. Bro, et al., Parafac. tutorial and applications, Chemometrics and Intelligent Laboratory Systems 1997;38(2):149–172.

    [18] R. Bro, C.A. Andersson, H.A. Kiers, Parafac2—part ii. Modeling chromatographic data with retention time shifts, Journal of Chemometrics: A Journal of the Chemometrics Society 1999;13(3–4):295–309.

    [19] L. De Lathauwer, J. Castaing, J.-F. Cardoso, Fourth-order cumulant-based blind identification of underdetermined mixtures, IEEE Transactions on Signal Processing 2007;55(6):2965–2973.

    [20] L. De Lathauwer, A. de Baynast, Blind deconvolution of ds-cdma signals by means of decomposition in rank-(1, l, l) terms, IEEE Transactions on Signal Processing 2008;56(4):1562–1571.

    [21] D. Muti, S. Bourennane, Multidimensional filtering based on a tensor approach, Signal Processing 2005;85(12):2338–2353.

    [22] L. De Lathauwer, B. De Moor, J. Vandewalle, A multilinear singular value decomposition, SIAM Journal on Matrix Analysis and Applications 2000;21(4):1253–1278.

    [23] L. De Lathauwer, B. De Moor, J. Vandewalle, On the best rank-1 and rank-(r1, r2,..., rn) approximation of higher-order tensors, SIAM Journal on Matrix Analysis and Applications 2000;21(4):1324–1342.

    [24] M.A.O. Vasilescu, D. Terzopoulos, Multilinear analysis of image ensembles: tensorfaces, European Conference on Computer Vision. Springer; 2002:447–460.

    [25] W. Hackbusch, B.N. Khoromskij, Tensor-product approximation to operators and functions in high dimensions, Journal of Complexity 2007;23(4–6):697–714.

    [26] B. Khoromskij, V. Khoromskaia, Low rank Tucker-type tensor approximation to classical potentials, Open Mathematics 2007;5(3):523–550.

    [27] B.W. Bader, M.W. Berry, M. Browne, Discussion tracking in enron email using parafac, Survey of Text Mining II. Springer; 2008:147–163.

    [28] E. Acar, S.A. Çamtepe, M.S. Krishnamoorthy, B. Yener, Modeling and multiway analysis of chatroom tensors, International Conference on Intelligence and Security Informatics. Springer; 2005:256–268.

    [29] N. Liu, B. Zhang, J. Yan, Z. Chen, W. Liu, F. Bai, L. Chien, Text representation: from vector to tensor, Fifth IEEE International Conference on Data Mining (ICDM'05). IEEE; 2005:725–728.

    [30] L. Tucker, Some mathematical notes on three-mode factor analysis, Psychometrika 1966;31(3):279–311.

    [31] D.L. Lieven, D.M. Bart, V. Joos, A multilinear singular value decomposition, SIAM Journal on Matrix Analysis and Applications 2000.

    [32] F.L. Hitchcock, The expression of a tensor or a polyadic as a sum of products, Journal of Mathematical Physics 1927;6(1):164–189.

    [33] J.D. Carroll, J.J. Chang, Analysis of individual differences in multidimensional scaling via an n-way generalization of eckart-young decomposition, Psychometrika 1970;35(3):283–319.

    [34] R.A. Harshman, et al., Foundations of the PARAFAC procedure: models and conditions for an explanatory multimodal factor analysis, UCLA Working Papers in Phonetics 1970;16:1–84.

    [35] H.A.L. Kiers, Towards a standardized notation and terminology in multiway analysis, Journal of Chemometrics 2000;14.

    [36] L. De Lathauwer, Decompositions of a higher-order tensor in block terms—part I: lemmas for partitioned matrices, SIAM Journal on Matrix Analysis and Applications 2008;30(3):1022–1032.

    [37] L. De Lathauwer, Decompositions of a higher-order tensor in block terms—part ii: definitions and uniqueness, SIAM Journal on Matrix Analysis and Applications 2008;30(3):1033–1066.

    [38] Z. Zhang, G. Ely, S. Aeron, N. Hao, M. Kilmer, Novel methods for multilinear data completion and de-noising based on tensor-svd, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2014:3842–3849.

    [39] R.H. Chan, M.K. Ng, Conjugate gradient methods for Toeplitz systems, SIAM Review 1996;38(3):427–482.

    [40] L. Grasedyck, Hierarchical singular value decomposition of tensors, SIAM Journal on Matrix Analysis and Applications 2010;31(4):2029–2054.

    [41] W. Hackbusch, S. Kühn, A new scheme for the tensor representation, The Journal of Fourier Analysis and Applications 2009;15(5):706–722 10.1007/s00041-009-9094-9.

    [42] E. Grelier, A. Nouy, M. Chevreuil, Learning with tree-based tensor formats, arXiv:1811.04455; 2019.

    [43] V.I. Oseledets, Tensor-train decomposition, SIAM Journal on Scientific Computing 2011;33(5):2295–2317.

    [44] F. Verstraete, V. Murg, J.I. Cirac, Matrix product states, projected entangled pair states, and variational renormalization group methods for quantum spin systems, Advances in Physics 2008;57(2):143–224.

    [45] M. Ablowitz, S. Nixon, Y. Zhu, Conical diffraction in honeycomb lattices, Physical Review A 2009;79(5), 1744.

    [46] L. Cincio, J. Dziarmaga, M.M. Rams, Multi-scale entanglement renormalization ansatz in two dimensions, Physical Review Letters 2007;100(24), 240603.

    [47] A. Ahad, Z. Long, C. Zhu, Y. Liu, Hierarchical tensor ring completion, arXiv preprint arXiv:2004.11720; 2020.

    [48] S. Zubair, W. Wang, Tensor dictionary learning with sparse Tucker decomposition, 2013 18th International Conference on Digital Signal Processing (DSP). IEEE; 2013:1–6.

    [49] Y. Peng, D. Meng, Z. Xu, C. Gao, Y. Yang, B. Zhang, Decomposable nonlocal tensor dictionary learning for multispectral image denoising, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2014:2949–2956.

    [50] J. Liu, P. Musialski, P. Wonka, J. Ye, Tensor completion for estimating missing values in visual data, IEEE Transactions on Pattern Analysis and Machine Intelligence

    Enjoying the preview?
    Page 1 of 1