Tensor Framework / TensorFaces
Multilinear tensor modeling methods are particularly well suited for mathematically representing causeandeffect of multimodal data where an observation, such as an image, is the result of several constituent factors, the causal factors of data formation.
For example, natural images are the compositional consequence of multiple factors related to scene structure, illumination, and imaging. The appearance of a person in an image (i.e. its pixel values) is the result of the facial geometry of a person, camera location/parameters, lighting conditions, expression, etc. While we can directly observe and measure the gray (or color) values in an image, we are often more interested in the information associated with the causal factors that determine the pixel values in an image, such as the person's identity, the viewing direction, or expression, which may be inferred, but not directly measured. The causal factors are represented by the latent variables in a computational model.
Data tensor modeling was first employed in computer vision, computer graphics and machine learning to represent causeandeffect and demonstratively disentangle the causal factors of observable data and recognize people from the way they move (Human Motion Signatures in 2001) and from their facial images (TensorFaces in 2002), but it may be used to recognize any objects or object attributes.
There are two classes of data tensor modeling techniques that stem from: (1) rankK tensor decompositions (CANDECOMP / Parafac decomposition) and (2) rank(R1,R2,...,RM) tensor decompositions, (Tucker decomposition). Variants on these decompositions employ various constraints. Kernel variants apply a kernel preprocessing step.
Recent theoretical evidence shows that deep learning is a neural network equivalent to multilinear tensor decomposition, while a shallow network corresponds to CP tensor factorization (aka, linear tensor factorization).
TensorFaces is based on the insight that multilinear tensor methods can explicitly model and decompose a facial image in terms of the causal factors of data formation where each causal factor is represented according to their secondorder statistics by employing the Tucker tensor decomposition. We refer to this approach more generally as Mulitlinear PCA in order to better differentiate it from our Multilinear ICA approach.
Multilinear (tensor) ICA is a more sophisticated model of causeandeffect based on the higherorder statistics associated with each causal factor. Similarly, one can employ our kernel variants (pg.43 ) to model causeandeffect. By comparison, matrix decompositions, such as PCA, or ICA, capture the overall statistical information (variance, kurtosis) without any type of differentiation.
Subspace multilinear learning demonstratively disentangles the causal factors of data formation through strategic dimensionality reduction. For example, in the case of facial images (or bidirectional textures functions), we suppress illumination effects such as shadows and highlights without blurring the edges associated with the person's identity that are important fo recognition (or edges associated with structural information that are important for texture synthesis. See TensorTextures video below. ).
Next important question:While TensorFaces is a handy moniker for an approach that learns and represents the interaction of various causal factors from a set of training images, with Multilinear (Tensor) ICA and kernel variants as a more sophisticated approaches, none of the interaction models prescribe a solution for how one might determine the multiple causal factors of a single unlabeled test image.
Multilinear Projection (FG 2011 , ICCV 2007 , briefly summarized in the 2005 MICA paper) addresses the question of how one might determine from a single unlabeled test image all the unknown causal factors of data formation, ie how does one solve for multiple unknowns from a single image equation? In the course of addressing this question, several concepts from linear (matrix) algebra were generalized, such as the modem identity tensor (which is also an algebraic operator that reshapes a matrix into a tensor and back again to a matrix), the modem pseudoinverse tensor, the modem product in order to develop the multilinear projection algorithm. (Note: The modem pseudoinverse tensor is not a tensor pseudoinverse.) Multilinear projection simultaneously projects one or more unlabeled test images into multiple constituent mode spaces, associated with image formation, in order to infer the mode labels.

"Compositional Hierarchical Tensor Factorization: Representing Hierarchical Intrinsic and Extrinsic Causal Factors ”, M.A.O. Vasilescu, E. Kim, In The 25th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD’19): Tensor Methods for Emerging Data Science Challenges, August 0408, 2019, Anchorage, AK. ACM, New York, NY, USA Paper (pdf)

"Face Tracking with Multilinear (Tensor) Active Appearance Models", Weiguang Si, Kota Yamaguchi, M. A. O. Vasilescu , June, 2013.
Paper (pdf)

"Multilinear Projection for Face Recognition via Canonical Decomposition ", M.A.O. Vasilescu, In Proc. Face and Gesture Conf. (FG'11), 476483. Paper (pdf)

"Multilinear Projection for Face Recognition via Rank1 Analysis ", M.A.O. Vasilescu, CVPR, IEEE Computer Society and IEEE Biometrics Council Workshop on Biometrics, June 18, 2010.

"Multilinear Projection for AppearanceBased Recognition in the Tensor Framework", M.A.O. Vasilescu and D. Terzopoulos, Proc. Eleventh IEEE International Conf. on Computer Vision (ICCV'07), Rio de Janeiro, Brazil, October, 2007, 18.
Paper (1,027 KB  .pdf)

“Multilinear Independent Components Analysis and Multilinear Projection Operator for Face Recognition”, M.A.O. Vasilescu, D. Terzopoulos, in Workshop on Tensor Decompositions and Applications, CIRM, Luminy, Marseille, France, August 2005.

"Multilinear (Tensor) ICA and Dimensionality Reduction", M.A.O. Vasilescu, D. Terzopoulos, Proc. 7th International Conference on Independent Component Analysis and Signal Separation (ICA07), London, UK, September, 2007. In Lecture Notes in Computer Science, 4666, SpringerVerlag, New York, 2007, 818–826.

"Multilinear Independent Components Analysis", M. A. O. Vasilescu and D. Terzopoulos, Proc. Computer Vision and Pattern Recognition Conf. (CVPR '05), San Diego, CA, June 2005, vol.1, 547553.
Paper (1,027 KB  .pdf)

"Multilinear Independent Component Analysis", M. A. O. Vasilescu and D. Terzopoulos, Learning 2004 Snowbird, UT, April, 2004.

"Multilinear Subspace Analysis for Image Ensembles,'' M. A. O. Vasilescu, D. Terzopoulos, Proc. Computer Vision and Pattern Recognition Conf. (CVPR '03), Vol.2, Madison, WI, June, 2003, 9399.
Paper (1,657KB  .pdf)

"Multilinear Image Analysis for Facial Recognition,'' M. A. O. Vasilescu, D. Terzopoulos, Proceedings of International Conference on Pattern Recognition (ICPR 2002), Vol. 2, Quebec City, Canada, Aug, 2002, 511514.
Paper (439KB  .pdf)

"Multilinear Analysis of Image Ensembles: TensorFaces," M. A. O. Vasilescu, D. Terzopoulos, Proc. 7th European Conference on Computer Vision (ECCV'02), Copenhagen, Denmark, May, 2002, in Computer Vision  ECCV 2002, Lecture Notes in Computer Science, Vol. 2350, A. Heyden et al. (Eds.), SpringerVerlag, Berlin, 2002, 447460.
Full Article in PDF (882KB)
Human Motion Signatures and Style Transfer:
Given motioncapture samples of Charlie Chaplin’s walk, is it possible to synthesize other motions (say, ascending or descending stairs) in his distinctive style? More generally, in analogy with handwritten signatures, do people have characteristic motion signatures that individualize their movements? If so, can these signatures be extracted from example motions? Can they be disentangled from other causal factors?
We have developed an algorithm that extracts motion signatures and uses them in the animation of graphical characters. The mathematical basis of our algorithm is a statistical numerical technique known as or Mmode data tensor analysis. For example, given a corpus of walking, stair ascending, and stair descending motion data collected over a group of subjects, plus a sample walking motion for a new subject, our algorithm can synthesize never before seen ascending and descending motions in the distinctive style of this new individual.

"Human Motion Signatures: Analysis, Synthesis, Recognition," M. A. O. Vasilescu Proceedings of International Conference on Pattern Recognition (ICPR 2002), Vol. 3, Quebec City, Canada, Aug, 2002, 456460.
Paper (439KB  .pdf)

"An Algorithm for Extracting Human Motion Signatures", M. A. O. Vasilescu, Computer Vision and Pattern Recognition CVPR 2001 Technical Sketches, Lihue, HI, December, 2001.

"Human Motion Signatures for Character Animations", M. A. O. Vasilescu, Sketch and Applications SIGGRAPH 2001 Los Angeles, CA, August, 2001.
Sketch (141KB  .pdf)

"Recognition Action Events from Multiple View Points," Tanveer SayedMahmood, Alex Vasilescu, Saratendu Sethi, in IEEE Workshop on Detection and Recognition of Events in Video, International Conference on Computer Vision (ICCV 2001), Vancuver , Canada, July 8, 2001, 6472.
Listening in 3D
Head related transfer function (HRTF) characterizes how an individual's anatomy and sound source location impacts an individual's perception of sound. The size, shape and density of the head, the shape of the ears and ear canal, the distance between the ears, all transform sound by amplifying some frequencies and attenuating others. Learning how sound is perceived is important in:

pinpointing the location of sound that is vital for safe navigation in traffic,

achieving a realistic acoustic environment in gaming and home cinema setups.
To measure an HRTF, one places a loudspeaker at various locations in space and a microphone at the ear. To recreate an authentic sound experience, slightly differently synthesized sounds are sent to each ear in accordance with a person's HRTF.
This is not surround sound which uses multiple speakers to provide a 360 sound.

"A Multilinear (Tensor) Framework for HRTF Analysis and Synthesis", G. Grindlay, M.A.O. Vasilescu, IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Honolulu, Hawaii, April, 2007
Paper (439KB  .pdf)
TensorTextures: Imagebased Rendering
One of the goals of computer graphics is photorealistic rendering, the synthesis of images of virtual scenes visually indistinguishable from those of natural scenes. Unlike traditional modelbased rendering, whose photorealism is limited by model complexity, an emerging and highly active research area known as
imagebased rendering eschews complex geometric models in favor of representing scenes by ensembles of example images. These are used to render novel photoreal images of the scene from arbitrary viewpoints and illuminations, thus decoupling rendering from scene complexity. The challenge is to develop structured representations in highdimensional image spaces that are rich enough to capture important information for synthesizing new images, including details such as selfocclusion, selfshadowing, interreflections, and subsurface scattering.
TensorTextures, a new imagebased texture mapping technique, is a rich generative model that, from a sparse set of example images, learns the interaction between viewpoint, illumination, and geometry that determines detailed surface appearance. Mathematically, TensorTextures is a nonlinear model of texture image ensembles that exploits tensor algebra and the Nmode SVD to learn a representation of the bidirectional texture function (BTF) in which the multiple constituent factors, or modesviewpoints and illuminationsare disentangled and represented explicitly.

"TensorTextures: Multilinear ImageBased Rendering", M. A. O. Vasilescu and D. Terzopoulos, Proc. ACM SIGGRAPH 2004 Conference Los Angeles, CA, August, 2004, in Computer Graphics Proceedings, Annual Conference Series, 2004, 336342.
Paper (5,104 KB  .pdf)
Animations:
TensorTextures  AVI (54,225 KB)

TensorTextures Strategic Dimensionality Reduction  AVI (19,650 KB)

TensorTextures Trailer  AVI (17,605 KB)


"TensorTextures", M. A. O. Vasilescu and D. Terzopoulos, Sketches and Applications SIGGRAPH 2003 San Diego, CA, July, 2003.
Sketch (6MB  .pdf)
Adaptive Meshes: Physically Based Modeling
Adaptive mesh models for the nonuniform sampling and reconstruction of visual data. Adaptive meshes are dynamic models assembled from nodal masses connected by adjustable springs. Acting as mobile sampling sites, the nodes observe interesting properties of the input data, such as intensities, depths, gradients, and curvatures. The springs automatically adjust their stiffnesses based on the locally sampled information in order to concentrate nodes near rapid variations in the input data. The representational power of an adaptive mesh is enhanced by its ability to optimally distribute the available degrees of freedom of the reconstructed model in accordance with the local complexity of the data.
We developed open adaptive mesh and closed adaptive shell surfaces based on triangular or rectangular elements. We propose techniques for hierarchically subdividing polygonal elements in adaptive meshes and shells. We also devise a discontinuity detection and preservation algorithm suitable for the model. Finally, motivated by (nonlinear, continuous dynamics, discrete observation) Kalman filtering theory, we generalize our model to the dynamic recursive estimation of nonrigidly moving surfaces.

"Adaptive meshes and shells: Irregular triangulation, discontinuities, and hierarchical subdivision," M. Vasilescu, D. Terzopoulos, in Proc. Computer Vision and Pattern Recognition Conf. (CVPR '92), Champaign , IL, June, 1992, pages 829  832.
Paper (652KB  .pdf)

"Sampling and Reconstruction with Adaptive Meshes," D. Terzopoulos, M. Vasilescu, in Proc. Computer Vision and Pattern Recognition Conf. (CVPR '91), Lahaina, HI, June, 1991, pages 70  75.
Paper (438KB  .pdf)