Decomposition

Multiview ICA

class mvlearn.decomposition.MultiviewICA(n_components=None, noise=1.0, max_iter=1000, init='permica', random_state=None, tol=0.001, verbose=False, n_jobs=30)[source]

Multiview ICA for which views share a common source but separate mixing matrices.

Parameters:

n_components : int, optional

Number of components to extract. If None, no dimension reduction is performed and all views must have the same number of features.

noise : float, default=1.0

Gaussian noise level

max_iter : int, default=1000

Maximum number of iterations to perform

init : {'permica', 'groupica'} or np array of shape

(n_groups, n_components, n_components), default='permica' If permica: initialize with perm ICA, if groupica, initialize with group ica. Else, use the provided array to initialize.

random_state : int, RandomState instance or None, default=None

Used to perform a random initialization. If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; If None, the random number generator is the RandomState instance used by np.random.

tol : float, default=1e-3

A positive scalar giving the tolerance at which the un-mixing matrices are considered to have converged.

verbose : bool, default=False

Print information

n_jobs : int (positive), default=None

The number of jobs to run in parallel. None means 1 job, -1 means using all processors.

Attributes

-------

components_ : np array of shape (n_groups, n_features, n_components)

P is the projection matrix that projects data in reduced space

unmixings_ : np array of shape (n_groups, n_components, n_components)

Estimated un-mixing matrices

source_ : np array of shape (n_samples, n_components)

Estimated source

See also

groupica, permica

Notes

Given each view \(X_i\) It optimizes:

\[l(W) = \frac{1}{T} \sum_{t=1}^T [\sum_k log(cosh(Y_{avg,k,t})) + \sum_i l_i(X_{i,.,t})]\]

where

\[l _i(X_{i,.,t}) = - log(|W_i|) + 1/(2 \sigma) ||X_{i,.,t}W_i - Y_{avg,.,t}||^2,\]

\(W_i\) is the mixing matrix for view \(i\), \(Y_{avg} = \frac{1}{n} \sum_{i=1}^n X_i W_i\), and \(\sigma\) is the noise level.

References

[1]Hugo Richard, Luigi Gresele, Aapo Hyvärinen, Bertrand Thirion, Alexandre Gramfort, Pierre Ablin. Modeling Shared Responses in Neuroimaging Studies through MultiView ICA. arXiv 2020.

Examples

>>> from mvlearn.datasets import load_UCImultifeature
>>> from mvlearn.decomposition import MultiviewICA
>>> Xs, _ = load_UCImultifeature()
>>> ica = MultiviewICA(n_components=3, max_iter=10)
>>> sources = ica.fit_transform(Xs)
>>> print(sources.shape)
(6, 2000, 3)
fit(Xs, y=None)[source]

Fits the model to the views Xs.

Parameters:

Xs : list of array-likes or numpy.ndarray

  • Xs length: n_views
  • Xs[i] shape: (n_samples, n_features_i)

Training data to recover a source and unmixing matrices from.

y : ignored

Returns:

self : returns an instance of itself.

fit_transform(Xs, y=None)

Fit to the data and transform the data

Parameters:

Xs : list of array-likes or numpy.ndarray

  • Xs length: n_views
  • Xs[i] shape: (n_samples, n_features_i)

y : array, shape (n_samples,), optional

Returns:

X_transformed : list of array-likes

  • out length: n_views
  • out[i] shape: (n_samples, n_components_i)
inverse_transform()

Transforms the sources back to the mixed data for each view (apply mixing matrix).

Parameters:

None

Returns:

Xs_new : numpy.ndarray, shape (n_views, n_samples, n_components)

The mixed sources from the single source and per-view unmixings.

transform(Xs, y=None)

Recover the sources from each view (apply unmixing matrix).

Parameters:

Xs : list of array-likes or numpy.ndarray

  • Xs length: n_views
  • Xs[i] shape: (n_samples, n_features_i)

Training data to recover a source and unmixing matrices from.

y : ignored

Returns:

Xs_new : numpy.ndarray, shape (n_views, n_samples, n_components)

The mixed sources from the single source and per-view unmixings.

Permutation ICA

class mvlearn.decomposition.PermICA(n_components=None, max_iter=1000, random_state=None, tol=1e-07, n_jobs=None)[source]

Performs one ICA per view (ex: subject) and align sources using the hungarian algorithm.

Parameters:

n_components : int, optional

Number of components to extract. If None, no dimension reduction is performed and all views must have the same number of features.

max_iter : int, default=1000

Maximum number of iterations to perform

random_state : int, RandomState instance or None, default=None

Used to perform a random initialization. If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; If None, the random number generator is the RandomState instance used by np.random.

tol : float, default=1e-3

A positive scalar giving the tolerance at which the un-mixing matrices are considered to have converged.

n_jobs : int (positive), default=None

The number of jobs to run in parallel. None means 1 job, -1 means using all processors.

Attributes

-------

components_ : np array of shape (n_groups, n_features, n_components)

P is the projection matrix that projects data in reduced space

unmixings_ : np array of shape (n_groups, n_components, n_components)

Estimated un-mixing matrices

source_ : np array of shape (n_samples, n_components)

Estimated source

See also

groupica, multiviewica

References

[2]Hugo Richard, Luigi Gresele, Aapo Hyvärinen, Bertrand Thirion, Alexandre Gramfort, Pierre Ablin. Modeling Shared Responses in Neuroimaging Studies through MultiView ICA. arXiv 2020.

Examples

>>> from mvlearn.datasets import load_UCImultifeature
>>> from mvlearn.decomposition import PermICA
>>> Xs, _ = load_UCImultifeature()
>>> ica = PermICA(n_components=3)
>>> sources = ica.fit_transform(Xs)
>>> print(sources.shape)
(6, 2000, 3)
fit(Xs, y=None)[source]

Fits the model to the views Xs.

Parameters:

Xs : list of array-likes or numpy.ndarray

  • Xs length: n_views
  • Xs[i] shape: (n_samples, n_features_i)

Training data to recover a source and unmixing matrices from.

y : ignored

Returns:

self : returns an instance of itself.

fit_transform(Xs, y=None)

Fit to the data and transform the data

Parameters:

Xs : list of array-likes or numpy.ndarray

  • Xs length: n_views
  • Xs[i] shape: (n_samples, n_features_i)

y : array, shape (n_samples,), optional

Returns:

X_transformed : list of array-likes

  • out length: n_views
  • out[i] shape: (n_samples, n_components_i)
inverse_transform()

Transforms the sources back to the mixed data for each view (apply mixing matrix).

Parameters:

None

Returns:

Xs_new : numpy.ndarray, shape (n_views, n_samples, n_components)

The mixed sources from the single source and per-view unmixings.

transform(Xs, y=None)

Recover the sources from each view (apply unmixing matrix).

Parameters:

Xs : list of array-likes or numpy.ndarray

  • Xs length: n_views
  • Xs[i] shape: (n_samples, n_features_i)

Training data to recover a source and unmixing matrices from.

y : ignored

Returns:

Xs_new : numpy.ndarray, shape (n_views, n_samples, n_components)

The mixed sources from the single source and per-view unmixings.

Group ICA

class mvlearn.decomposition.GroupICA(n_components=None, max_iter=1000, random_state=None, tol=1e-07, n_jobs=None)[source]

Performs PCA on concatenated data across groups (ex: subjects) and apply ICA on reduced data.

Parameters:

n_components : int, optional

Number of components to extract. If None, no dimension reduction is performed and all views must have the same number of features.

max_iter : int, default=1000

Maximum number of iterations to perform

random_state : int, RandomState instance or None, default=None

Used to perform a random initialization. If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; If None, the random number generator is the RandomState instance used by np.random.

tol : float, default=1e-3

A positive scalar giving the tolerance at which the un-mixing matrices are considered to have converged.

n_jobs : int (positive), default=None

The number of jobs to run in parallel. None means 1 job, -1 means using all processors.

Attributes

-------

components_ : np array of shape (n_groups, n_features, n_components)

P is the projection matrix that projects data in reduced space

unmixings_ : np array of shape (n_groups, n_components, n_components)

Estimated un-mixing matrices

source_ : np array of shape (n_samples, n_components)

Estimated source

See also

permica, multiviewica

References

[3]Vince D Calhoun, Tülay Adali, Godfrey D Pearlson, and James J Pekar. A method for making group inferences from functional MRI data using independent component analysis. Human brain mapping, 14(3):140–151, 2001.

Examples

>>> from mvlearn.datasets import load_UCImultifeature
>>> from mvlearn.decomposition import GroupICA
>>> Xs, _ = load_UCImultifeature()
>>> ica = GroupICA(n_components=3)
>>> sources = ica.fit_transform(Xs)
>>> print(sources.shape)
(6, 2000, 3)
fit(Xs, y=None)[source]

Fits the model to the views Xs.

Parameters:

Xs : list of array-likes or numpy.ndarray

  • Xs length: n_views
  • Xs[i] shape: (n_samples, n_features_i)

Training data to recover a source and unmixing matrices from.

y : ignored

Returns:

self : returns an instance of itself.

fit_transform(Xs, y=None)

Fit to the data and transform the data

Parameters:

Xs : list of array-likes or numpy.ndarray

  • Xs length: n_views
  • Xs[i] shape: (n_samples, n_features_i)

y : array, shape (n_samples,), optional

Returns:

X_transformed : list of array-likes

  • out length: n_views
  • out[i] shape: (n_samples, n_components_i)
inverse_transform()

Transforms the sources back to the mixed data for each view (apply mixing matrix).

Parameters:

None

Returns:

Xs_new : numpy.ndarray, shape (n_views, n_samples, n_components)

The mixed sources from the single source and per-view unmixings.

transform(Xs, y=None)

Recover the sources from each view (apply unmixing matrix).

Parameters:

Xs : list of array-likes or numpy.ndarray

  • Xs length: n_views
  • Xs[i] shape: (n_samples, n_features_i)

Training data to recover a source and unmixing matrices from.

y : ignored

Returns:

Xs_new : numpy.ndarray, shape (n_views, n_samples, n_components)

The mixed sources from the single source and per-view unmixings.

Angle-Based Joint and Individual Variation Explained (AJIVE)

AJIVE

class mvlearn.decomposition.AJIVE(init_signal_ranks=None, joint_rank=None, indiv_ranks=None, n_elbows=None, center=True, reconsider_joint_components=True, wedin_percentile=5, n_wedin_samples=1000, precomp_wedin_samples=None, randdir_percentile=95, n_randdir_samples=1000, precomp_randdir_samples=None, store_full=True)[source]

An implementation of Angle-based Joint and Individual Variation Explained [4]. This algorithm takes multiple views and decomposes them into 3 distinct matrices representing:

  • Low rank approximation of individual variation within each view
  • Low rank approximation of joint variation between views
  • Residual noise

An important note, AJIVE can handle any number of views, not just two.

Parameters:

init_signal_ranks : list, default = None

Initial guesses as to the rank of each view's signal.

joint_rank : int, default = None

Rank of the joint variation matrix. If None, will estimate the joint rank. Otherwise, will use provided joint rank.

indiv_ranks : list, default = None

Ranks of individual variation matrices. If None, will estimate the individual ranks. Otherwise, will use provided individual ranks.

n_elbows : int, optional, default: 2

If init_signal_ranks=None, then computes the initial signal rank guess using mvlearn.embed.utils.select_dimension() with n_elbows for each view.

center : bool, default = True

Boolean for centering matrices.

reconsider_joint_components : bool, default = True

Triggers _reconsider_joint_components function to run and removes columns of joint scores according to identifiability constraint.

wedin_percentile : int, default = 5

Percentile used for wedin (lower) bound cutoff for squared singular values used to estimate joint rank.

n_wedin_samples : int, default = 1000

Number of wedin bound samples to draw.

precomp_wedin_samples : list of array-like

Wedin samples that are precomputed for each view.

randdir_percentile : int, default = 95

Percentile for random direction (lower) bound cutoff for squared singular values used to estimate joint rank.

n_randdir_samples : int, default = 1000

Number of random direction samples to draw.

precomp_randdir_samples : array-like, default = None

Precomputed random direction samples.

Attributes

common_ (The class mvlearn.factorization.ajive_utils.pca.pca) The common joint space found using pca class in same directory
blocks_ (dict) The block-specific results.
centers_ (dict) The the centering vectors computed for each matrix.
sv_threshold_ (dict) The singular value thresholds computed for each block based on initial SVD. Eventually used to estimate the individual ranks.
all_joint_svals_ (list) All singular values from the concatenated joint matrix.
random_sv_samples_ (list) Random singular value samples from random direction bound.
rand_cutoff_ (float) Singular value squared cutoff for the random direction bound.
wedin_samples_ (dict) The wedin samples for each view.
wedin_cutoff_ (float) Singular value squared cutoff for the wedin bound.
svalsq_cutoff_ (float) max(rand_cutoff_, wedin_cutoff_)
joint_rank_wedin_est_ (int) The joint rank estimated using the wedin/random direction bound.
init_signal_rank_ (dict) init_signal_rank in a dictionary of items for each view.
joint_rank_ (int) The rank of the joint matrix
indiv_ranks_ (dict of ints) Ranks of the individual matrices for each view.
center_ (dict) Center in a dict of items for each view.
is_fit_ (bool, default = False) Returns whether data has been fit yet

Notes

Angle-Based Joint and Individual Variation Explained (AJIVE) is a specfic variation of the Joint and Individual Variation Explained (JIVE) algorithm. This algorithm takes \(k\) different views with \(n\) observations and \(d\) variables and finds a basis that represents the joint variation and \(k\) bases with their own ranks representing the individual variation of each view. Each of these individual bases is orthonormal to the joint basis. These bases are then used to create the following \(k\) statements:

\[X^{(i)}= J^{(i)} + I^{(i)} + E^{(i)}\]

where \(X^{(i)}\) represents the i-th view of data and \(J^{(i)}\), \(I^{(i)}\), and \(E^{(i)}\) represent its joint, individual, and noise signal estimates respectively.

The AJIVE algorithm calculations can be split into three seperate steps:
  • Signal Space Initial Extraction
  • Score Space Segmentation
  • Final Decomposition and Outputs

In the Signal Space Initial Extraction step we compute a rank \(r_{initial}^{(i)}\) singular value decomposition for each \(X^{(i)}\), the value of \(r_{initial}^{(i)}\) can be found by looking at the scree plots of each view or thresholding based on singular value. From each singular value decomposition, the first \(r_{initial}^{(i)}\) columns of of the scores matrix (\(U^{(i)}\)) are taken to form \(\widetilde{U}^{(i)}\).

After this, the Score Space Segmentation step concatenates the k \(\widetilde{U}^{(i)}\) matrices found in the first step as follows:

\[M = [\widetilde{U}^{(1)}, \dots, \widetilde{U}^{(k)}]\]

From here, the \(r_{joint}\) singular value decomposition is taken. \(r_{joint}\) is estimated individually or using the wedin bound which quantifies how the theoretical singular subspaces are affected by noise as thereby quantifying the distance between rank of the original input and the estimation. For the scores of the singular value decomposition of \(M\), the first \(r_{joint}\) columns are taken to obtain the basis, \(U_{joint}\). The \(J^{(i)}\) matrix (joint signal estimate) can be found by projecting \(X^{(i)}\) onto \(U_{joint}\).

In the Final Decomposition and Outputs step, we project each \(X^{i}\) matrix onto the orthogonal complement of \(U_{joint}\):

\[X^{(i), orthog} = (I - U_{joint}U_{joint}^T)X^{(i)}\]

\(I\) in the above equation represents the identity matrix. From here, the \(I^{(i)}\) matrix (individual signal estimate) can be found by performing the rank \(r_{individual}^{(i)}\) singular value decomposition of \(X^{(i), orthog}\). \(r_{individual}^{(i)}\) can be found by using the aforementioned singular value thresholding method.

Finally, we can solve for the noise estimates, \(E^{(i)}\) by using the equation:

\[E^{(i)}= X^{(i)} - (J^{(i)} + I^{(i)})\]

Much of this implementation has been adapted from Iain Carmichael Ph.D.'s pip-installable package, jive, the code for which is linked here.

References

[4]Feng, Qing, et al. “Angle-Based Joint and Individual Variation Explained.” Journal of Multivariate Analysis, vol. 166, 2018, pp. 241–265., doi:10.1016/j.jmva.2018.03.008.

Examples

>>> from mvlearn.factorization.ajive import AJIVE
>>> from mvlearn.datasets import load_UCImultifeature
>>> Xs, _ = load_UCImultifeature()
>>> print(len(Xs)) # number of samples in each view
6
>>> print(Xs[0].shape, Xs[1].shape) # number of samples in each view
(2000, 76) (2000, 216)
>>> ajive = AJIVE(init_signal_ranks=[2,2])
>>> b = ajive.fit(Xs).predict()
>>> print(b)
6
>>> print(b[0][0].shape,b[1][0].shape)  # (V1 joint mat, V2 joint mat)
(2000, 76) (2000, 216)
>>> print(b[0][1].shape,b[1][1].shape)  # (V1 indiv mat, V2 indiv mat)
(2000, 76) (2000, 216)
>>> print(b[0][2].shape,b[1][2].shape)  # (V1 noise mat, V2 noise mat)
(2000, 76) (2000, 216)
fit(Xs, view_names=None, precomp_init_svd=None)[source]

Learns the AJIVE decomposition from Xs.

Parameters:

Xs: list of array-likes

  • Xs length: n_views
  • Xs[i] shape: (n_samples, n_features_i)

The different views that are input. Input as data matrices.

precomp_init_svd: dict or list

Precomputed initial SVD. Must have one entry for each data block. The SVD should be an ordered list of 3 matrices (scores, svals, loadings), see output of .ajive_utils/utils/svd_wrapper for formatting details.

view_names: array-like, default = None

Optional. The names of the views. If no input, the views will be names 1,2,...n_views.

Returns:

self : returns an instance of self.

block_names
Returns:

block_names: list

The names of the views.

transform(Xs=None, return_dict=False)[source]

Returns the joint, individual, and noise components of each view from the fitted decomposition. Only works on the data inputted in fit.

Parameters:

Xs : ignored

Not used but included for API consistency. Predictions come from the fitted data.

return_dict: bool, default = False

If True, return is in dictionary format, if False, return is in list format.

Returns:

full: list of lists of np.arrays or dict of dicts of np.arrays holding

the joint, individual, and noise full estimates for each block. In list format the inital indices represent each view. Within these views, the first index represents the view's full joint estimate, the second index represents the view's full individual estimate, and the third index represents the view's noise matrix. The dictionary format returns the same matrices as dictionaries with the top-level dictionary keys representing views and each view dictionary keys representing the type of matrix ('joint', 'individual', 'noise')

results_dict()[source]

Returns a summary of the fitted results in a dictionary.

Returns:

results: dict of dict of dict of np.arrays

Returns n+1 dicts where n is the number of input views. First dict is a named 'common' and contains the common scores, loadings and rank of the views. The next n dicts represent each view. They each have the following keys:

  • 'joint'
  • 'individual'
  • 'noise'

The 'joint' and 'individual' keys are dict with the following keys referencing their respective estimates:

  • 'scores'
  • 'svals'
  • 'loadings'
  • 'rank'
  • 'full'

The 'noise' key is the full noise matrix estimate of the view.

get_ranks()[source]

Returns the joint and individual ranks of the decomposition.

Returns:

joint_rank: int

The joint rank

indiv_ranks: dict

The individual ranks.

fit_transform(Xs, y=None)

Fit to the data and transform the data

Parameters:

Xs : list of array-likes or numpy.ndarray

  • Xs length: n_views
  • Xs[i] shape: (n_samples, n_features_i)

y : array, shape (n_samples,), optional

Returns:

X_transformed : list of array-likes

  • out length: n_views
  • out[i] shape: (n_samples, n_components_i)

AJIVE Plotting Functions

mvlearn.decomposition.data_block_heatmaps(Xs)[source]

Plots n_views heatmaps in a singular row. It is recommended to set a figure size as is shown in the tutorial.

Parameters:

Xs : dict or list of array-likes

  • Xs length: n_views
  • Xs[i] shape: (n_samples, n_features_i)

The different views to plot.

mvlearn.decomposition.ajive_full_estimate_heatmaps(Xs, full_block_estimates, names=None)[source]
Plots four heatmaps for each of the views:
  • Full initial data
  • Full joint signal estimate
  • Full individual signal estimate
  • Full noise estimate

It is recommended to set a figure size as shown in the tutorial.

Parameters:

Xs: list of array-likes

  • Xs length: n_views
  • Xs[i] shape: (n_samples, n_features_i)

The different views that are input. Input as data matrices.

full_block_estimates: dict

Dict that is returned from the ajive.predict() function

names: list

The names of the views.