Clustering¶
Multiview Spectral Clustering¶

class
mvlearn.cluster.
MultiviewSpectralClustering
(n_clusters=2, random_state=None, info_view=None, max_iter=10, n_init=10, affinity='rbf', gamma=None, n_neighbors=10)[source]¶ An implementation of multiview spectral clustering using the basic cotraining framework as described in [1]. Additionally, this can be effective when the dataset naturally contains features that are of 2 different data types, such as continuous features and categorical features [4], and then the original features are separated into two views in this way.
This algorithm can handle 2 or more views of data.
Parameters: n_clusters : int
The number of clusters
random_state : int, optional, default=None
Determines random number generation for kmeans.
info_view : int, optional, default=None
The most informative view. Must be between 0 and n_views1 If given, then the final clustering will be performed on the designated view alone. Otherwise, the algorithm will concatenate across all views and cluster on the result.
max_iter : int, optional, default=10
The maximum number of iterations to run the clustering algorithm.
n_init : int, optional, default=10
The number of random initializations to use for kmeans clustering.
affinity : string, optional, default='rbf'
The affinity metric used to construct the affinity matrix. Options include 'rbf' (radial basis function), 'nearest_neighbors', and 'poly' (polynomial)
gamma : float, optional, default=None
Kernel coefficient for rbf and polynomial kernels. If None then gamma is computed as 1 / (2 * median(pair_wise_distances(X))^2) for each data view X.
n_neighbors : int, optional, default=10
Only used if nearest neighbors is selected for affinity. The number of neighbors to use for the nearest neighbors kernel.
Attributes
labels_ (arraylike, shape (n_samples)) Cluster labels for each sample in the fitted data. embedding_ (arraylike, shape (n_samples, n_clusters)) The final spectral representation of the data to be used as input for the KMeans clustering step. Notes
Multiview spectral clustering adapts the spectral clustering algorithm to applications where more than one view of data is available. This algorithm relies on the basic assumptions of the cotraining, which are: (a) Sufficiency: each view is sufficient for classification on its own, (b) Compatibility: the target functions in both views predict the same labels for cooccurring features with high probability, and (c) Conditional independence: the views are conditionally independent given the class labels. In contrast to multiview kmeans clustering, multiview spectral clustering performs well on arbitrary shaped clusters, and can therefore be readily used in applications where clusters are not expected to be convex. However multiview spectral clustering tends to be computationally expensive unless the similarity graph for the data is sparse.
Multiview spectral clustering works by using the spectral embedding from one view to constrain the similarity graph in the other view. By iteratively applying this procedure, the clustering of the two views tend to each other. Here we outline the algorithm for the Multiview Spectral clustering algorithm for 2 views.
Multiview Spectral Clustering Algorithm (for 2 views)
Input: Similarity matrix for both views: \(\mathbf{K}_1, \mathbf{K}_2\)
Output: Assignments to k clusters
Initialize: \(\mathbf{L}_v = \mathbf{D}_v^{1/2} \mathbf{K}_v\mathbf{D}_v^{1/2}\) for \(v = 1, 2\)
\(\mathbf{U}_v^0\) is an \(n \times k\) matrix with the top k eigenvectors of \(\mathbf{L}_v\) for \(v = 1, 2\)
For \(i = 1\) to iter:
 \(\mathbf{S}_1 = sym(\mathbf{U}_2^{i1} {\mathbf{U}_2^{i1}}^T\mathbf{K}_1)\)
 \(\mathbf{S}_2 = sym(\mathbf{U}_1^{i1} {\mathbf{U}_1^{i1}}^T\mathbf{K}_2)\)
 Use \(\mathbf{S}_1\) and \(\mathbf{S}_2\) as the new graph similarities and compute the Laplacians. Solve for the largest k eigenvectors to obtain \(\mathbf{U}_1^i\) and \(\mathbf{U}_2^i\).
Rownormalize \(\mathbf{U}_1^i\) and \(\mathbf{U}_2^i\).
Form matrix \(\mathbf{V} = \mathbf{U}_v^i\), where \(v\) is believed to be the most informative view a priori. If there is no prior knowledge on the view informativeness, matrix \(\mathbf{V}\) can also be set to the columnwise concatenation of the two \(\mathbf{U}_v^i\) s.
Assign example j to cluster c if the jth row of \(\mathbf{V}\) is assigned to cluster c by the kmeans algorithm.
References
[1] Abhishek Kumar and Hal Daumé. A Cotraining Approach for Multiview Spectral Clustering. In International Conference on Machine Learning, 2011 Examples
>>> from mvlearn.datasets import load_UCImultifeature >>> from mvlearn.cluster import MultiviewSpectralClustering >>> from sklearn.metrics import normalized_mutual_info_score as nmi_score >>> # Get 5class data >>> data, labels = load_UCImultifeature(select_labeled = list(range(5))) >>> mv_data = data[:2] # first 2 views only >>> mv_spectral = MultiviewSpectralClustering(n_clusters=5, ... random_state=10, n_init=100) >>> mv_clusters = mv_spectral.fit_predict(mv_data) >>> nmi = nmi_score(labels, mv_clusters) >>> print('{0:.3f}'.format(nmi)) 0.872

fit
(Xs, y=None)[source]¶ Performs clustering on the multiple views of data.
Parameters: Xs : list of arraylikes or numpy.ndarray
 Xs length: n_views
 Xs[i] shape: (n_samples, n_features_i)
This list must be of size n_views, corresponding to the number of views of data. Each view can have a different number of features, but they must have the same number of samples.
y : Ignored
Not used, present for API consistency by convention.
Returns: self : returns an instance of self.

fit_predict
(Xs, y=None)¶ A method for fitting then predicting cluster assignments.
Parameters: Xs : list of arraylikes or numpy.ndarray
 Xs length: n_views
 Xs[i] shape: (n_samples, n_features_i)
A list of different views to fit the model on.
y : arraylike, shape (n_samples,)
Labels for each sample. Only used by supervised algorithms.
Returns: labels : arraylike, shape (n_samples,)
The predicted cluster labels for each sample.

predict
(Xs)¶ A method to predict cluster labels of multiview data. Parameters  Xs : list of arraylikes or numpy.ndarray
 Xs length: n_views
 Xs[i] shape: (n_samples, n_features_i)
A list of different views to cluster.
Returns: labels : arraylike, shape (n_samples,)
Returns the predicted cluster labels for each sample.
CoRegularized Multiview Spectral Clustering¶

class
mvlearn.cluster.
MultiviewCoRegSpectralClustering
(n_clusters=2, v_lambda=2, random_state=None, info_view=None, max_iter=10, n_init=10, affinity='rbf', gamma=None, n_neighbors=10)[source]¶ An implementation of coregularized multiview spectral clustering based on an unsupervied version of the cotraining framework. This algorithm uses the pairwise coregularization scheme as described in [2]. This algorithm can handle 2 or more views of data.
Parameters: n_clusters : int
The number of clusters
v_lambda : float, optional, default=2
The regularization parameter. This parameter tradesoff the spectral clustering objectives with the degree of agreement between each pair of views in the new representation. Must be a positive value.
random_state : int, optional, default=None
Determines random number generation for kmeans.
info_view : int, optional, default=None
The most informative view. Must be between 0 and n_views1 If given, then the final clustering will be performed on the designated view alone. Otherwise, the algorithm will concatenate across all views and cluster on the result.
max_iter : int, optional, default=10
The maximum number of iterations to run the clustering algorithm.
n_init : int, optional, default=10
The number of random initializations to use for kmeans clustering.
affinity : string, optional, default='rbf'
The affinity metric used to construct the affinity matrix. Options include 'rbf' (radial basis function), 'nearest_neighbors', and 'poly' (polynomial)
gamma : float, optional, default=None
Kernel coefficient for rbf and polynomial kernels. If None then gamma is computed as 1 / (2 * median(pair_wise_distances(X))^2) for each data view X.
n_neighbors : int, optional, default=10
Only used if nearest neighbors is selected for affinity. The number of neighbors to use for the nearest neighbors kernel.
Attributes
labels_ (arraylike, shape (n_samples,)) Cluster labels for each point. embedding_ (arraylike, shape (n_samples, n_clusters)) The final spectral representation of the data to be used as input for the KMeans clustering step. objective_ (arraylike, shape (n_views, n_iterations)) The value of the spectral clustering objective for each view at the end of each iteration. Notes
In standard spectral clustering, the eigenvector matrix U for a given view is the new data representation to be used for the subsequent kmeans clustering stage. In this algorithm, the objective function has been altered to encourage the pairwise similarities of examples under the new representation to be similar across all views.
The modified spectral clustering objective for the case of two views is shown and derived in [#4Clu]. In the clustering objective, the hyperparameter lambda tradesoff the spectral clustering objectives and the disagreement term.
For a fixed lambda and n, the objective function is bounded from above and nondecreasing. As such, the algorithm is guaranteed to converge.
References
[2] Kumar A, Rai P, Daumé H (2011) Coregularized multiview spectral clustering. Adv Neural Inform Process Syst 24:1413–1421 Examples
>>> from mvlearn.datasets import load_UCImultifeature >>> from mvlearn.cluster import MultiviewCoRegSpectralClustering >>> from sklearn.metrics import normalized_mutual_info_score as nmi_score >>> # Get 5class data >>> data, labels = load_UCImultifeature(select_labeled = list(range(5))) >>> mv_data = data[:2] # first 2 views only >>> mv_spectral = MultiviewCoRegSpectralClustering(n_clusters=5, ... random_state=10, n_init=100) >>> mv_clusters = mv_spectral.fit_predict(mv_data) >>> nmi = nmi_score(labels, mv_clusters, average_method='arithmetic') >>> print('{0:.3f}'.format(nmi)) 0.663

fit
(Xs)[source]¶ Performs clustering on the multiple views of data.
Parameters: Xs : list of arraylikes or numpy.ndarray
 Xs length: n_views
 Xs[i] shape: (n_samples, n_features_i)
This list must be of size n_views, corresponding to the number of views of data. Each view can have a different number of features, but they must have the same number of samples.
Returns: self : returns an instance of self.

fit_predict
(Xs, y=None)[source]¶ Performs clustering on the multiple views of data and returns the cluster labels.
Parameters: Xs : list of arraylikes or numpy.ndarray
 Xs length: n_views
 Xs[i] shape: (n_samples, n_features_i)
This list must be of size n_views, corresponding to the number of views of data. Each view can have a different number of features, but they must have the same number of samples.
y : ignored
Included for API compliance.
Returns: labels : arraylike, shape (n_samples,)
The predicted cluster labels for each sample.

predict
(Xs)¶ A method to predict cluster labels of multiview data. Parameters  Xs : list of arraylikes or numpy.ndarray
 Xs length: n_views
 Xs[i] shape: (n_samples, n_features_i)
A list of different views to cluster.
Returns: labels : arraylike, shape (n_samples,)
Returns the predicted cluster labels for each sample.

Multiview K Means¶

class
mvlearn.cluster.
MultiviewKMeans
(n_clusters=2, random_state=None, init='kmeans++', patience=5, max_iter=300, n_init=5, tol=0.0001, n_jobs=None)[source]¶ This class implements multiview kmeans using the coEM framework as described in [3]. This algorithm is most suitable for cases in which the different views of data are conditionally independent. Additionally, this can be effective when the dataset naturally contains features that are of 2 different data types, such as continuous features and categorical features [4], and then the original features are separated into two views in this way.
This algorithm currently handles two views of data.
Parameters: n_clusters : int, optional, default=2
The number of clusters
random_state : int, optional, default=None
Determines random number generation for initializing centroids. Can seed the random number generator with an int.
init : {'kmeans++', 'random'} or list of arraylikes, default='kmeans++'
Method of initializing centroids.
'kmeans++': selects initial cluster centers for kmeans clustering via a method that speeds up convergence.
'random': choose n_cluster samples from the data for the initial centroids.
If a list of arraylikes is passed, the list should have a length of equal to the number of views. Each of the arraylikes should have the shape (n_clusters, n_features_i) for the ith view, where n_features_i is the number of features in the ith view of the input data.
patience : int, optional, default=5
The number of EM iterations with no decrease in the objective function after which the algorithm will terminate.
max_iter : int, optional, default=300
The maximum number of EM iterations to run before termination.
n_init : int, optional, default=5
Number of times the kmeans algorithm will run on different centroid seeds. The final result will be the best output of n_init runs with respect to total inertia across all views.
tol : float, default=1e4
Relative tolerance with regards to inertia to declare convergence.
n_jobs : int, default=None
The number of jobs to use for computation. This works by computing each of the n_init runs in parallel. None means 1. 1 means using all processors.
Attributes
labels_ (arraylike, shape (n_samples)) Cluster labels for each sample in the fitted data. centroids_ (list of arraylikes)  centroids_ length: n_views  centroids_[i] shape: (n_clusters, n_features_i) The cluster centroids for each of the two views. centroids_[0] corresponds to the centroids of view 1 and centroids_[1] corresponds to the centroids of view 2. Notes
Multiview kmeans clustering adapts the traditional kmeans clustering algorithm to handle two views of data. This algorithm requires that a conditional independence assumption between views holds true. In cases where both views are informative and conditionally independent, multiview kmeans clustering can outperform its singleview analog run on a concatenated version of the two views of data. This is quite useful for applications where you wish to cluster data from two different modalities or data with features that naturally fall into two different partitions. Multiview kmeans works by iteratively performing the maximization and expectation steps of traditional EM in one view, and then using the computed hidden variables as the input for the maximization step in the other view. This algorithm, referred to as CoEM, is described below.
CoEM Algorithm
Input: Unlabeled data D with 2 views
Initialize \(\Theta_0^{(2)}\), T, \(t = 0\).
E step for view 2: compute expectation for hidden variables given
Loop until stopping criterion is true:
For v = 1 ... 2:
 \(t = t + 1\)
 M step view v: Find model parameters \(\Theta_t^{(v)}\)
that maximize the likelihood for the data given the expected values for hidden variables of view \(\overline{v}\) of iteration \(t\)  1
 E step view \(v\): compute expectation for hidden
variables given the model parameters \(\Theta_t^{(v)}\)
return combined \(\hat{\Theta} = \Theta_{t1}^{(1)} \cup \Theta_t^{(2)}\)
The final assignment of examples to partitions is performed by assigning each example to the cluster with the largest averaged posterior probability over both views.
References
[3] (1, 2) Bickel S, Scheffer T (2004) Multiview clustering. Proceedings of the 4th IEEE International Conference on Data Mining, pp. 19–26 [4] (1, 2, 3) Chao, Guoqing, Shiliang Sun, and Jinbo Bi. "A survey on multiview clustering." arXiv preprint arXiv:1712.06246 (2017). Examples
>>> from mvlearn.datasets import load_UCImultifeature >>> from mvlearn.cluster import MultiviewKMeans >>> from sklearn.metrics import normalized_mutual_info_score as nmi_score >>> # Get 5class data >>> data, labels = load_UCImultifeature(select_labeled = list(range(5))) >>> mv_data = data[:2] # first 2 views only >>> mv_kmeans = MultiviewKMeans(n_clusters=5, random_state=10) >>> mv_clusters = mv_kmeans.fit_predict(mv_data) >>> nmi = nmi_score(labels, mv_clusters) >>> print('{0:.3f}'.format(nmi)) 0.770
""

fit
(Xs, y=None)[source]¶ Fit the cluster centroids to the data.
Parameters: Xs : list of arraylikes or numpy.ndarray
 Xs length: n_views
 Xs[i] shape: (n_samples, n_features_i)
This list must be of size 2, corresponding to the two views of the data. The two views can each have a different number of features, but they must have the same number of samples.
y : Ignored
Not used, present for API consistency by convention.
Returns: self : returns an instance of self.

predict
(Xs)[source]¶ Predict the cluster labels for the data.
Parameters: Xs : list of arraylikes or numpy.ndarray
 Xs length: n_views
 Xs[i] shape: (n_samples, n_features_i)
This list must be of size 2, corresponding to the two views of the data. The two views can each have a different number of features, but they must have the same number of samples.
Returns: labels : arraylike, shape (n_samples,)
The predicted cluster labels for each sample.

fit_predict
(Xs, y=None)¶ A method for fitting then predicting cluster assignments.
Parameters: Xs : list of arraylikes or numpy.ndarray
 Xs length: n_views
 Xs[i] shape: (n_samples, n_features_i)
A list of different views to fit the model on.
y : arraylike, shape (n_samples,)
Labels for each sample. Only used by supervised algorithms.
Returns: labels : arraylike, shape (n_samples,)
The predicted cluster labels for each sample.
Multiview Spherical K Means¶

class
mvlearn.cluster.
MultiviewSphericalKMeans
(n_clusters=2, random_state=None, init='kmeans++', patience=5, max_iter=None, n_init=5, tol=0.0001, n_jobs=None)[source]¶ An implementation of multiview spherical KMeans using the coEM framework as described in [3]. This algorithm is most suitable for cases in which the different views of data are conditionally independent. Additionally, this can be effective when the dataset naturally contains features that are of 2 different data types, such as continuous features and categorical features [4], and then the original features are separated into two views in this way.
This algorithm currently handles two views of data.
Parameters: n_clusters : int, optional, default=2
The number of clusters
random_state : int, optional, default=None
Determines random number generation for initializing centroids. Can seed the random number generator with an int.
init : {'kmeans++', 'random'} or list of arraylikes, default='kmeans++'
Method of initializing centroids.
'kmeans++': selects initial cluster centers for kmeans clustering via a method that speeds up convergence.
'random': choose n_cluster samples from the data for the initial centroids.
If a list of arraylikes is passed, the list should have a length of equal to the number of views. Each of the arraylikes should have the shape (n_clusters, n_features_i) for the ith view, where n_features_i is the number of features in the ith view of the input data.
patience : int, optional, default=5
The number of EM iterations with no decrease in the objective function after which the algorithm will terminate.
max_iter : int, optional, default=None
The maximum number of EM iterations to run before termination.
n_init : int, optional, default=5
Number of times the kmeans algorithm will run on different centroid seeds. The final result will be the best output of n_init runs with respect to total inertia across all views.
tol : float, default=1e4
Relative tolerance with regards to inertia to declare convergence.
n_jobs : int, default=None
The number of jobs to use for computation. This works by computing each of the n_init runs in parallel. None means 1. 1 means using all processors.
Attributes
labels_ (arraylike, shape (n_samples)) Cluster labels for each sample in the fitted data. centroids_ (list of arraylikes)  centroids_ length: n_views  centroids_[i] shape: (n_clusters, n_features_i) The cluster centroids for each of the two views. centroids_[0] corresponds to the centroids of view 1 and centroids_[1] corresponds to the centroids of view 2. Notes
Multiview spherical kmeans clustering adapts the traditional spherical kmeans clustering algorithm to handle two views of data. This algorithm is similar to the multview kmeans algorithm, except it uses cosine distance instead of euclidean distance for the purposes of computing the optimization objective and making assignments. This algorithm requires that a conditional independence assumption between views holds true. In cases where both views are informative and conditionally independent, multiview spherical kmeans clustering can outperform its singleview analog run on a concatenated version of the two views of data. This is quite useful for applications where you wish to cluster data from two different modalities or data with features that naturally fall into two different partitions. Multiview spherical kmeans works by iteratively performing the maximization and expectation steps of traditional EM in one view, and then using the computed hidden variables as the input for the maximization step in the other view. This algorithm is described in the section for multiview kmeans clustering.
Examples
>>> from mvlearn.datasets import load_UCImultifeature >>> from mvlearn.cluster import MultiviewSphericalKMeans >>> from sklearn.metrics import normalized_mutual_info_score as nmi_score >>> # Get 5class data >>> data, labels = load_UCImultifeature(select_labeled = list(range(5))) >>> mv_data = data[:2] # first 2 views only >>> mv_kmeans = MultiviewSphericalKMeans(n_clusters=5, random_state=5) >>> mv_clusters = mv_kmeans.fit_predict(mv_data) >>> # Compute nmi between true class labels and multiview cluster labels >>> nmi = nmi_score(labels, mv_clusters) >>> print('{0:.3f}'.format(nmi)) 0.823

fit
(Xs, y=None)[source]¶ Fit the cluster centroids to the data.
Parameters: Xs : list of arraylikes or numpy.ndarray
 Xs length: n_views
 Xs[i] shape: (n_samples, n_features_i)
This list must be of size 2, corresponding to the two views of the data. The two views can each have a different number of features, but they must have the same number of samples.
y : Ignored
Not used, present for API consistency by convention.
Returns: self : returns an instance of self.

fit_predict
(Xs, y=None)¶ A method for fitting then predicting cluster assignments.
Parameters: Xs : list of arraylikes or numpy.ndarray
 Xs length: n_views
 Xs[i] shape: (n_samples, n_features_i)
A list of different views to fit the model on.
y : arraylike, shape (n_samples,)
Labels for each sample. Only used by supervised algorithms.
Returns: labels : arraylike, shape (n_samples,)
The predicted cluster labels for each sample.

predict
(Xs)¶ Predict the cluster labels for the data.
Parameters: Xs : list of arraylikes or numpy.ndarray
 Xs length: n_views
 Xs[i] shape: (n_samples, n_features_i)
This list must be of size 2, corresponding to the two views of the data. The two views can each have a different number of features, but they must have the same number of samples.
Returns: labels : arraylike, shape (n_samples,)
The predicted cluster labels for each sample.
