localbiplot package¶

local biplot¶

class localbiplot.GMDOutput[source]¶: Bases: object

class localbiplot.LocalBiplot(X, labels=None, perplexity=None, red='tsne', sca='minmax', random_seed=123)[source]¶

Bases: object

Object for data analysis using linear and non-linear Biplots obtained by SVD decomposition and a Generalized SVD decomposition .

This class implements a set of functions for data analysis, including scaling, dimensionality reduction, kernel calculation, and biplots computation and display.

X¶

Input matrix of shape N x P.

Type:: pd.dataframe

labels¶

Labels for the samples (default is None).

Type:: array-like, optional

perplexity¶

Perplexity for t-SNE (default is calculated as the square root of N).

Type:: int or None, optional

red¶

Dimensionality reduction method (‘tsne’ by default).

Type:: {‘tsne’, ‘pca’, ‘umap’}, default is ‘tsne’

sca¶

Data scaling method (‘minmax’ by default).

Type:: {‘minmax’}, default is ‘minmax’

random_seed¶

Seed for result reproducibility.

Type:: int, default is 123

data_scaler(X, feature_range=(0, 1))[source]¶: Scale the data using MinMaxScaler if ‘sca’ is set to ‘minmax’.

reduce_dimensions(X)[source]¶: Reduce the dimensionality of the data using t-SNE, PCA, or UMAP.

krbf(X)¶: Calculate the Radial Basis Function (RBF) kernel matrix for the input data.

center_kernel(K)¶: Center a given kernel matrix using the Kernel Centering method.

laplacian_score(X, K, tol=1e-10)¶: Calculate the Laplacian score for a given dataset and kernel matrix.

lnkbp_()¶: Process and analyze the data through steps such as scaling, dimensionality reduction, kernel calculations, and Laplacian Score computation.

localbp_(X_)¶: Perform a local biplot operation on the scaled data (currently commented out).

laplacian_score(X, K, tol=1e-10)¶: Calculate the Laplacian score for a given dataset and kernel matrix

GMD(X, H, Q, K)¶: Generalized Matrix Decomposition method (power method) for a given dataset and kernel matrices.

biplot_gmd_body(fit, index=None, names=None, sample_col='grey50', sample_pch=19, arrow_col='orange', arrow_cex=1)¶: Generate a GMD-biplot based on generalized matrix decomposition results.

plot_lnkbp_(hue, c, figsize=(25, 10))¶: Plot various visualizations, including scatter plots, kernel matrices, and feature relevance.

affine_transformM(parameters, array_A)[source]¶: Apply an affine transformation to the input array using the given parameters.

registration_errorM(parameters, array_A, array_B)[source]¶: Compute the registration error between two sets of 2D points after applying an affine transformation.

…

LocalBiplot_()[source]¶

Process and analyze the data using a series of steps, including scaling, dimensionality reduction, kernel calculations, and Laplacian score computation.

Returns:¶

YourClass instance: The modified instance with processed and analyzed data.

affine_transformM(parameters, array_A)[source]¶

Apply an affine transformation to the input array using the given parameters.

Parameters:¶

parameters (array-like): Affine transformation parameters.
- parameters[0]: Scaling factor
- parameters[1]: Rotation angle (in radians)
- parameters[2:]: Translation along x and y axes
array_A (array-like): Input array to be transformed.

Returns:¶

array-like: Transformed array after applying the affine transformation.

clustering(Z, eps_=None, per_=5)[source]¶

Perform clustering on the given 2D data using DBSCAN algorithm.

Parameters:¶

Z (array-like): N x 2 list | np.ndarray representing the data points.
eps_ (float, optional): The maximum distance between two samples for one to be considered as in the neighborhood of the other. Defaults to None.
per_ (float, optional): The percentile value used to set the eps parameter if it is not provided. Defaults to 5.

Returns:¶

list | np.ndarray : An array of cluster labels assigned by the DBSCAN algorithm.

Notes:¶

If eps_ is not provided, it is calculated as a percentile of the pairwise Euclidean distances between points in the input data Z.

DBSCAN (Density-Based Spatial Clustering of Applications with Noise) is a clustering algorithm that groups together data points that are close to each other and marks outliers as noise.

compute_variance_ratio(Sc)[source]¶

Compute eigenvalues, total variance, and explained variance ratio by principal component.

Parameters: - Sc: Array of singular values from SVD.

Returns: - explained_variance_ratio: Array of explained variance ratios.

data_scaler(X, feature_range=(0, 1))[source]¶

this method scale the input data using MinMaxScaler if ‘sca’ is set to ‘minmax’.

Parameters:

(array-like) (- X) –
(tuple (- feature_range) – Defaults to (0, 1).
optional) (Tuple specifying the minimum and maximum values of the feature range.) – Defaults to (0, 1).

Return type:

An N x P scaled data matrix.

get_localbp_(tar_, Ck, databp)[source]¶

optimize_affine_transform(Zc, B, Sc, ind_)[source]¶

Optimize the parameters for the affine transformation.

Parameters:¶

Zc (array-like): Cluster data points (N x 2 array).
B (array-like): Matrix of vectors (2 x P) representing the original basis.
Sc (array-like): Singular values of the original basis.
ind_ (array-like): Boolean array indicating the indices of the cluster.

Returns:¶

Tuple: A tuple containing the optimized parameters and the transformed cluster points.

Notes:¶

This function performs optimization to find the best affine transformation parameters using the Nelder-Mead method. It then applies the optimized transformation to the cluster points.

pca_by_SVD(X)[source]¶

Perform SVD decomposition.

Parameters:¶

X: list | np.ndarray Input data N x P.

Returns:¶

U, S, VT, S_, A, B

Details:¶

Singular Value Decomposition

(utilizar ..math:: en lugar de $$) $mathbf{X} = mathbf{U}mathbf{S}mathbf{V}^ op = mathbf{U}mathbf{S}^{0.5}mathbf{S}^{0.5}mathbf{V}^ op = mathbf{A}mathbf{B}^ op$

$mathbf{X}in mathbb{R}^{N imes P}$

$mathbf{U}in mathbb{R}^{N imes M}$

$mathbf{V}in mathbb{R}^{P imes M}$

$mathbf{S}in mathbb{R}^{M imes M}$

$mathbf{A} = mathbf{U}mathbf{S}^{0.5} in mathbb{R}^{N imes M} $

$mathbf{B} = mathbf{V}mathbf{S}^{0.5} in mathbb{R}^{P imes M} $

$M = min(N,P)$

plot_transformed_clusters(ax, ZcA, VA, cmap, arrow_size=0.05)[source]¶

Plot the non-linear local-Biplot SVD.

Parameters:¶

ax (matplotlib.axes._subplots.AxesSubplot): Axes on which to plot.
ZcA (numpy.ndarray): Transformed points of the cluster.
VA (numpy.ndarray): Transformed vector arrows of the cluster.
cmap: Color map for the scatter plot.
arrow_size

Returns:¶

None

reduce_dimensions(X)[source]¶

Reduce the dimensionality of the input data using t-SNE, PCA, or UMAP.

Parameters:¶

X (array-like): Input matrix of shape N x P. Input data to be dimensionality reduced.

Returns:¶

An n x 2 array-like dimensionality reduced data.

registration_errorM(parameters, array_A, array_B)[source]¶

Compute the registration error between two sets of 2D points after applying an affine transformation.

Parameters:¶

parameters (array-like): Affine transformation parameters.
array_A (array-like): Source set of 2D points (N x 2 array).
array_B (array-like): Target set of 2D points (N x 2 array).

Returns:¶

float: Registration error, calculated as the Frobenius norm of the difference
between the transformed source points and the target points.

localbiplot package¶

local biplot¶

Returns:¶

Parameters:¶

Returns:¶

Parameters:¶

Returns:¶

Notes:¶

Parameters:¶

Returns:¶

Notes:¶

Parameters:¶

Returns:¶

Details:¶

Parameters:¶

Returns:¶

Parameters:¶

Returns:¶

Parameters:¶

Returns:¶

localbiplot

Navigation

Related Topics