Skip to content

Metrics

The evaluation system for manyLatents: a three-level architecture for measuring embedding quality, dataset properties, and algorithm internals.

Embedding Metrics

Evaluate the quality of low-dimensional embeddings. Compare high-dimensional input to low-dimensional output.

metric config defaults description
AlignmentScore metrics/embedding=alignment_score k=20, method=jaccard Compute composite per-variant alignment score.
Anisotropy metrics/embedding=anisotropy -- Anisotropy of embedding space
AUC metrics/embedding=auc -- Area Under ROC Curve for binary classification
CKA metrics/embedding=cka kernel=linear Centered Kernel Alignment with linear kernel
Continuity metrics/embedding=continuity return_per_sample=True Continuity of embedding (preservation of original neighborhoods)
CrossModalJaccard metrics/embedding=cross_modal_jaccard k=20, metric=euclidean Cross-modal k-NN neighborhood Jaccard overlap
DiffusionCondensation metrics/embedding=diffusion_condensation scale=1.025, granularity=0.1, knn=5, decay=40, n_pca=50, output_mode=stable Diffusion condensation score
DiffusionCurvature metrics/embedding=diffusion_curvature t=3, percentile=5 Diffusion curvature of embedding manifold
DiffusionSpectralEntropy metrics/embedding=diffusion_spectral_entropy t=1, gaussian_kernel_sigma=10 Diffusion spectral entropy (eigenvalue count at diffusion time t)
DiffusionSpectralEntropy metrics/embedding=dse_dense output_mode=eigenvalue_count, t_high=[10, 50, 100, 200, 500], numerical_floor=1e-06, kernel=dense, gaussian_kernel_sigma=10 Diffusion spectral entropy (eigenvalue count at diffusion time t)
DiffusionSpectralEntropy metrics/embedding=dse_knn output_mode=eigenvalue_count, t_high=[10, 50, 100, 200, 500], numerical_floor=1e-06, kernel=knn, k=${neighborhood_size}, alpha=1.0 Diffusion spectral entropy (eigenvalue count at diffusion time t)
DiffusionSpectralEntropy metrics/embedding=dse_t_sweep output_mode=eigenvalue_count_sweep, t_high=[10, 50, 100, 200, 500], numerical_floor=1e-06, gaussian_kernel_sigma=10 Diffusion spectral entropy (eigenvalue count at diffusion time t)
FractalDimension metrics/embedding=fractal_dimension n_box_sizes=10 Correlation fractal dimension of embedding
KNNPreservation metrics/embedding=knn_preservation n_neighbors=10, metric=euclidean k-nearest neighbor preservation between original and embedded spaces
LocalIntrinsicDimensionality metrics/embedding=local_intrinsic_dimensionality k=20 Mean local intrinsic dimensionality of the embedding
MagnitudeDimension metrics/embedding=magnitude_dimension n_ts=50, log_scale=False, scale_finding=convergence, target_prop=0.95, metric=euclidean, p=2, n_neighbors=12, method=cholesky, one_point_property=True, perturb_singularities=True, positive_magnitude=False, exact=False Magnitude-based effective dimensionality
NoOp metrics/embedding=noop k=25
OutlierScore metrics/embedding=outlier_score k=20, return_scores=False Outlier scores using Local Outlier Factor
ParticipationRatio metrics/embedding=participation_ratio n_neighbors=25, return_per_sample=True Local participation ratio measuring effective dimensionality
PearsonCorrelation metrics/embedding=pearson_correlation return_per_sample=False, num_dists=100 Pearson correlation between pairwise distances
PersistentHomology metrics/embedding=persistent_homology homology_dim=1, persistence_threshold=0.1, max_N=2000, random_seed=0 Count of loops/cycles (H1 Betti number)
PersistentHomology metrics/embedding=persistent_homology_beta0 homology_dim=0, persistence_threshold=3.0 Count of loops/cycles (H1 Betti number)
RankAgreement metrics/embedding=rank_agreement k=20, metric_fn=lid Rank-based agreement of LID/PR across modalities
SilhouetteScore metrics/embedding=silhouette metric=euclidean Silhouette score for cluster separation in embedding
TangentSpaceApproximation metrics/embedding=tangent_space n_neighbors=25, variance_threshold=0.95, return_per_sample=True Tangent space alignment between original and embedded spaces
Trustworthiness metrics/embedding=trustworthiness n_neighbors=5, metric=euclidean Compute the trustworthiness of an embedding.
Trustworthiness metrics/embedding=trustworthiness_k n_neighbors=[15, 25, 50, 100, 250], metric=euclidean Compute the trustworthiness of an embedding.

Config pattern: metrics/embedding=<name>

Module Metrics

Evaluate algorithm-specific internal components. Require a fitted module exposing affinity_matrix() or kernel_matrix().

metric config defaults description
AffinitySpectrum metrics/module=affinity_spectrum -- Top-k eigenvalues of the affinity matrix
ConnectedComponents metrics/module=connected_components -- Number of connected components in the kNN graph
DatasetTopologyDescriptor metrics/module=dataset_topology_descriptor -- Topological descriptor of the dataset structure
DiffusionMapCorrelation metrics/module=diffusion_map_correlation dm_components=2, alpha=1.0, correlation_type=pearson Correlation between diffusion map and embedding distances
KernelMatrixDensity metrics/module=kernel_matrix_density threshold=1e-10 Density of the kernel/affinity matrix
KernelMatrixSparsity metrics/module=kernel_matrix_sparsity threshold=1e-10 Sparsity of the kernel/affinity matrix
NoOp metrics/module=noop k=25
SpectralDecayRate metrics/module=spectral_decay_rate top_k=20 Fit exponential decay to the eigenvalue spectrum.
SpectralGapRatio metrics/module=spectral_gap_ratio -- Ratio of first to second eigenvalue of the diffusion operator

Config pattern: metrics/module=<name>

Dataset Metrics

Evaluate properties of the original high-dimensional data, independent of the DR algorithm.

metric config defaults description
GroundTruthPreservation metrics/dataset=admixture_laplacian -- Admixture Laplacian preservation score
GeodesicDistanceCorrelation metrics/dataset=geodesic_distance_correlation correlation_type=spearman Correlation between geodesic and embedded distances
NoOp metrics/dataset=noop k=25
kmeans_stratification metrics/dataset=stratification random_state=${seed} K-means stratification score for population structure

Config pattern: metrics/dataset=<name>


Metric Protocol

All metrics must match the Metric protocol (manylatents/metrics/metric.py):

def __call__(
    self,
    embeddings: np.ndarray,
    dataset=None,
    module=None,
    cache=None,
) -> float | tuple[float, np.ndarray] | dict[str, Any]

Return Types

Type Use Case Example
float Simple scalar Trustworthiness: 0.95
tuple[float, ndarray] Scalar + per-sample Continuity with return_per_sample=True
dict[str, Any] Structured output Persistent homology: {'beta_0': ..., 'beta_1': ...}

Configuration

Metrics use Hydra's _partial_: True for deferred parameter binding:

# configs/metrics/embedding/trustworthiness.yaml
_target_: manylatents.metrics.trustworthiness.Trustworthiness
_partial_: true
n_neighbors: 5

Multi-Scale Expansion

List-valued parameters expand via Cartesian product through flatten_and_unroll_metrics():

n_neighbors: [5, 10, 20]  # Produces 3 separate evaluations

Naming convention: embedding.trustworthiness__n_neighbors_5, embedding.trustworthiness__n_neighbors_10, etc.

Shared kNN Cache

Metrics that need kNN graphs share a cache computed once with max(k) across all metrics, avoiding redundant computation.

Writing a New Metric

import numpy as np
from typing import Optional

def YourMetric(
    embeddings: np.ndarray,
    dataset=None,
    module=None,
    k: int = 10,
    cache=None,
) -> float:
    # Your computation
    return score

Choosing the Right Level

  • Only needs original data? → metrics/dataset/
  • Compares original vs. reduced? → metrics/embedding/
  • Needs algorithm internals? → metrics/module/

Config

# configs/metrics/embedding/your_metric.yaml
_target_: manylatents.metrics.your_metric.YourMetric
_partial_: true
k: 10

Testing

Use metrics=noop to verify integration:

uv run python -m manylatents.main data=swissroll algorithms/latent=pca metrics=noop

Null Metrics Support

manyLatents supports running experiments without metrics computation — useful for fast debugging, exploratory analysis, or workflows where metrics are computed separately.

Usage

CLI (Default)

Metrics are null by default. Just don't specify them:

# No metrics (default)
uv run python -m manylatents.main data=swissroll algorithms/latent=pca

# With metrics (explicit opt-in)
uv run python -m manylatents.main data=swissroll algorithms/latent=pca metrics=noop

Experiment Configs

# configs/experiment/my_experiment.yaml
# @package _global_
defaults:
  - override /algorithms/latent: pca
  - override /data: swissroll
  - override /callbacks/embedding: default
  # No metrics override - stays null

Python API

from manylatents.api import run

result = run(
    data="swissroll",
    algorithms={'latent': 'pca'},
    metrics=None  # Explicitly disable
)

Expected Behavior

When metrics=null:

  • Generates embeddings
  • Saves embeddings to files
  • Creates plots (if callbacks configured)
  • Logs to wandb (if configured)
  • Does NOT compute evaluation metrics
  • Shows warning: "No scores found"

Design: Opt-In by Default

The base config (configs/config.yaml) sets metrics to null. Experiment configs opt in:

# configs/experiment/single_algorithm.yaml
defaults:
  - override /metrics: noop  # Opt in for this experiment

Hydra Limitation

Hydra CLI does not support null as an override value. You cannot do metrics=null on the command line — Hydra's parser converts "null" to Python None, which its override validator rejects.

Workarounds:

  • Use experiment configs without metrics specified
  • Use the Python API with metrics=None (our code handles this)
  • Use metrics=null config files (e.g., the base config already does this)

The API intercepts None values before Hydra sees them and sets them after config composition via OmegaConf.update().

Troubleshooting

"Could not find 'metrics/none'"

You're trying metrics=none as a CLI override. Hydra interprets this as looking for metrics/none.yaml.

Fix: Use an experiment config, or the API with metrics=None.

Metrics Still Being Computed

Check that:

  1. Your experiment config doesn't have - override /metrics: ... in defaults
  2. You're not passing metrics=... on the command line
  3. The final config shows metrics: null