Metrics

The evaluation system for manyLatents: a three-level architecture for measuring embedding quality, dataset properties, and algorithm internals.

Embedding Metrics

Evaluate the quality of low-dimensional embeddings. Compare high-dimensional input to low-dimensional output.

metric	config	defaults	description
AlignmentScore	`metrics/embedding=alignment_score`	k=20, method=jaccard	Compute composite per-variant alignment score.
Anisotropy	`metrics/embedding=anisotropy`	--	Anisotropy of embedding space
AUC	`metrics/embedding=auc`	--	Area Under ROC Curve for binary classification
CKA	`metrics/embedding=cka`	kernel=linear	Centered Kernel Alignment with linear kernel
Continuity	`metrics/embedding=continuity`	return_per_sample=True	Continuity of embedding (preservation of original neighborhoods)
CrossModalJaccard	`metrics/embedding=cross_modal_jaccard`	k=20, metric=euclidean	Cross-modal k-NN neighborhood Jaccard overlap
DiffusionCondensation	`metrics/embedding=diffusion_condensation`	scale=1.025, granularity=0.1, knn=5, decay=40, n_pca=50, output_mode=stable	Diffusion condensation score
DiffusionCurvature	`metrics/embedding=diffusion_curvature`	t=3, percentile=5	Diffusion curvature of embedding manifold
DiffusionSpectralEntropy	`metrics/embedding=diffusion_spectral_entropy`	t=1, gaussian_kernel_sigma=10	Diffusion spectral entropy (eigenvalue count at diffusion time t)
DiffusionSpectralEntropy	`metrics/embedding=dse_dense`	output_mode=eigenvalue_count, t_high=[10, 50, 100, 200, 500], numerical_floor=1e-06, kernel=dense, gaussian_kernel_sigma=10	Diffusion spectral entropy (eigenvalue count at diffusion time t)
DiffusionSpectralEntropy	`metrics/embedding=dse_knn`	output_mode=eigenvalue_count, t_high=[10, 50, 100, 200, 500], numerical_floor=1e-06, kernel=knn, k=${neighborhood_size}, alpha=1.0	Diffusion spectral entropy (eigenvalue count at diffusion time t)
DiffusionSpectralEntropy	`metrics/embedding=dse_t_sweep`	output_mode=eigenvalue_count_sweep, t_high=[10, 50, 100, 200, 500], numerical_floor=1e-06, gaussian_kernel_sigma=10	Diffusion spectral entropy (eigenvalue count at diffusion time t)
FractalDimension	`metrics/embedding=fractal_dimension`	n_box_sizes=10	Correlation fractal dimension of embedding
KNNPreservation	`metrics/embedding=knn_preservation`	n_neighbors=10, metric=euclidean	k-nearest neighbor preservation between original and embedded spaces
LocalIntrinsicDimensionality	`metrics/embedding=local_intrinsic_dimensionality`	k=20	Mean local intrinsic dimensionality of the embedding
MagnitudeDimension	`metrics/embedding=magnitude_dimension`	n_ts=50, log_scale=False, scale_finding=convergence, target_prop=0.95, metric=euclidean, p=2, n_neighbors=12, method=cholesky, one_point_property=True, perturb_singularities=True, positive_magnitude=False, exact=False	Magnitude-based effective dimensionality
NoOp	`metrics/embedding=noop`	k=25
OutlierScore	`metrics/embedding=outlier_score`	k=20, return_scores=False	Outlier scores using Local Outlier Factor
ParticipationRatio	`metrics/embedding=participation_ratio`	n_neighbors=25, return_per_sample=True	Local participation ratio measuring effective dimensionality
PearsonCorrelation	`metrics/embedding=pearson_correlation`	return_per_sample=False, num_dists=100	Pearson correlation between pairwise distances
PersistentHomology	`metrics/embedding=persistent_homology`	homology_dim=1, persistence_threshold=0.1, max_N=2000, random_seed=0	Count of loops/cycles (H1 Betti number)
PersistentHomology	`metrics/embedding=persistent_homology_beta0`	homology_dim=0, persistence_threshold=3.0	Count of loops/cycles (H1 Betti number)
RankAgreement	`metrics/embedding=rank_agreement`	k=20, metric_fn=lid	Rank-based agreement of LID/PR across modalities
SilhouetteScore	`metrics/embedding=silhouette`	metric=euclidean	Silhouette score for cluster separation in embedding
TangentSpaceApproximation	`metrics/embedding=tangent_space`	n_neighbors=25, variance_threshold=0.95, return_per_sample=True	Tangent space alignment between original and embedded spaces
Trustworthiness	`metrics/embedding=trustworthiness`	n_neighbors=5, metric=euclidean	Compute the trustworthiness of an embedding.
Trustworthiness	`metrics/embedding=trustworthiness_k`	n_neighbors=[15, 25, 50, 100, 250], metric=euclidean	Compute the trustworthiness of an embedding.

Config pattern: metrics/embedding=<name>

Module Metrics

Evaluate algorithm-specific internal components. Require a fitted module exposing affinity_matrix() or kernel_matrix().

metric	config	defaults	description
AffinitySpectrum	`metrics/module=affinity_spectrum`	--	Top-k eigenvalues of the affinity matrix
ConnectedComponents	`metrics/module=connected_components`	--	Number of connected components in the kNN graph
DatasetTopologyDescriptor	`metrics/module=dataset_topology_descriptor`	--	Topological descriptor of the dataset structure
DiffusionMapCorrelation	`metrics/module=diffusion_map_correlation`	dm_components=2, alpha=1.0, correlation_type=pearson	Correlation between diffusion map and embedding distances
KernelMatrixDensity	`metrics/module=kernel_matrix_density`	threshold=1e-10	Density of the kernel/affinity matrix
KernelMatrixSparsity	`metrics/module=kernel_matrix_sparsity`	threshold=1e-10	Sparsity of the kernel/affinity matrix
NoOp	`metrics/module=noop`	k=25
SpectralDecayRate	`metrics/module=spectral_decay_rate`	top_k=20	Fit exponential decay to the eigenvalue spectrum.
SpectralGapRatio	`metrics/module=spectral_gap_ratio`	--	Ratio of first to second eigenvalue of the diffusion operator

Config pattern: metrics/module=<name>

Dataset Metrics

Evaluate properties of the original high-dimensional data, independent of the DR algorithm.

metric	config	defaults	description
GroundTruthPreservation	`metrics/dataset=admixture_laplacian`	--	Admixture Laplacian preservation score
GeodesicDistanceCorrelation	`metrics/dataset=geodesic_distance_correlation`	correlation_type=spearman	Correlation between geodesic and embedded distances
NoOp	`metrics/dataset=noop`	k=25
kmeans_stratification	`metrics/dataset=stratification`	random_state=${seed}	K-means stratification score for population structure

Config pattern: metrics/dataset=<name>

ProtocolWriting a New MetricRunning Without Metrics

Metric Protocol

All metrics must match the Metric protocol (manylatents/metrics/metric.py):

def __call__(
    self,
    embeddings: np.ndarray,
    dataset=None,
    module=None,
    cache=None,
) -> float | tuple[float, np.ndarray] | dict[str, Any]

Return Types

Type	Use Case	Example
`float`	Simple scalar	Trustworthiness: `0.95`
`tuple[float, ndarray]`	Scalar + per-sample	Continuity with `return_per_sample=True`
`dict[str, Any]`	Structured output	Persistent homology: `{'beta_0': ..., 'beta_1': ...}`

Configuration

Metrics use Hydra's _partial_: True for deferred parameter binding:

# configs/metrics/embedding/trustworthiness.yaml
_target_: manylatents.metrics.trustworthiness.Trustworthiness
_partial_: true
n_neighbors: 5

Multi-Scale Expansion

List-valued parameters expand via Cartesian product through flatten_and_unroll_metrics():

n_neighbors: [5, 10, 20]  # Produces 3 separate evaluations

Naming convention: embedding.trustworthiness__n_neighbors_5, embedding.trustworthiness__n_neighbors_10, etc.

Shared kNN Cache

Metrics that need kNN graphs share a cache computed once with max(k) across all metrics, avoiding redundant computation.

Writing a New Metric

import numpy as np
from typing import Optional

def YourMetric(
    embeddings: np.ndarray,
    dataset=None,
    module=None,
    k: int = 10,
    cache=None,
) -> float:
    # Your computation
    return score

Choosing the Right Level

Only needs original data? → metrics/dataset/
Compares original vs. reduced? → metrics/embedding/
Needs algorithm internals? → metrics/module/

Config

# configs/metrics/embedding/your_metric.yaml
_target_: manylatents.metrics.your_metric.YourMetric
_partial_: true
k: 10

Testing

Use metrics=noop to verify integration:

uv run python -m manylatents.main data=swissroll algorithms/latent=pca metrics=noop

Null Metrics Support

manyLatents supports running experiments without metrics computation — useful for fast debugging, exploratory analysis, or workflows where metrics are computed separately.

Usage

CLI (Default)

Metrics are null by default. Just don't specify them:

# No metrics (default)
uv run python -m manylatents.main data=swissroll algorithms/latent=pca

# With metrics (explicit opt-in)
uv run python -m manylatents.main data=swissroll algorithms/latent=pca metrics=noop

Experiment Configs

# configs/experiment/my_experiment.yaml
# @package _global_
defaults:
  - override /algorithms/latent: pca
  - override /data: swissroll
  - override /callbacks/embedding: default
  # No metrics override - stays null

Python API

from manylatents.api import run

result = run(
    data="swissroll",
    algorithms={'latent': 'pca'},
    metrics=None  # Explicitly disable
)

Expected Behavior

When metrics=null:

Generates embeddings
Saves embeddings to files
Creates plots (if callbacks configured)
Logs to wandb (if configured)
Does NOT compute evaluation metrics
Shows warning: "No scores found"

Design: Opt-In by Default

The base config (configs/config.yaml) sets metrics to null. Experiment configs opt in:

# configs/experiment/single_algorithm.yaml
defaults:
  - override /metrics: noop  # Opt in for this experiment

Hydra Limitation

Hydra CLI does not support null as an override value. You cannot do metrics=null on the command line — Hydra's parser converts "null" to Python None, which its override validator rejects.

Workarounds:

Use experiment configs without metrics specified
Use the Python API with metrics=None (our code handles this)
Use metrics=null config files (e.g., the base config already does this)

The API intercepts None values before Hydra sees them and sets them after config composition via OmegaConf.update().

Troubleshooting

"Could not find 'metrics/none'"

You're trying metrics=none as a CLI override. Hydra interprets this as looking for metrics/none.yaml.

Fix: Use an experiment config, or the API with metrics=None.

Metrics Still Being Computed

Check that:

Your experiment config doesn't have - override /metrics: ... in defaults
You're not passing metrics=... on the command line
The final config shows metrics: null