Metrics
The evaluation system for manyLatents: a three-level architecture for measuring embedding quality, dataset properties, and algorithm internals.
Embedding Metrics
Evaluate the quality of low-dimensional embeddings. Compare high-dimensional input to low-dimensional output.
| metric | config | defaults | description |
|---|---|---|---|
| AlignmentScore | metrics/embedding=alignment_score |
k=20, method=jaccard | Compute composite per-variant alignment score. |
| Anisotropy | metrics/embedding=anisotropy |
-- | Anisotropy of embedding space |
| AUC | metrics/embedding=auc |
-- | Area Under ROC Curve for binary classification |
| CKA | metrics/embedding=cka |
kernel=linear | Centered Kernel Alignment with linear kernel |
| Continuity | metrics/embedding=continuity |
return_per_sample=True | Continuity of embedding (preservation of original neighborhoods) |
| CrossModalJaccard | metrics/embedding=cross_modal_jaccard |
k=20, metric=euclidean | Cross-modal k-NN neighborhood Jaccard overlap |
| DiffusionCondensation | metrics/embedding=diffusion_condensation |
scale=1.025, granularity=0.1, knn=5, decay=40, n_pca=50, output_mode=stable | Diffusion condensation score |
| DiffusionCurvature | metrics/embedding=diffusion_curvature |
t=3, percentile=5 | Diffusion curvature of embedding manifold |
| DiffusionSpectralEntropy | metrics/embedding=diffusion_spectral_entropy |
t=1, gaussian_kernel_sigma=10 | Diffusion spectral entropy (eigenvalue count at diffusion time t) |
| DiffusionSpectralEntropy | metrics/embedding=dse_dense |
output_mode=eigenvalue_count, t_high=[10, 50, 100, 200, 500], numerical_floor=1e-06, kernel=dense, gaussian_kernel_sigma=10 | Diffusion spectral entropy (eigenvalue count at diffusion time t) |
| DiffusionSpectralEntropy | metrics/embedding=dse_knn |
output_mode=eigenvalue_count, t_high=[10, 50, 100, 200, 500], numerical_floor=1e-06, kernel=knn, k=${neighborhood_size}, alpha=1.0 | Diffusion spectral entropy (eigenvalue count at diffusion time t) |
| DiffusionSpectralEntropy | metrics/embedding=dse_t_sweep |
output_mode=eigenvalue_count_sweep, t_high=[10, 50, 100, 200, 500], numerical_floor=1e-06, gaussian_kernel_sigma=10 | Diffusion spectral entropy (eigenvalue count at diffusion time t) |
| FractalDimension | metrics/embedding=fractal_dimension |
n_box_sizes=10 | Correlation fractal dimension of embedding |
| KNNPreservation | metrics/embedding=knn_preservation |
n_neighbors=10, metric=euclidean | k-nearest neighbor preservation between original and embedded spaces |
| LocalIntrinsicDimensionality | metrics/embedding=local_intrinsic_dimensionality |
k=20 | Mean local intrinsic dimensionality of the embedding |
| MagnitudeDimension | metrics/embedding=magnitude_dimension |
n_ts=50, log_scale=False, scale_finding=convergence, target_prop=0.95, metric=euclidean, p=2, n_neighbors=12, method=cholesky, one_point_property=True, perturb_singularities=True, positive_magnitude=False, exact=False | Magnitude-based effective dimensionality |
| NoOp | metrics/embedding=noop |
k=25 | |
| OutlierScore | metrics/embedding=outlier_score |
k=20, return_scores=False | Outlier scores using Local Outlier Factor |
| ParticipationRatio | metrics/embedding=participation_ratio |
n_neighbors=25, return_per_sample=True | Local participation ratio measuring effective dimensionality |
| PearsonCorrelation | metrics/embedding=pearson_correlation |
return_per_sample=False, num_dists=100 | Pearson correlation between pairwise distances |
| PersistentHomology | metrics/embedding=persistent_homology |
homology_dim=1, persistence_threshold=0.1, max_N=2000, random_seed=0 | Count of loops/cycles (H1 Betti number) |
| PersistentHomology | metrics/embedding=persistent_homology_beta0 |
homology_dim=0, persistence_threshold=3.0 | Count of loops/cycles (H1 Betti number) |
| RankAgreement | metrics/embedding=rank_agreement |
k=20, metric_fn=lid | Rank-based agreement of LID/PR across modalities |
| SilhouetteScore | metrics/embedding=silhouette |
metric=euclidean | Silhouette score for cluster separation in embedding |
| TangentSpaceApproximation | metrics/embedding=tangent_space |
n_neighbors=25, variance_threshold=0.95, return_per_sample=True | Tangent space alignment between original and embedded spaces |
| Trustworthiness | metrics/embedding=trustworthiness |
n_neighbors=5, metric=euclidean | Compute the trustworthiness of an embedding. |
| Trustworthiness | metrics/embedding=trustworthiness_k |
n_neighbors=[15, 25, 50, 100, 250], metric=euclidean | Compute the trustworthiness of an embedding. |
Config pattern: metrics/embedding=<name>
Module Metrics
Evaluate algorithm-specific internal components. Require a fitted module exposing affinity_matrix() or kernel_matrix().
| metric | config | defaults | description |
|---|---|---|---|
| AffinitySpectrum | metrics/module=affinity_spectrum |
-- | Top-k eigenvalues of the affinity matrix |
| ConnectedComponents | metrics/module=connected_components |
-- | Number of connected components in the kNN graph |
| DatasetTopologyDescriptor | metrics/module=dataset_topology_descriptor |
-- | Topological descriptor of the dataset structure |
| DiffusionMapCorrelation | metrics/module=diffusion_map_correlation |
dm_components=2, alpha=1.0, correlation_type=pearson | Correlation between diffusion map and embedding distances |
| KernelMatrixDensity | metrics/module=kernel_matrix_density |
threshold=1e-10 | Density of the kernel/affinity matrix |
| KernelMatrixSparsity | metrics/module=kernel_matrix_sparsity |
threshold=1e-10 | Sparsity of the kernel/affinity matrix |
| NoOp | metrics/module=noop |
k=25 | |
| SpectralDecayRate | metrics/module=spectral_decay_rate |
top_k=20 | Fit exponential decay to the eigenvalue spectrum. |
| SpectralGapRatio | metrics/module=spectral_gap_ratio |
-- | Ratio of first to second eigenvalue of the diffusion operator |
Config pattern: metrics/module=<name>
Dataset Metrics
Evaluate properties of the original high-dimensional data, independent of the DR algorithm.
| metric | config | defaults | description |
|---|---|---|---|
| GroundTruthPreservation | metrics/dataset=admixture_laplacian |
-- | Admixture Laplacian preservation score |
| GeodesicDistanceCorrelation | metrics/dataset=geodesic_distance_correlation |
correlation_type=spearman | Correlation between geodesic and embedded distances |
| NoOp | metrics/dataset=noop |
k=25 | |
| kmeans_stratification | metrics/dataset=stratification |
random_state=${seed} | K-means stratification score for population structure |
Config pattern: metrics/dataset=<name>
Metric Protocol
All metrics must match the Metric protocol (manylatents/metrics/metric.py):
def __call__(
self,
embeddings: np.ndarray,
dataset=None,
module=None,
cache=None,
) -> float | tuple[float, np.ndarray] | dict[str, Any]
Return Types
| Type | Use Case | Example |
|---|---|---|
float |
Simple scalar | Trustworthiness: 0.95 |
tuple[float, ndarray] |
Scalar + per-sample | Continuity with return_per_sample=True |
dict[str, Any] |
Structured output | Persistent homology: {'beta_0': ..., 'beta_1': ...} |
Configuration
Metrics use Hydra's _partial_: True for deferred parameter binding:
# configs/metrics/embedding/trustworthiness.yaml
_target_: manylatents.metrics.trustworthiness.Trustworthiness
_partial_: true
n_neighbors: 5
Multi-Scale Expansion
List-valued parameters expand via Cartesian product through flatten_and_unroll_metrics():
n_neighbors: [5, 10, 20] # Produces 3 separate evaluations
Naming convention: embedding.trustworthiness__n_neighbors_5, embedding.trustworthiness__n_neighbors_10, etc.
Shared kNN Cache
Metrics that need kNN graphs share a cache computed once with max(k) across all metrics, avoiding redundant computation.
Writing a New Metric
import numpy as np
from typing import Optional
def YourMetric(
embeddings: np.ndarray,
dataset=None,
module=None,
k: int = 10,
cache=None,
) -> float:
# Your computation
return score
Choosing the Right Level
- Only needs original data? →
metrics/dataset/ - Compares original vs. reduced? →
metrics/embedding/ - Needs algorithm internals? →
metrics/module/
Config
# configs/metrics/embedding/your_metric.yaml
_target_: manylatents.metrics.your_metric.YourMetric
_partial_: true
k: 10
Testing
Use metrics=noop to verify integration:
uv run python -m manylatents.main data=swissroll algorithms/latent=pca metrics=noop
Null Metrics Support
manyLatents supports running experiments without metrics computation — useful for fast debugging, exploratory analysis, or workflows where metrics are computed separately.
Usage
CLI (Default)
Metrics are null by default. Just don't specify them:
# No metrics (default)
uv run python -m manylatents.main data=swissroll algorithms/latent=pca
# With metrics (explicit opt-in)
uv run python -m manylatents.main data=swissroll algorithms/latent=pca metrics=noop
Experiment Configs
# configs/experiment/my_experiment.yaml
# @package _global_
defaults:
- override /algorithms/latent: pca
- override /data: swissroll
- override /callbacks/embedding: default
# No metrics override - stays null
Python API
from manylatents.api import run
result = run(
data="swissroll",
algorithms={'latent': 'pca'},
metrics=None # Explicitly disable
)
Expected Behavior
When metrics=null:
- Generates embeddings
- Saves embeddings to files
- Creates plots (if callbacks configured)
- Logs to wandb (if configured)
- Does NOT compute evaluation metrics
- Shows warning: "No scores found"
Design: Opt-In by Default
The base config (configs/config.yaml) sets metrics to null. Experiment configs opt in:
# configs/experiment/single_algorithm.yaml
defaults:
- override /metrics: noop # Opt in for this experiment
Hydra Limitation
Hydra CLI does not support null as an override value. You cannot do metrics=null on the command line — Hydra's parser converts "null" to Python None, which its override validator rejects.
Workarounds:
- Use experiment configs without metrics specified
- Use the Python API with
metrics=None(our code handles this) - Use
metrics=nullconfig files (e.g., the base config already does this)
The API intercepts None values before Hydra sees them and sets them after config composition via OmegaConf.update().
Troubleshooting
"Could not find 'metrics/none'"
You're trying metrics=none as a CLI override. Hydra interprets this as looking for metrics/none.yaml.
Fix: Use an experiment config, or the API with metrics=None.
Metrics Still Being Computed
Check that:
- Your experiment config doesn't have
- override /metrics: ...in defaults - You're not passing
metrics=...on the command line - The final config shows
metrics: null