Shrinkr - Covariance matrix shrinkage and LDA
Shrinkr is a Python package for covariance matrix shrinkage and Linear Discriminant Analysis. Methods are implemented in C for performance and exposed through a clean Python interface.
Installation
Currently the package is only on GitHub. Install most recent release with:
pip install git+https://github.com/ZetrextJG/shrinkr@latest
PyPI release coming soon.
Usage example
Also located in the ready to run script.
from shrinkr import CovarianceEstimator
from shrinkr import LinearDiscriminantAnalysis as LDA
from shrinkr.functional import accuracy
from shrinkr.monte_carlo import get_guassian_lda_samples
# Generate Gaussian data for covariance estimation and LDA
X, y = get_guassian_lda_samples(p=20, n_per_class=200, seed=1)
# Shrunk covariance estimation:
# Methods like LW Linear, OAS, LW Analytical.
total_covariance = CovarianceEstimator(method="lw_linear").fit_predict(X)
assert total_covariance.shape == (20, 20)
# Linear Discriminant Analysis with Shrunk covariance estimation:
# Supports all methods from CovarianceEstimator
# but also LDA specialized shrinkages like DEAL.
classifier = LDA(method="deal")
classifier.fit(X, y)
y_pred = classifier.predict(X)
print(accuracy(y, y_pred)) # 1.0, quite a simple task
Documentation
Documentation site is hosted on GitHub Pages. Build with MkDocs for Python and Doxygen for C API Reference.
Structure
Main classes CovarianceEstimator and LinearDiscriminantAnalysis are importable
directly from the package root shrinkr.*.
All shrinkage methods are implemented functionally in the shrinkr.functional module,
with reference Python/NumPy implementations in shrinkr.reference.
Additionally Monte Carlo implementations used for tests (and more)
are located in the shrinkr.monte_carlo.
Development
The project is set up with uv:
uv sync --dev
The pure C code can be found in ./src with the Python
bindings in ./shrinkr/bindings.c which are exposed via the
shrinkr._native module with type interface in ./shrinkr/_native.pyi.
Testing
All tests are in ./.devel/tests and are handled with pytest.
To run the unit test suite:
uv run pytest -m unit
To run the property-based test suite:
uv run pytest -m prop
Styling
Styling is handled entirely with ruff and enforced on every commit by pre-commit. Docstrings must be in the numpy docstring format. Also enforced by ruff.
Benchmarking
Benchmarking tools can be found in ./.devel/bench.
Those utilize pytest-benchmark for
benchmarking together with Python wrappers and Google's benchmark
for the benchmarking pure C implementation.
Benchmark results
Benchmarking results run on a Lenovo ThinkSystem SR665 with 2x AMD EPYC 7413 48 Core Processors and sufficient RAM. The number of cores is restricted to 16. Numpy is installed with uv.