API Reference
Core Classes
shrinkr
Shrinkr package.
CovarianceEstimator
Bases: BaseEstimator
Covariance matrix estimator with optional shrinkage.
Wraps several shrinkage methods behind a scikit-learn-compatible
fit / predict interface.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
method
|
str
|
Shrinkage method to apply. One of |
'empirical'
|
tol
|
float
|
Eigenvalue threshold passed to eigenvalue-based methods. Default is 1e-8. |
1e-08
|
See Also
shrinkr.functional.lw_linear
Ledoit-Wolf Linear Shrinkage
shrinkr.functional.lw_analytical
Ledoit-Wolf Analytical Shrinkage
shrinkr.functional.oas
Oracle Approximating Shrinkage
Attributes:
| Name | Type | Description |
|---|---|---|
is_fitted_ |
bool
|
True after |
data_ |
ndarray
|
The data passed to |
cov_ |
ndarray or None
|
Raw sample covariance matrix. Set only for methods that require it. |
shrunk_cov_ |
ndarray or None
|
Shrinkage-regularized covariance matrix produced by |
Source code in shrinkr/cov.py
14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 | |
fit(X, y=None)
Compute the (shrunk) covariance matrix from data.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
X
|
ndarray
|
Data matrix of shape (n_samples, n_features). |
required |
y
|
ignored
|
Present for API compatibility. |
None
|
Returns:
| Name | Type | Description |
|---|---|---|
self |
CovarianceEstimator
|
Fitted estimator. |
Source code in shrinkr/cov.py
67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 | |
fit_predict(X, y=None)
Fit and immediately return the covariance matrix.
Equivalent to calling fit followed by predict.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
X
|
ndarray
|
Data matrix of shape (n_samples, n_features). |
required |
y
|
ignored
|
Present for API compatibility. |
None
|
Returns:
| Type | Description |
|---|---|
ndarray
|
The shrinkage-regularized covariance matrix. |
Source code in shrinkr/cov.py
153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 | |
predict(X)
Return the fitted covariance matrix.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
X
|
ignored
|
Present for API compatibility. |
required |
Returns:
| Type | Description |
|---|---|
ndarray
|
The shrinkage-regularized covariance matrix, or the raw sample covariance if no shrinkage method produced a result. |
Source code in shrinkr/cov.py
132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 | |
LinearDiscriminantAnalysis
Bases: BaseEstimator
Binary Linear Discriminant Analysis with pluggable covariance shrinkage.
Fits a two class LDA model and classifies by the
log-posterior ratio. The pooled covariance can be estimated with any
method supported by CovarianceEstimator
or specialized LDA shrinkage deal.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
method
|
str
|
Estimator used to shrink (compute) the pooled within-class covariance.
See |
'empirical'
|
See Also
shrinkr.CovarianceEstimator
CovarianceEstimator class with non directional shrinkage methods
shrinkr.functional.lw_analytical
Deterministic Equivalent Adjusted LDA (directional shrinakge)
Attributes:
| Name | Type | Description |
|---|---|---|
covariance_estimator_ |
CovarianceEstimator
|
Instance of the Covariance Estimator for shrinkage LDA. |
classes_ |
np.ndarray of shape (2,)
|
The two class labels seen during |
priors_ |
np.ndarray of shape (2,)
|
Class prior probabilities estimated from the training data. |
means_ |
np.ndarray of shape (2, n_features)
|
Per-class sample means. |
covariance_ |
np.ndarray of shape (n_features, n_features)
|
Pooled within-class covariance matrix (after shrinkage if applicable). |
precision_ |
np.ndarray of shape (n_features, n_features)
|
Pseudo-inverse of |
coef_ |
np.ndarray of shape (1, n_features)
|
Linear discriminant direction. |
intercept_ |
np.ndarray of shape (1,)
|
Decision boundary offset. |
is_fitted_ |
bool
|
True after |
Source code in shrinkr/lda.py
14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 | |
decision_function(X)
Compute the log-odds score for the positive class.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
X
|
np.ndarray of shape (n_samples, n_features)
|
Samples to score. |
required |
Returns:
| Type | Description |
|---|---|
np.ndarray of shape (n_samples,)
|
Log-odds of the positive class for each sample. |
Source code in shrinkr/lda.py
163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 | |
fit(X, y)
Fit the LDA model on labelled training data.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
X
|
np.ndarray of shape (n_samples, n_features)
|
Training data. |
required |
y
|
np.ndarray of shape (n_samples,)
|
Binary class labels. Exactly two distinct values must be present. |
required |
Returns:
| Name | Type | Description |
|---|---|---|
self |
LinearDiscriminantAnalysis
|
Fitted estimator. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If |
Source code in shrinkr/lda.py
79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 | |
fit_predict(X, y)
Fit and predict the predict classes.
Equivalent to calling fit followed by predict.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
X
|
np.ndarray of shape (n_samples, n_features)
|
Training data. |
required |
y
|
np.ndarray of shape (n_samples,)
|
Binary class labels. Exactly two distinct values must be present. |
required |
Returns:
| Type | Description |
|---|---|
ndarray
|
The shrinkage-regularized covariance matrix. |
Source code in shrinkr/lda.py
197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 | |
predict(X)
Predict binary class labels.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
X
|
np.ndarray of shape (n_samples, n_features)
|
Samples to classify. |
required |
Returns:
| Type | Description |
|---|---|
np.ndarray of shape (n_samples,)
|
Predicted class label for each sample. |
Source code in shrinkr/lda.py
180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 | |
predict_proba(X)
Estimate class probabilities using the logistic sigmoid.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
X
|
np.ndarray of shape (n_samples, n_features)
|
Samples to score. |
required |
Returns:
| Type | Description |
|---|---|
np.ndarray of shape (n_samples, 2)
|
Columns are |
Source code in shrinkr/lda.py
216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 | |
Functional API
shrinkr.functional
Functional C implementation of shrinkages and more.
accuracy(y, y_pred)
Classification accuracy.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
y
|
ndarray
|
True class labels (1D integer array). |
required |
y_pred
|
ndarray
|
Predicted class labels (1D integer array). |
required |
Returns:
| Type | Description |
|---|---|
float
|
Fraction of correctly classified samples, in the range [0, 1]. |
Source code in shrinkr/functional/_losses.py
199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 | |
deal(evals, z_vec, n_eff, gamma_min=0.02, gamma_max=100, base_shrinkage='lw_analytical', surrogate_shrinkage='lw_analytical', eps=1e-08, **kwargs)
DEAL (Deterministic Equivalents for Adaptive LDA) shrinkage.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
evals
|
ndarray
|
Eigenvalues of the empirical covariance matrix. |
required |
z_vec
|
ndarray
|
Vector of interest projected into the eigenvector space. |
required |
n_eff
|
int
|
Effective number of samples used to compute the empirical covariance matrix. |
required |
gamma_min
|
float
|
Minimum value for the gamma bounded search. Default is 0.02. |
0.02
|
gamma_max
|
float
|
Maximum value for the gamma bounded search. Default is 100. |
100
|
base_shrinkage
|
(lw_analytical, empirical)
|
Shrinkage method for the base eigenvalue estimation. Default is 'lw_analytical'. |
'lw_analytical'
|
surrogate_shrinkage
|
(lw_analytical, empirical)
|
Shrinkage method for the surrogate eigenvalue estimation. Default is 'lw_analytical'. |
'lw_analytical'
|
eps
|
float
|
Epsilon for numerical stability. Default is 1e-8. |
1e-08
|
Notes
The DEAL method utilizes the Random Matrix Theory (RMT) [1] to construct an estimate and optimize an objective defined as an expectation over the data distribution of \(|| \hat\Sigma^{-1} \mu - \Sigma^{-1} \mu ||_\Sigma^2\) where \(\hat\Sigma\) is an optimal linear correction to any non-directional shrinkage, \(\Sigma\) is the True Population Covariance, \(\mu\) is a constant vector of interest and \(|| \cdot ||_\Sigma\) is the Mahalanobis distance based on the matrix \(\Sigma\). More details in a future paper.
Returns:
| Type | Description |
|---|---|
ndarray
|
Shrinkage-adjusted eigenvalues. |
References
-
Hachem, W., Loubaton, P., Najim, J., & Vallet, P. (2013). On bilinear forms based on the resolvent of large random matrices. In Annales de l'IHP Probabilités et statistiques (Vol. 49, No. 1, pp. 36-63). https://www.numdam.org/article/AIHPB_2013__49_1_36_0.pdf ↩
Source code in shrinkr/functional/_deal.py
52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 | |
deal_objective(base_evals, surrogate_evals, z_vec, gamma, n, start_value=1.0)
Objective function of DEAL.
Computes the optimization objective using deterministic equivalents. Requires solving a fixed-point equation for delta at a given gamma.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
base_evals
|
ndarray
|
First 1D array of eigenvalues for the objective (Those will be shrunk) |
required |
surrogate_evals
|
ndarray
|
Second 1D array of eigenvalues for the objective (Used to compute shrinkage paramters) |
required |
z_vec
|
ndarray
|
Vector of interest projected into the eigenvector space. |
required |
gamma
|
float
|
The value of gamma to evaluate. During optimization only this value changes. |
required |
n
|
int
|
Effective number of samples used to compute the empirical covariance matrix. |
required |
start_value
|
float
|
Starting value of delta for the fixed point iteration method used by for the objective. |
1.0
|
See Also
shrinkr.functional.deal
function for more information about the DEAL method.
Returns:
| Type | Description |
|---|---|
float
|
The DEAL objective estimate. |
Source code in shrinkr/functional/_deal.py
10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 | |
loss_fm(v, sigma, mu)
Fisher Margin (FM) loss.
Defined as \(FM(v) = -(v^T \mu)^2 / (v^T \Sigma v)\), where \(\mu\) is the true difference-in-means vector, \(\Sigma\) is the true Population Covariance matrix and the \(v\) is the considered LDA vector.
Minimizing the Fisher Margin leads to an optimal Bayesian Classifier on data which admits the LDA data assumptions. The loss is scale invariant.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
v
|
ndarray
|
LDA vector computed from data. |
required |
sigma
|
ndarray
|
True covariance matrix. |
required |
mu
|
ndarray
|
True difference-in-means vector. |
required |
Notes
Practically the Fisher Margin is defined
without the minus sign. It is there only to turn
the maximization task in a minimization one
making it a loss.
Returns:
| Type | Description |
|---|---|
float
|
Value of the FM loss. |
Source code in shrinkr/functional/_losses.py
86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 | |
loss_fr(matrixA, matrixB)
Frobenius distance between two matrices.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
matrixA
|
ndarray
|
First matrix. |
required |
matrixB
|
ndarray
|
Second matrix. |
required |
Returns:
| Type | Description |
|---|---|
float
|
Scaled squared Frobenius distance between the matrices. |
Source code in shrinkr/functional/_losses.py
170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 | |
loss_mv(sigma_hat, sigma)
Minimal Variance (MV) loss [1].
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
sigma_hat
|
ndarray
|
Estimated covariance matrix. |
required |
sigma
|
ndarray
|
True covariance matrix. |
required |
Returns:
| Type | Description |
|---|---|
float
|
Value of the MV loss. |
References
-
Ledoit, O., & Wolf, M. (2020). Analytical nonlinear shrinkage of large-dimensional covariance matrices. The Annals of Statistics, 48(5), 3043-3065. http://www.ledoit.net/Analytical_AoS_2020.pdf ↩
Source code in shrinkr/functional/_losses.py
131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 | |
loss_prial(sample_cov, sigma_hat, sigma)
Percentage Relative Improvement in Average Loss (PRIAL) [1].
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
sample_cov
|
ndarray
|
Sample covariance matrix. |
required |
sigma_hat
|
ndarray
|
Estimated covariance matrix. |
required |
sigma
|
ndarray
|
True covariance matrix. |
required |
Returns:
| Type | Description |
|---|---|
float
|
Percentage improvement relative to the oracle, in the range [0, 1]. |
References
-
Ledoit, O., & Péché, S. (2011). Eigenvectors of some large sample covariance matrix ensembles. Probability Theory and Related Fields, 151(1), 233-264. https://link.springer.com/article/10.1007/s00440-010-0298-3 ↩
Source code in shrinkr/functional/_losses.py
4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 | |
lw_analytical(eigenvalues, n, p=None, eps=1e-08)
Ledoit-Wolf Analytical (nonlinear) shrinkage of eigenvalues.
Based on a optimization free formula from Ledoit and Wolf (2020) [1]. Handles also the high-dimensional setting where \(p>n\).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
eigenvalues
|
ndarray
|
1-D array of eigenvalues of the sample covariance matrix. |
required |
n
|
int
|
Number of observations used to compute the sample covariance. |
required |
p
|
int
|
Number of variables. If None, inferred as |
None
|
eps
|
float
|
Threshold below which eigenvalues are treated as numerically zero. Default is 1e-8. |
1e-08
|
Returns:
| Type | Description |
|---|---|
ndarray
|
Analytically shrunk eigenvalues of the same shape as |
References
-
Ledoit, O., & Wolf, M. (2020). Analytical nonlinear shrinkage of large-dimensional covariance matrices. The Annals of Statistics, 48(5), 3043-3065. http://www.ledoit.net/Analytical_AoS_2020.pdf ↩
Source code in shrinkr/functional/_lw_analytical.py
7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 | |
lw_linear(X, assume_centered=False)
Ledoit-Wolf linear shrinkage estimator.
The value of the shrinkage is constructed based on the Theorem 3.2 and Lemmata 3.2-3.5 from [1].
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
X
|
ndarray
|
Data matrix of shape (n_samples, n_features). |
required |
assume_centered
|
bool
|
If True, data is not mean-centered before computing the covariance. Default is False. |
False
|
Notes
The regularized covariance is: \((1 - s) * S_c + s * \mu * I\), where \(\mu = Tr{(S_c)} / n_\text{features}\), \(s\) is the shrinkage value and \(S_c\) is the sample covariance matrix.
Returns:
| Name | Type | Description |
|---|---|---|
sample_cov_star |
ndarray
|
Shrinkage-regularized covariance matrix of shape (n_features, n_features). |
shrinkage |
float
|
Optimal shrinkage coefficient. |
References
-
Ledoit, O., & Wolf, M. (2004). A well-conditioned estimator for large-dimensional covariance matrices. Journal of multivariate analysis, 88(2), 365-411. http://www.ledoit.net/ole1a.pdf ↩
Source code in shrinkr/functional/_lw_linear.py
6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 | |
mv_opt_cov(sample_cov, sigma)
Minimal variance optimal rotation equivariant estimator.
Oracle estimator derived in [1].
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
sample_cov
|
ndarray
|
Sample covariance matrix. |
required |
sigma
|
ndarray
|
True covariance matrix. |
required |
Returns:
| Type | Description |
|---|---|
ndarray
|
Oracle optimal rotation equivariant estimator under the MV loss. |
References
-
Ledoit, O., & Wolf, M. (2020). Analytical nonlinear shrinkage of large-dimensional covariance matrices. The Annals of Statistics, 48(5), 3043-3065. http://www.ledoit.net/Analytical_AoS_2020.pdf ↩
Source code in shrinkr/functional/_losses.py
47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 | |
oas(sample_cov, n, p=None)
Oracle Approximating Shrinkage (OAS) covariance estimator.
The formulation is based on [1].
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
sample_cov
|
ndarray
|
Sample covariance matrix of shape (p, p). |
required |
n
|
int
|
Number of observations used to compute the sample covariance. |
required |
p
|
int
|
Number of variables. If None, inferred from |
None
|
Returns:
| Name | Type | Description |
|---|---|---|
sample_cov_star |
ndarray
|
Shrinkage-regularized covariance matrix of shape (p, p). |
shrinkage |
float
|
Optimal shrinkage coefficient. |
References
-
Chen, Y., Wiesel, A., Eldar, Y. C., & Hero, A. O. (2010). Shrinkage algorithms for MMSE covariance estimation. IEEE Transactions on Signal Processing, 58(10), 5016-5029. https://arxiv.org/pdf/0907.4698.pdf ↩
Source code in shrinkr/functional/_oas.py
6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 | |
Reference implementations
shrinkr.reference
Shrinkage reference implementations in numpy.
ref_deal(evals, z_vec, n_eff, gamma_min=0.02, gamma_max=100, base_shrinkage='lw_analytical', surrogate_shrinkage='lw_analytical', eps=1e-08, **kwargs)
DEAL (Deterministic Equivalents for Adaptive LDA) shrinkage (reference implementation).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
evals
|
ndarray
|
Eigenvalues of the empirical covariance matrix. |
required |
z_vec
|
ndarray
|
Vector of interest projected into the eigenvector space. |
required |
n_eff
|
int
|
Effective number of samples used to compute the empirical covariance matrix. |
required |
gamma_min
|
float
|
Minimum value for the gamma bounded search. Default is 0.02. |
0.02
|
gamma_max
|
float
|
Maximum value for the gamma bounded search. Default is 100. |
100
|
base_shrinkage
|
(lw_analytical, empirical)
|
Shrinkage method for the base eigenvalue estimation. Default is 'lw_analytical'. |
'lw_analytical'
|
surrogate_shrinkage
|
(lw_analytical, empirical)
|
Shrinkage method for the surrogate eigenvalue estimation. Default is 'lw_analytical'. |
'lw_analytical'
|
eps
|
float
|
Epsilon for numerical stability. Default is 1e-8. |
1e-08
|
Returns:
| Type | Description |
|---|---|
ndarray
|
Shrinkage-adjusted eigenvalues. |
See Also
shrinkr.functional.deal
Optimized implementation of this method.
Go there for additional notes and references.
Functions ref_* are reference implementations intended for validation.
Source code in shrinkr/reference/_deal.py
161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 | |
ref_deal_objective(base_evals, surrogate_evals, z_vec, gamma, n, start_value=1, max_iters=200, eps=1e-08)
Objective function of DEAL (reference implementation).
Computes the optimization objective using deterministic equivalents. Requires solving a fixed-point equation for delta at a given gamma.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
base_evals
|
ndarray
|
Eigenvalue estimates for the base covariance matrix (those that will be shrunk). |
required |
surrogate_evals
|
ndarray
|
Eigenvalue estimates used to compute the shrinkage parameters. |
required |
z_vec
|
ndarray
|
Vector of interest projected into the eigenvector space. |
required |
gamma
|
float
|
Scalar resolvent parameter for the matrix S. |
required |
n
|
int
|
Effective number of observations used to compute the sample covariance matrix. |
required |
start_value
|
float
|
Starting value for the fixed-point iteration (supports warm-starts). Default is 1. |
1
|
max_iters
|
int
|
Maximum number of fixed-point iterations. Default is 200. |
200
|
eps
|
float
|
Convergence tolerance for early stopping. Default is 1e-8. |
1e-08
|
Returns:
| Name | Type | Description |
|---|---|---|
obj |
float
|
The DEAL risk objective estimate. |
delta |
float
|
The converged delta value from the fixed-point iteration. |
delta_prime |
float
|
The derivative of delta with respect to gamma. |
See Also
shrinkr.functional.deal_objective
Optimized implementation of this method.
Go there for additional notes and references.
Functions ref_* are reference implementations intended for validation.
Source code in shrinkr/reference/_deal.py
8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 | |
ref_lw_analytical(lam, n, eps=1e-08)
Ledoit-Wolf Analytical (nonlinear) shrinkage of eigenvalues (reference implementation).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
lam
|
ndarray
|
1-D array of empirical eigenvalues. |
required |
n
|
int
|
Effective sample size. |
required |
eps
|
float
|
Threshold below which eigenvalues are treated as numerically zero. Default is 1e-8. |
1e-08
|
Returns:
| Type | Description |
|---|---|
ndarray
|
Analytically shrunk eigenvalues. |
See Also
shrinkr.functional.lw_analytical
Optimized implementation of this method. This is a reference
implementation intended for validation; prefer the functional version
for performance-critical use.
Source code in shrinkr/reference/_lw_analytical.py
88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 | |
ref_lw_analytical_unstable(lam, n, eps=1e-08)
Ledoit-Wolf Analytical (nonlinear) shrinkage — numerically unstable variant.
Identical in formula to ref_lw_analytical but without the numerical
stability fixes for singularities and large tails. Kept for reference only;
use ref_lw_analytical for proper reference.
Reference implementation from https://github.com/matzhaugen/analytic_shrinkage.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
lam
|
ndarray
|
1-D array of empirical eigenvalues. |
required |
n
|
int
|
Effective sample size. |
required |
eps
|
float
|
Threshold below which eigenvalues are treated as numerically zero. Default is 1e-8. |
1e-08
|
Returns:
| Type | Description |
|---|---|
ndarray
|
Analytically shrunk eigenvalues. |
See Also
shrinkr.functional.lw_analytical
Optimized, numerically stable implementation. This is a reference
implementation intended for validation.
Source code in shrinkr/reference/_lw_analytical.py
4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 | |
ref_lw_linear(X, assume_centered=False, block_size=1000)
Ledoit-Wolf linear shrinkage estimator (reference implementation).
Reference implementation from scikit-learn.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
X
|
ndarray
|
Data matrix of shape (n_samples, n_features). |
required |
assume_centered
|
bool
|
If True, data is not mean-centered before computing the covariance. Default is False. |
False
|
block_size
|
int
|
Size of blocks into which the covariance matrix is split for computation. Default is 1000. |
1000
|
Returns:
| Name | Type | Description |
|---|---|---|
sample_cov_star |
ndarray
|
Shrinkage-regularized covariance matrix of shape (n_features, n_features). |
shrinkage |
float
|
Optimal shrinkage coefficient. |
See Also
shrinkr.functional.lw_linear
Optimized implementation of this method.
Go there for additional notes and references.
Functions ref_* are reference implementations intended for validation.
Source code in shrinkr/reference/_lw_linear.py
4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 | |
ref_oas(sample_cov, n, p=None)
Oracle Approximating Shrinkage (OAS) covariance estimator (reference implementation).
Reference implementation from scikit-learn.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
sample_cov
|
ndarray
|
Sample covariance matrix of shape (p, p). |
required |
n
|
int
|
Number of observations used to compute the sample covariance. |
required |
p
|
int
|
Number of variables. If None, inferred from |
None
|
Returns:
| Name | Type | Description |
|---|---|---|
sample_cov_star |
ndarray
|
Shrinkage-regularized covariance matrix of shape (p, p). |
shrinkage |
float
|
Optimal shrinkage coefficient. |
See Also
shrinkr.functional.oas
Optimized implementation of this method.
Go there for additional notes and references.
Functions ref_* are reference implementations intended for validation.
Source code in shrinkr/reference/_oas.py
4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 | |
Monte carlo module
shrinkr.monte_carlo
Submodule with simple monte carlo methods for getting testing data.
get_guassian_lda_samples(p=20, n_per_class=100, seed=0)
Generate data for Gaussian LDA with two classes. Balanced dataset.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
p
|
int
|
Number of features (dimension). Default is 20. |
20
|
n_per_class
|
int
|
Number of samples per class. Default is 100. |
100
|
seed
|
int
|
Seed for |
0
|
Source code in shrinkr/monte_carlo.py
78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 | |
get_large_sample_cov(p=20, n=200, seed=0, add_diagonal=0.1)
Generate Gaussian data with a random positive semi-definite covariance matrix.
The population covariance is constructed as A @ A.T (normalized to unit
trace) plus a small ridge add_linear for numerical stability.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
p
|
int
|
Number of features (dimension). Default is 20. |
20
|
n
|
int
|
Number of samples. Default is 200. |
200
|
seed
|
int
|
Seed for |
0
|
add_diagonal
|
float
|
Value added to the diagonal of the true covariance matrix. Default is 1e-1. |
0.1
|
Returns:
| Name | Type | Description |
|---|---|---|
X |
np.ndarray of shape (n, p)
|
Simulated data matrix. |
sample_cov |
np.ndarray of shape (p, p)
|
Sample covariance matrix computed from |
real_cov |
np.ndarray of shape (p, p)
|
Population (true) covariance matrix. |
Source code in shrinkr/monte_carlo.py
34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 | |
get_small_sample_cov(n=50, seed=0)
Generate a small sample from a fixed 2-D multivariate normal distribution.
The population covariance is [[0.4, 0.2], [0.2, 0.8]].
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
n
|
int
|
Number of samples to generate. Default is 50. |
50
|
seed
|
int
|
Seed for |
0
|
Returns:
| Name | Type | Description |
|---|---|---|
X |
np.ndarray of shape (n, 2)
|
Simulated data matrix. |
sample_cov |
np.ndarray of shape (2, 2)
|
Sample covariance matrix computed from |
real_cov |
np.ndarray of shape (2, 2)
|
Population (true) covariance matrix. |
Source code in shrinkr/monte_carlo.py
6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 | |