scikitplot.impute#

Transformers for missing value imputation.

This submodule re-exports the standard scikit-learn imputers and exposes additional experimental imputers via the scikitplot.experimental namespace.

In particular, ANNImputer is an approximate nearest-neighbours based imputer built on top of the Spotify Annoy library. It is gated behind the experimental switch

from scikitplot.experimental import enable_ann_imputer

to follow scikit-learn’s experimental API conventions.

User guide. See the Impute section for further details.

Approximate K-nearest-neighbours (KNN) imputation.#

Vector-based Approximate K-Nearest Neighbors (KNN) [1] imputation ANNImputer.

Annoy [2] (Approximate Nearest Neighbors Oh Yeah) is a C++ library with Python bindings to search for points in space that are close to a given query point.

Voyager [3] is an HNSW-based approximate nearest-neighbor index with a Python API.

Both libraries create large read-only file-based data structures that can be memory-mapped so that many processes may share the same data.

References

User guide. See the ANNImputer section for further details.

_ann.ANNImputer

Approximate K-nearest-neighbours (KNN) imputer with pluggable ANN backends.