VectorOpsMixin#
- class scikitplot.annoy.VectorOpsMixin[source]#
User-facing neighbor queries for Annoy-like backends.
This mixin exposes explicit per-query helpers (
query_by_item,query_by_vector) and scikit-learn style batch helpers (kneighbors,kneighbors_graph).Notes
Output ordering for
kneighborsis(neighbors, distances)wheninclude_distances=True(neighbors first). This is not the same assklearn.neighbors.NearestNeighbors.kneighbors(which returns distances first). The order is intentional and documented.- kneighbors(X, n_neighbors=5, *, search_k=-1, include_distances=True, exclude_self=False, exclude_item_ids=None, ensure_all_finite=True, copy=False, output_type='vector')[source]#
Find k nearest neighbors for one or more query vectors.
This is a sklearn-like convenience wrapper that returns rectangular arrays.
- Parameters:
- Xarray-like of shape (f,) or (n_queries, f)
Query vector(s).
- n_neighborsint, default=5
Number of neighbors to return per query.
- search_kint, default=-1
Search parameter forwarded to the backend.
- include_distancesbool, default=True
If True, return
(neighbors, distances). Otherwise return neighbors.- exclude_selfbool, default=False
If True, apply the same deterministic self-exclusion rule as
query_by_vectorfor each query row.- exclude_item_idsiterable of int, optional
Exclude these ids for every query.
- ensure_all_finitebool or ‘allow-nan’, default=True
Input validation option forwarded to scikit-learn.
- copybool, default=False
Input validation option forwarded to scikit-learn.
- output_type{‘item’, ‘vector’}, default=’vector’
If ‘item’, return neighbor ids. If ‘vector’, return neighbor vectors.
- Returns:
- neighborsnumpy.ndarray
If
output_type='item', shape is(n_queries, n_neighbors). Ifoutput_type='vector', shape is(n_queries, n_neighbors, f).- distancesnumpy.ndarray of shape (n_queries, n_neighbors)
Neighbor distances. Returned when
include_distances=True.
- Raises:
- sklearn.exceptions.NotFittedError
If the backend reports that the index is unbuilt.
- ValueError
If
n_neighbors <= 0or any query yields too few neighbors after exclusions.
- Parameters:
- Return type:
See also
query_by_vectorPer-query 1D interface.
kneighbors_graphCSR kNN graph.
- kneighbors_graph(X, n_neighbors=5, *, search_k=-1, mode='connectivity', exclude_self=False, exclude_item_ids=None, ensure_all_finite=True, copy=False, output_type='item')[source]#
Compute the k-neighbors graph (CSR) for query vectors.
- Parameters:
- Xarray-like of shape (f,) or (n_queries, f)
Query vector(s).
- n_neighborsint, default=5
Number of neighbors per query.
- search_kint, default=-1
Search parameter forwarded to the backend.
- mode{‘connectivity’, ‘distance’}, default=’connectivity’
If ‘connectivity’, graph entries are 1. If ‘distance’, entries are backend distances.
- exclude_selfbool, default=False
If True, apply the same deterministic self-exclusion rule as
kneighborsfor each query row.- exclude_item_idsiterable of int, optional
Exclude these ids for every query.
- ensure_all_finitebool or ‘allow-nan’, default=True
Input validation option forwarded to scikit-learn.
- copybool, default=False
Input validation option forwarded to scikit-learn.
- output_type{‘item’}, default=’item’
Must be ‘item’ for CSR construction.
- Returns:
- graphscipy.sparse.csr_matrix
CSR matrix of shape
(n_queries, n_items).
- Raises:
- ImportError
If SciPy is not installed.
- ValueError
If
modeis invalid oroutput_type != 'item'.- RuntimeError
If the backend returns an out-of-range neighbor id.
- Parameters:
- Return type:
See also
kneighborsDense kNN results.
- query_by_item(item, n_neighbors, *, search_k=-1, include_distances=False, exclude_self=False, exclude_item_ids=None, ensure_all_finite=True, copy=False)[source]#
Query neighbors by stored item id.
- Parameters:
- itemint
Stored item id.
- n_neighborsint
Number of neighbors to return after applying exclusions.
- search_kint, default=-1
Search parameter forwarded to the backend.
- include_distancesbool, default=False
If True, also return distances.
- exclude_selfbool, default=False
If True, exclude
itemfrom the returned neighbors.- exclude_item_idsiterable of int, optional
Additional item ids to exclude.
- ensure_all_finitebool or ‘allow-nan’, default=True
Input validation option forwarded to scikit-learn.
- copybool, default=False
Input validation option forwarded to scikit-learn.
- Returns:
- indicesnumpy.ndarray of shape (n_neighbors,)
Neighbor ids.
- (indices, distances)tuple of numpy.ndarray
Returned when
include_distances=True.
- Raises:
- sklearn.exceptions.NotFittedError
If the backend reports that the index is unbuilt.
- ValueError
If
n_neighbors <= 0or not enough neighbors remain after exclusions.
- Parameters:
- Return type:
See also
query_by_vectorQuery neighbors by an explicit vector.
kneighborsBatch neighbor queries (sklearn-like).
Notes
Exclusions are applied deterministically in the order returned by the backend.
- query_by_vector(vector, n_neighbors, *, search_k=-1, include_distances=False, exclude_self=False, exclude_item_ids=None, ensure_all_finite=True, copy=False)[source]#
Query neighbors by an explicit vector.
- Parameters:
- vectorarray-like of shape (f,)
Query vector.
- n_neighborsint
Number of neighbors to return after exclusions.
- search_kint, default=-1
Search parameter forwarded to the backend.
- include_distancesbool, default=False
If True, also return distances.
- exclude_selfbool, default=False
If True, exclude the first returned candidate whose distance is exactly
0.0. This is intended for queries wherevectorcomes from the index itself.- exclude_item_idsiterable of int, optional
Additional item ids to exclude.
- ensure_all_finitebool or ‘allow-nan’, default=True
Input validation option forwarded to scikit-learn.
- copybool, default=False
Input validation option forwarded to scikit-learn.
- Returns:
- indicesnumpy.ndarray of shape (n_neighbors,)
Neighbor ids.
- (indices, distances)tuple of numpy.ndarray
Returned when
include_distances=True.
- Raises:
- sklearn.exceptions.NotFittedError
If the backend reports that the index is unbuilt.
- ValueError
If
n_neighbors <= 0, vector dimension mismatchesf, or not enough neighbors remain after exclusions.
- Parameters:
- Return type:
See also
query_by_itemQuery neighbors by stored item id.
kneighborsBatch neighbor queries (sklearn-like).
Notes
Exclusions are applied deterministically in the order returned by the backend. If
exclude_self=Trueand no exact0.0distance candidate is returned in the first position, no additional self-exclusion is applied.
- query_vectors_by_item(item, n_neighbors, *, search_k=-1, include_distances=False, exclude_self=False, exclude_item_ids=None, ensure_all_finite=True, copy=False, dtype=<class 'numpy.float32'>, output_type='vector')[source]#
Query neighbor vectors by stored item id.
This is a convenience wrapper over
query_by_itemthat materializes vectors using the backend’sget_item_vector.- Parameters:
- item, n_neighbors, search_k, include_distances, exclude_self, exclude_item_ids
See
query_by_item.- ensure_all_finite, copy
See
query_by_vector.- dtypenumpy dtype, default=numpy.float32
Output dtype for the returned vectors.
- output_type{‘item’, ‘vector’}, default=’vector’
If ‘vector’, return neighbor vectors. If ‘item’, return neighbor ids.
- Returns:
- vectorsnumpy.ndarray of shape (n_neighbors, f)
Neighbor vectors.
- (vectors, distances)tuple
Returned when
include_distances=True.
- Parameters:
- Return type:
See also
query_vectors_by_vectorVector query returning vectors (or ids).
- query_vectors_by_vector(vector, n_neighbors, *, search_k=-1, include_distances=False, exclude_self=False, exclude_item_ids=None, ensure_all_finite=True, copy=False, dtype=<class 'numpy.float32'>, output_type='vector')[source]#
Query neighbor vectors by an explicit vector.
Convenience wrapper over
query_by_vector. By default it returns vectors; setoutput_type='item'to return neighbor ids instead.- Parameters:
- vector, n_neighbors, search_k, include_distances, exclude_self, exclude_item_ids,
See
query_by_item.- ensure_all_finite, copy
See
query_by_vector.- dtypenumpy dtype, default=numpy.float32
Output dtype for the returned vectors.
- output_type{‘item’, ‘vector’}, default=’vector’
If ‘vector’, return neighbor vectors. If ‘item’, return neighbor ids.
- Returns:
- neighborsnumpy.ndarray
If
output_type='vector', an array of shape(n_neighbors, f). Ifoutput_type='item', an array of shape(n_neighbors,).- (neighbors, distances)tuple
Returned when
include_distances=True.
- Parameters:
- Return type:
See also
query_vectors_by_itemItem id query returning vectors.
query_by_vectorPer-query id interface.