NDArrayMixin#
- class scikitplot.annoy.NDArrayMixin[source]#
NumPy / SciPy / pandas interoperability for Annoy-like indexes.
- add_items(X, ids=None, *, start_id=None, accept_sparse='error', ensure_all_finite=True, copy=False, dtype=<class 'numpy.float32'>, order='C', check_unique_ids=True)[source]#
Add many vectors to the index.
- Parameters:
- Xarray-like of shape (n_samples, n_features)
Vectors to add.
- idsarray-like of shape (n_samples,), optional
Explicit integer ids. If omitted, ids are allocated as a contiguous range starting at
start_id(orget_n_items()at call time).- start_idint, optional
Starting id used when
idsis None. If None, defaults tobackend.get_n_items()at call time.- accept_sparse{‘error’, ‘toarray’}, default=’error’
Sparse input handling.
'toarray'densifies SciPy sparse inputs explicitly. Any other sparse behavior raises.- ensure_all_finitebool or ‘allow-nan’, default=True
Finiteness validation policy.
- copybool, default=False
If True, copy the validated dense array before adding.
- dtypenumpy dtype, default=numpy.float32
Dtype passed to the backend.
- order{‘C’, ‘F’, ‘A’, ‘K’}, default=’C’
Memory order used when coercing
X.- check_unique_idsbool, default=True
If True, require ids to be unique.
- Returns:
- ids_outnumpy.ndarray of shape (n_samples,)
The ids that were added, as
int64.
- Raises:
- RuntimeError
If the backend indicates the index is built.
- TypeError
If sparse input is given while
accept_sparse='error'.- ValueError
If
Xis not 2D, feature dimensions mismatchf, ids are invalid, or finiteness policy is violated.
- Parameters:
- Return type:
See also
get_item_vectorsFetch vectors by id selection.
to_numpyExport vectors as a dense NumPy array.
Notes
This method is deterministic: ids are generated predictably and vectors are added in row order.
- get_item_vectors(ids=None, *, dtype=<class 'numpy.float32'>, start=0, stop=None, n_rows=None, return_ids=False, validate_vector_len=True)[source]#
Fetch many vectors as a dense NumPy array.
- Parameters:
- idssequence of int or iterable of int, optional
Ids to fetch. If None, selects
range(start, stop or n_items).- dtypenumpy dtype, default=numpy.float32
Output dtype.
- start, stopint, optional
Range selection used when
idsis None.- n_rowsint, optional
Required when
idsis a non-sized iterable (e.g., generator).- return_idsbool, default=False
If True, also return the realized ids (int64) in row order.
- validate_vector_lenbool, default=True
If True, verify every fetched vector has length
f.
- Returns:
- Xnumpy.ndarray of shape (n_rows, f)
Dense matrix of vectors.
- ids_outnumpy.ndarray of shape (n_rows,), optional
Returned when
return_ids=True.
- Raises:
- ValueError
If the id selection is inconsistent or vectors have unexpected length.
- TypeError
If
idsis a non-sized iterable andn_rowsis not provided.
- Parameters:
- Return type:
See also
to_numpyDense NumPy export alias.
iter_item_vectorsStreaming export without allocating a dense matrix.
- iter_item_vectors(ids=None, *, start=0, stop=None, with_ids=True, dtype=None)[source]#
Iterate vectors without allocating a dense matrix.
- Parameters:
- ids, start, stop
Selection controls. See
get_item_vectors.- with_idsbool, default=True
If True, yield
(id, vector). If False, yield vectors only.- dtypenumpy dtype, optional
If provided, cast output vectors to this dtype.
- Yields:
- (id, vector) or vector
Each vector is returned as a 1D NumPy array.
- Parameters:
- Return type:
See also
get_item_vectorsDense export.
- to_numpy(ids=None, *, dtype=<class 'numpy.float32'>, start=0, stop=None, n_rows=None, validate_vector_len=True)[source]#
Export vectors to a dense NumPy array.
See also
get_item_vectorsDense export with optional id output.
iter_item_vectorsStreaming export.
to_scipy_csrExport as SciPy CSR.
to_pandasExport as pandas DataFrame.
Notes
This is an alias of
get_item_vectorswithreturn_ids=False.
- to_pandas(ids=None, *, dtype=<class 'numpy.float32'>, start=0, stop=None, n_rows=None, id_location='index', id_name='id', columns=None, validate_vector_len=True)[source]#
Export vectors to a pandas
DataFrame.- Parameters:
- ids, start, stop, n_rows
Selection controls. See
get_item_vectors.- dtypenumpy dtype, default=numpy.float32
Output dtype.
- id_location{‘index’, ‘column’, ‘both’, ‘none’}, default=’index’
Where to place ids in the output.
- id_namestr, default=’id’
Name used for the id column / index.
- columnssequence of str, optional
Column names for vector dimensions. If None, uses
feature_names_in_when present and length matchesf; otherwise usesfeature_0..feature_{f-1}.- validate_vector_lenbool, default=True
If True, verify every fetched vector has length
f.
- Returns:
- dfpandas.DataFrame
DataFrame with shape
(n_rows, f)plus optional id metadata.
- Raises:
- ImportError
If pandas is not installed.
- ValueError
If
id_locationis invalid orcolumnslength mismatchesf.
- Parameters:
- Return type:
See also
to_numpyDense NumPy export.
to_scipy_csrExport as SciPy CSR.