documents_to_polars#

scikitplot.corpus.documents_to_polars(docs, *, include_embedding=False)[source]#

Convert a list of CorpusDocument instances to a polars.DataFrame.

Parameters:
docslist of CorpusDocument

Documents to convert. Must be non-empty.

include_embeddingbool, optional

When True, include a column "embedding" with list-of-float values. Default: False.

Returns:
polars.DataFrame

One row per document. Metadata fields are promoted to columns.

Raises:
ImportError

If polars is not installed.

ValueError

If docs is empty.

Parameters:
Return type:

pl.DataFrame

Examples

>>> docs = [CorpusDocument.create("f.txt", i, f"Sentence {i}.") for i in range(3)]
>>> df = documents_to_polars(docs)
>>> len(df)
3