documents_to_pandas#

scikitplot.corpus.documents_to_pandas(docs, *, include_embedding=False)[source]#

Convert a list of CorpusDocument instances to a pandas.DataFrame.

Parameters:
docslist of CorpusDocument

Documents to convert. Must be non-empty.

include_embeddingbool, optional

When True, include a column "embedding" with numpy arrays. Default: False.

Returns:
pandas.DataFrame

One row per document. Metadata fields are promoted to columns.

Raises:
ImportError

If pandas is not installed.

ValueError

If docs is empty.

Parameters:
Return type:

pd.DataFrame

Examples

>>> docs = [CorpusDocument.create("f.txt", i, f"Sentence {i}.") for i in range(3)]
>>> df = documents_to_pandas(docs)
>>> len(df)
3