PipelineResult#
- class scikitplot.corpus.PipelineResult(source, documents, output_path, n_read, n_omitted, n_embedded, elapsed_seconds, export_format)[source]#
Immutable summary of a single pipeline run.
- Parameters:
- sourcestr
Input source identifier (file path, URL, or batch label).
- documentslist of CorpusDocument
All documents produced (after chunking, filtering, and optional embedding). Empty list if the source yielded no usable text.
- output_pathpathlib.Path or None
Path to the exported file, or
Nonewhen no export was requested (output_path=Nonein the pipeline call).- n_readint
Total raw chunks yielded by the reader before filtering.
- n_omittedint
Chunks dropped by the filter.
- n_embeddedint
Documents that received an embedding vector (0 when embedding is disabled).
- elapsed_secondsfloat
Wall-clock time for the entire run, in seconds.
- export_formatExportFormat or None
Format used for export, or
Nonewhen no export was done.
- Parameters:
Notes
n_read - n_omitted == len(documents)is an invariant maintained by the pipeline.Examples
>>> result.n_read 512 >>> result.elapsed_seconds 3.14 >>> len(result.documents) 487