compute_stats#
- scikitplot.corpus.compute_stats(docs)[source]#
Compute aggregate statistics over a document collection.
- Parameters:
- docssequence of CorpusDocument
Documents to analyse. May be empty.
- Returns:
- CorpusStats
Frozen statistics object.
- Parameters:
docs (Sequence[CorpusDocument])
- Return type:
Notes
This is a pure function: same input → same output, no I/O, no mutation. Safe to call from multiple threads concurrently.
Median is computed without NumPy using a sort-based O(n log n) algorithm so that this module has zero optional dependencies.
Examples
>>> stats = compute_stats(docs) >>> print(stats.summary()) CorpusStats Documents : 487 Tokens : 42,310 total (mean 86.9, ...)