NormalizationPipeline#
- class scikitplot.corpus.NormalizationPipeline(steps)[source]#
Apply a sequence of normalisers in order.
Each normaliser in the pipeline receives the output of the previous one. Normalisers that have no effect return the document unchanged, so only modified documents incur a
replace()call.- Parameters:
- stepssequence of NormalizerBase
Ordered list of normalisers to apply.
- Raises:
- ValueError
If
stepsis empty.
- Parameters:
steps (Sequence[NormalizerBase])
Examples
>>> pipeline = NormalizationPipeline( ... [ ... UnicodeNormalizer(form="NFKC"), ... HTMLStripNormalizer(), ... WhitespaceNormalizer(), ... ] ... ) >>> result = pipeline.normalize_doc(doc)