CustomNLPEnricher#
- class scikitplot.corpus.CustomNLPEnricher(config=None, *, custom_config=None)[source]#
NLPEnricherextended with fully-replaceable NLP backends.Wraps a standard
NLPEnricherand intercepts each processing stage when the corresponding custom callable is set incustom_config. Built-in backends are used as fallback for any stage without a custom override.- Parameters:
- configEnricherConfig or None, optional
Standard enrichment configuration.
Noneuses defaults.- custom_configCustomEnricherConfig or None, optional
Custom backend callables.
Nonedisables all custom overrides (equivalent to using plainNLPEnricher).
- Parameters:
config (Any | None)
custom_config (CustomEnricherConfig | None)
See also
scikitplot.corpus._enrichers.NLPEnricherBuilt-in enricher.
CustomEnricherConfigCustom backend callables configuration.
Notes
User note: Drop-in replacement for
NLPEnricher. The sameenrich_documents()interface is preserved.Developer note: Delegation order per stage:
If
custom_config.<stage>is set → call the custom callable.Otherwise → delegate to the wrapped
NLPEnrichermethod.
This keeps the built-in lazy-loading cache (spaCy, NLTK, stemmer) intact for any stage that does not have a custom override.
Examples
Integrate a custom tokenizer (e.g. SentencePiece):
import sentencepiece as spm sp = spm.SentencePieceProcessor() sp.load("bpe.model") def sp_tokenize(text): return sp.encode(text, out_type=str) ccfg = CustomEnricherConfig(custom_tokenizer=sp_tokenize) enricher = CustomNLPEnricher(custom_config=ccfg) docs = enricher.enrich_documents(corpus_docs)