CustomEnricherConfig#
- class scikitplot.corpus.CustomEnricherConfig(custom_tokenizer=None, custom_lemmatizer=None, custom_stemmer=None, custom_keyword_extractor=None, custom_stopwords=None)[source]#
Custom backend callables for
CustomNLPEnricher.Every field is optional. When set it replaces the corresponding built-in backend in
NLPEnricher.Nonemeans “use the built-in backend fromEnricherConfig”.- Parameters:
- custom_tokenizercallable or None, optional
Replaces the built-in tokenizer. Signature:
def custom_tokenizer(text: str) -> list[str]: ...
- custom_lemmatizercallable or None, optional
Replaces the built-in lemmatizer. Signature:
def custom_lemmatizer(tokens: list[str]) -> list[str]: ...
- custom_stemmercallable or None, optional
Replaces the built-in stemmer. Signature:
def custom_stemmer(tokens: list[str]) -> list[str]: ...
- custom_keyword_extractorcallable or None, optional
Replaces the built-in keyword extractor. Signature:
def custom_keyword_extractor( text: str, tokens: list[str], ) -> list[str]: ...
- custom_stopwordsfrozenset[str] or None, optional
Replaces the built-in stopword set used by
_filter_tokens. WhenNonethe built-in NLTK / fallback set is used.
- Parameters:
Notes
User note: Pass a
CustomEnricherConfigtogether with the standardEnricherConfigtoCustomNLPEnricher. Built-in fields (tokenizer,lemmatizer, etc.) inEnricherConfigare still honoured for any stage that has no custom callable.Examples
Replace keyword extraction with a KeyBERT-based extractor:
from keybert import KeyBERT _kb = KeyBERT() def kb_extractor(text, tokens): return [kw for kw, _ in _kb.extract_keywords(text, top_n=10)] ccfg = CustomEnricherConfig(custom_keyword_extractor=kb_extractor) enricher = CustomNLPEnricher(custom_config=ccfg)