resolve_stopwords#
- scikitplot.corpus.resolve_stopwords(lang, *, default='english', extra=None)[source]#
Return a frozenset of stopwords for one or more languages.
Looks up each language in
BUILTIN_LANG_STOPWORDS. Languages that are in NLTK’s stopwords corpus but absent from the built-in table are silently skipped (callers should use NLTK directly for those).- Parameters:
- langstr or list[str] or None
Language specifier. Accepts the same forms as
coerce_language.- defaultstr, optional
Fallback language when lang is
None.- extrafrozenset[str] or None, optional
Additional custom stopwords to union with the result.
- Returns:
- frozenset[str]
Union of stopwords across all requested languages plus extra.
- Parameters:
- Return type:
Examples
>>> "the" in resolve_stopwords("english") True >>> words = resolve_stopwords(["en", "hi"]) >>> "और" in words and "the" in words True >>> resolve_stopwords(None, extra=frozenset(["foo"])) frozenset({'foo', ...})