resolve_stopwords#

scikitplot.corpus.resolve_stopwords(lang, *, default='english', extra=None)[source]#

Return a frozenset of stopwords for one or more languages.

Looks up each language in BUILTIN_LANG_STOPWORDS. Languages that are in NLTK’s stopwords corpus but absent from the built-in table are silently skipped (callers should use NLTK directly for those).

Parameters:
langstr or list[str] or None

Language specifier. Accepts the same forms as coerce_language.

defaultstr, optional

Fallback language when lang is None.

extrafrozenset[str] or None, optional

Additional custom stopwords to union with the result.

Returns:
frozenset[str]

Union of stopwords across all requested languages plus extra.

Parameters:
Return type:

frozenset[str]

Examples

>>> "the" in resolve_stopwords("english")
True
>>> words = resolve_stopwords(["en", "hi"])
>>> "और" in words and "the" in words
True
>>> resolve_stopwords(None, extra=frozenset(["foo"]))
frozenset({'foo', ...})