iso_to_nltk#
- scikitplot.corpus.iso_to_nltk(code)[source]#
Resolve an ISO 639-1/639-3 code to a canonical NLTK language name.
- Parameters:
- codestr
ISO 639-1 two-letter code (e.g.
"en","ar") or ISO 639-3 three-letter code (e.g."grc"for Ancient Greek), or already- canonical NLTK name (e.g."english"). Case-insensitive.
- Returns:
- str
Canonical lowercase NLTK-compatible language name. Falls back to code itself if the code is not found in the registry (so passing
"english"returns"english"unchanged).
- Parameters:
code (str)
- Return type:
Examples
>>> iso_to_nltk("en") 'english' >>> iso_to_nltk("ar") 'arabic' >>> iso_to_nltk("english") 'english' >>> iso_to_nltk("grc") 'ancient_greek' >>> iso_to_nltk("zz") # unknown → returned as-is 'zz'