CustomTokenizerRegistry#

class scikitplot.corpus.CustomTokenizerRegistry(kind)[source]#

Thread-safe module-level registry for named custom components.

Each registry holds a dict[str, Protocol] accessible via module-level helpers (register_tokenizer, get_tokenizer, etc.).

All register, get, names, and __contains__ operations acquire an internal threading.RLock to guarantee safety when registering from worker threads (e.g. ThreadPoolExecutor batch jobs).

Raises:
KeyError

get raises KeyError when the name is not registered.

Parameters:

kind (str)

Notes

User note: Register all custom components at application startup, before spawning worker threads, to avoid any lock contention in hot paths.

Developer note: threading.RLock (reentrant) is used instead of threading.Lock so that the same thread can call register from within a get callback without deadlocking. This is safe on CPython, PyPy, and GraalPy.

get(name)[source]#

Retrieve the component registered under name.

Parameters:
namestr

Registry key.

Returns:
object

The registered component.

Raises:
KeyError

If name has not been registered.

Parameters:

name (str)

Return type:

Any

names()[source]#

Return all registered names.

Returns:
list[str]

Sorted list of registered keys.

Return type:

list[str]

register(name, instance)[source]#

Register instance under name.

Parameters:
namestr

Registry key. Must be a non-empty string.

instanceobject

The component instance to register.

Raises:
TypeError

If name is not a str.

ValueError

If name is empty.

Parameters:
Return type:

None