ChunkerBridge#
- class scikitplot.corpus.ChunkerBridge(inner)[source]#
Adapter that wraps a new-style chunker as a
ChunkerBase- compatible object.- Parameters:
- innerobject
The new-style chunker instance (
SentenceChunker,ParagraphChunker,FixedWindowChunker, orWordChunker).
- Attributes:
- strategyChunkingStrategy
Required by
_base.py:get_documents()line 739.- innerobject
The wrapped chunker — retained for direct access to the richer
ChunkResultAPI when needed.
- Parameters:
inner (Any)
Notes
Developer note:
_base.pycalls exactly two things on a chunker:self.chunker.strategy— aChunkingStrategyenum value.self.chunker.chunk(text, metadata=raw_chunk)→list[tuple[int, str]]whereintischar_startandstris the chunk text.
This bridge satisfies both without touching
ChunkerBaseor the new chunkers.- chunk(text, metadata=None)[source]#
Chunk text and return
(char_start, chunk_text)pairs.- Parameters:
- textstr
Raw text to chunk.
- metadatadict[str, Any] or None, optional
Raw-chunk metadata dict passed by
get_documents(). Forwarded asextra_metadatato the inner chunker where supported.
- Returns:
- list[tuple[int, str]]
Each element is
(char_offset, chunk_text). If the inner chunker does not provide offsets, a forward-cursor scan computes them.
- Parameters:
- Return type: