ChunkerBridge#
- class scikitplot.corpus.ChunkerBridge(inner)[source]#
Adapter that wraps a new-style chunker as a
ChunkerBase- compatible object.- Parameters:
- innerobject
The new-style chunker instance (
SentenceChunker,ParagraphChunker,FixedWindowChunker, orWordChunker).
- Attributes:
- strategyChunkingStrategy
Required by
_base.py:get_documents()line 739.- innerobject
The wrapped chunker — retained for direct access to the richer
ChunkResultAPI when needed.
- Parameters:
inner (Any)
Notes
Developer note:
_base.pycalls exactly two things on a chunker:self.chunker.strategy— aChunkingStrategyenum value.self.chunker.chunk(text, metadata=raw_chunk)→list[tuple[int, str]]whereintischar_startandstris the chunk text.
This bridge satisfies both without touching
ChunkerBaseor the new chunkers.- chunk(text, metadata=None)[source]#
Chunk text and return a
ChunkResult.CRITICAL-02 (Phase 2): Returns
ChunkResultdirectly.DocumentReader.get_documentsnow iterateschunk_result.chunksinstead of(char_start, chunk_text)tuples.- Parameters:
- textstr
Raw text to chunk.
- metadatadict[str, Any] or None, optional
Raw-chunk metadata dict passed by
get_documents(). Forwarded asextra_metadatato the inner chunker.
- Returns:
- ChunkResult
Ordered list of
Chunkobjects withtext,start_char,end_char, andmetadata.
- Parameters:
- Return type:
ChunkResult
Notes
Use
_to_tuplesto convert to the legacylist[tuple[int, str]]format if needed for backward compat.
- strategy: ClassVar[ChunkingStrategy]#