WordChunkerBridge#
- class scikitplot.corpus.WordChunkerBridge(inner)[source]#
Bridge for
WordChunker→ChunkerBasecontract.Notes
WordChunkersplits text at the word-token level, which does not correspond to any namedChunkingStrategyvalue.CUSTOMis used as the closest approximation — it signals that user-supplied or non-standard logic was applied, and downstream consumers should not assume standard segment boundaries.- Parameters:
inner (Any)
- chunk(text, metadata=None)[source]#
Chunk text and return
(char_start, chunk_text)pairs.- Parameters:
- textstr
Raw text to chunk.
- metadatadict[str, Any] or None, optional
Raw-chunk metadata dict passed by
get_documents(). Forwarded asextra_metadatato the inner chunker where supported.
- Returns:
- list[tuple[int, str]]
Each element is
(char_offset, chunk_text). If the inner chunker does not provide offsets, a forward-cursor scan computes them.
- Parameters:
- Return type: