FixedWindowChunkerConfig#
- class scikitplot.corpus.FixedWindowChunkerConfig(window_size=512, step_size=256, unit=WindowUnit.CHARS, min_length=10, include_offsets=True, strip_whitespace=True)[source]#
Configuration for
FixedWindowChunker.- Parameters:
- window_sizeint
Size of each chunk in unit units.
- step_sizeint
Stride between consecutive chunk starts.
step_size == window_sizegives non-overlapping chunks.step_size < window_sizegives sliding-window overlap.- unitWindowUnit
Measurement unit:
CHARS(default) orTOKENS.- min_lengthint
Minimum character length to keep the last (possibly partial) chunk.
- include_offsetsbool
Compute and store character offsets.
- strip_whitespacebool
Strip leading/trailing whitespace from each chunk.
- Parameters:
- unit: WindowUnit = 'chars'[source]#