_MultiSourceReader#
- class scikitplot.corpus._MultiSourceReader(readers)[source]#
Chains multiple
DocumentReaderinstances into one stream.Returned by
DocumentReader.createwhen more than one source is supplied, and byDocumentReader.from_manifest.Also acts as a context manager ensuring temporary directories from
from_url()downloads are cleaned up on exit.- Parameters:
- readerslist[DocumentReader]
Ordered list of sub-readers. Documents are yielded in order.
- Parameters:
readers (list[DocumentReader])
Notes
Context manager usage (ensures temp-file cleanup):
with DocumentReader.create( "https://iris.who.int/.../content", Path("report.pdf"), ) as reader: docs = list(reader.get_documents())
Duck-typed interface — exposes
get_documents()matchingDocumentReaderso it works anywhere a single reader is accepted.Examples
>>> from pathlib import Path >>> import scikitplot.corpus._readers >>> reader = DocumentReader.create(Path("a.txt"), Path("b.pdf")) >>> type(reader).__name__ '_MultiSourceReader' >>> docs = list(reader.get_documents())
- close()[source]#
Release temporary directories created by
from_url()downloads.Each sub-reader that downloaded a file has a
_from_url_tmp_dirattribute set byDocumentReader.from_url. This method deletes those directories. Called automatically when used as a context manager; call manually otherwise.- Return type:
None
- readers: list[DocumentReader]#