SQLiteStorage#

class scikitplot.corpus.SQLiteStorage(db_path=':memory:')[source]#

SQLite-backed corpus store with FTS5 full-text search.

Uses stdlib sqlite3 — no external dependencies. Full-text search is available via FTS5 (StorageQuery.full_text).

The database uses WAL (Write-Ahead Logging) mode for better concurrent read throughput. A single connection is held per SQLiteStorage instance; use separate instances for multiple threads if needed.

Parameters:
db_pathpathlib.Path or str

Path to the SQLite database file. Created if absent. Pass \":memory:\" for a purely in-memory database.

Parameters:

db_path (pathlib.Path | str)

Examples

>>> store = SQLiteStorage(Path("corpus.db"))
>>> store.save_batch(docs)
>>> result = store.query(StorageQuery(source_type="book", limit=10))
>>> result.total
487
close()[source]#

Close the database connection.

Return type:

None

count()[source]#

Return total stored document count via fast SQL COUNT.

Return type:

int

get(doc_id)[source]#

Retrieve a document by doc_id.

Parameters:
doc_idstr
Parameters:

doc_id (str)

Return type:

CorpusDocument | None

query(q)[source]#

Query documents with optional full-text search (FTS5).

Parameters:
qStorageQuery
Parameters:

q (StorageQuery)

Return type:

QueryResult

save(doc)[source]#

Persist a single document (upsert by doc_id).

Parameters:
docCorpusDocument
Parameters:

doc (CorpusDocument)

Return type:

None

save_batch(docs)[source]#

Persist a batch of documents in a single transaction.

Parameters:
docssequence of CorpusDocument
Parameters:

docs (Sequence[CorpusDocument])

Return type:

None