SearchConfig#
- class scikitplot.corpus.SearchConfig(top_k=10, match_mode='semantic', semantic_threshold=0.0, keyword_threshold=0.0, hybrid_alpha=0.5, rrf_k=60, use_normalized_text=True, case_sensitive=False)[source]#
Configuration for similarity search.
- Parameters:
- top_kint
Maximum results to return.
- match_modestr
One of
"strict","keyword","semantic","hybrid".- semantic_thresholdfloat
Minimum cosine similarity for SEMANTIC results.
- keyword_thresholdfloat
Minimum keyword overlap for KEYWORD results.
- hybrid_alphafloat
Weight for semantic scores in HYBRID mode (0 = pure keyword, 1 = pure semantic). Default 0.5 (equal weight).
- rrf_kint
Reciprocal rank fusion constant. Default 60 (standard).
- use_normalized_textbool
Use
normalized_textfor matching when available.- case_sensitivebool
Case-sensitive matching in STRICT mode.
- Parameters:
Notes
User note: For RAG pipelines,
match_mode="hybrid"with default settings provides a good balance. For exact citation matching, usematch_mode="strict".