scikitplot._externals#

User guide. See the Externals (experimental) section for further details.

Sphinx Extensions#

scikitplot._externals._sphinx_ext#

Private namespace for vendored Sphinx extensions.

All child submodules are loaded lazily: importing this package does not pull in Sphinx, BeautifulSoup, markdownify, or any other heavy dependency.

Submodules#

_sphinx_ai_assistant

AI-assistant Sphinx extension (markdown export, llms.txt, AI chat links). Copied and adapted from mlazag/sphinx-ai-assistant (MIT licence). Requires Sphinx ≥ 5 at call time, not at import time.

Notes

To register the AI-assistant extension in a Sphinx project, add the full dotted path to extensions in conf.py:

extensions = [
    "scikitplot._externals._sphinx_ext._sphinx_ai_assistant",
]

Examples

>>> # Safe: no Sphinx needed yet
>>> from scikitplot._externals import _sphinx_ext
>>> # Sphinx imported here, on demand:
>>> ai = _sphinx_ext._sphinx_ai_assistant

User guide. See the Sphinx Ext (experimental) section for further details.

_sphinx_ext._sphinx_ai_assistant

Sphinx AI Assistant Extension

Sphinx AI Extension#

Sphinx AI Assistant Extension#

A Sphinx extension that adds AI-assistant features to documentation pages, including one-click Markdown export, AI chat deep-links, MCP tool integration, and automated llms.txt generation.

The module has two distinct layers:

Core layer (Sphinx-free)

Importable without Sphinx, BeautifulSoup, or markdownify. All security helpers, the HTML→Markdown converter, the multi-process HTML walker, the standalone directory processor.

Sphinx layer

setup(), generate_markdown_files(), generate_llms_txt(), and add_ai_assistant_context() are Sphinx build-event hooks wired by setup. They delegate to the core layer internally.

All heavy optional dependencies (sphinx, bs4, markdownify, IPython) are imported lazily — only when a feature is actually invoked. Importing this module at the top level is always safe and has zero side effects.

Public API (standalone / non-Sphinx)#

process_html_directorycallable

Walk any HTML directory tree, convert pages to Markdown, optionally produce llms.txt. Works with Sphinx, MkDocs, Jekyll, plain HTML, or any other static-site generator.

generate_llms_txt_standalonecallable

Write llms.txt from an existing set of .md files without requiring a Sphinx build.

html_to_markdowncallable

Convert an HTML string to Markdown.

Public API (Sphinx extension)#

setupcallable

Sphinx extension entry point.

Notes

Developer note — import discipline:

Every import of sphinx.*, bs4, markdownify lives inside the function or class body that needs it, guarded by a try/except where appropriate. Nothing is imported at module scope except the standard library. This keeps import time cost near zero and avoids ImportError at load time when optional packages are absent.

Security notes:

  • _safe_json_for_script escapes </script> sequences to prevent script-injection attacks when config is serialised into an HTML page.

  • _is_path_within prevents path-traversal attacks in the multi-process HTML walker.

  • _validate_base_url rejects non-HTTP(S) schemes in the base URL configuration value.

  • _validate_position rejects unknown widget-position strings.

  • _validate_provider_url_template rejects non-HTTP(S) schemes in AI-provider URL templates (javascript:, data:, ftp:, …).

  • _validate_css_selector rejects selectors containing HTML-injection characters (< or >).

  • _validate_provider checks every required field of a provider dict before it is serialised into a page or widget.

  • Ollama api_base_url is validated to allow only http://localhost or http://127.0.0.1 origins, preventing exfiltration to remote hosts.

References

Examples

Sphinx Register in conf.py:

extensions = [
    "scikitplot._externals._sphinx_ext._sphinx_ai_assistant",
]
html_theme = "pydata_sphinx_theme"  # scikit-learn / NumPy style
ai_assistant_enabled = True
ai_assistant_theme_preset = "pydata_sphinx_theme"  # auto-selects CSS selectors
ai_assistant_generate_markdown = True
ai_assistant_generate_llms_txt = True
html_baseurl = "https://docs.example.com"

Standalone (non-Sphinx):

from scikitplot._externals._sphinx_ext._sphinx_ai_assistant import (
    process_html_directory,
)

stats = process_html_directory(
    "/path/to/site/_site",
    theme_preset="jekyll",
    generate_llms=True,
    base_url="https://example.com",
)
print(stats)  # {"generated": 42, "skipped": 3, "errors": 0}

User guide. See the Sphinx AI Extensions (experimental) section for further details.

_sphinx_ai_assistant.add_ai_assistant_context

Inject AI-assistant configuration into each HTML page's template context.

_sphinx_ai_assistant.generate_llms_txt

Post-build hook: write llms.txt listing all generated .md URLs.

_sphinx_ai_assistant.generate_llms_txt_standalone

Write llms.txt from an existing set of .md files.

_sphinx_ai_assistant.generate_markdown_files

Post-build hook: generate .md companions for every .html file.

_sphinx_ai_assistant.html_to_markdown

Convert an HTML string to Markdown using the Sphinx-tuned converter.

_sphinx_ai_assistant.html_to_markdown_converter

Convert an HTML string to Markdown using the Sphinx-tuned converter.

_sphinx_ai_assistant.process_html_directory

Walk any HTML directory tree and convert pages to Markdown.