Agent-Native Philosophy
The Shift from Human-Centric to Agent-Native
Traditional documentation is designed for biological processing. It prioritizes visual hierarchy, aesthetic navigation, and discovery-based UX. For an Autonomous Agent or a Large Language Model (LLM), these features are "noise" that lead to context fragmentation and hallucinations.
Moltext operates on the principle that agents do not "browse"—they "ingest."
Why Human Documentation Fails Agents
- Context Fragmentation: A tool's logic might be spread across 50 HTML pages. An agent attempting to crawl these individually often loses the "global state" of the library, leading to inconsistent code generation.
- Semantic Noise: CSS classes, JavaScript boilerplate, and complex header/footer navigations dilute the signal-to-noise ratio. This wastes tokens and confuses the attention mechanism of the model.
- Non-Deterministic Retrieval: When an agent "searches" a live doc site, it relies on the site's internal search engine or an erratic browsing tool. This introduces non-determinism into the agent's reasoning loop.
The Moltext Solution: Deterministic Context
Moltext transforms chaotic web documentation into a Single Source of Truth. By compiling an entire documentation suite into a high-density context.md file, Moltext provides:
- Structural Compression: Converting HTML to clean Markdown while stripping conversational filler (e.g., "In this tutorial, we will learn how to...") to focus purely on API signatures and technical constraints.
- Token Efficiency: By normalizing content via LLM processing (or raw structural parsing), Moltext reduces the footprint of the documentation, allowing more logic to fit within the agent's context window.
- Vector Readiness: The output is formatted with clear, hierarchical headers and explicit source citations, making it ideal for RAG (Retrieval-Augmented Generation) ingestion.
Implementation Modes
Moltext provides two primary pathways to achieve agent-native context, depending on your resource constraints and accuracy requirements.
1. The Raw Pipeline (Deterministic)
Use the --raw flag to bypass LLM processing. This is the fastest, most cost-effective method for agents that prefer to do their own reasoning over "ground truth" Markdown.
# Pure structural normalization
moltext https://docs.example.com --raw --output tool_context.md
2. The Compiled Pipeline (Refined)
In this mode, Moltext uses a "Compiler Model" (e.g., gpt-4o-mini or a local llama3) to rewrite the documentation. The compiler follows a strict system prompt to:
- Preserve all code blocks and type signatures.
- Remove redundant phrasing.
- Optimize for keyword density.
# AI-enhanced compilation
moltext https://docs.example.com --model gpt-4o-mini --key YOUR_API_KEY
From "Browsing" to "Knowing"
By integrating Moltext into an agent's workflow, you move the agent from a state of uncertain exploration to grounded execution. Instead of guessing an API's parameters, the agent reads the compiled context.md into its memory, providing it with the complete technical surface area of the tool it is tasked to use.