Local Inference Setup

Moltext is designed to be provider-agnostic. While it defaults to OpenAI, you can route the compilation process through local inference servers like Ollama or LM Studio. This is the preferred "Shared Brain" flow for users prioritizing privacy, local-first agentic workflows, or cost reduction.

Why Use Local Inference?

Data Privacy: Documentation content never leaves your local machine.
Zero Cost: Process massive documentation sets without incurring API tokens.
Agent Sovereignty: Keep your agent's learning infrastructure entirely self-hosted.

Configuration Overview

To use a local provider, you must override the default OpenAI endpoint and specify your local model name using the following flags:

[!NOTE] When a custom --base-url is detected that does not point to OpenAI, Moltext automatically bypasses the API key requirement by injecting a dummy key. You do not need to provide the -k flag for local setups.

Provider Examples

1. Ollama

Ollama provides an OpenAI-compatible API on port 11434.

Ensure Ollama is running and you have pulled a model (e.g., llama3 or mistral).
Run Moltext pointing to the local endpoint:

moltext https://docs.example.com \
  --base-url http://localhost:11434/v1 \
  --model llama3

2. LM Studio

LM Studio allows you to host local LLMs with a one-click local server.

Open LM Studio and start the Local Server (typically on port 1234).
Ensure the "Cross-Origin Resource Sharing (CORS)" is enabled if applicable.
Run Moltext:

moltext https://docs.example.com \
  --base-url http://localhost:1234/v1 \
  --model <your-loaded-model-id>

Advanced Usage: The Hybrid "Raw" Fallback

If your local model is struggling with complex HTML-to-Markdown normalization, you can use the --raw flag to skip LLM processing entirely. This bypasses all inference and uses Moltext's internal deterministic parsing engine.

moltext https://docs.example.com --raw

Troubleshooting Local Connections

Connection Refused: Ensure your local server is explicitly running the API service and that the port matches your --base-url.
Model Not Found: Verify the model name using ollama list or the LM Studio dashboard. The string passed to --model must match the provider's identifier exactly.
Timeouts: Compiling large documentation sites locally depends on your hardware. For massive sites, consider increasing the --limit gradually.