Recursive Language Models (RLM) - Research

Focus: Original MIT paper research and algorithm Tachikoma Implementation: See docs/capabilities/rlm.mdFull Documentation: See docs/rlm.md

The Problem

Standard LLMs have fixed context windows (128K-200K tokens). How to handle:

Entire codebases
Large documentation sets
Multi-file refactoring
10M+ token contexts

MIT Paper: "Recursive Language Models"

Paper: "Recursive Language Models: Efficient Long-Context Processing with Limited Computation" arXiv: https://arxiv.org/abs/2512.24601

Key Findings

Process inputs up to two orders of magnitude beyond context windows
RLM-Qwen3-8B outperforms base Qwen3-8B by 28.3% on average
Approaches GPT-5 quality on long-context tasks

Key Innovations

1. Symbolic Handle to Prompt

Concept: Prompt lives in REPL (external to LLM)

How it works:

Large context stored externally (not in LLM context)
LLM maintains symbolic reference to context
Only metadata in LLM context (constant size)

2. Symbolic Recursion

Concept: LLM writes code that calls sub_LLM() in loops

How it works:

text

# LLM generates this Python code
chunks = chunk_indices(size=50000)
results = []
for start, end in chunks:
    chunk = peek(start, end)
    result = sub_LLM("Analyze", chunk=chunk)  # RECURSION!
    if result["success"]:
        results.append(result["result"])

Why this is recursion: LLM is calling itself through sub_LLM() function.

3. Output via Variables

Concept: Results stored in REPL variables (Final)

How it works:

Intermediate results stored in REPL
Final synthesis in variable Final
LLM only sees variable names, not actual results

4. Metadata-Only History

Concept: Only constant-size metadata in LLM context

How it works:

No actual context chunks in LLM history
Only metadata: chunk IDs, processing status
Context window stays fixed regardless of input size

5. Sub-LLM Calls

Concept: LLM calls itself via subagent

How it works:

Subagent acts as "sub-LLM"
Processes individual chunks
Returns structured results
Main LLM synthesizes from results

Performance Results

Metric	Result
Context scaling	100x beyond context windows
Accuracy improvement	28.3% over base model
Quality	Approaches GPT-5 on long-context tasks
Computation	Limited - scales efficiently

Limitations from Paper

Sequential processing - Chunks processed one at a time
Manual chunking - No semantic boundary detection
Fixed chunk size - No adaptive sizing based on content

Position Bias: LLMs pay more attention to tokens at start and end of context
Tool-Augmented LLMs: Tools add latency but improve accuracy
Modularity: Smaller, focused components work better than large monolithic prompts
Verification Loops: Reflection after execution improves quality

Implementation Notes

Tachikoma's RLM implementation extends the MIT paper with:

Adaptive chunking - Semantic boundary detection (JSON objects, Markdown headings, code functions)
Parallel processing - Process 5 chunks concurrently in waves
Plugin system - Native opencode integration for tool discovery
Environment variables - Testing and control
MCP integration - Uses tachikoma-mcp_enhanced_rlm_process with local fallback

MCP-Enhanced Processing

When tachikoma-mcp_enhanced_rlm_process is available:

typescript

// MCP handles processing with hierarchical indexing
const mcpResult = await globalThis.mcpTools.enhancedRLMProcess({
  content: largeContext,
  query: userRequest,
  use_hierarchical_indexing: true,
  chunk_strategy: "semantic",
});

// Fallback to local processing if MCP unavailable

Performance:

With MCP: O(log N) retrieval, semantic chunking
Without MCP: Fixed-size chunking, linear scan
Improvement: Better semantic coherence, faster queries

Configuration

RLM handler configuration in src/plugin/tachikoma/rlm-handler.ts:

typescript

interface RLMConfig {
  chunkSize: number;              // Target tokens per chunk (50K)
  maxConcurrentChunks: number;   // Parallel waves (5)
  semanticBoundaries: string[];   // What defines chunk boundaries
  enableAdaptiveChunking: boolean;
  enableParallelProcessing: boolean;
  recursionDepth: number;
}

See: docs/capabilities/rlm.md for Tachikoma implementation details See: docs/internals/mcp-integration.md for MCP server integration

Last Updated: 2026-02-20

Recursive Language Models (RLM) - Research ​

The Problem ​

MIT Paper: "Recursive Language Models" ​

Key Findings ​

Key Innovations ​

1. Symbolic Handle to Prompt ​

2. Symbolic Recursion ​

3. Output via Variables ​

4. Metadata-Only History ​

5. Sub-LLM Calls ​

Performance Results ​

Limitations from Paper ​

Related Research ​

Implementation Notes ​

MCP-Enhanced Processing ​

Configuration ​

Recursive Language Models (RLM) - Research

The Problem

MIT Paper: "Recursive Language Models"

Key Findings

Key Innovations

1. Symbolic Handle to Prompt

2. Symbolic Recursion

3. Output via Variables

4. Metadata-Only History

5. Sub-LLM Calls

Performance Results

Limitations from Paper

Related Research

Implementation Notes

MCP-Enhanced Processing

Configuration