FIND-20260324-017 · 2026-03-24 · Innovation Veille
RLM (Recursive Language Models) — Data Scientist Agent Embedded in Programs via DSPy SandboxSerializable
tool
MEDIUM
Kevin Madura (@kmad) shares his March 22 2026 blog post introducing a new pattern for embedding LLM-based data analysis agents directly into programs using Recursive Language Models (RLMs). The approach extends DSPy's SandboxSerializable protocol to expose DataFrames to a persistent REPL-based LLM loop — the model iterates, writes code, inspects results, and recurses until analysis is complete. Benchmarked on DABench (257 questions, 68 CSVs), Qwen 3.5 397B reaches 86.8% accuracy with 2.8 average iterations. The upstream library (alexzhang13/rlm, 3177 stars, MIT, Python) provides plug-and-play inference with Docker/Modal sandbox support.
Source
https://x.com/kmad/status/2035790703180005507?s=46
ODS Impact
Relevant to ODS Data Platform Zero-ETL layer and any future analytics automation within the platform. The RLM pattern — embedding a REPL-looping LLM into data workflows — could serve as the intelligence layer above ClickHouse for ad-hoc cohort analysis, anomaly detection, or report generation without custom ETL code. The DSPy SandboxSerializable protocol is directly applicable to any service that produces DataFrames from ClickHouse queries (Billing Engine analytics, Metabase supplement). The upstream RLM library (MIT) is production-ready with Docker sandbox isolation. Not an immediate P0/P1 dependency but a strong candidate for the P2 ClickHouse + Metabase phase.
Security Review
License: MIT | Maintenance: ACTIVE | Risk: MEDIUM | Recommendation: USE_WITH_CAUTION
Tags
ai-agents
rlm
dspy
llm
data-analysis
clickhouse
python
sandbox
repl
adhoc
analytics
zero-etl