Probably raises $9 million seed round from Andreessen Horowitz for data tool that validates LLM outputs
The startup’s first product is a data science tool that pairs LLM answers with citations, audit trails, and a deterministic validator to reduce hallucinations.
TechCrunchProbably raised $9 million in seed funding from Andreessen Horowitz for a data science tool designed to limit hallucinations and factual errors in large-language-model outputs. The company’s system runs an LLM’s first-pass answers through a deterministic validator that checks results against the underlying dataset and returns mismatched outputs for correction.
The LLM itself has been trained against this validator, and the full harness is optimized for speed and accuracy, TechCrunch reported.
Founder Peter Elias said the approach allows Probably to use a model four classes weaker than current frontier systems. The current version runs on local hardware such as a desktop computer, cutting token costs that have risen for many users. Elias described the engineering as a “data science mech suit” that reduces ambiguity so the model does not have to work as hard.
“What we learned building this was that the better your harness engineering is, the weaker the model can be,” he said. Each answer produced by the tool includes a citation and an audit trail showing how the result was generated. TechCrunch reported that the same engine can be extended to other precision-sensitive fields such as accounting or medical services.
Elias noted that major AI labs have not pursued similar systems. “They’re incentivized not to, because they make money the more times you have to correct the model,” he said.

