Researchers grow a hypothesis tree for AI coding agents

Further, progress should not depend on human overseers regularly stepping in to dictate logical next steps or interpret the meaning of previous trials, they noted. To be truly autonomous, agentic research frameworks must maintain connections between experiments, data, results, and failures over time.

Arbor is built to fulfill three system requirements. First, it must be able to branch as sub-trees test out competing hypotheses that are all potentially plausible. At the same time, unrestricted branching can degenerate the whole framework, so that must be controlled to remain organized. The researchers call this “branching with coherence.”

Second, the infrastructure must separate local execution from overarching strategy. Testing out single hypotheses requires short-horizon tasks like editing, debugging, and evaluation. But these should not “obscure” the larger tree making decisions based on evidence gathered across the whole run.

Source link