Pre-Earnings Research: 160 Hours to Automation

TL;DR

Earnings prep for a 30-position fund takes roughly 160 analyst-hours per quarter
Brooker Belcourt’s published framework (Every Inc., March 2026) compresses that to ~1 hour of review at Level 4
The jump from Level 2 to Level 4 requires ~100 hours of coding skill investment per analyst
Level 4 has real limitations: fragile scheduling, ephemeral output, single-user access
Level 5 (production infrastructure) solves these with persistent overnight runs, multi-analyst access, and parallel processing
Data accuracy with structured sources (Daloopa benchmark): 89-91%. Web-sourced: 20-71%. The data source matters more than the model.

160 Hours Per Quarter on Assembly, Not Analysis

Earnings prep at a fund covering 30 positions takes roughly 160 hours per quarter. That is one analyst, full time, for a month, doing nothing but assembling revenue trends, margin evolution, segment data, and management commentary.

Brooker Belcourt, who ran finance at Perplexity and now leads financial services consulting at Every, compressed that to about an hour of review. His published framework for how he got there reveals where most teams get stuck.

After reaching what he calls Level 4, Belcourt says the 160 hours compressed to “the time it takes Claude Code to run, plus the hour or so I spend reviewing and adding perspective.”

The pre-earnings research automation framework is worth understanding in detail, because it maps the path most funds will need to follow.

The Four Levels of Earnings Automation

Level 1: ChatGPT with a custom prompt. Most analysts start here. Upload an earnings release or a 10-Q, give it a detailed prompt asking for beat/miss analysis, margin trends, guidance changes, and key transcript quotes. The output is decent for a first pass. The limitation: an 8,000 character prompt ceiling, no connection to external data, and every analysis trapped inside a chat window that disappears when you close the tab.

Level 2: Claude with skills and data connections. The analyst connects Claude to external data via MCP (Model Context Protocol). Daloopa provides structured financials hyperlinked to source documents. Multiple analysis templates run simultaneously: an earnings preview skill, an investment philosophy skill, and a dashboard formatting skill. The output improves dramatically because the model works with real financial data instead of whatever the analyst pasted into the prompt. The ceiling: still a manual workflow. Someone has to trigger each analysis.

Level 3: Claude Cowork with local files. This is where the analyst’s own data enters the picture. Excel models, internal notes, prior research memos, call transcripts. Claude reads across these local files and connects them to the external data sources. You can ask it to compare your projections against consensus estimates, or flag where management commentary contradicts your thesis from last quarter. The ceiling: each task produces a separate output. No unified view.

Level 4: Claude Code with a custom dashboard. A single command triggers the entire workflow. Claude Code runs continuously, pulling data from multiple MCPs, reading internal files, generating analysis, and assembling everything into an interactive Streamlit dashboard with charts, tables, and formatted commentary. Revenue trajectories, margin evolution, segment breakdowns, management guidance comparisons. All from one command.

At Level 4, Belcourt claims quarterly earnings prep compresses from 160 hours to roughly the compute time plus one hour of review. A 99% reduction in analyst labor.

Where Most Teams Get Stuck

The jump from Level 1 to Level 2 is straightforward. Connect a data source, write better prompts. Most technically curious analysts can get there in a few days.

The jump from Level 2 to Level 4 is where it falls apart. Belcourt himself puts the learning curve at 100 hours to feel confident with the coding tools. His $5,000 one-day workshop exists precisely because that 100-hour barrier stops most people.

And even at Level 4, there are practical limitations that matter for institutional deployment:

Scheduling is fragile. Belcourt’s own Twitter activity from March 2026 shows him posting screenshots of errors while trying to get Perplexity scheduled tasks to work. He tagged @perplexity_ai directly. Cloud-based scheduled tasks on Anthropic’s platform are limited to three daily sessions on the highest plan tier, and known MCP connector bugs prevent some data sources from loading in scheduled runs. Desktop scheduling requires the analyst’s machine to be awake and Claude Desktop to be open. Close the laptop lid and the run stops.

Output is ephemeral. Streamlit dashboards generated by Claude Code do not persist between sessions. Each run starts from scratch. There is no database storing yesterday’s analysis for comparison with today’s.

It is single-user. The dashboard runs on localhost. One analyst, one machine. Sharing the output means taking a screenshot or exporting a PDF. There is no URL a colleague can open to see the same view.

The Accuracy Question

Before automating earnings research, accuracy matters. A February 2026 benchmark study by Daloopa tested three frontier AI models (Claude Opus 4.5, GPT-5.2, and Gemini 3 Pro) on 500 financial questions.

Data Source	Accuracy Range
Web search	20% to 71% (varies by model and question type)
Structured data (Daloopa)	89% to 91%

The takeaway is not that AI models are unreliable. The data source determines the quality of the output. An agent pulling structured fundamentals from a provider like Daloopa, Fiscal AI, or FactSet produces dramatically more accurate analysis than an agent searching the web for financial data. The model matters less than what you feed it.

For practical purposes: any automated earnings research system should ground itself in structured data feeds, not web scraping. The 89-91% accuracy range on structured data is high enough to be useful as a first pass that an analyst reviews and refines. The 20-71% range on web-sourced data is too unreliable to trust, even as a starting point.

What Level 5 Looks Like

Belcourt’s framework stops at Level 4: a single analyst running Claude Code with a custom dashboard on their own machine. Institutional requirements go further.

A production pre-earnings research system for a fund with 20 to 50 positions needs:

Persistent overnight scheduling. Agents run at 3 AM, pull data from multiple sources, generate analysis for every position in the portfolio, and have formatted briefs waiting before market open. Not three runs per day. Not dependent on a laptop being awake. Every position, every morning, without exception.

Multi-company, multi-analyst access. The same analysis available to every team member through a web URL. No local installation. No 100-hour learning curve. Open the page, read the research, start working.

Parallel processing. Analyze 25 companies simultaneously, not sequentially. Earnings season is dense. Running positions one at a time means the early analyses are stale before the later ones finish.

Grounded analysis. Every number in the output links to a structured data source. Not “revenue grew approximately 48%.” Instead: “Google Cloud revenue was $17.7 billion in Q4, per Fiscal AI, representing 48% year-over-year growth.” The analyst can verify any data point in seconds.

The infrastructure for this exists today. MCP connects to institutional data providers. Cron scheduling on dedicated servers runs without laptop dependencies. The 160-hour quarterly workflow compresses not just for one analyst on one machine, but for an entire research team across every position they cover.

The Real Question

The technology works. The data accuracy with structured sources is high enough. The scheduling, multi-company, and web-access layers are engineering problems with known solutions.

The question for most investment teams is not whether this is possible. It is whether they want to spend 100 hours per analyst teaching everyone to build it themselves, or whether they want the infrastructure delivered and maintained while their analysts focus on what they were hired to do: make investment decisions.

Belcourt’s four-level framework is the right map. But the destination is not “every analyst becomes a Claude Code expert.” The destination is “every analyst arrives at 7 AM and the research is already done.”

FAQ

How long does earnings prep take without automation?

Roughly 160 analyst-hours per quarter for a fund covering 30 positions. That covers assembling revenue trends, margin evolution, segment data, and management commentary for each position. It is one analyst working full-time for a month.

What is Brooker Belcourt’s four-level framework?

A progression from basic ChatGPT prompts (Level 1) through data-connected analysis (Level 2), local file integration (Level 3), to a fully automated Claude Code dashboard (Level 4). Published by Belcourt through Every Inc. in March 2026, based on his experience running finance at Perplexity.

How accurate is AI-generated financial analysis?

Depends entirely on the data source. A February 2026 Daloopa benchmark of 500 financial questions across three frontier models showed 89-91% accuracy with structured data feeds and only 20-71% accuracy with web search. Structured data sources (Daloopa, FactSet, Fiscal AI) produce dramatically better results.

What are the limitations of Level 4 (Claude Code dashboard)?

Three main constraints: scheduling depends on the analyst’s machine being awake, output does not persist between sessions, and dashboards run on localhost with no multi-user access. These are the gaps that institutional (Level 5) infrastructure addresses.

What does a production Level 5 system require?

Persistent overnight scheduling (not laptop-dependent), multi-analyst web access (no local setup needed), parallel processing across the full portfolio, and grounded analysis with source-linked data points. The underlying technology and connectors exist; the gap is the integration and deployment engineering.

Sources: Every Inc. “Build Your Own Bloomberg Terminal With AI” (Brooker Belcourt, March 2026), Daloopa AI Accuracy Benchmark (February 2026, 500 questions, 3 frontier models), Brooker Belcourt Twitter @BrookerBelcourt (March 16-17, 2026, scheduling issues), Claude Help Center “Schedule recurring tasks” (plan limitations), Anthropic Claude Code documentation (cloud scheduling limits)

Last updated: April 14, 2026

By BetterAI | We build custom AI research infrastructure for European investment firms.