This is the dotrepo-side synthesis of the interview round. The companion essay on MaxwellSantoro.com covers the broader framing and why this experiment was worth running at all.

Read the sister article

The Experiment

dotrepo is an open metadata protocol for software repositories. It is designed for three audiences: maintainers, human users, and AI agents. Since AI agents are a first-class consumer of the protocol, it made sense to ask them directly what they want from it.

I sent the same 10-question prompt to 12 different AI models across 8 providers. Fresh conversations, no priming, no pep talk. The goal was not affirmation. The goal was pressure testing: what feels obviously useful, what is missing, what looks risky, and what would actually make an agent check dotrepo first.

The answers converged much harder than expected. That convergence is the signal.

Models Interviewed

Model Provider Notes
ChatGPT 5.4 ThinkingOpenAILogged-in session
ChatGPT 5.4 ThinkingOpenAIIncognito session
Claude Opus 4.6 ExtendedAnthropicLogged-in session
Claude Opus 4.6 ExtendedAnthropicIncognito session
Gemini Pro 3.1GoogleFresh conversation
Gemini Thinking 3.1GoogleFresh conversation
Grok Expert 4.20xAILogged-in session
Grok Expert 4.20xAIIncognito session
GLM-5Zhipu AIFresh conversation
Hunter AlphaOpenRouterFresh conversation
MiniMax M2.5MiniMaxFresh conversation
Nemotron 3 SuperNVIDIAFresh conversation

Where possible, ChatGPT and Grok were grounded against the live repo and public site, and both ChatGPT, Claude, and Grok were tested in more than one session shape to check whether the takeaways were stable.

Consensus Findings

1. Build and test commands are the sharpest pain point

This was unanimous. Every model described “how do I actually build and test this repo?” as the most expensive reasoning step in unfamiliar codebases. The hard part is not inferring the language. It is getting from the presence of build files to the exact, correct command with the right flags, prerequisites, and side effects.

The gap between “I see a build file” and “I know the correct command including flags and prerequisites” is where most wasted effort lives.

The implication for dotrepo is direct: structured build and test metadata is not ornamental. It is the highest-value field family in the protocol.

2. The overlay index is the wedge

All 12 models independently identified the public overlay index as dotrepo’s smartest near-term design move. It breaks the adoption trap that kills most metadata standards by making the protocol useful before maintainers opt in.

That view was not abstract. Several models gave concrete coverage thresholds for when dotrepo would flip from “nice to check” to “I check this first.” The shared theme: the protocol is already coherent; the missing ingredient is enough reviewed data that checking dotrepo is usually cheaper than not checking it.

3. Trust and provenance is the moat

The distinction between maintainer-declared facts, imported facts, and inferred facts was universally praised as the genuinely differentiated part of the project. Models described using that metadata to change both language and behavior:

That is exactly the behavior dotrepo is trying to induce, and it is already visible on the live public surface at /v0/repos/index.json and in trust-aware queries such as this repository field query.

4. Stale metadata is the most dangerous failure mode

This was also unanimous. Every model made some version of the same argument: stale trusted metadata is worse than no metadata, because it suppresses the skepticism that would otherwise push an agent back toward the source materials.

That is why freshness is first-class on dotrepo’s public surface. Every response carries snapshot freshness and digest metadata, and the top-level meta document at /v0/meta.json exists specifically so agents and operators can reason about staleness instead of pretending it away.

5. Keep the core schema brutally small

Every model warned against schema bloat. The useful framing was not “small is elegant.” It was “small is how this survives.” The winning version of dotrepo answers a short list of high-value questions reliably: what this repo is, how it builds, how it tests, where the real docs are, who owns it, and what trust level attaches to each answer.

Strongest Criticisms

  1. The project scope is still very ambitious for one repo. The protocol, toolchain, public API, claim workflows, and deployment story are all real now. That is impressive, but it also means the ratio of infrastructure to adoption is something to watch closely.
  2. Plain-string build commands are not enough. Multiple models wanted prerequisites, environment requirements, platform constraints, and an explicit “safe for agent execution?” shape.
  3. Monorepo and workspace semantics remain an obvious gap. The repos where metadata is most painful are often exactly the repos where workspace structure matters most.
  4. Record-level trust is not always granular enough. Several models argued that identity may be maintainer-declared while build commands are imported and docs topology is inferred. That pressure toward field-level provenance is real even if it does not need to land immediately.
  5. The MCP server still lacks remote lookup. The hosted HTTP surface already supports predictable repo-first lookup, but the MCP layer still requires local context for most workflows.
  6. The index is still too small to change behavior by default. Five reviewed overlays proves the architecture. It does not yet create the habit loop where an agent expects dotrepo coverage on arbitrary open-source repos.

Missing MCP Operations

Operation Description Models Requesting
dotrepo.lookupRemote query by repository URL without a local clone6 / 12
dotrepo.diff / dotrepo.stalenessCompare overlay expectations against current repo state6 / 12
dotrepo.batch_queryResolve multiple fields or repositories in one call5 / 12
dotrepo.suggestPropose fields for incomplete or newly imported records4 / 12
dotrepo.evidenceShow why a specific field has the value it has3 / 12

The clear front-runner is remote lookup. The public origin already supports the lookup pattern structurally. What is missing is the MCP operation that makes that path zero-friction inside agent tooling.

Risk Warnings

Synthesis: The Three Things That Matter Most

1. Seed the index. The protocol and hosting surface are ahead of the data. The near-term job is not more architecture. It is more reviewed overlays covering the repos agents actually encounter.

2. Build remote lookup. The hosted HTTP layer already proves the contract. The MCP gap is now the highest-leverage toolchain gap.

3. Protect the minimal core. dotrepo should answer a short list of essential repo questions with explicit provenance and freshness. Everything else should face a very high bar for inclusion.

The trust model is the moat. The overlay index is the wedge. The small schema is the survival constraint.

Methodology Notes

Where This Feeds Back Into dotrepo

The repo-side synthesis and backlog changes live in docs/ai-tool-interviews.md and the post-v1 backlog. The public site now carries this write-up because it is not just internal planning context. It is one of the clearest pieces of product evidence behind the current roadmap.

If you are building an AI coding tool and want to integrate dotrepo, or if you maintain a popular open-source project and want to correct or replace your overlay record, start at github.com/maxwellsantoro/dotrepo.