Three modes. One endpoint.
Pass depth as 1, 2, or 3. The same seven-layer pipeline runs at each depth; deeper modes loop more, decompose more, and synthesize at the end. Pick by question shape, not vendor preference.
How to pick
- If the answer is a single fact, use depth 1. Fast, cheap, cache-friendly.
- If the answer is multi-faceted but bounded, use depth 2. The reflection loop catches gaps a single pass would miss.
- If the deliverable is a report, use depth 3. The stopping AI decides when the question is answered; synthesis produces a coherent narrative across sources.
Depth and the cache
Every depth benefits from the fact cache. Depth 1 hits the cache on the initial query. Depth 2 and 3 also hit the cache on each decomposed sub-question inside the loop, so a tenant with a warm cache pays cache-hit costs on the parts of the question it has already answered before. See Fact Cache.