Why seven, not one

A monolithic retrieve-then-rerank stage is fragile. Failures in extraction poison rerank. Injection slips into the cache. Decomposition gets squeezed into a system prompt and pollutes the answer. Splitting the pipeline lets each layer fail independently and lets operators reason about cost, latency, and integrity per layer instead of for the whole black box.

Where depth meets the pipeline

Every depth runs every layer. Depth 1 runs the loop once. Depth 2 runs depth 1, then re-decomposes the gaps and loops once more. Depth 3 streams the loop and lets a stopping AI judge saturation. The pipeline shape stays constant; the loop count and the decomposition aggressiveness change. See Depths.