CI Pipeline Speed: The Three-Minute Rule
- Pipeline duration is developer experience. A 3-minute pipeline keeps the engineer engaged; a 12-minute pipeline does not.
- Test parallelism is the cheapest speedup; most CI runners support it natively.
- Cached dependencies cut cold-start time by 50 to 80 percent on most stacks.
- Some tests do not belong in CI. Move them to nightly or pre-merge windows.
A team had a CI pipeline that took 14 minutes per PR. Nobody had complained loudly because the team had grown into the slowness over months. We measured the impact: each engineer pushed roughly 8 PRs per week. With each PR running CI 2 to 3 times before merge (push, address review comments, push again), the team was spending roughly 14 minutes * 2.5 runs * 8 PRs * 24 engineers = 112 hours per week of human attention waiting on or context-switching around CI. That was three full-time engineers’ worth of attention burned weekly.
Six weeks of work brought the pipeline to 2.8 minutes. The same calculation: 2.8 * 2.5 * 8 * 24 = 22 hours per week. Nine fewer engineers’ worth of attention burned. Same team, same tests, same coverage; only the architecture of the pipeline changed.
This piece is about the four levers that consistently produce that kind of speedup.
Why three minutes
The empirical threshold is the time past which a typical engineer context-switches away from the PR they opened. Below three minutes, they wait. Above three, they open Slack, read an email, start another task. The cost of context-switching back after CI completes is the issue: studies on developer productivity put it at 5 to 25 minutes of recovery per switch.
Three minutes is not a hard rule. Some teams sustain longer pipelines with disciplined batching. But for most teams shipping at any cadence, three minutes is the threshold beyond which CI duration starts costing more than the test runtime suggests.
Lever 1: parallel test execution
The single largest speedup on most pipelines. Tests rarely need to run sequentially; modern test runners can split them across multiple workers.
For a typical Python project:
# pytest with xdist
test:
command: pytest -n 8 --dist=loadscope tests/
For Jest:
{
"scripts": {
"test": "jest --maxWorkers=8"
}
}
Most CI providers (GitHub Actions, GitLab, Buildkite, CircleCI) support job-level parallelism on top of in-test parallelism. A well-configured pipeline runs tests across multiple machines and within each machine across multiple cores.
The constraint: test isolation. Tests that share state (a database, files on disk, ports) need to be properly isolated before they can parallelise. A common pre-parallelisation refactor: ensure each test creates and tears down its own resources, no shared mutable state.
Realistic gains: 4x to 8x speedup on most test suites that were not already parallelised. The effort: one to three engineer-days for the configuration, plus whatever isolation work the existing tests need.
Lever 2: dependency caching
Cold start (downloading dependencies, compiling) often dominates CI time. Caching dependencies between runs cuts this dramatically.
| Stack | Cache target |
|---|---|
| Node.js | node_modules and ~/.npm or ~/.pnpm-store |
| Python | virtualenv or ~/.cache/pip |
| Rust | ~/.cargo and target/ (release builds) |
| Java | ~/.m2/repository or ~/.gradle/caches |
| Docker | layer cache (BuildKit) |
GitHub Actions has actions/cache built in; other providers have equivalents. The cache key is typically a hash of the lockfile (package-lock.json, requirements.txt, Cargo.lock).
# GitHub Actions example
- uses: actions/cache@v4
with:
path: ~/.npm
key: npm-${{ hashFiles('package-lock.json') }}
- run: npm ci
Realistic gains: 30 to 70 percent reduction in cold-start time. On a 14-minute pipeline where 4 minutes was dependency installation, this might save 2 to 3 minutes per run.
The trap: invalidating the cache too aggressively. A change to one dependency should not throw away the cache for all dependencies. Most cache strategies handle this with restore-keys (fall back to a partial match when exact match misses).
Lever 3: test selection by changed paths
If a PR touches only the billing module, do you need to run the entire test suite? For most projects, no. Path-based test selection runs only the tests relevant to the changed files.
Approaches:
File-pattern based. A simple mapping: services/billing/** triggers tests/billing/**. Easy to set up, occasionally misses cross-cutting concerns.
Coverage-based. Tools like pytest-testmon or coverage analysis identify which tests cover which files. PRs only run the tests that cover changed code. More accurate; more setup.
Build-tool integration. Bazel, Pants and other build systems compute precise dependency graphs and run only affected tests. The most precise option; requires committing to the build tool.
For most teams, file-pattern based is the right starting point. Coverage- or build-tool-based becomes worthwhile as the codebase grows past 100k lines.
The complementary safety: full test suite still runs on main branch post-merge, plus nightly. Path-based selection accelerates PRs; the full suite catches anything path selection missed.
Lever 4: remove tests from the merge gate
Some tests do not belong in the merge-gating pipeline:
End-to-end tests over external services. A test that hits a real third-party API is slow and flaky. Move it to a nightly run or a pre-release smoke test.
Performance regression tests. A test that measures latency over 10,000 iterations is slow. Run it nightly with alerting; do not gate merge on it.
Cross-browser or cross-OS tests. Each additional environment multiplies CI time. Run a curated subset on PR; full matrix on merge to main or nightly.
Data-heavy tests. Tests that load gigabytes of fixtures are slow. Move them to a separate job; sample them on PR.
The honest test for whether a test belongs in CI: does its failure require the PR to NOT MERGE, today? If yes, it belongs. If “we’ll know within 24 hours and can fix forward”, it does not belong on the critical path.
The push-back the team needs: “but what if a regression slips in”. The answer: post-merge alerting catches it within hours, and the cost of one slipped regression caught at the post-merge stage is significantly less than the daily cost of a slow gating pipeline.
Measuring and protecting CI speed
A team that does not measure CI speed lets it degrade silently. New tests get added; nothing gets removed; pipeline duration grows by 10 to 30 percent per quarter.
Minimum measurement:
- 95th percentile pipeline duration over the last 7 days
- Average pipeline duration trend over the last quarter
- Count of new tests added per week
- Top 10 slowest tests in the suite
Surface these on a dashboard the team sees. Set targets (P95 < 5 minutes). Treat regressions in CI duration as performance bugs.
What we install on engagements
Standard CI optimisation:
- Profile the existing pipeline (where is the time going).
- Parallelise test execution within and across jobs.
- Cache dependencies and build artifacts.
- Select tests by changed paths for PRs.
- Move non-merge-gating tests to post-merge or nightly.
- Measure with a dashboard the team owns.
Total: typically two to four engineer-weeks for a team that has not optimised CI before. The result is consistently a pipeline that fits inside three minutes for typical PRs and inside five minutes for almost all PRs.
The engineering investment pays back in developer attention recovered. The cost of NOT doing this work is paid every day, by every engineer, in context-switches that nobody measures but everyone feels.
Questions teams ask
Why three minutes specifically?
Three minutes is the empirical threshold past which engineers context-switch. Below it, they wait at the PR. Above it, they open a new tab. Once they context-switch, the cost of returning to the PR after CI completes is significant.
What if our test suite genuinely takes 30 minutes?
Split it. Fast tests gate merge (3 minutes). Slow tests run in a separate post-merge or nightly pipeline. The merge-blocking tests are a curated subset; the comprehensive suite still runs but does not block iteration.
Should CI speed be in the team's KPIs?
It can be. A target like '95th percentile CI duration under 5 minutes' is measurable and creates the right incentive. Without a number, CI speed degrades silently as new tests get added.