perf(coverage): speed up report generation (bounded-memory merge + precompiled globs)#10506
perf(coverage): speed up report generation (bounded-memory merge + precompiled globs)#10506toxik wants to merge 4 commits into
Conversation
|
Hello @toxik. Your PR has been labeled To keep your PR open, please follow these steps:
Please, do not generate or format the response with AI. If you do not speak English, reply in your native language or use translation software like Google Translate or Deepl. If the response is generated, the PR will be closed automatically. These measures help us reduce maintenance burden and keep the team's work efficient. See our AI contributions policy for more context. |
✅ Deploy Preview for vitest-dev ready!Built without sensitive environment variables
To edit notification comments on pull requests, go to your Netlify project configuration. |
Hello, I am indeed a human and there was indeed coding agent help with this PR from Claude Code. |
|
📝 Ran ecosystem CI: Open
|
AriPerkkio
left a comment
There was a problem hiding this comment.
The early coverage map merging is there intentionally. We cannot hold all coverage results in memory at once, as it would lead to out-of-memory crashes. That's the reason why we do file-system round-trip in the first place, instead of just holding everything in memory.
Related issue:
|
|
||
| // mergeProcessCovs sometimes loses autoAttachSubprocess | ||
| const fromExtendedContext = autoAttachSubprocess ? coverage.result.filter(r => r.isExtendedContext) : [] | ||
| coverages.push(coverage) |
There was a problem hiding this comment.
We cannot hold these in-memory as it would cause out-of-memory issues in large projects.
There was a problem hiding this comment.
Thanks for the review!
I'll look into what can be done to reduce the memory footprint.
In the meantime, would it make sense to open a separate PR for the regex optimization?
There was a problem hiding this comment.
Yup sounds good! Also if it makes sense to optimize the startOffset and autoAttachSubprocess handling separately, feel free to include them there.
There was a problem hiding this comment.
left things in the same PR though 😅
batched the files into ~16MB chunks to still keep O(n) instead of O(n^2)
01c2c27
|
|
||
| this._globMatchers = { | ||
| matchExclude: exclude.length ? pm(exclude, { dot: true }) : () => false, | ||
| matchInclude: pm(include, { dot: true, ignore: exclude }), |
There was a problem hiding this comment.
This could probably also check this.options.include, and default to () => true when it's falsy. 🤔
There was a problem hiding this comment.
addressed in 43c3903f76e14c51b578dab4a9090e05c21be04a
generateCoverage() folded mergeProcessCovs([merged, coverage]) once per coverage file and re-scanned the growing merged.result on every file to repair startOffset/isExtendedContext, making the read phase O(n^2) in the number of coverage files. Collect the raw coverages and merge them in a single pass, restoring the dropped fields via lookup maps built while reading. Profiled on a real 1124-test-file app: the read+merge phase dropped from ~24.5s to ~0.3s (single merge ~95ms). Together with the isIncluded fix, total coverage generation went from 52.8s to 7.1s. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
isIncluded() called picomatch.isMatch(file, patterns, options) per file, which recompiles every glob into a regex on each call. The globCache does not help during coverage filtering because each result entry has a unique filename, so every call paid the full compile cost. Compile the include/exclude matchers once and reuse them. Profiled on a real 1124-test-file app: the include/exclude filtering step dropped from ~23.3s to ~0.38s (~61x), with identical results. Together with the single-pass merge, total coverage generation went from 52.8s to 7.1s. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
14a9977 to
65dd2ad
Compare
Co-authored-by: Ari Perkkiö <ari.perkkio@gmail.com>
65dd2ad to
43c3903
Compare
Description
Two independent, self-contained performance fixes to coverage report generation (
generateCoverage/BaseCoverageProvider), measured on a real 1124-test-file app (~39 MB of raw V8 coverage across the run). Numbers below are best-of-3 runs, baseline vs this PR on the same machine:isIncludedfiltering)generateCoveragePeak heap during coverage generation drops from ~2.5 GB → ~1.15 GB, and resident raw coverage is now bounded to ≤16 MB regardless of suite size.
Tests
No behavior change. Both fixes are verified against the existing
test/coverage-testsuites (v8 + istanbul), which assert coverage output and the set of included files.pnpm typecheckandpnpm lintpass. A backport to thev4release line will follow.🤖 Generated with Claude Code