feat: add MacOS support#81
Conversation
Merging this PR will degrade performance by 93.19%
|
| Mode | Benchmark | BASE |
HEAD |
Efficiency | |
|---|---|---|---|---|---|
| ❌ | WallTime | iterative fibo 15 |
312 ns | 369,169 ns | -99.92% |
| ❌ | WallTime | iterative fibo 20 |
360 ns | 367,274 ns | -99.9% |
| ❌ | WallTime | short body |
1.7 µs | 377 µs | -99.55% |
| ❌ | WallTime | short body |
1.7 µs | 379.6 µs | -99.55% |
| ❌ | WallTime | short body |
1.7 µs | 373.4 µs | -99.54% |
| ❌ | WallTime | short body |
1.8 µs | 373.7 µs | -99.53% |
| ❌ | WallTime | short body |
1.8 µs | 373.8 µs | -99.53% |
| ❌ | WallTime | short body |
1.8 µs | 376.5 µs | -99.52% |
| ❌ | WallTime | fibo 10 |
2.1 µs | 372.6 µs | -99.43% |
| ❌ | WallTime | fibo 15 |
21.5 µs | 392.1 µs | -94.51% |
| ❌ | WallTime | recursive fibo 15 |
22.2 µs | 396.5 µs | -94.41% |
| ❌ | WallTime | end |
35.5 µs | 432 µs | -91.79% |
| ❌ | WallTime | long body |
214.3 µs | 604.3 µs | -64.53% |
| ❌ | WallTime | long body |
211.7 µs | 596.8 µs | -64.52% |
| ❌ | WallTime | long body |
217.2 µs | 601.9 µs | -63.91% |
| ❌ | WallTime | long body |
217.3 µs | 601.5 µs | -63.87% |
| ❌ | WallTime | recursive fibo 20 |
239.5 µs | 613.1 µs | -60.94% |
| ❌ | Simulation | test_iterative_fibo_10 |
299.9 µs | 547.2 µs | -45.2% |
| ❌ | WallTime | wait 1ms |
1 ms | 1.4 ms | -28.18% |
| ⚡ | Memory | wait 1ms |
16 B | 10 B | +60% |
| ... | ... | ... | ... | ... | ... |
ℹ️ Only the first 20 benchmarks are displayed. Go to the app to view all benchmarks.
Tip
Investigate this regression by commenting @codspeedbot fix this regression on this PR, or directly use the CodSpeed MCP with your agent.
Comparing cod-2720-bump-instrument-hooks-in-codspeed-node-to-support-macos (ad2b2e7) with main (4dae798)
f0b1349 to
3982c34
Compare
Add benches/macos.bench.ts guarded with describe.skipIf(!darwin) so it only runs on macOS, and wire the walltime-macos-test CI job to execute it via a direct `node vitest.mjs bench --run macos` invocation.
32c3b31 to
1d75a61
Compare
Greptile SummaryThis PR adds macOS walltime support to the vitest and tinybench plugins by introducing per-iteration
Confidence Score: 4/5Safe to merge for the macOS-only path; the existing Linux walltime benchmarks now silently emit per-iteration FIFO markers that were not there before, matching a pattern the tinybench plugin explicitly removed to prevent CI timeouts. The finishRound change in WalltimeRunner adds two blocking FIFO round-trips per tinybench iteration on every walltime run — including the four existing bench files on the Linux codspeed-walltime job. The tinybench plugin's own inline comment documents that this identical per-iteration pattern previously overwhelmed the runner and timed out CI, and that plugin was specifically refactored to avoid it. The macOS job uses a purpose-built runner branch, but the Linux job uses the existing runner with no changes and has no guard to skip the new markers. packages/vitest-plugin/src/walltime/index.ts — the finishRound call inside codspeed_root_frame affects all walltime benchmark runs, not only the new macOS benchmark. Important Files Changed
|
| if ( | ||
| result !== null && | ||
| typeof result === "object" && | ||
| typeof (result as PromiseLike<unknown>).then === "function" | ||
| ) { |
There was a problem hiding this comment.
The multi-clause thenable check should be extracted into a named
const per the project's rule that non-obvious or multi-clause boolean conditions must be named with an is*/has* prefix.
| if ( | |
| result !== null && | |
| typeof result === "object" && | |
| typeof (result as PromiseLike<unknown>).then === "function" | |
| ) { | |
| const isThenable = | |
| result !== null && | |
| typeof result === "object" && | |
| typeof (result as PromiseLike<unknown>).then === "function"; | |
| if (isThenable) { |
Rule Used: Extract any non-obvious or multi-clause boolean co... (source)
Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!
| - name: Diagnose resign+exec on this runner | ||
| run: | | ||
| sw_vers; uname -m; sysctl -n hw.memsize hw.ncpu | ||
| cp /bin/sh /tmp/sh_copy | ||
| /usr/bin/codesign -s - -f /tmp/sh_copy | ||
| lipo -info /tmp/sh_copy | ||
| /tmp/sh_copy -c 'echo RESIGNED_ARM64E_OK; uname -m' | ||
| echo "exit=$?" |
There was a problem hiding this comment.
Diagnostic step left in production workflow
The "Diagnose resign+exec on this runner" step copies /bin/sh, codesigns it, and re-executes it — this is clearly debugging scaffolding for the SIP/resign investigation. Leaving it in the permanent workflow adds noise to every future CI run and exposes internal implementation details. It should be removed once the resign-exec behaviour is confirmed working.
| working-directory: packages/vitest-plugin | ||
| run: pnpm turbo run bench --env-mode=loose --filter=@codspeed/vitest-plugin | ||
| mode: walltime | ||
| runner-version: branch:sip-resign-exec-redirect |
There was a problem hiding this comment.
Mutable branch reference for
runner-version
runner-version: branch:sip-resign-exec-redirect points to a live branch that can be force-pushed or rebased at any time. If the branch tip changes between two benchmark runs, the profiling behaviour changes silently, making measurement results non-comparable across commits. Consider pinning to a specific commit SHA or tag once the feature branch is stable, e.g. runner-version: sha:<commit>.
888eaad to
9925d1a
Compare
Per-iteration markers made marker volume proportional to the iteration count (millions for fast benches): every addMarker call is a synchronous FIFO round-trip to the runner, which overwhelmed it and timed out the walltime CI job. Buffer round timestamps in JS instead and flush one BENCHMARK_START/BENCHMARK_END pair bracketing the measured rounds after the run. Also exclude warmup from markers (the wrapper previously emitted markers during warmup rounds) and fix the sync path to use warmupSync(): the un-awaited async warmup() never actually ran before runSync(). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Move marker emission from per-iteration (inside the task fn) to a single BENCHMARK_START/BENCHMARK_END pair bracketing the whole bench run in setupBenchMethods. Each addMarker call is a synchronous FIFO round-trip to the runner, so per-iteration markers scaled with iteration count (millions for fast benches), overwhelmed the runner, and timed out the walltime CI job — while also inflating the measured samples. Emission is gated to walltime mode so the analysis runner sharing setupBenchMethods is unaffected. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Pass the vitest file filter through turbo so the macOS job runs just
macos.bench.ts. The full vitest-plugin suite already runs on the linux
walltime job, and uploading the same benchmark twice for one commit
fails CodSpeed processing ("Found the same benchmark multiple times").
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
No description provided.