Skip to content

fix(schedules): count usage lim error schedule as failed run#4853

Merged
icecrasher321 merged 2 commits into
stagingfrom
fix/schedules-402
Jun 2, 2026
Merged

fix(schedules): count usage lim error schedule as failed run#4853
icecrasher321 merged 2 commits into
stagingfrom
fix/schedules-402

Conversation

@icecrasher321
Copy link
Copy Markdown
Collaborator

Summary

402s should be treated as failed runs to prevent runaway schedules.

Type of Change

  • Bug fix

Testing

N/A

Checklist

  • Code follows project style guidelines
  • Self-reviewed my changes
  • Tests added/updated and passing
  • No new warnings introduced
  • I confirm that I have read and agree to the terms outlined in the Contributor License Agreement (CLA)

@vercel
Copy link
Copy Markdown

vercel Bot commented Jun 2, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment
Project Deployment Actions Updated (UTC)
docs Skipped Skipped Jun 2, 2026 8:51pm

Request Review

@cursor
Copy link
Copy Markdown

cursor Bot commented Jun 2, 2026

PR Summary

Medium Risk
Touches schedule state machine and auto-disable logic across cron recovery and background execution; behavior change for 402 paths could disable schedules that previously retried forever.

Overview
Introduces buildScheduleFailureUpdate so every schedule path that records a failed run applies the same DB fields: release the claim, set nextRunAt, bump failedCount, stamp lastFailedAt, reset infra retries, and auto-disable at MAX_CONSECUTIVE_FAILURES. The cron execute route and background execution now call this helper instead of duplicating inline update objects.

The important behavior change is HTTP 402 (usage limit) during preprocessing: those attempts are now counted as failed runs (while still scheduling the next normal cadence) so a schedule stuck over quota can eventually auto-disable instead of retrying indefinitely. Successful runs still reset failedCount, so short-lived overages can recover.

Reviewed by Cursor Bugbot for commit 0d5e813. Configure here.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Jun 2, 2026

Greptile Summary

This PR fixes a runaway-schedule bug by treating HTTP 402 (usage-limit) responses as failed runs, so the shared MAX_CONSECUTIVE_FAILURES auto-disable threshold applies. It also refactors all five inline failure-update objects into a single buildScheduleFailureUpdate helper, eliminating drift risk across failure branches.

  • buildScheduleFailureUpdate extraction — all failure paths (preprocessing, execution, infra-retry exhaustion, 402) now share a single implementation that increments failedCount, stamps lastFailedAt, and evaluates auto-disable; the 429 (rate limit) path is deliberately excluded since transient rate limits are not billing-state failures.
  • 402 behavioural change — previously a usage-limit hit only reset infraRetryCount and advanced nextRunAt; it now also increments failedCount and can set status = 'disabled' after 100 consecutive failures, preventing abandoned over-limit schedules from retrying indefinitely.
  • Test mock updateroute.test.ts adds a stub for the new export to keep existing route-level tests compiling; the 402 execution path itself is exercised within schedule-execution.ts unit tests.

Confidence Score: 5/5

Safe to merge — the change is a targeted fix with a clean refactor and no regressions introduced.

The behavioural change is small and intentional: a 402 response now increments the consecutive-failure counter, matching how every other non-retryable failure is handled. The refactoring that centralises buildScheduleFailureUpdate is mechanical — each replaced inline object contained exactly the same fields with the same expressions. The 429 path is correctly left unchanged.

No files require special attention.

Important Files Changed

Filename Overview
apps/sim/background/schedule-execution.ts Extracts buildScheduleFailureUpdate helper and routes the 402 path through it so usage-limit hits now increment failedCount and can trigger auto-disable — previously 402 only reset infraRetryCount.
apps/sim/app/api/schedules/execute/route.ts Two inline failure-update objects replaced with calls to buildScheduleFailureUpdate; logic is identical to what was inlined.
apps/sim/app/api/schedules/execute/route.test.ts Mock for the newly exported buildScheduleFailureUpdate added to the schedule-execution module mock; no new test cases for the 402 behaviour.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[executeScheduleJob] --> B{Preprocess result\nstatus code}
    B -->|404| C[Disable schedule immediately]
    B -->|429| D[Delay 5 min, reset infraRetryCount\nfailCount unchanged]
    B -->|402| E[buildScheduleFailureUpdate\nnextRunAt = cron cadence or +1h]
    B -->|5xx retryable| F[retryScheduleAfterInfraFailure]
    B -->|other error| G[buildScheduleFailureUpdate\nnextRunAt = determineNextRunAfterError]
    B -->|success| H[Run workflow]
    E --> I[failedCount++\nlastFailedAt = now\nstatus = disabled if >= MAX_CONSECUTIVE_FAILURES]
    G --> I
    F -->|retries exhausted| I
    H -->|failure result| J[buildScheduleFailureUpdate]
    H -->|execution error| K[buildScheduleFailureUpdate]
    J --> I
    K --> I
    H -->|success| L[Reset failedCount = 0\nstatus = active]
Loading

Reviews (2): Last reviewed commit: "remove backoff logic" | Re-trigger Greptile

@icecrasher321
Copy link
Copy Markdown
Collaborator Author

@greptile

@icecrasher321
Copy link
Copy Markdown
Collaborator Author

bugbot run

Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Bugbot reviewed your changes and found no new issues!

Comment @cursor review or bugbot run to trigger another review on this PR

Reviewed by Cursor Bugbot for commit 0d5e813. Configure here.

@icecrasher321 icecrasher321 merged commit f56a0e4 into staging Jun 2, 2026
14 checks passed
@waleedlatif1 waleedlatif1 deleted the fix/schedules-402 branch June 3, 2026 19:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant