v0.6.103: readme updates, tables lifecycle improvements, new connectors, clickhouse integration by waleedlatif1 · Pull Request #4892 · simstudioai/sim

waleedlatif1 · 2026-06-05T07:51:48Z

feat(tables): background import for large CSVs with live progress (feat(tables): background import for large CSVs with live progress #4861)
chore(api-key): remove legacy scan+decrypt auth fallback (chore(api-key): remove legacy scan+decrypt auth fallback #4876)
fix(security): chat attachment XSS, MCP OAuth SSRF guards, Teams clientState verification (fix(security): chat attachment XSS, MCP OAuth SSRF guards, Teams clientState verification #4877)
chore(readme): refresh demo GIFs from docs, lead with Mothership (chore(readme): refresh demo GIFs from docs, lead with Mothership #4878)
fix(mcp): enforce tool name validation in deploy modal (fix(mcp): enforce tool name validation in deploy modal #4879)
chore(skills): mirror model/enrichment/hosted-key/council skills into .agents/skills and expand add-model touchpoints (chore(skills): mirror model/enrichment/hosted-key/council skills into .agents/skills and expand add-model touchpoints #4882)
fix(polling-tools): pass plan execution timeout to internal polling tool routes (fix(polling-tools): pass plan execution timeout to internal polling tool routes #4884)
chore(db): drop legacy copilot_chats.messages JSONB column (chore(db): drop legacy copilot_chats.messages JSONB column #4886)
feat(connectors): add 7 knowledge base connectors (Google Forms, Typeform, Azure DevOps, YouTube, JSM, S3, Sentry) (feat(connectors): add 7 knowledge base connectors (Google Forms, Typeform, Azure DevOps, YouTube, JSM, S3, Sentry) #4880)
feat(integrations): add ClickHouse block and expand Dagster + Tinybird tools (feat(integrations): add ClickHouse block and expand Dagster + Tinybird tools #4883)
fix(autolayout): relocate notes that overlap blocks after layout (fix(autolayout): relocate notes that overlap blocks after layout #4888)
feat(metrics): emit hosted-key metrics to Grafana via OTel (feat(metrics): emit hosted-key metrics to Grafana via OTel #4885)
refactor(tables): consolidate row data-access in service.ts (refactor(tables): consolidate row data-access in service.ts #4881)
feat(tables): workflow version selection (live/deployed) and not-found/no-output badges (feat(tables): workflow version selection (live/deployed) and not-found/no-output badges #4889)
fix(otel): make service.instance.id unique per process (fix(otel): make service.instance.id unique per process #4891)

) * feat(tables): background import for large CSVs with live progress * fix(tables): address review — import heartbeat, overlap guard, column/empty validation * fix(tables): guard sync import overlap, scope fileKey to workspace, delete-on-replace after download * fix(tables): stream large CSV imports from storage instead of buffering the whole file * test(tables): fix async-import route tests for workspace-scoped fileKey + name uniquification * fix(tables): append imports start after existing rows; reconcile missed import failures in the tray * fix(tables): delete the uploaded CSV from storage after the import finishes * fix(tables): validate replace before deleting rows; ignore stale replayed import events by importId * fix(tables): bind import worker to its importId (no stale-worker clobber/overlap) and destroy storage stream on failure * feat(tables): byte-based import progress, cancel support, and a start toast that opens the import view * fix(tables): don't emit ready after cancel; honor cancel during the upload phase * improvement(tables): use a stop (square) icon for canceling an active import * fix(tables): make markTableImporting an atomic claim to close the concurrent-import TOCTOU race * improvement(tables): preview CSV import from a slice, drop client row-count warning The import dialog parsed the entire file in the browser to show an exact row count and a row-limit warning. That holds the whole file in memory, blocks the main thread, and hits V8's ~512MB string ceiling — so the dialog capped the effective import size well below what the streaming importer handles. Parse only the first 512KB (headers + sample for the mapping); drop the exact count and the "would exceed the row limit by N" gate. The DB row-count trigger already enforces max_rows server-side, so an over-limit import fails fast during the run with a clear message instead of being blocked by an expensive parse. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(tables): gate import ownership every batch and stop canceled imports reappearing - Worker checked run ownership only at the progress cadence (~every 5k rows), so a canceled/superseded import could insert several more batches (incl. the final partial batch) before stopping. Move the updateImportProgress ownership gate to the top of every flush — a run that lost the table stops within one batch. - A list/dialog import canceled mid-upload left the server row `importing` until the in-flight server cancel landed; hydration re-seeded it from useTablesList, so the dismissed import flickered back. Flag the real table id canceled on the mid-upload cancel path, skip re-seeding flagged tables in hydration, and clear the flag once the server import is terminal. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * refactor(tables): drive import tray by polling derived from server, not SSE Import progress no longer holds an SSE connection per importing table. The tray now derives its importing rows live from the table list (React Query), polled only while an import is in flight; the table detail page keeps its own cell-state SSE for grid refresh. - store holds only client-only state now: optimistic uploads, which terminal completions to surface this session, canceled ids, menu open — no copied importStatus/rowsProcessed. - useWorkspaceImports is the single source: polls via a data-predicate refetchInterval, derives rows, and fires completion toasts on the importing -> terminal transition. - kickoff handlers use startUpload/setUploadPercent/endUpload; the invalidated list refetch surfaces the server row and polling takes over. - removes use-hydrate-import-tray + use-import-progress-tracker (folded in). - trims over-verbose comments across the import paths. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(tables): ignore superseded-run import events in the detail SSE cache applyImport applied every replayed import payload to the detail cache. The SSE buffer can replay a prior import's terminal event for the same table, stomping a newer in-flight import's UI. Lock to the active run's importId (and ignore a replayed terminal before the id is known), matching the guard the header tracker used to have. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(tables): close sync-import TOCTOU by claiming the atomic import gate The sync import route checked importStatus from a checkAccess snapshot, then parsed/validated/wrote seconds later without taking the atomic claim. A concurrent async kickoff (markTableImporting) could slip into that window and both writers would run together — for replace mode, two delete+insert passes leave the table indeterminate. Claim the same atomic gate (markTableImporting) right before the write and release it in the finally (before the response returns, so a client refetch never sees the transient status). A row-level FOR UPDATE was avoided on purpose: it would invert lock order against the position advisory lock / row-count trigger and risk a deadlock — markTableImporting is the established gate. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> * fix(multipart): keep abort wired after resolve so a mid-upload disconnect tears down the stream readMultipart resolves on the file-part header and hands the caller an un-drained stream, but settle() ran cleanup() and detached the abort listener on that path too. A client disconnect mid-upload then destroyed nothing — busboy never saw EOF, the file stream stalled, and the route's `for await` held a request slot until maxDuration (300s). Re-arm an abort handler scoped to the file stream on resolve, detached when the stream closes. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

…ntState verification (#4877) * fix(chat): prevent XSS in attachment preview via filename/data URL escaping Replace document.write with an escaped blob URL preview: HTML-entity encode the user-controlled filename and data URL, open with noopener,noreferrer, and revoke the blob URL after navigation. * fix(mcp): guard OAuth discovery and token revocation against SSRF Route discoverOAuthServerInfo and the RFC 7009 revocation POST through an SSRF-guarded fetch that validates every request URL via validateMcpServerSsrf (blocking private/reserved/loopback targets, honoring ALLOWED_MCP_DOMAINS and self-hosted localhost rules) and pins the connection to the resolved IP to prevent DNS-rebinding TOCTOU. Previously these fetches used unvalidated global fetch against URLs taken verbatim from attacker-controllable authorization-server metadata. * fix(webhooks): verify Graph clientState on Teams chat-subscription notifications The microsoftteams_chat_subscription trigger set clientState=webhook.id when creating the Graph subscription but never validated it on inbound change notifications, so any request to the webhook path with a crafted notification body was treated as authentic (CWE-345). verifyAuth now requires every notification in the value array to carry a clientState matching the stored webhook id (constant-time compare) and rejects payloads without notifications. Validation handshakes (validationToken) are handled before auth and remain unaffected; outgoing-webhook HMAC auth is unchanged. * fix(webhooks): fail closed when Teams chat-subscription webhook id is unavailable Hardens the clientState check so a missing webhook id (theoretically unreachable, since the row is looked up by primary key) can never collapse the expected value to an empty string that a forged clientState could match. * docs(mcp): note AbortSignal does not bound SSRF-guard DNS lookup * improvement(chat): hoist HTML escape map to module-level constant

* fix(mcp): enforce tool name validation in deploy modal * fix(mcp): correct cn import path to fix build * fix(mcp): align tool-name regex with server sanitization, add disabled-combobox hint

… .agents/skills and expand add-model touchpoints (#4882) * chore(skills): mirror model/enrichment/hosted-key/council skills into .agents/skills and expand add-model touchpoints * chore(skills): document council yaml omission and disambiguate validate-model cross-ref

…ool routes (#4884) * fix(polling-tools): pass plan execution timeout to internal polling tool routes * address comments

Reads and writes are fully cut over to the normalized copilot_messages table (verified in production: no writes to the column in 24h, recently-active chats have empty JSONB while copilot_messages holds the transcript). Drop the dead column via drizzle migration 0225 and re-type CopilotChatDetailRow.messages as an assembled (non-column) field. Deploy notes: reconcile any chats where the JSONB still leads copilot_messages before applying, and pg_repack copilot_chats afterward to reclaim the ~5.7GB TOAST storage (DROP COLUMN is metadata-only).

…form, Azure DevOps, YouTube, JSM, S3, Sentry) (#4880) * feat(connectors): add 7 knowledge base connectors (Google Forms, Typeform, Azure DevOps, YouTube, JSM, S3, Sentry) * fix(connectors): tighten listingCapped semantics per review (WIQL cap, batch omissions, cap-vs-exhaustion) * fix(connectors): google-forms listingCapped must fire on slice regardless of hitLimit (404-null-filter gap) * fix(connectors): s3 streaming size cap for chunked responses without content-length * fix(connectors): ado byte-exact file content fetch, google-forms hash-poisoning on listing failure * fix(connectors): ado auth-failure deletion guard, jsm last-page slice flag, google-forms response cap in hash * fix(connectors): shared streaming size-cap reader for ado file hydration (promote from s3) * fix(knowledge): flag incomplete listings at engine level when pagination is truncated * fix(connectors): ado flags listing incomplete when a non-empty repo has no resolvable branch * fix(knowledge): engine truncation flag is an absolute deletion block (fullSync cannot override); s3 byte-exact size fallback; ado tsdoc accuracy * improvement(knowledge): extract shouldReconcileDeletions gate as tested pure function, tighten engine comments * test(connectors): mapTags coverage for the 7 new connectors * fix(connectors): ado probes past the wiql 20k cap before flagging; document custom-wiql full-listing behavior * fix(connectors): ado flags partial repo trees when items listing emits a continuation token * fix(connectors): ado discards foreign-phase cursors; google-forms scans all response pages for change detection * fix(connectors): audit fixes across new connectors - registry: register x connector (was dead code, never wired in) - google-docs/google-drive/google-forms: gate deletion reconciliation on Drive incompleteSearch; google-docs also now sets listingCapped on its maxDocs cap path - jsm: add read:jira-user scope so reporter resolves on requests - gong: only set listingCapped on genuine truncation, not exact-cap source exhaustion - gitlab: issues phase switched to keyset pagination (removes ~50k offset ceiling), matching the repo-tree phase - grain: parallelize recording + transcript fetch in getDocument - ashby: document updatedAt-based content-hash limitation for notes/feedback change detection - tests: mapTags coverage for x, granola, greenhouse, fathom, rootly

…d tools (#4883) * feat(integrations): add ClickHouse block and expand Dagster + Tinybird tools * fix(tinybird): fail loudly on invalid query_pipe parameters JSON parsePipeParameters previously returned {} on any JSON parse error, so a mistyped 'parameters' input produced a successful pipe call with the dynamic filters silently dropped. Throw a clear error for non-empty, non-object input instead; an omitted/empty value still means 'no parameters'. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * fix(dagster): guard NaN numeric coercions and bound list_assets pagination Address PR review: - Route all block numeric coercions (list_runs limit/createdAfter/createdBefore, get_run_logs logsLimit, list_assets assetsLimit) through a toFiniteNumber() guard so invalid/wand-generated text becomes undefined instead of NaN. - list_assets now applies a default page size (100) when no limit is given, so paging stays bounded and hasMore is meaningful even when limit is omitted. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * fix(dagster): make list_assets hasMore exact via fetch N+1 Address PR review (hasMore true on exact page): request one extra row (pageSize + 1), use its presence as the authoritative hasMore, slice it off, and derive the returned cursor from the last RETURNED asset's key path (JSON-serialized; Dagster normalizes JS/Python whitespace on the way in). This removes the false-positive hasMore when the final page is exactly full. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * fix(clickhouse): enforce read-only query operation and harden WHERE-clause guard * fix(dagster): make list_runs hasMore exact via fetch N+1 Address PR review (list runs false hasMore): request one extra row (pageSize + 1), use its presence as the authoritative hasMore, and slice it off before mapping. Removes the false-positive hasMore (and misleading cursor) when the final page is exactly `limit` runs long. Mirrors the list_assets fix. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * fix(clickhouse): restrict DROP PARTITION to literal values to prevent SQL injection * fix(clickhouse): reject chained statements in read-only query operation * fix(clickhouse): force JSON output on query path and ignore comments when detecting chained statements * fix(tinybird): encode datasource/pipe names in URL paths to prevent traversal A user-or-llm datasource/pipe name interpolated raw into the URL path (e.g. 'real_ds/../../other') is normalized by the WHATWG URL parser and can target a different endpoint. Wrap the path segment with encodeURIComponent in the truncate, delete, and query_pipe URLs. Events/append pass the name via URLSearchParams, which already encodes, so they were unaffected. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * fix(clickhouse): block WITH-led writes/DDL in read-only query operation * fix(clickhouse): validate column types structurally and normalize FORMAT around SETTINGS * fix(clickhouse): balance-check ORDER BY/PARTITION BY and skip leading comments in read-only guard --------- Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com>

* fix(autolayout): relocate notes that overlap blocks after layout * fix(autolayout): harden note overlap resolution against resize and non-finite positions

* feat(metrics): emit hosted-key metrics to Grafana via OTel Replace the dropped platform.hosted_key.* spans with OTel counters/histograms for usage, cost, failures, throttles, and queue waits. Wire a MeterProvider into the Next.js OTel SDK (trigger.dev already exports metrics). Per-key attribution via a key label (env var name). * fix(metrics): correct hosted-key failure attribution - Re-point used/cost/failed labels at the freshly acquired key after reacquire - Classify quota-style 401/403 as rate_limited (mirror isRateLimitError) - Count returned success:false runs (e.g. deep_research polling) as failed * fix(metrics): label hosted_key.throttled with real provider on exhausted retries * fix(metrics): parse OTLP metrics URL via URL/pathname, not string suffix Handles query strings and trailing slashes so the /v1/traces->/v1/metrics swap can't produce a malformed endpoint, matching normalizeOtlpTracesUrl.

- Route both row-GET endpoints (internal + v1) and the copilot tool through the single service.queryRows instead of three inline query copies; add a withExecutions option so the public v1 route still omits executions. - Run COUNT(*) and the page fetch concurrently in queryRows. - Move CSV-import transaction ownership out of the API route into importAppendRows / importReplaceRows so routes never hold a trx. - Extract row position mechanics (reserve / shift / compact) into named private helpers in service.ts; no separate table-wrapper module.

…d/no-output badges (#4889) * feat(tables): workflow version selection (live/deployed) and not-found/no-output badges * fix(tables): draw row-selection left edge as checkbox cell border so it cannot be cut off * fix(tables): per-group version in cascade, accurate deploy error, skip not-found for deployed groups * fix(tables): render selection left edge as continuous strip overlapping row gridlines * feat(tables): not-found column icon, optional workflow inputs, mothership deploymentMode --------- Co-authored-by: waleed <walif6@gmail.com>

All app replicas shared a hardcoded service.instance.id ("mothership-sim"), so OTel metrics from every process collapsed into one Prometheus series. Their independent cumulative counters then interleaved, producing phantom counter resets that corrupt rate()/increase() — staging hosted-key cost inflated to ~$0.72 from a few cents, while no-`key` metrics (cost_charged, throttled, queue_wait_*) were affected fleet-wide. Append the hostname (the container id under ECS, unique per task) so each replica gets its own series and sum(rate(...)) / sum(increase(...)) aggregate correctly. The mothership-sim prefix is kept so Jaeger's clock-skew adjuster still separates Sim from Go. Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

vercel · 2026-06-05T07:51:54Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
docs	Ready	Preview, Comment	Jun 5, 2026 9:22pm

cursor · 2026-06-05T07:51:59Z

PR Summary

Medium Risk
Table import changes affect concurrent writes and large-file handling; incorrect import claim release could block imports. Connector listingCapped omissions elsewhere remain a data-loss risk class documented but not fully shown in this diff.

Overview
v0.6.103 bundles product/docs polish with substantial Tables import and workflow behavior, plus integration surface area.

Tables & CSV import adds background CSV import (import-async, cancel, markTableImporting claim, detached runTableImport) and exposes import progress fields on table GET. The sync import path now uses streaming multipart parsing, atomic import claims (409 when another import runs), and consolidated importAppendRows / importReplaceRows instead of ad-hoc transactions. A cron job marks stale importing tables as failed after worker timeouts.

Table workflow columns can set deploymentMode when updating workflow groups (live vs deployed — aligned with PR theme).

Integrations & docs: new ClickHouse tool docs, landing integrations catalog entry, and icon wiring. Dagster and Tinybird docs/JSON expand operations (assets, pipe endpoints, datasource management). Knowledge connector docs bump to 49 connectors and broaden auth examples. README is reframed around Mothership, files, tables, and updated demo assets.

Agent/dev workflows: mirrors add-enrichment, add-hosted-key, add-model, validate-model, and council into .agents/skills/; add-model / validate-model commands now stress hosted billing (getHostedModels) and non–data-driven touchpoints. Connector add/validate commands require syncContext.listingCapped when listings are truncated so sync reconciliation does not mass-delete unseen documents.

Minor docs UI fix: search dialog z-index above the sticky navbar.

^{Reviewed by Cursor Bugbot for commit 24a6086. Bugbot is set up for automated code reviews on this repo. Configure here.}

waleedlatif1 · 2026-06-05T16:43:13Z

@greptile

waleedlatif1 · 2026-06-05T16:43:18Z

@cursor review

cursor

✅ Bugbot reviewed your changes and found no new issues!

Comment @cursor review or bugbot run to trigger another review on this PR

^{Reviewed by Cursor Bugbot for commit f7f7840. Configure here.}

greptile-apps · 2026-06-05T16:50:10Z

Greptile Summary

This rollup PR delivers several security hardening patches alongside new features: background CSV import with live progress for large files, seven new knowledge-base connectors (Google Forms, Typeform, Azure DevOps, YouTube, JSM, S3, Sentry), a full ClickHouse integration block with 25+ operations, and expanded Dagster/Tinybird tooling. Security changes include fixing chat-attachment XSS via blob URLs, adding SSRF guards to MCP OAuth discovery/revocation, adding Microsoft Teams clientState webhook verification, removing the legacy API-key scan+decrypt fallback, and enforcing tool-name validation in the MCP deploy modal.

Background CSV import: Large files (above the threshold) are uploaded to object storage and imported in a detached server process, with SSE progress streaming and a cancellation path. Client-side row-count capacity checks are removed since the full count is unknown until server-side processing completes; the DB trigger still enforces limits.
ClickHouse block: 25 operations over the ClickHouse HTTP interface with identifier sanitization, read-only enforcement, and a format-normalizer; user-supplied WHERE clauses go through a blocklist validator before being interpolated into mutating statements.
Security patches: XSS fixed via HTML entity escaping before writing to a blob URL; MCP OAuth flows now run through createSsrfGuardedMcpFetch per request; Teams webhook verifies clientState against the stored webhook ID.

Confidence Score: 4/5

Safe to merge; the three security patches are sound and the background-import architecture is well-guarded. The two flagged items are non-blocking quality gaps.

The ClickHouse WHERE-clause blocklist misses OR 1 (truthy integer), which could be exploited via prompt injection against the user's own database. The client-side row-limit warning was removed for all import sizes, not just large files, leaving small-file users without upfront feedback before a failed import. Neither issue affects core platform security or data integrity for Sim itself.

apps/sim/app/api/tools/clickhouse/utils.ts — WHERE-clause blocklist gap; apps/sim/app/workspace/[workspaceId]/tables/components/import-csv-dialog/import-csv-dialog.tsx — row-limit UX regression for small files

Important Files Changed

Filename	Overview
apps/sim/app/api/tools/clickhouse/utils.ts	Core ClickHouse execution utility: identifier sanitization, read-only enforcement, WHERE-clause blocklist validator, format normalizer. Blocklist has a gap: `OR 1` (truthy integer) is not caught, unlike `OR TRUE`/`OR FALSE`.
apps/sim/app/chat/components/message/message.tsx	XSS fix: replaces direct document.write of user-controlled filename/dataUrl with HTML-escaped content written into a blob URL opened in a new tab.
apps/sim/lib/mcp/pinned-fetch.ts	Adds `createSsrfGuardedMcpFetch` that validates every outbound OAuth/revocation URL against the MCP SSRF policy before issuing the request, preventing SSRF via attacker-controlled metadata endpoints.
apps/sim/lib/webhooks/providers/microsoft-teams.ts	Adds clientState verification for `microsoftteams_chat_subscription` webhooks — compares each notification's clientState against the stored webhook ID using safeCompare.
apps/sim/lib/api-key/service.ts	Removes legacy scan+decrypt fallback; authentication now goes exclusively through the hash lookup path, simplifying the flow and removing a slow full-table scan.
apps/sim/lib/table/import-runner.ts	New background import worker: streams CSV from storage, infers schema, inserts in batches, emits progress events, and handles superseded-import detection and cleanup on failure.
apps/sim/app/workspace/[workspaceId]/tables/components/import-csv-dialog/import-csv-dialog.tsx	Routes large files through the async upload+import path; removes client-side capacity checks since total row count is only known after server-side streaming.

Sequence Diagram

sequenceDiagram
    participant Client
    participant ImportAPI as /api/table/import-async
    participant Storage as Object Storage
    participant Runner as TableImportRunner (detached)
    participant DB as PostgreSQL
    participant SSE as /api/table/[id]/events

    Client->>Storage: Upload CSV (presigned URL)
    Client->>ImportAPI: "POST {workspaceId, fileKey, fileName, mode, mapping}"
    ImportAPI->>DB: checkAccess + markTableImporting (claim importId)
    ImportAPI-->>Client: "202 {tableId, importId}"
    ImportAPI->>Runner: runDetached()

    Runner->>Storage: downloadFileStream(fileKey)
    Runner->>DB: nextImportStartPosition / setTableSchemaForImport
    loop batch (CSV_MAX_BATCH_SIZE rows)
        Runner->>DB: bulkInsertImportBatch
        Runner->>SSE: appendTableEvent (progress %)
        Runner->>DB: updateImportProgress
    end
    Runner->>DB: markImportReady / deleteFile(fileKey)

    Client->>SSE: EventSource stream
    SSE-->>Client: progress events (live %)

_{Reviews (1): Last reviewed commit: "fix(otel): make service.instance.id uniq..." | Re-trigger Greptile}

greptile-apps · 2026-06-05T16:50:14Z

+    /\bor\s+true\b/i,
+    /\bor\s+false\b/i,


The validateWhereClause blocklist catches OR TRUE and OR FALSE but not OR 1 (and similar truthy integer literals). In ClickHouse, WHERE id = 1 OR 1 is equivalent to WHERE 1, affecting every row. A prompt-injection attack that steers the LLM toward a WHERE clause like status = 'active' OR 1 would silently bypass the guard and delete or update all rows in the target table.

Suggested change

/\bor\s+true\b/i,

/\bor\s+false\b/i,

/\bor\s+true\b/i,

/\bor\s+false\b/i,

/\bor\s+\d+\b/i,

/\band\s+\d+\b/i,

greptile-apps · 2026-06-05T16:50:15Z

    duplicateTargets.length === 0 &&
-    mappedCount + createCount > 0 &&
-    appendCapacityDeficit === 0 &&
-    replaceCapacityDeficit === 0
+    mappedCount + createCount > 0

  async function handleSubmit() {
    if (!parsed || !canSubmit) return
    setSubmitError(null)
+    const createColumns = createHeaders.size > 0 ? [...createHeaders] : undefined
+
+    // Large files can't be POSTed through the server (request-body cap) — upload them
+    // straight to storage and import in the background instead. Seed the header tray and


Client-side row-limit guard removed for all imports

The previous code blocked the submit button and showed an error when a file's row count would exceed table.maxRows for append or replace mode. This guard was removed entirely rather than keeping it only for the synchronous (small-file) path. For files below CSV_ASYNC_IMPORT_THRESHOLD_BYTES, the total row count is still known from the preview parse (parsedCsvBuffer returns all rows for small files), so the client could still surface the warning for the synchronous path. Without it, users learn about the row-limit violation only after the upload completes, and the error message comes from the server rather than appearing before the import begins.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

…lag-gated, default off) (#4890) * feat(tables): add order_key column, fractional-indexing util, and ordering flag (off) * feat(tables): write order_key on insert, flag-gate delete reindex + query ordering, add backfill Flag off (default) = identical behavior. Single-insert assigns a fractional order_key; queryRows orders by order_key when the flag is on; deletes skip the O(N) reindex when on. Per-table-atomic backfill script populates existing rows. * feat(tables): write order_key on all insert paths (batch, upsert, replace, import, create, copilot) Completes the always-write-keys prerequisite: every row insert now assigns a fractional order_key consistent with position order, so the flag can be flipped safely after backfill. Flag off (default) still = identical behavior. * feat(tables): insert-by-neighbor-id + orderKey on wire + client order-by-key Inserts express intent as afterRowId/beforeRowId (O(1) key mint via the (table_id,order_key,id) index); orderKey is returned on every row; client reconcile/undo place by orderKey (no neighbor bump) with position fallback. Flag off = unchanged. 205 table tests pass. * feat(tables): resolve position-based inserts by key ordinal under the flag Position-based callers (mothership tool, v1 API, undo fallback, transient old clients) resolve their insert neighbor by order_key ordinal (OFFSET) when the flag is on — positions are gappy then, so WHERE position=N would miss. Flag off keeps the indexed position lookup. The mothership tool itself is unchanged. * test(tables): flag-on coverage — delete skips reindex, insert mints key + no shift * fix lint * chore(db): regenerate order_key migration with default drizzle name * fix(tables): address review — guard neighbor insert + mutual-exclusion + safe reconcile - resolveInsertByNeighbor throws when the anchor row is missing (was silently inserting at the front) and when its order_key is null under the flag. - insert contract: afterRowId/beforeRowId are mutually exclusive (refine). - reconcileCreatedRow only key-sorts when every cached row is keyed, so mid- backfill un-keyed rows aren't yanked to the front. * fix(kb): restore non-null guard in storage-key filter (unsafe-lint regression) * refactor(tables): extract maxOrderKey + thread import append key - Extract maxOrderKey(executor, tableId) helper; replaces three identical max(order_key) selects (single/batch insert append + import). - Import: read the append anchor once up front and thread each batch's last key forward (nextImportStartOrderKey + afterOrderKey) instead of re-scanning max(order_key) per batch over a growing table — one scan per import, not one per 1k-row batch. * fix(tables): keep insert body base omittable for v1 contract The afterRowId/beforeRowId mutual-exclusion .refine() turned the schema into a ZodEffects, which Zod forbids .omit() on — v1's insertTableRowBodySchema.omit({ position }) threw at module load (runtime-only; tsc misses it). Split the plain object base out, apply the shared refine on top, and have v1 omit from the base then re-apply it. * fix(tables): chunk backfill order-key writes A single UPDATE … FROM (VALUES …) over a whole large table overflows the JS call stack while drizzle assembles the VALUES list (and would blow past Postgres's 65535 bound-param limit at ~32k rows) — large tables failed with 'Maximum call stack size exceeded'. Write in 1000-row chunks inside the same per-table transaction so keying stays atomic. * fix(tables): emit orderKey in insert responses The single-row and batch insert handlers dropped orderKey from the JSON response even though the service returns it, so reconcileCreatedRow always fell back to position-sorting and could place neighbor inserts wrong under the fractional-ordering flag. Serialize orderKey alongside position. * fix(tables): restore by orderKey, not position, under fractional flag A saved position is the gappy column value, but under the flag insert reads position as a visual rank (OFFSET) — so position-based restore misplaces rows. - create-row redo now goes through the batch path carrying the saved orderKey (the single-insert API has no orderKey field); drop the now-unused single create mutation. - resolveBatchInsertOrderKeys appends under the flag instead of feeding gappy positions to resolveInsertOrderKey; positions remain the flag-off path. * perf(tables): backfill writes 5000 rows/chunk (was 1000) 5x fewer round-trips per table; ~10k bound params stays well under Postgres's 65535 ceiling and far below the single-statement size that overflows the stack. * fix(tables): drop rowNumber from table trigger payload position is gappy under the fractional-ordering flag, so rowNumber (= row.position) no longer reflects a contiguous visual rank. Rather than compute-on-read, remove it from the trigger payload, output schema, and column-execution input. Also pin isTablesFractionalOrderingEnabled=false in update-row.test.ts so its flag-off position-shift assertions are deterministic regardless of local env. * chore(db): format generated 0226 migration metadata biome check . flagged the drizzle-generated _journal.json and 0226_snapshot.json; apply the formatter so packages/db lint:check passes in CI. * docs(triggers): drop rowNumber from table trigger outputs rowNumber was removed from the table trigger payload; remove it from the documented output fields to match. * test(tables): remove flag-on fractional-ordering unit suite Flag-on behavior is covered by manual large-table verification; the heavily- mocked DB-chain suite added little signal.

TheodoreSpeaks and others added 15 commits June 3, 2026 21:26

chore(api-key): remove legacy scan+decrypt auth fallback (#4876)

4076d76

chore(readme): refresh demo GIFs from docs, lead with Mothership (#4878)

402472e

fix(mcp): enforce tool name validation in deploy modal (#4879)

d80f0c6

* fix(mcp): enforce tool name validation in deploy modal * fix(mcp): correct cn import path to fix build * fix(mcp): align tool-name regex with server sanitization, add disabled-combobox hint

fix(polling-tools): pass plan execution timeout to internal polling t…

71f6936

…ool routes (#4884) * fix(polling-tools): pass plan execution timeout to internal polling tool routes * address comments

fix(autolayout): relocate notes that overlap blocks after layout (#4888)

80ea0dd

* fix(autolayout): relocate notes that overlap blocks after layout * fix(autolayout): harden note overlap resolution against resize and non-finite positions

cursor Bot reviewed Jun 5, 2026

View reviewed changes

Comment thread apps/sim/app/api/cron/cleanup-stale-executions/route.ts

Comment thread apps/sim/lib/table/service.ts

cursor Bot reviewed Jun 5, 2026

View reviewed changes

greptile-apps Bot reviewed Jun 5, 2026

View reviewed changes

vercel Bot deployed to Preview June 5, 2026 21:22 View deployment

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.6.103: readme updates, tables lifecycle improvements, new connectors, clickhouse integration #4892

v0.6.103: readme updates, tables lifecycle improvements, new connectors, clickhouse integration #4892
waleedlatif1 wants to merge 16 commits into
mainfrom
staging

waleedlatif1 commented Jun 5, 2026

Uh oh!

vercel Bot commented Jun 5, 2026 •

edited

Loading

Uh oh!

cursor Bot commented Jun 5, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

waleedlatif1 commented Jun 5, 2026

Uh oh!

waleedlatif1 commented Jun 5, 2026

Uh oh!

cursor Bot left a comment

Uh oh!

greptile-apps Bot commented Jun 5, 2026

Uh oh!

greptile-apps Bot Jun 5, 2026

Uh oh!

greptile-apps Bot Jun 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

waleedlatif1 commented Jun 5, 2026

Uh oh!

vercel Bot commented Jun 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cursor Bot commented Jun 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Summary

Uh oh!

Uh oh!

Uh oh!

waleedlatif1 commented Jun 5, 2026

Uh oh!

waleedlatif1 commented Jun 5, 2026

Uh oh!

cursor Bot left a comment

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot commented Jun 5, 2026

Greptile Summary

Confidence Score: 4/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps Bot Jun 5, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot Jun 5, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

vercel Bot commented Jun 5, 2026 •

edited

Loading

cursor Bot commented Jun 5, 2026 •

edited

Loading