Skip to content

feat(integrations): add ClickHouse block and expand Dagster + Tinybird tools#4883

Merged
waleedlatif1 merged 13 commits into
stagingfrom
waleedlatif1/validate-live-apis
Jun 4, 2026
Merged

feat(integrations): add ClickHouse block and expand Dagster + Tinybird tools#4883
waleedlatif1 merged 13 commits into
stagingfrom
waleedlatif1/validate-live-apis

Conversation

@waleedlatif1
Copy link
Copy Markdown
Collaborator

Summary

  • Add ClickHouse integration: new block + 26 tools (query/insert/DDL/partitions/mutations/system introspection) over the ClickHouse HTTP interface
  • Expand Dagster tools with asset operations (get/list/materialize/report/wipe)
  • Expand Tinybird tools with datasource + pipe operations (append, delete rows, truncate, query pipe)

Type of Change

  • New feature

Testing

Tested manually; type-check, lint, and API-contract audit pass for the included changes

Checklist

  • Code follows project style guidelines
  • Self-reviewed my changes
  • Tests added/updated and passing
  • No new warnings introduced
  • I confirm that I have read and agree to the terms outlined in the Contributor License Agreement (CLA)

@vercel
Copy link
Copy Markdown

vercel Bot commented Jun 4, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment
Project Deployment Actions Updated (UTC)
docs Skipped Skipped Jun 4, 2026 9:37pm

Request Review

@cursor
Copy link
Copy Markdown

cursor Bot commented Jun 4, 2026

PR Summary

Medium Risk
ClickHouse and expanded Dagster/Tinybird tools expose DDL, data mutation, and destructive operations (e.g. drop/truncate, wipe asset) against user-supplied endpoints; mitigations include internal auth and host/SQL validation, but misconfiguration or weak remote credentials still carry real blast radius.

Overview
This PR adds a ClickHouse workflow integration end-to-end: block, tool configs, Zod API contracts, authenticated HTTP routes, docs, and catalog entries for 26 operations (read-only query vs raw execute, inserts, async update/delete mutations, schema introspection, DDL, partitions, mutations, running queries, kill query, clusters). Server-side logic talks to ClickHouse over HTTP with host SSRF checks, read-only guards on the query path, and validation on identifiers, WHERE clauses, and DDL fragments.

Dagster gains five asset-focused tools (list/get, materialize, external materialization report, destructive wipe), richer run metadata on get-run, and list runs filters plus cursor pagination. Tinybird grows from two to six operations: published pipe endpoints, append-from-URL, truncate, and conditional row delete, with matching block UI and docs.

Docs and the integrations landing page are updated for all three; a ClickHouse icon is wired through shared icon mappings.

Reviewed by Cursor Bugbot for commit dd3c37d. Configure here.

Comment thread apps/sim/blocks/blocks/dagster.ts Outdated
Comment thread apps/sim/tools/dagster/list_assets.ts
Comment thread apps/sim/tools/tinybird/query_pipe.ts Outdated
Comment thread apps/sim/blocks/blocks/dagster.ts
parsePipeParameters previously returned {} on any JSON parse error, so a
mistyped 'parameters' input produced a successful pipe call with the dynamic
filters silently dropped. Throw a clear error for non-empty, non-object input
instead; an omitted/empty value still means 'no parameters'.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ation

Address PR review:
- Route all block numeric coercions (list_runs limit/createdAfter/createdBefore,
  get_run_logs logsLimit, list_assets assetsLimit) through a toFiniteNumber()
  guard so invalid/wand-generated text becomes undefined instead of NaN.
- list_assets now applies a default page size (100) when no limit is given, so
  paging stays bounded and hasMore is meaningful even when limit is omitted.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@waleedlatif1
Copy link
Copy Markdown
Collaborator Author

@greptile

@waleedlatif1
Copy link
Copy Markdown
Collaborator Author

@cursor review

Comment thread apps/sim/app/api/tools/clickhouse/query/route.ts Outdated
Comment thread apps/sim/app/api/tools/clickhouse/utils.ts
Comment thread apps/sim/tools/dagster/list_assets.ts
Address PR review (hasMore true on exact page): request one extra row
(pageSize + 1), use its presence as the authoritative hasMore, slice it off,
and derive the returned cursor from the last RETURNED asset's key path
(JSON-serialized; Dagster normalizes JS/Python whitespace on the way in).
This removes the false-positive hasMore when the final page is exactly full.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Jun 4, 2026

Greptile Summary

This PR adds a full ClickHouse integration (26 tools + API routes + UI block), expands Dagster with asset operations (get, list, materialize, report, wipe), and adds four new Tinybird datasource/pipe tools. Credential visibility follows user-only conventions throughout, ClickHouse operations include SQL injection defenses (identifier sanitization, WHERE clause denylist, read-only enforcement), and the prior Tinybird path-traversal fixes are already applied.

  • ClickHouse (26 tools): HTTP-interface client with timeout, AbortController, Zod contract validation on every route, and layered SQL safety.
  • Dagster asset tools: Five new GraphQL operations using the existing parseDagsterGraphqlResponse plumbing with overfetch-by-one pagination.
  • Tinybird tools: Four new tools (query pipe, append, delete rows, truncate) with encodeURIComponent on all path segments.

Confidence Score: 5/5

This PR is safe to merge. All credential fields use user-only visibility, auth checks are applied on every new API route, and the ClickHouse HTTP client includes identifier sanitization, WHERE-clause filtering, and a read-only enforcement layer.

The integration is well-structured with consistent patterns across all 26 ClickHouse tools. The two observations are minor usability and robustness improvements that do not affect correctness of the happy path.

No files require special attention for merge readiness. apps/sim/app/api/tools/clickhouse/utils.ts has the identifier-regex limitation worth addressing before users with hyphenated table names hit it.

Important Files Changed

Filename Overview
apps/sim/app/api/tools/clickhouse/utils.ts Core ClickHouse utility with HTTP requests, query validation, identifier sanitization, WHERE clause denylist, and DDL generation. sanitizeIdentifier regex is overly restrictive for hyphenated table names.
apps/sim/lib/api/contracts/tools/databases/clickhouse.ts Zod validation contracts for all 26 ClickHouse API routes; thorough input validation with sensible defaults.
apps/sim/blocks/blocks/clickhouse.ts ClickHouse UI block definition with all 26 operations; correct user-only visibility for credentials.
apps/sim/tools/dagster/list_assets.ts Overfetch-by-one pagination; cursor derived from JSON.stringify of last asset path rather than Dagster API cursor.
apps/sim/tools/dagster/materialize_assets.ts Asset materialization via dynamic GraphQL mutation; tags JSON parsing defers array validation to GraphQL layer.
apps/sim/tools/dagster/wipe_asset.ts Destructive wipe-asset mutation; correct response handling with optional chaining for wiped key.
apps/sim/tools/tinybird/query_pipe.ts Calls Tinybird Pipe endpoint with URL-encoded parameters; correctly guards reserved q key.
apps/sim/tools/tinybird/delete_datasource_rows.ts Conditional row delete with encodeURIComponent on datasource path segment.

Sequence Diagram

sequenceDiagram
    participant UI as Sim UI Block
    participant Route as Next.js API Route
    participant Utils as clickhouse/utils.ts
    participant CH as ClickHouse HTTP Interface

    UI->>Route: "POST /api/tools/clickhouse/<op>"
    Route->>Route: checkInternalAuth()
    Route->>Route: parseToolRequest Zod validation
    Route->>Utils: "executeClickHouse*()"
    Utils->>Utils: validateDatabaseHost()
    Utils->>Utils: sanitizeIdentifier / validateWhereClause / enforceReadOnly
    Utils->>CH: POST with X-ClickHouse-User and SQL body
    CH-->>Utils: JSON body + X-ClickHouse-Summary header
    Utils-->>Route: rows and rowCount
    Route-->>UI: message rows rowCount
Loading

Reviews (9): Last reviewed commit: "fix(clickhouse): balance-check ORDER BY/..." | Re-trigger Greptile

@waleedlatif1
Copy link
Copy Markdown
Collaborator Author

@greptile

Comment thread apps/sim/tools/dagster/list_assets.ts
Comment thread apps/sim/tools/dagster/materialize_assets.ts
Comment thread apps/sim/app/api/tools/clickhouse/utils.ts
@waleedlatif1
Copy link
Copy Markdown
Collaborator Author

@cursor review

Comment thread apps/sim/tools/dagster/list_runs.ts
Comment thread apps/sim/app/api/tools/clickhouse/utils.ts
Address PR review (list runs false hasMore): request one extra row
(pageSize + 1), use its presence as the authoritative hasMore, and slice it
off before mapping. Removes the false-positive hasMore (and misleading cursor)
when the final page is exactly `limit` runs long. Mirrors the list_assets fix.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@waleedlatif1
Copy link
Copy Markdown
Collaborator Author

@greptile

@waleedlatif1
Copy link
Copy Markdown
Collaborator Author

@cursor review

Comment thread apps/sim/app/api/tools/clickhouse/utils.ts
@waleedlatif1
Copy link
Copy Markdown
Collaborator Author

@greptile

@waleedlatif1
Copy link
Copy Markdown
Collaborator Author

@cursor review

Comment thread apps/sim/app/api/tools/clickhouse/utils.ts
Comment thread apps/docs/components/icons.tsx
Comment thread apps/sim/tools/tinybird/truncate_datasource.ts
…raversal

A user-or-llm datasource/pipe name interpolated raw into the URL path (e.g.
'real_ds/../../other') is normalized by the WHATWG URL parser and can target a
different endpoint. Wrap the path segment with encodeURIComponent in the
truncate, delete, and query_pipe URLs. Events/append pass the name via
URLSearchParams, which already encodes, so they were unaffected.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@waleedlatif1
Copy link
Copy Markdown
Collaborator Author

@greptile

@waleedlatif1
Copy link
Copy Markdown
Collaborator Author

@cursor review

Comment thread apps/sim/app/api/tools/clickhouse/utils.ts
Comment thread apps/sim/app/api/tools/clickhouse/utils.ts
@waleedlatif1
Copy link
Copy Markdown
Collaborator Author

@greptile

@waleedlatif1
Copy link
Copy Markdown
Collaborator Author

@cursor review

Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Bugbot reviewed your changes and found no new issues!

Comment @cursor review or bugbot run to trigger another review on this PR

Reviewed by Cursor Bugbot for commit 9a8fc9c. Configure here.

Comment thread apps/sim/tools/dagster/list_runs.ts
@waleedlatif1
Copy link
Copy Markdown
Collaborator Author

@greptile

@waleedlatif1
Copy link
Copy Markdown
Collaborator Author

@cursor review

Comment thread apps/sim/app/api/tools/clickhouse/utils.ts
Comment thread apps/sim/app/api/tools/clickhouse/utils.ts
@waleedlatif1
Copy link
Copy Markdown
Collaborator Author

@greptile

@waleedlatif1
Copy link
Copy Markdown
Collaborator Author

@cursor review

Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Bugbot reviewed your changes and found no new issues!

Comment @cursor review or bugbot run to trigger another review on this PR

Reviewed by Cursor Bugbot for commit dd3c37d. Configure here.

@waleedlatif1 waleedlatif1 merged commit 16954e4 into staging Jun 4, 2026
14 checks passed
@waleedlatif1 waleedlatif1 deleted the waleedlatif1/validate-live-apis branch June 4, 2026 21:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant