From 59748bbdfe158190871fbec47be68d7ce14823ac Mon Sep 17 00:00:00 2001 From: Waleed Latif Date: Wed, 3 Jun 2026 10:25:22 -0700 Subject: [PATCH 1/4] feat(storage): support S3-compatible endpoints (R2, MinIO, B2) for file storage Add S3_ENDPOINT and S3_FORCE_PATH_STYLE env vars, wired into the shared upload S3 client so Cloudflare R2, MinIO, Backblaze B2, and other S3-compatible stores work for self-hosted file storage. The endpoint is trusted operator config (no SSRF/HTTPS gate). Makes the multipart Location fallback endpoint-aware, extends the S3 client unit tests, and documents the new vars in Helm values, .env.example, and the English self-hosting docs (incl. browser-reachability + CORS guidance). --- .../en/self-hosting/environment-variables.mdx | 13 + .../content/docs/en/self-hosting/meta.json | 1 + .../docs/en/self-hosting/object-storage.mdx | 272 ++++++++++++++++++ apps/sim/.env.example | 15 + apps/sim/lib/core/config/env.ts | 4 +- apps/sim/lib/uploads/config.ts | 11 +- apps/sim/lib/uploads/core/setup.server.ts | 8 +- .../lib/uploads/providers/s3/client.test.ts | 43 ++- apps/sim/lib/uploads/providers/s3/client.ts | 18 +- helm/sim/examples/values-aws.yaml | 3 + helm/sim/values.yaml | 2 + 11 files changed, 381 insertions(+), 9 deletions(-) create mode 100644 apps/docs/content/docs/en/self-hosting/object-storage.mdx diff --git a/apps/docs/content/docs/en/self-hosting/environment-variables.mdx b/apps/docs/content/docs/en/self-hosting/environment-variables.mdx index 8d481b0d62f..0a11464a8f1 100644 --- a/apps/docs/content/docs/en/self-hosting/environment-variables.mdx +++ b/apps/docs/content/docs/en/self-hosting/environment-variables.mdx @@ -70,6 +70,19 @@ import { Callout } from 'fumadocs-ui/components/callout' | `ALLOWED_LOGIN_EMAILS` | Restrict signups to specific emails (comma-separated) | | `DISABLE_REGISTRATION` | Set to `true` to disable new user signups | +## File Storage + +By default Sim writes uploads to local disk. For production, point it at AWS S3 or Azure Blob. See [Object Storage](/self-hosting/object-storage) for the full setup, bucket layout, and IAM policy. + +| Variable | Description | +|----------|-------------| +| `AWS_REGION` | AWS region — set with `S3_BUCKET_NAME` to enable S3 | +| `AWS_ACCESS_KEY_ID` | AWS access key. Omit to use the instance/IRSA credential chain | +| `AWS_SECRET_ACCESS_KEY` | AWS secret key. Omit to use the instance/IRSA credential chain | +| `S3_BUCKET_NAME` | General workspace files bucket — set with `AWS_REGION` to enable S3 | +| `AZURE_STORAGE_CONTAINER_NAME` | General files container — set with Azure credentials to enable Blob (takes precedence over S3) | +| `AZURE_CONNECTION_STRING` | Azure connection string, or use `AZURE_ACCOUNT_NAME` + `AZURE_ACCOUNT_KEY` | + ## Email Providers Configure one provider — the mailer auto-detects in priority order: **Resend → AWS SES → SMTP → Azure Communication Services**. If none are configured, emails are logged to the console instead. diff --git a/apps/docs/content/docs/en/self-hosting/meta.json b/apps/docs/content/docs/en/self-hosting/meta.json index 805cfb659a1..8ec1af87ec8 100644 --- a/apps/docs/content/docs/en/self-hosting/meta.json +++ b/apps/docs/content/docs/en/self-hosting/meta.json @@ -5,6 +5,7 @@ "docker", "kubernetes", "platforms", + "object-storage", "environment-variables", "troubleshooting" ], diff --git a/apps/docs/content/docs/en/self-hosting/object-storage.mdx b/apps/docs/content/docs/en/self-hosting/object-storage.mdx new file mode 100644 index 00000000000..0f0ed9e4100 --- /dev/null +++ b/apps/docs/content/docs/en/self-hosting/object-storage.mdx @@ -0,0 +1,272 @@ +--- +title: Object Storage +description: Configure where Sim stores uploaded files — local disk, AWS S3, or Azure Blob +--- + +import { Tab, Tabs } from 'fumadocs-ui/components/tabs' +import { Callout } from 'fumadocs-ui/components/callout' +import { Step, Steps } from 'fumadocs-ui/components/steps' +import { FAQ } from '@/components/ui/faq' + +Sim stores every uploaded file — knowledge base documents, chat attachments, execution outputs, profile pictures, and more — in object storage. Three backends are supported: + +| Backend | When to use | +|---------|-------------| +| **Local disk** | Single-node Docker, local development, evaluation | +| **AWS S3** | Production, especially when running more than one app replica | +| **Azure Blob** | Production on Azure | + + + Local disk writes to the container's `/uploads` directory. Files are lost when the container is recreated unless that path is on a persistent volume, and they are **not** shared across replicas. For any multi-replica or production deployment, use S3 or Azure Blob. + + +## How the backend is selected + +Sim picks the backend automatically from environment variables — there is no explicit "provider" flag. The logic, in order of precedence: + +1. **Azure Blob** — used if `AZURE_STORAGE_CONTAINER_NAME` is set **and** either (`AZURE_ACCOUNT_NAME` + `AZURE_ACCOUNT_KEY`) or `AZURE_CONNECTION_STRING` is set. +2. **AWS S3** — used if `S3_BUCKET_NAME` **and** `AWS_REGION` are set (and Azure is not configured). +3. **Local disk** — the fallback when neither is configured. + +If both Azure and S3 are configured, **Azure wins**. Set only the variables for the backend you intend to use. + +## Set up AWS S3 + + + + + +### Create the buckets + +Sim separates files into purpose-specific buckets. At minimum you need the general workspace bucket; the rest are created on demand based on which env vars you set. A bucket that isn't configured falls back to the general bucket where the code allows it, but the recommended setup is one bucket per purpose. + +```bash +# Set your region once +export AWS_REGION=us-east-1 + +# Create buckets (names must be globally unique — prefix with your org) +for name in workspace-files knowledge-base execution-files chat-files \ + copilot-files profile-pictures og-images workspace-logos; do + aws s3api create-bucket \ + --bucket "myorg-sim-$name" \ + --region "$AWS_REGION" \ + --create-bucket-configuration LocationConstraint="$AWS_REGION" +done +``` + + + In `us-east-1`, omit the `--create-bucket-configuration` flag — that region rejects an explicit `LocationConstraint`. + + +Keep all buckets **private** (block public access). Sim serves files through short-lived presigned URLs, so the buckets never need public read access. + + + + + +### Grant access with an IAM policy + +Create an IAM policy scoped to your buckets and attach it to the user (or role) Sim runs as: + +```json +{ + "Version": "2012-10-17", + "Statement": [ + { + "Effect": "Allow", + "Action": [ + "s3:GetObject", + "s3:PutObject", + "s3:DeleteObject", + "s3:ListBucket" + ], + "Resource": [ + "arn:aws:s3:::myorg-sim-*", + "arn:aws:s3:::myorg-sim-*/*" + ] + } + ] +} +``` + +You then have two ways to supply credentials: + +- **Static keys** — create an IAM user with this policy and set `AWS_ACCESS_KEY_ID` / `AWS_SECRET_ACCESS_KEY`. +- **Instance/role credentials (recommended)** — attach the policy to the EC2 instance role, ECS task role, or EKS IRSA role. Leave `AWS_ACCESS_KEY_ID` / `AWS_SECRET_ACCESS_KEY` unset and Sim falls back to the default AWS credential chain automatically. + + + + + +### Configure environment variables + +Set the region, optionally the credentials, and the bucket names: + +```bash +# Region + credentials +AWS_REGION=us-east-1 +AWS_ACCESS_KEY_ID=AKIA... # omit when using an instance/IRSA role +AWS_SECRET_ACCESS_KEY=... # omit when using an instance/IRSA role + +# Buckets (per purpose) +S3_BUCKET_NAME=myorg-sim-workspace-files +S3_KB_BUCKET_NAME=myorg-sim-knowledge-base +S3_EXECUTION_FILES_BUCKET_NAME=myorg-sim-execution-files +S3_CHAT_BUCKET_NAME=myorg-sim-chat-files +S3_COPILOT_BUCKET_NAME=myorg-sim-copilot-files +S3_PROFILE_PICTURES_BUCKET_NAME=myorg-sim-profile-pictures +S3_OG_IMAGES_BUCKET_NAME=myorg-sim-og-images +S3_WORKSPACE_LOGOS_BUCKET_NAME=myorg-sim-workspace-logos +``` + +Only `AWS_REGION` and `S3_BUCKET_NAME` are strictly required to switch Sim into S3 mode. Add the others so each file type lands in its own bucket. + + + + + +### S3 bucket reference + +| Variable | Stores | Required | +|----------|--------|----------| +| `AWS_REGION` | Region for all buckets | **Yes** (enables S3) | +| `AWS_ACCESS_KEY_ID` | Access key | No (uses credential chain if unset) | +| `AWS_SECRET_ACCESS_KEY` | Secret key | No (uses credential chain if unset) | +| `S3_BUCKET_NAME` | General workspace files | **Yes** (enables S3) | +| `S3_KB_BUCKET_NAME` | Knowledge base documents | Recommended | +| `S3_EXECUTION_FILES_BUCKET_NAME` | Workflow execution files (default: `sim-execution-files`) | Recommended | +| `S3_CHAT_BUCKET_NAME` | Deployed chat assets | Recommended | +| `S3_COPILOT_BUCKET_NAME` | Copilot attachments | Recommended | +| `S3_PROFILE_PICTURES_BUCKET_NAME` | User avatars | Recommended | +| `S3_OG_IMAGES_BUCKET_NAME` | OpenGraph preview images (falls back to `S3_BUCKET_NAME`) | Optional | +| `S3_WORKSPACE_LOGOS_BUCKET_NAME` | Workspace logos (falls back to `S3_BUCKET_NAME`) | Optional | +| `S3_LOGS_BUCKET_NAME` | Stored logs | Optional | +| `S3_ENDPOINT` | Custom endpoint for S3-compatible storage (R2, MinIO, B2) | Optional (AWS S3 if unset) | +| `S3_FORCE_PATH_STYLE` | `true` for path-style addressing (MinIO/Ceph) | Optional (defaults `false`) | + +## Apply the configuration + + + + +Add the storage variables to the `.env` file used by `docker-compose.prod.yml`, then restart: + +```bash +docker compose -f docker-compose.prod.yml up -d +``` + +Because files now live in S3, you no longer depend on a local `/uploads` volume for durability. + + + + +Set the variables under `app.env` (non-secret, e.g. region and bucket names) and supply credentials through a secret. The chart ships a complete example at `helm/sim/examples/values-aws.yaml`: + +```yaml +app: + env: + AWS_REGION: "us-east-1" + S3_BUCKET_NAME: "myorg-sim-workspace-files" + S3_KB_BUCKET_NAME: "myorg-sim-knowledge-base" + S3_EXECUTION_FILES_BUCKET_NAME: "myorg-sim-execution-files" + # ...remaining buckets +``` + +On EKS, prefer **IRSA**: attach the IAM policy to the service account's role and leave the access-key variables unset. + + + + +## Set up Azure Blob + +Azure Blob uses one container per purpose, mirroring the S3 layout. Authenticate with either a connection string or an account name + key. + +```bash +# Credentials — provide ONE of these forms +AZURE_ACCOUNT_NAME=mystorageaccount +AZURE_ACCOUNT_KEY=... +# or +AZURE_CONNECTION_STRING=DefaultEndpointsProtocol=https;AccountName=...;AccountKey=...;EndpointSuffix=core.windows.net + +# Containers (per purpose) +AZURE_STORAGE_CONTAINER_NAME=workspace-files +AZURE_STORAGE_KB_CONTAINER_NAME=knowledge-base +AZURE_STORAGE_EXECUTION_FILES_CONTAINER_NAME=execution-files +AZURE_STORAGE_CHAT_CONTAINER_NAME=chat-files +AZURE_STORAGE_COPILOT_CONTAINER_NAME=copilot-files +AZURE_STORAGE_PROFILE_PICTURES_CONTAINER_NAME=profile-pictures +AZURE_STORAGE_OG_IMAGES_CONTAINER_NAME=og-images +AZURE_STORAGE_WORKSPACE_LOGOS_CONTAINER_NAME=workspace-logos +``` + +A full Helm example lives at `helm/sim/examples/values-azure.yaml`. + +## Set up an S3-compatible provider (R2, MinIO, B2) + +Sim works with any S3-compatible store by pointing the S3 client at a custom endpoint. Configure it exactly like AWS S3 (buckets, access key, secret), then add `S3_ENDPOINT` — and `S3_FORCE_PATH_STYLE` where the provider requires path-style addressing. + + + `S3_ENDPOINT` is trusted operator configuration, so it is used as-is — `http://` and private hosts are accepted (no SSRF/HTTPS gate). Don't wire it to untrusted input. + + + + **The endpoint must be reachable from your users' browsers, and the bucket needs CORS.** Uploads use presigned `PUT` requests sent **directly from the browser** to `S3_ENDPOINT` (downloads are proxied back through the app, so they only need server-side reachability). This means: + + - A purely internal endpoint (e.g. `https://minio.internal:9000` that only the app pods can resolve) will let the server start cleanly but **uploads will fail in the browser**. Use an endpoint your users can reach. + - Configure a **CORS policy** on the bucket that allows your Sim origin (`PUT`, `GET`, and the `Authorization` / `Content-Type` / `x-amz-*` headers). This applies to AWS S3 too — R2 and MinIO are no different. + + + + + +R2 uses virtual-hosted style (the default) and the region `auto`: + +```bash +AWS_REGION=auto +S3_ENDPOINT=https://.r2.cloudflarestorage.com +AWS_ACCESS_KEY_ID= +AWS_SECRET_ACCESS_KEY= +S3_BUCKET_NAME=myorg-sim-workspace-files +# ...remaining S3_*_BUCKET_NAME vars, one R2 bucket each +``` + +Leave `S3_FORCE_PATH_STYLE` unset — R2 supports the default virtual-hosted addressing. + + + + +MinIO (and Ceph RGW) need path-style addressing and accept any region string: + +```bash +AWS_REGION=us-east-1 +S3_ENDPOINT=https://minio.example.com # must be reachable from users' browsers, not app-pods-only +S3_FORCE_PATH_STYLE=true +AWS_ACCESS_KEY_ID= +AWS_SECRET_ACCESS_KEY= +S3_BUCKET_NAME=myorg-sim-workspace-files +# ...remaining S3_*_BUCKET_NAME vars, one bucket each +``` + +`http://` works server-side, but since the browser uploads directly to this endpoint, prefer a TLS endpoint your users can reach (a mixed-content `http://` target will be blocked on an `https://` Sim origin). + + + + +## Verify it works + +After restarting with the new configuration: + +1. Open the app and upload a document to a knowledge base (or set a profile picture). +2. Confirm an object appears in the corresponding bucket/container. +3. Reload the page — the file should still render (downloads stream back through the app at `/api/files/serve`). + +If uploads fail, check the app logs for credential or permission errors (see [Troubleshooting](/self-hosting/troubleshooting)). + + diff --git a/apps/sim/.env.example b/apps/sim/.env.example index ca6012c7bb1..180e9b56e98 100644 --- a/apps/sim/.env.example +++ b/apps/sim/.env.example @@ -71,6 +71,21 @@ API_ENCRYPTION_KEY=your_api_encryption_key # Use `openssl rand -hex 32` to gener # PEOPLEDATALABS_API_KEY_1= # People Data Labs API key #1 # PEOPLEDATALABS_API_KEY_2= # People Data Labs API key #2 +# File Storage (Optional - defaults to local disk; use S3 or Azure Blob for production) +# AWS_REGION=us-east-1 # Required with S3_BUCKET_NAME to enable S3. Use "auto" for Cloudflare R2 +# AWS_ACCESS_KEY_ID= # Omit to use the instance/IRSA credential chain +# AWS_SECRET_ACCESS_KEY= # Omit to use the instance/IRSA credential chain +# S3_BUCKET_NAME= # General workspace files bucket (required with AWS_REGION to enable S3) +# S3_KB_BUCKET_NAME= # Knowledge base documents +# S3_EXECUTION_FILES_BUCKET_NAME= # Workflow execution files +# S3_CHAT_BUCKET_NAME= # Deployed chat assets +# S3_COPILOT_BUCKET_NAME= # Copilot attachments +# S3_PROFILE_PICTURES_BUCKET_NAME= # User profile pictures +# S3_OG_IMAGES_BUCKET_NAME= # OpenGraph preview images (falls back to S3_BUCKET_NAME) +# S3_WORKSPACE_LOGOS_BUCKET_NAME= # Workspace logos (falls back to S3_BUCKET_NAME) +# S3_ENDPOINT= # Custom endpoint for S3-compatible storage (Cloudflare R2, MinIO, Backblaze B2). Leave unset for AWS S3 +# S3_FORCE_PATH_STYLE=true # Required for MinIO/Ceph RGW. Leave unset for AWS S3 and R2 + # Admin API (Optional - for self-hosted GitOps) # ADMIN_API_KEY= # Use `openssl rand -hex 32` to generate. Enables admin API for workflow export/import. # Usage: curl -H "x-admin-key: your_key" https://your-instance/api/v1/admin/workspaces diff --git a/apps/sim/lib/core/config/env.ts b/apps/sim/lib/core/config/env.ts index 223eb519524..9714415226d 100644 --- a/apps/sim/lib/core/config/env.ts +++ b/apps/sim/lib/core/config/env.ts @@ -218,8 +218,10 @@ export const env = createEnv({ S3_PROFILE_PICTURES_BUCKET_NAME: z.string().optional(), // S3 bucket for profile pictures S3_OG_IMAGES_BUCKET_NAME: z.string().optional(), // S3 bucket for OpenGraph images S3_WORKSPACE_LOGOS_BUCKET_NAME: z.string().optional(), // S3 bucket for workspace logos + S3_ENDPOINT: z.string().optional(), // Custom endpoint for S3-compatible storage (Cloudflare R2, MinIO, Backblaze B2). Leave unset for AWS S3 + S3_FORCE_PATH_STYLE: z.boolean().optional(), // Force path-style addressing (MinIO/Ceph RGW). Defaults to false (AWS S3, R2) - // Cloud Storage - Azure Blob + // Cloud Storage - Azure Blob AZURE_ACCOUNT_NAME: z.string().optional(), // Azure storage account name AZURE_ACCOUNT_KEY: z.string().optional(), // Azure storage account key AZURE_CONNECTION_STRING: z.string().optional(), // Azure storage connection string diff --git a/apps/sim/lib/uploads/config.ts b/apps/sim/lib/uploads/config.ts index f38d147db07..d833d19c9f8 100644 --- a/apps/sim/lib/uploads/config.ts +++ b/apps/sim/lib/uploads/config.ts @@ -1,4 +1,4 @@ -import { env } from '@/lib/core/config/env' +import { env, envBoolean } from '@/lib/core/config/env' import type { StorageConfig, StorageContext } from '@/lib/uploads/shared/types' export type { StorageConfig, StorageContext } from '@/lib/uploads/shared/types' @@ -17,6 +17,15 @@ export const USE_S3_STORAGE = hasS3Config && !USE_BLOB_STORAGE export const S3_CONFIG = { bucket: env.S3_BUCKET_NAME || '', region: env.AWS_REGION || '', + /** + * Custom endpoint for S3-compatible providers (Cloudflare R2, MinIO, Backblaze B2). + * Unset means the AWS SDK derives the host from `region`, targeting AWS S3. + * This is trusted operator configuration (not user input), so it is passed + * through verbatim — `http://` and private hosts are allowed for on-prem MinIO. + */ + endpoint: env.S3_ENDPOINT || undefined, + /** Path-style addressing — required by MinIO/Ceph RGW; AWS S3 and R2 use the default `false`. */ + forcePathStyle: envBoolean(env.S3_FORCE_PATH_STYLE) ?? false, } export const BLOB_CONFIG = { diff --git a/apps/sim/lib/uploads/core/setup.server.ts b/apps/sim/lib/uploads/core/setup.server.ts index 87801a29f96..f2dcd51daff 100644 --- a/apps/sim/lib/uploads/core/setup.server.ts +++ b/apps/sim/lib/uploads/core/setup.server.ts @@ -2,7 +2,7 @@ import { existsSync } from 'fs' import { mkdir } from 'fs/promises' import path, { join } from 'path' import { createLogger } from '@sim/logger' -import { env } from '@/lib/core/config/env' +import { env, isTruthy } from '@/lib/core/config/env' import { getStorageProvider, USE_BLOB_STORAGE, USE_S3_STORAGE } from '@/lib/uploads/config' const logger = createLogger('UploadsSetup') @@ -79,6 +79,12 @@ if (typeof process !== 'undefined') { } else { logger.info('AWS S3 credentials found in environment variables') } + + if (env.S3_ENDPOINT) { + logger.info( + `Using S3-compatible endpoint: ${env.S3_ENDPOINT} (path-style: ${isTruthy(env.S3_FORCE_PATH_STYLE)})` + ) + } } else { // Local storage mode logger.info('Using local file storage') diff --git a/apps/sim/lib/uploads/providers/s3/client.test.ts b/apps/sim/lib/uploads/providers/s3/client.test.ts index 4e62109a5d8..48c017d2cb6 100644 --- a/apps/sim/lib/uploads/providers/s3/client.test.ts +++ b/apps/sim/lib/uploads/providers/s3/client.test.ts @@ -14,6 +14,7 @@ const { mockDeleteObjectCommand, mockGetSignedUrl, mockEnv, + mockS3Config, } = vi.hoisted(() => { const mockSend = vi.fn() const mockS3Client = { send: mockSend } @@ -24,9 +25,21 @@ const { AWS_ACCESS_KEY_ID: 'test-access-key', AWS_SECRET_ACCESS_KEY: 'test-secret-key', } + const mockS3Config: { + bucket: string + region: string + endpoint: string | undefined + forcePathStyle: boolean + } = { + bucket: 'test-bucket', + region: 'test-region', + endpoint: undefined, + forcePathStyle: false, + } return { mockSend, mockS3Client, + mockS3Config, mockS3ClientConstructor: vi.fn().mockImplementation( class { constructor() { @@ -71,10 +84,7 @@ vi.mock('@/lib/uploads/setup', () => ({ })) vi.mock('@/lib/uploads/config', () => ({ - S3_CONFIG: { - bucket: 'test-bucket', - region: 'test-region', - }, + S3_CONFIG: mockS3Config, S3_KB_CONFIG: { bucket: 'test-kb-bucket', region: 'test-region', @@ -97,6 +107,8 @@ describe('S3 Client', () => { vi.spyOn(Date.prototype, 'toISOString').mockReturnValue('2025-06-16T01:13:10.765Z') mockEnv.AWS_ACCESS_KEY_ID = 'test-access-key' mockEnv.AWS_SECRET_ACCESS_KEY = 'test-secret-key' + mockS3Config.endpoint = undefined + mockS3Config.forcePathStyle = false resetS3ClientForTesting() }) @@ -342,6 +354,8 @@ describe('S3 Client', () => { expect(client).toBeDefined() expect(mockS3ClientConstructor).toHaveBeenCalledWith({ region: 'test-region', + endpoint: undefined, + forcePathStyle: false, credentials: { accessKeyId: 'test-access-key', secretAccessKey: 'test-secret-key', @@ -359,8 +373,29 @@ describe('S3 Client', () => { expect(client).toBeDefined() expect(mockS3ClientConstructor).toHaveBeenCalledWith({ region: 'test-region', + endpoint: undefined, + forcePathStyle: false, credentials: undefined, }) }) + + it('should pass a custom endpoint and path-style flag for S3-compatible providers', () => { + mockS3Config.endpoint = 'https://account.r2.cloudflarestorage.com' + mockS3Config.forcePathStyle = true + resetS3ClientForTesting() + + const client = getS3Client() + + expect(client).toBeDefined() + expect(mockS3ClientConstructor).toHaveBeenCalledWith({ + region: 'test-region', + endpoint: 'https://account.r2.cloudflarestorage.com', + forcePathStyle: true, + credentials: { + accessKeyId: 'test-access-key', + secretAccessKey: 'test-secret-key', + }, + }) + }) }) }) diff --git a/apps/sim/lib/uploads/providers/s3/client.ts b/apps/sim/lib/uploads/providers/s3/client.ts index fe939cb506f..ed66bfdddc3 100644 --- a/apps/sim/lib/uploads/providers/s3/client.ts +++ b/apps/sim/lib/uploads/providers/s3/client.ts @@ -54,6 +54,8 @@ export function getS3Client(): S3Client { _s3Client = new S3Client({ region, + endpoint: S3_CONFIG.endpoint, + forcePathStyle: S3_CONFIG.forcePathStyle, credentials: env.AWS_ACCESS_KEY_ID && env.AWS_SECRET_ACCESS_KEY ? { @@ -386,6 +388,19 @@ export async function getS3MultipartPartUrls( return presignedUrls } +/** + * Build a fallback object URL for when the SDK omits `Location` on multipart + * completion. Honors a custom `S3_CONFIG.endpoint` (R2/MinIO) with path-style + * addressing; otherwise falls back to the AWS virtual-hosted-style host. + */ +function buildObjectFallbackUrl(bucket: string, region: string, key: string): string { + if (S3_CONFIG.endpoint) { + const base = S3_CONFIG.endpoint.replace(/\/+$/, '') + return `${base}/${bucket}/${key}` + } + return `https://${bucket}.s3.${region}.amazonaws.com/${key}` +} + /** * Complete multipart upload for S3 */ @@ -408,8 +423,7 @@ export async function completeS3MultipartUpload( }) const response = await s3Client.send(command) - const location = - response.Location || `https://${config.bucket}.s3.${config.region}.amazonaws.com/${key}` + const location = response.Location || buildObjectFallbackUrl(config.bucket, config.region, key) const path = `/api/files/serve/${encodeURIComponent(key)}` return { diff --git a/helm/sim/examples/values-aws.yaml b/helm/sim/examples/values-aws.yaml index c8795a8e976..a0837b2672b 100644 --- a/helm/sim/examples/values-aws.yaml +++ b/helm/sim/examples/values-aws.yaml @@ -103,6 +103,9 @@ app: S3_PROFILE_PICTURES_BUCKET_NAME: "profile-pictures" # User avatars S3_OG_IMAGES_BUCKET_NAME: "og-images" # OpenGraph preview images S3_WORKSPACE_LOGOS_BUCKET_NAME: "workspace-logos" # Workspace logos + # For S3-compatible storage (Cloudflare R2, MinIO, Backblaze B2) instead of AWS S3: + # S3_ENDPOINT: "https://.r2.cloudflarestorage.com" # custom endpoint; set AWS_REGION: "auto" for R2 + # S3_FORCE_PATH_STYLE: "true" # required for MinIO/Ceph; omit for AWS S3 and R2 # Realtime service realtime: diff --git a/helm/sim/values.yaml b/helm/sim/values.yaml index 6b48a957bd3..5fad7baa674 100644 --- a/helm/sim/values.yaml +++ b/helm/sim/values.yaml @@ -260,6 +260,8 @@ app: S3_PROFILE_PICTURES_BUCKET_NAME: "" # S3 bucket for user profile pictures S3_OG_IMAGES_BUCKET_NAME: "" # S3 bucket for OpenGraph preview images S3_WORKSPACE_LOGOS_BUCKET_NAME: "" # S3 bucket for workspace logos + S3_ENDPOINT: "" # Custom endpoint for S3-compatible storage (Cloudflare R2, MinIO, Backblaze B2). Leave empty for AWS S3 + S3_FORCE_PATH_STYLE: "" # Set to "true" for path-style addressing (MinIO/Ceph RGW). Leave empty for AWS S3 and R2 # Azure Blob Storage Configuration (optional - for file storage) # If configured, files will be stored in Azure Blob instead of local storage From 74afb56280e4386b91b70c74998f43f84251072a Mon Sep 17 00:00:00 2001 From: Waleed Latif Date: Wed, 3 Jun 2026 10:28:08 -0700 Subject: [PATCH 2/4] docs(storage): add RustFS as an S3-compatible provider example --- .../docs/en/self-hosting/object-storage.mdx | 19 ++++++++++++++++++- 1 file changed, 18 insertions(+), 1 deletion(-) diff --git a/apps/docs/content/docs/en/self-hosting/object-storage.mdx b/apps/docs/content/docs/en/self-hosting/object-storage.mdx index 0f0ed9e4100..e9255afa9a5 100644 --- a/apps/docs/content/docs/en/self-hosting/object-storage.mdx +++ b/apps/docs/content/docs/en/self-hosting/object-storage.mdx @@ -216,7 +216,7 @@ Sim works with any S3-compatible store by pointing the S3 client at a custom end - Configure a **CORS policy** on the bucket that allows your Sim origin (`PUT`, `GET`, and the `Authorization` / `Content-Type` / `x-amz-*` headers). This applies to AWS S3 too — R2 and MinIO are no different. - + R2 uses virtual-hosted style (the default) and the region `auto`: @@ -249,6 +249,23 @@ S3_BUCKET_NAME=myorg-sim-workspace-files `http://` works server-side, but since the browser uploads directly to this endpoint, prefer a TLS endpoint your users can reach (a mixed-content `http://` target will be blocked on an `https://` Sim origin). + + + +RustFS is a Rust-based, S3-compatible store (a MinIO drop-in). Configure it exactly like MinIO — path-style, any region string, SigV4 access key/secret: + +```bash +AWS_REGION=us-east-1 +S3_ENDPOINT=https://rustfs.example.com # must be reachable from users' browsers +S3_FORCE_PATH_STYLE=true +AWS_ACCESS_KEY_ID= +AWS_SECRET_ACCESS_KEY= +S3_BUCKET_NAME=myorg-sim-workspace-files +# ...remaining S3_*_BUCKET_NAME vars, one bucket each +``` + +The same browser-reachability and CORS requirements apply. + From f3c7b144faa53e001c14a8bf8bf86e6df0896e0c Mon Sep 17 00:00:00 2001 From: Waleed Latif Date: Wed, 3 Jun 2026 10:35:24 -0700 Subject: [PATCH 3/4] fix(storage): address review feedback and fix env mock for CI MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - Add envBoolean to the shared env test mock (createEnvMock) so config.ts's forcePathStyle coercion resolves — fixes failing knowledge/utils.test.ts - Declare S3_FORCE_PATH_STYLE as z.string() (every other env var's pattern); it's coerced via envBoolean at the consumption site, avoiding a boolean type that never matches the string process.env value - Log path-style from S3_CONFIG.forcePathStyle (envBoolean) instead of a separate isTruthy call, so the startup log can't disagree with the client - Make buildObjectFallbackUrl honor forcePathStyle: virtual-hosted-style URL (bucket as subdomain) for R2, path-style only when forcePathStyle is set --- apps/sim/lib/core/config/env.ts | 2 +- apps/sim/lib/uploads/core/setup.server.ts | 11 ++++++++--- apps/sim/lib/uploads/providers/s3/client.ts | 13 ++++++++++--- packages/testing/src/mocks/env.mock.ts | 8 ++++++++ 4 files changed, 27 insertions(+), 7 deletions(-) diff --git a/apps/sim/lib/core/config/env.ts b/apps/sim/lib/core/config/env.ts index 9714415226d..e47ca481d02 100644 --- a/apps/sim/lib/core/config/env.ts +++ b/apps/sim/lib/core/config/env.ts @@ -219,7 +219,7 @@ export const env = createEnv({ S3_OG_IMAGES_BUCKET_NAME: z.string().optional(), // S3 bucket for OpenGraph images S3_WORKSPACE_LOGOS_BUCKET_NAME: z.string().optional(), // S3 bucket for workspace logos S3_ENDPOINT: z.string().optional(), // Custom endpoint for S3-compatible storage (Cloudflare R2, MinIO, Backblaze B2). Leave unset for AWS S3 - S3_FORCE_PATH_STYLE: z.boolean().optional(), // Force path-style addressing (MinIO/Ceph RGW). Defaults to false (AWS S3, R2) + S3_FORCE_PATH_STYLE: z.string().optional(), // Force path-style addressing (MinIO/Ceph RGW). Defaults to false (AWS S3, R2). Coerced via envBoolean at the consumption site // Cloud Storage - Azure Blob AZURE_ACCOUNT_NAME: z.string().optional(), // Azure storage account name diff --git a/apps/sim/lib/uploads/core/setup.server.ts b/apps/sim/lib/uploads/core/setup.server.ts index f2dcd51daff..70a03f43976 100644 --- a/apps/sim/lib/uploads/core/setup.server.ts +++ b/apps/sim/lib/uploads/core/setup.server.ts @@ -2,8 +2,13 @@ import { existsSync } from 'fs' import { mkdir } from 'fs/promises' import path, { join } from 'path' import { createLogger } from '@sim/logger' -import { env, isTruthy } from '@/lib/core/config/env' -import { getStorageProvider, USE_BLOB_STORAGE, USE_S3_STORAGE } from '@/lib/uploads/config' +import { env } from '@/lib/core/config/env' +import { + getStorageProvider, + S3_CONFIG, + USE_BLOB_STORAGE, + USE_S3_STORAGE, +} from '@/lib/uploads/config' const logger = createLogger('UploadsSetup') @@ -82,7 +87,7 @@ if (typeof process !== 'undefined') { if (env.S3_ENDPOINT) { logger.info( - `Using S3-compatible endpoint: ${env.S3_ENDPOINT} (path-style: ${isTruthy(env.S3_FORCE_PATH_STYLE)})` + `Using S3-compatible endpoint: ${env.S3_ENDPOINT} (path-style: ${S3_CONFIG.forcePathStyle})` ) } } else { diff --git a/apps/sim/lib/uploads/providers/s3/client.ts b/apps/sim/lib/uploads/providers/s3/client.ts index ed66bfdddc3..42b0c62fac8 100644 --- a/apps/sim/lib/uploads/providers/s3/client.ts +++ b/apps/sim/lib/uploads/providers/s3/client.ts @@ -390,13 +390,20 @@ export async function getS3MultipartPartUrls( /** * Build a fallback object URL for when the SDK omits `Location` on multipart - * completion. Honors a custom `S3_CONFIG.endpoint` (R2/MinIO) with path-style - * addressing; otherwise falls back to the AWS virtual-hosted-style host. + * completion. For a custom `S3_CONFIG.endpoint` it matches the configured + * addressing mode — path-style for MinIO/Ceph (`forcePathStyle`), virtual-hosted + * (bucket as a subdomain) for R2 and friends. Falls back to the AWS + * virtual-hosted host when no custom endpoint is set. */ function buildObjectFallbackUrl(bucket: string, region: string, key: string): string { if (S3_CONFIG.endpoint) { const base = S3_CONFIG.endpoint.replace(/\/+$/, '') - return `${base}/${bucket}/${key}` + if (S3_CONFIG.forcePathStyle) { + return `${base}/${bucket}/${key}` + } + const url = new URL(base) + url.hostname = `${bucket}.${url.hostname}` + return `${url.origin}/${key}` } return `https://${bucket}.s3.${region}.amazonaws.com/${key}` } diff --git a/packages/testing/src/mocks/env.mock.ts b/packages/testing/src/mocks/env.mock.ts index 61f733c1ec2..e6bbd09f7b4 100644 --- a/packages/testing/src/mocks/env.mock.ts +++ b/packages/testing/src/mocks/env.mock.ts @@ -53,6 +53,14 @@ export function createEnvMock(overrides: Record = {} typeof value === 'string' ? value.toLowerCase() === 'false' || value === '0' : value === false, + envBoolean: (value: boolean | string | undefined | null): boolean | undefined => { + if (typeof value === 'boolean') return value + if (value === undefined || value === null || value === '') return undefined + const normalized = String(value).trim().toLowerCase() + return ( + normalized === 'true' || normalized === '1' || normalized === 'yes' || normalized === 'on' + ) + }, envNumber: ( value: number | string | undefined | null, fallback: number, From 12c14e298160fd6c6bc6ca2b61d346dfec3ddcf9 Mon Sep 17 00:00:00 2001 From: Waleed Latif Date: Wed, 3 Jun 2026 11:26:29 -0700 Subject: [PATCH 4/4] docs(storage): add backlinks to S3-compatible providers (R2, MinIO, Ceph, B2, RustFS) and backends --- .../content/docs/en/self-hosting/object-storage.mdx | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/apps/docs/content/docs/en/self-hosting/object-storage.mdx b/apps/docs/content/docs/en/self-hosting/object-storage.mdx index e9255afa9a5..edeee7b87d5 100644 --- a/apps/docs/content/docs/en/self-hosting/object-storage.mdx +++ b/apps/docs/content/docs/en/self-hosting/object-storage.mdx @@ -13,8 +13,8 @@ Sim stores every uploaded file — knowledge base documents, chat attachments, e | Backend | When to use | |---------|-------------| | **Local disk** | Single-node Docker, local development, evaluation | -| **AWS S3** | Production, especially when running more than one app replica | -| **Azure Blob** | Production on Azure | +| **[AWS S3](https://aws.amazon.com/s3/)** | Production, especially when running more than one app replica | +| **[Azure Blob](https://learn.microsoft.com/azure/storage/blobs/)** | Production on Azure | Local disk writes to the container's `/uploads` directory. Files are lost when the container is recreated unless that path is on a persistent volume, and they are **not** shared across replicas. For any multi-replica or production deployment, use S3 or Azure Blob. @@ -203,7 +203,7 @@ A full Helm example lives at `helm/sim/examples/values-azure.yaml`. ## Set up an S3-compatible provider (R2, MinIO, B2) -Sim works with any S3-compatible store by pointing the S3 client at a custom endpoint. Configure it exactly like AWS S3 (buckets, access key, secret), then add `S3_ENDPOINT` — and `S3_FORCE_PATH_STYLE` where the provider requires path-style addressing. +Sim works with any S3-compatible store by pointing the S3 client at a custom endpoint. Configure it exactly like AWS S3 (buckets, access key, secret), then add `S3_ENDPOINT` — and `S3_FORCE_PATH_STYLE` where the provider requires path-style addressing. Verified with [Cloudflare R2](https://developers.cloudflare.com/r2/), [MinIO](https://min.io/), [Backblaze B2](https://www.backblaze.com/cloud-storage), and [RustFS](https://rustfs.com/). `S3_ENDPOINT` is trusted operator configuration, so it is used as-is — `http://` and private hosts are accepted (no SSRF/HTTPS gate). Don't wire it to untrusted input. @@ -219,7 +219,7 @@ Sim works with any S3-compatible store by pointing the S3 client at a custom end -R2 uses virtual-hosted style (the default) and the region `auto`: +[Cloudflare R2](https://developers.cloudflare.com/r2/api/s3/) uses virtual-hosted style (the default) and the region `auto`: ```bash AWS_REGION=auto @@ -235,7 +235,7 @@ Leave `S3_FORCE_PATH_STYLE` unset — R2 supports the default virtual-hosted add -MinIO (and Ceph RGW) need path-style addressing and accept any region string: +[MinIO](https://min.io/docs/minio/linux/index.html) (and [Ceph RGW](https://docs.ceph.com/en/latest/radosgw/)) need path-style addressing and accept any region string: ```bash AWS_REGION=us-east-1 @@ -252,7 +252,7 @@ S3_BUCKET_NAME=myorg-sim-workspace-files -RustFS is a Rust-based, S3-compatible store (a MinIO drop-in). Configure it exactly like MinIO — path-style, any region string, SigV4 access key/secret: +[RustFS](https://rustfs.com/) is a Rust-based, S3-compatible store (a MinIO drop-in). Configure it exactly like MinIO — path-style, any region string, SigV4 access key/secret: ```bash AWS_REGION=us-east-1