github · dsyme · Jun 3, 2026 · Jun 3, 2026 · Jun 3, 2026 · Jun 3, 2026
diff --git a/.github/workflows/daily-ambient-context-optimizer.lock.yml b/.github/workflows/daily-ambient-context-optimizer.lock.yml
diff --git a/.github/workflows/daily-ambient-context-optimizer.md b/.github/workflows/daily-ambient-context-optimizer.md
@@ -0,0 +1,297 @@
+---
+emoji: "🌫️"
+name: Daily Ambient Context Optimizer
+description: Samples recent agentic workflow runs, inspects the first DLLM request text, and recommends prompt, skill, and agent changes to shrink ambient context
+on:
+  schedule: daily
+  workflow_dispatch:
+permissions:
+  contents: read
+  actions: read
+  issues: read
+  pull-requests: read
+tracker-id: daily-ambient-context-optimizer
+strict: true
+max-daily-effective-tokens: 100M
+network:
+  allowed: [defaults, github]
+tools:
+  agentic-workflows:
+  bash: true
+safe-outputs:
+  mentions: false
+  allowed-github-references: []
+  create-issue:
+    title-prefix: "[ambient-context] "
+    labels: [automation, report, workflow-optimization, analysis]
+    close-older-issues: true
+    expires: 7d
+    max: 1
+timeout-minutes: 45
+steps:
+  - name: Setup Python runtime
+    uses: actions/setup-python@v6.2.0
+    with:
+      python-version: "3.12"
+  - name: Prepare analysis workspace
+    run: |
+      mkdir -p /tmp/gh-aw/ambient-context
+imports:
+  - shared/otlp.md
+---
+
+# Daily Ambient Context Optimizer
+
+You are a cost-optimization analyst for `${{ github.repository }}`.
+
+Your job is to inspect the **first request sent to the DLLM** for several recent workflow runs, identify avoidable ambient context, and publish exactly one issue with concrete workflow improvements.
+
+## Goals
+
+1. Sample several agentic workflow runs from the last 24 hours.
+2. Inspect the first DLLM request text for each sampled run.
+3. Use deterministic Python analysis to measure prompt bloat and repetition.
+4. Recommend the highest-leverage improvements to workflow `.md` files, skill usage, and the set of agents/sub-agents.
+5. Create exactly one detailed issue report.
+
+## Data Collection
+
+### Step 1 — Download recent runs
+
+Use the `logs` tool from the `agentic-workflows` MCP server with:
+
+- `start_date: "-1d"`
+- `count: 120`
+- `parse: true`
+
+The tool downloads run data under `/tmp/gh-aw/aw-mcp/logs/`.
+
+### Step 2 — Pick the sample set
+
+Sample **6 runs** when available. If fewer than 6 eligible runs exist, sample all eligible runs down to a minimum of 3 before falling back to a reduced-data report.
+
+Eligibility rules:
+
+- `status == "completed"`
+- exclude this workflow itself
+- prefer successful runs, but include up to 2 failed runs when they have usable prompt artifacts
+- prefer breadth: no more than 2 runs from the same workflow when alternatives exist
+- require a usable first-request source:
+  - preferred: `prompt.txt`
+  - fallback: the first `user.message` event in `events.jsonl`
+
+Prefer higher-cost runs first by using `effective_tokens`, `token_usage`, `turns`, or prompt size when available.
+
+### Step 3 — Enrich a subset with audits
+
+Run the `audit` tool from the `agentic-workflows` MCP server for the **3 most expensive sampled runs** so you have richer token context and references.
+
+## First-Request Extraction Rules
+
+Treat the first DLLM request text as:
+
+1. `prompt.txt` when present, because it is the generated prompt sent to the agent
+2. otherwise, extract the first user-message payload from the run's `events.jsonl`
+
+For each sampled run, save the extracted text to:
+
+- `/tmp/gh-aw/ambient-context/samples/run-<id>.txt`
+
+Also save one metadata JSON file per run at:
+
+- `/tmp/gh-aw/ambient-context/samples/run-<id>.json`
+
+Include at least:
+
+- `run_id`
+- `workflow_name`
+- `workflow_path`
+- `run_url`
+- `status`
+- `conclusion`
+- `effective_tokens`
+- `token_usage`
+- `turns`
+- `request_chars`
+- `request_lines`
+- `request_source`
+
+## Deterministic Analysis
+
+Write and run a Python script at `/tmp/gh-aw/ambient-context/analyze_requests.py`.
+
+Use only the Python standard library. Do **not** install third-party packages.
+
+The script must read every sampled `run-*.txt` and `run-*.json` file and produce:
+
+- `/tmp/gh-aw/ambient-context/request-analysis.json`
+- `/tmp/gh-aw/ambient-context/request-analysis.md`
+
+The script must compute deterministic metrics for each sampled first request:
+
+- bytes, characters, lines, words
+- markdown heading count
+- list item count
+- code fence count
+- HTML `<details>` count
+- table row count
+- inline agent count (`## agent:`)
+- inline skill count (`## skill:`)
+- imported skill reference count (`SKILL.md`)
+- duplicate line ratio
+- duplicate paragraph ratio
+- longest 5 sections by heading
+- top repeated non-trivial lines or paragraphs
+- count of lines mentioning tools, skills, agents, safe outputs, and workflow instructions
+
+Aggregate metrics across the sample set:
+
+- sampled run count
+- distinct workflow count
+- median request chars
+- p95 request chars
+- top workflows by first-request size
+- most common repeated fragments
+- most common large-section headings
+
+## Source Review
+
+For every sampled run, read the current workflow source file from the repository when `workflow_path` resolves to a local `.github/workflows/*.md` file.
+
+Assess whether the request size is likely driven by:
+
+- verbose workflow markdown
+- overly broad or duplicated skill instructions
+- too many inline agents or agent definitions that are not justified
+- duplicated guardrails, examples, or formatting rules
+- context that should be moved to deterministic `steps:` or smaller sub-agents
+
+## Sub-Agent Usage
+
+After the deterministic Python script finishes, invoke `request-optimizer` **once per sampled run** using that run's compact JSON summary, not the raw full prompt, whenever at least 3 sampled runs exist.
+
+Each sub-agent invocation may return at most 3 opportunities for its run. Aggregate and deduplicate those per-run opportunities, then do the final prioritization yourself.
+
+## Recommendation Rules
+
+Produce **3 to 7** recommendations total.
+
+Each recommendation must include:
+
+- category: `workflow-md`, `skills`, or `agents`
+- affected workflow(s)
+- evidence from deterministic metrics
+- why it should shrink the first request
+- expected impact: `high`, `medium`, or `low`
+- whether the change is likely safe immediately or needs manual review
+
+Prioritize recommendations that:
+
+1. remove repeated context shared across many runs
+2. reduce broad skill loading or oversized skill fusion
+3. simplify or remove low-value inline agents
+4. move deterministic data gathering out of the main prompt
+
+Do not recommend changes that would obviously weaken safety or remove necessary task context.
+
+## Report Requirements
+
+Create exactly one issue titled:
+
+`[ambient-context] Daily Ambient Context Optimizer - YYYY-MM-DD`
+
+Use only `###` or lower headings.
+
+Keep the issue structured like this:
+
+### Executive Summary
+- runs sampled
+- workflows covered
+- median and p95 first-request size
+- highest-level conclusion
+
+### Highest-Leverage Changes
+- a concise numbered list of the top recommendations
+
+### Key Metrics
+| Metric | Value |
+|---|---|
+| Sampled runs | ... |
+| Distinct workflows | ... |
+| Median chars | ... |
+| P95 chars | ... |
+| Largest sampled request | ... |
+
+<details>
+<summary>Per-Run First-Request Metrics</summary>
+
+Include a markdown table with one row per sampled run.
+
+</details>
+
+<details>
+<summary>Repeated Ambient Context Signals</summary>
+
+Summarize repeated sections, duplicated fragments, and bloated headings.
+
+</details>
+
+<details>
+<summary>Deterministic Analysis Output</summary>
+
+Summarize the Python script outputs and cite the most relevant metrics.
+
+</details>
+
+### Recommendations by Category
+#### Workflow Markdown
+#### Skills
+#### Agents
+
+### References
+- Include up to 3 sampled run links in `[§12345](https://github.com/owner/repo/actions/runs/12345)` format
+
+## Reduced-Data Behavior
+
+If fewer than 3 eligible runs exist, still create the issue.
+
+In that case:
+
+- explain the reduced sample size clearly
+- report whatever evidence is available
+- prioritize repository-wide recommendations only when supported by the sampled data
+
+Do not use `noop` merely because the sample is small or imperfect. Create exactly one issue whenever logs are available. Use `noop` only if no run logs can be downloaded at all or the repository context is unavailable.
+
+## agent: `request-optimizer`
+---
+description: Ranks prompt-shrinking opportunities for one sampled run from compact deterministic metrics
+model: small
+---
+You are a compact optimization classifier.
+
+Input:
+- one JSON object for a sampled run
+- optional workflow source excerpt
+
+Return JSON only:
+
+```json
+{
+  "run_id": 123,
+  "workflow_name": "name",
+  "opportunities": [
+    {
+      "category": "workflow-md|skills|agents",
+      "finding": "short statement",
+      "evidence": ["metric or source detail"],
+      "impact": "high|medium|low"
+    }
+  ]
+}
+```
+
+Rules:
+- return at most 3 opportunities
+- use only provided evidence
+- prefer opportunities that reduce first-request size without reducing safety
diff --git a/actions/setup/js/copilot_harness.test.cjs b/actions/setup/js/copilot_harness.test.cjs
@@ -425,12 +425,51 @@ describe("copilot_harness.cjs", () => {
         expect(onPermissionRequest({ kind: "mcp", serverName: "github", toolName: "get_file_contents" })).toEqual({ kind: "approve-once" });
         expect(onPermissionRequest({ kind: "url", url: "https://example.com" })).toEqual({ kind: "approve-once" });
         expect(onPermissionRequest({ kind: "write", fileName: "a.txt", diff: "", intention: "" })).toEqual({ kind: "approve-once" });
+        expect(onPermissionRequest({ kind: "read", fileName: "a.txt" })).toEqual({
+          kind: "reject",
+          feedback: "Tool invocation is not allowed by workflow tool permissions.",
+        });
         expect(onPermissionRequest({ kind: "shell", commands: [{ identifier: "rm" }], fullCommandText: "rm -rf /tmp/x" })).toEqual({
           kind: "reject",
           feedback: "Tool invocation is not allowed by workflow tool permissions.",
         });
       });
 
+      it("allows read requests when read is explicitly allowlisted", async () => {
+        const disconnect = vi.fn().mockResolvedValue(undefined);
+        const stop = vi.fn().mockResolvedValue(undefined);
+        const createSession = vi.fn().mockResolvedValue({
+          sessionId: "session-read-allowed",
+          on: () => {},
+          sendAndWait: vi.fn().mockResolvedValue({ data: { content: "ok" } }),
+          disconnect,
+        });
+        class FakeCopilotClient {
+          start = vi.fn().mockResolvedValue(undefined);
+          createSession = createSession;
+          stop = stop;
+        }
+
+        const result = await runWithCopilotSDK({
+          sdkUri: "http://127.0.0.1:3002",
+          prompt: "test prompt",
+          logger: () => {},
+          permissionConfig: {
+            allowedTools: ["read"],
+          },
+          sdkModule: {
+            CopilotClient: FakeCopilotClient,
+            RuntimeConnection: { forUri: vi.fn(() => ({})) },
+            approveAll: () => ({ kind: "approve-once" }),
+          },
+        });
+
+        expect(result.exitCode).toBe(0);
+        const sessionConfig = createSession.mock.calls[0][0];
+        const onPermissionRequest = sessionConfig.onPermissionRequest;
+        expect(onPermissionRequest({ kind: "read", fileName: "a.txt" })).toEqual({ kind: "approve-once" });
+      });
+
       it("logs permission-denied SDK requests as core warnings", async () => {
         const disconnect = vi.fn().mockResolvedValue(undefined);
         const stop = vi.fn().mockResolvedValue(undefined);

diff --git a/actions/setup/js/copilot_sdk_driver.cjs b/actions/setup/js/copilot_sdk_driver.cjs
@@ -77,6 +77,8 @@ function summarizePermissionRequest(request) {
       return `url(${request.url || "unknown"})`;
     case "write":
       return `write(${request.fileName || "unknown"})`;
+    case "read":
+      return "read";
     case "custom-tool":
       return `custom-tool(${request.toolName || "unknown"})`;
     default:
@@ -164,8 +166,7 @@ function buildCopilotSDKPermissionHandler(permissionConfig, approveAll, logOptio
       case "write":
         return allowedToolEntries.has("write");
       case "read":
-        // Read permissions are low-risk and are broadly expected by the agent flow.
-        return true;
+        return allowedToolEntries.has("read");
       case "url":
         return allowedToolEntries.has("web_fetch");
       case "mcp":