Observe -> Workflow IR -> Skill Pack -> Improve

Apprentice learns one real workflow and turns it into a reusable skill.

Screen and OCR capture is idle

YC Proof Path

Session to reusable skill

1. ObserveBounded behavioral trace

seeded meeting-prep fixture

fixture seed until real packet
2. UnderstandWorkflow IR and execution choice

1 candidates; api native recommended

useful after edits
3. Skill packReusable agent skill pack

meeting-prep-agent skill pack: 9 files, 5 eval cases

skill pack export ready
4. ReuseSupervised run without recording

0 saved rerun records

real replay pending
5. ImproveDraft skill improvements

19 proposals, 4 next eval cases

operator review required
Built runtimeReal

Workflow IR, skill pack export, supervised reuse, and draft improvements run in app.

Evidence sourceFixture-backed

3 demo profiles, 0 verified partners.

External proofPending

Real partner-approved replay still needs a packet path and proof-gate run.

Current demo data is explicitly fixture-backed. The YC proof gate is a real partner-approved redacted packet replayed without fixture mode.

0verified partners
3demo queue profiles
0saved skill packs
0demo minutes tracked

Needs Attention

Workflow operations queue

7

Guided UI vs compiled workflow

Competitive demo

6
Guided Through UI vs Compiled Out Of UI

A UI guide helps the user complete one task; Apprentice turns the same session into a reusable, governed agent skill.

An AI-native operator has a recurring partner meeting. Today they manually open Calendar, Gmail, Slack, Drive, and notes to prepare a brief and draft follow-up.
Guided UI1
Compiled stages3
Approval gates1
Reuse loops1
Guided through UI1
Start with the familiar screen-aware helper moment

The operator clicks Record Workflow and receives a Clicky-like prompt while opening the calendar invite and searching related Gmail and Slack context.

Explicit Record Workflow modeWorkflow interviewer overlayBrowser/native capture mode selector
GuardrailsCapture is explicit and boundedRaw capture stays localNo always-on recorder

The audience recognizes the familiar guided-through-UI pattern in the first 20 seconds.

Compiled out of UI2
Extract the process instead of teaching more clicks

Apprentice shows the inferred Workflow IR from its internal Workflow Compiler engine: trigger, inputs, apps, data sources, steps, decision points, sensitive data, and scores.

Workflow IR reviewField-level IR diffStructured UI step plan ingestion
GuardrailsRaw video and key material are not promotedRedacted evidence only enters durable IR

The audience can explain what work should no longer require manual UI traversal.

Compiled out of UI3
Recommend the agent-native execution surface

The dashboard contrasts browser/computer-use fallback with API, MCP, and skill-first execution, then selects the meeting-prep agent/tool path.

Execution-surface hierarchyCUA fallback marked deferredGenerated MCP tool and Codex skill artifacts
GuardrailsUI automation remains fallback onlyCUA is not a capture modeComputer-use execution requires separate approval

The audience sees that Apprentice is compiler-backed, not a desktop copilot clone.

Compiled out of UI4
Generate reusable meeting-prep and follow-up artifacts

The same captured workflow becomes a meeting-prep artifact bundle, follow-up draft policy, MCP tool stub, eval checklist, and registry promotion candidate.

Generated Artifacts tableCompiled Workflow IR draftPromote to Registry Review action
GuardrailsRaw upload blockedRegistry promotion stores metadata and generated artifacts only

A design partner asks to reuse the workflow for the next recurring meeting.

Approval gate5
Require approval before anything external happens

The approval panel shows read-only, draft-only, and approval-required modes. Follow-up messages are drafts; external sends stay blocked.

Approval Policy panelEval checklistNeeds Attention queue
GuardrailsNo autonomous sendExternal communication requires a separate approval eventRecurring runs remain supervised

The audience trusts the workflow enough to approve a draft-only recurring prep run.

Reuse loop6
End with tomorrow's brief, not today's instructions

The final screen shows the workflow queued for registry review so the next matching meeting can produce a brief without repeating the UI path.

Workflow RegistryRegistry Review QueuedDesign-partner success criteria
GuardrailsRaw local evidence can be discarded after extractionReusable template keeps approval and eval metadata attached

The design partner says they would stop doing this workflow manually.

Partner Replay Intake

Review redacted replay packet

Storage posture

Local parse and sanitized metadata only

No raw packet JSON is stored in localStorage or sent to backend payloads.

Import only a partner-approved redacted packet. Fixture packets fail by default in this UI.

Usefulness verification

Observer Review

useful after edits

Local deterministic review loaded. Packet source: seeded meeting-prep fixture.

Backend persistence: No backend observer outcome has been saved yet. Saved records this session: 0.

Usefulness

71/100

local-workflow-observer-v0
Confidence

91/100

Capture evidence covers enough apps/events to review the workflow boundary.
Risk

29/100

Observer packet excludes raw screen video, raw typed text, private message bodies, and full browsing history.
Registry readiness

needs feedback

operator accepted, edited, reused, or observed time saved
Suggested Workflow IR edits

decision points: Add explicit decision questions for attendee matching, evidence sufficiency, and follow-up readiness.

Skill and MCP improvements
  • Add an explicit source-of-truth order for Calendar, Gmail, Slack, Drive, and manual notes.
  • Add a stop condition when attendee/company matching is ambiguous.
  • Expose a dry-run mode that returns planned connector queries and missing context.
  • Return evidence_refs and unsupported_claims as first-class output fields.
Policy and eval improvements
  • Require approval for expanded connector scopes, new recipients, recurring schedule changes, and any external send.
  • Keep raw capture discard visible before registry promotion.
  • Generated artifact is accepted or edited by the operator before registry promotion.
  • At least one reuse run saves measurable time without unsupported claims.
Follow-up questions
  • Which source should win if Gmail and Slack disagree?
  • What exact brief sections do you reuse before every meeting?
  • What would make the follow-up draft unsafe to send?

Skill Pack Compiler

Skill Studio

passes basic eval
Packmeeting-prep-agent skill pack

v0.1.0

Files9

SKILL.md entrypoint

Eval lift+33

85 skill vs 52 baseline

Export postureready

No raw capture or private content included

Generated filesskill_pack_partner-meeting-prep-and-follow-up
manifest.json

Skill pack manifest for export, reuse, and registry review.

manifest
SKILL.md

Codex-compatible skill entrypoint generated from Workflow IR.

entrypoint
README.md

Human-readable summary of the generated skill pack.

export
references/workflow-ir.json

Durable Workflow IR object used to generate the skill.

reference
references/mcp-tool-stub.json

MCP-style tool stub with schemas, permissions, and dry-run behavior.

reference
references/connector-retrieval-plan.md

API/tool-first retrieval plan for approved context sources.

reference
policies/approval-policy.md

Approval and blocked-operation rules for the workflow.

policy
evals/eval-criteria.md

Generated and observer-suggested eval criteria.

eval
evals/evals.json

Basic eval harness cases for the exported skill pack.

eval
Basic eval harness100% pass rate
Complete meeting-prep context

82 vs 58 baseline, delta 24

4/4
Follow-up draft guardrail

88 vs 52 baseline, delta 36

4/4
Sparse context recovery

84 vs 47 baseline, delta 37

4/4
Outcome acceptance signal

86 vs 50 baseline, delta 36

4/4
Regression against previous cases

87 vs 55 baseline, delta 32

4/4
SKILL.md preview3847 chars
---
name: meeting-prep-agent
description: Prepare a read-only meeting brief and draft follow-up from approved context.
---

# Meeting Prep Agent

## Purpose
Turn the Workflow IR "Partner meeting prep and follow-up" into a repeatable meeting-prep and follow-up drafting process for AI-native operators.

## When To Use
Use when Calendar event with external attendee and company domain 30 minutes before start.

## Required Inputs
- calendar invite
- recent Gmail threads
- Slack mentions
- Google Drive docs
- previous meeting notes
- browser context from allowlisted pages

## Connector And Tool Preferences
- Prefer Calendar, Gmail, Slack, and Drive APIs/connectors before browser or computer-use automation.
- Use browser context only for allowlisted pages.
- Use manual notes when the operator supplies context that connectors do not cover.

## Workflow Steps
- 1. Inspect upcoming calendar event and attendee domains.
- 2. Collect recent email and Slack context for attendee and company.
- 3. Find relevant Drive docs and previous meeting notes.
- 4. Generate brief, agenda, open questions, and commitments.
- 5. Draft follow-up after meeting, requiring approval before external send.
- 6. observe: Open the upcoming partner meeting invite and identify attendees.
- 7. keyboard shortcut: Search email for the attendee domain.
- 8. click: Open the most recent relevant email thread.
- 9. wait for state: Wait for Drive meeting notes to load.
- 10. type: Draft the safe meeting brief notes.
- 11. Compile reviewed capture evidence into a Workflow IR draft before generating durable artifacts.

## Approval Rules
- Default mode: draft_only.
- External sends are blocked unless the user explicitly approves a separate send action.
- Recurring runs, expanded scopes, new recipients, and state-changing writes require approval.
- Do not request raw keystrokes or always-on screen recording.

## Output Format
- One-page meeting brief
- Attendee and company summary
- Last contact and recent commitments
- Open questions
- Suggested agenda and asks
- Evidence references
- Follow-up draft only when the user provides notes or asks for a draft

## Eval Checklist
- Trigger matches the intended calendar event and attendee/company context.
- Required Calendar, Gmail, Slack, Drive, browser, and manual-note inputs are covered when available.
- Output includes attendees, company context, last contact, open commitments, relevant docs, agenda, and suggested asks.
- Evidence references support commitments, suggested asks, and follow-up claims.
- Unsupported claims are absent.
- Sensitive internal content is not overexposed.
- Permission scopes match the approved manifest.
- Approval policy is followed before any recurring run, expanded scope, write, or external send.
- Rollback steps are available and understandable.
- Artifact is reusable as a reviewed template after a dry run.
- Follow-up recipients are correct.
- Follow-up tone fits the relationship and meeting context.
- No external follow-up is sent autonomously.

## Failure Handling
- If missing relevant thread, stop and ask for review instead of guessing.
- If wrong attendee match, stop and ask for review instead of guessing.
- If overly broad context retrieval, stop and ask for review instead of guessing.
- If capture evidence does not match the intended workflow boundary, stop and ask for review instead of guessing.
- If UI state differs from the imported plan, stop and ask for review instead of guessing.
- If selector target no longer exists, stop and ask for review instead of guessing.
- If wait condition cannot be observed reliably, stop and ask for review instead of guessing.

## Non-Goals
- Do not send external messages autonomously.
- Do not operate as an always-on screen recorder.
- Do not store raw keystrokes.
- Do not expand beyond approved connector scopes.
Export bundlemanifest included
Download Skill Pack JSON
0 saved

No skill pack export has been saved yet.

No raw screen video in exported skill pack.No raw typed text or keystroke material in exported skill pack.No autonomous external send.Use connector summaries and evidence refs before UI automation.

Evidence-derived skill updates

Skill Improvement Loop

draft only
Statusdraft only

Operator review is required before any skill file changes.

Versionv0.1.0 -> v0.1.1

meeting-prep-agent skill pack

Input signals71/100

passes basic eval at 100% eval pass rate.

Safetysend blocked

No raw capture, private content, or autonomous mutation.

Daily evidence reviewrevise existing skill
9 signals4 change3 no-change
revise existing skill

Observer, operator, or existing proposal evidence contradicts the current skill behavior, so the review drafts a skill revision for operator review.

operator review required
workflow candidate

Partner meeting prep and follow-up candidate: manual context stitching, meeting prep pattern, follow up drafting pattern, connector overlap, repeated app sequence.

no change
observer review

useful after edits observer review with 3 skill suggestion(s).

revise existing skill
observer feedback

Operator feedback 1: 1 reuse(s), 1 edited field(s), dismissed false.

needs more observation
Draft-only improvement proposals19 proposals
Skill patches: 3Tool schema: 3Policies: 2Evals: 9Prompts: 2
skill file patch

Add an explicit source-of-truth order for Calendar, Gmail, Slack, Drive, and manual notes.

high
skill file patch

Add a stop condition when attendee/company matching is ambiguous.

medium
skill file patch

Require evidence references next to commitments, suggested asks, and follow-up claims.

medium
mcp tool schema

Expose a dry-run mode that returns planned connector queries and missing context.

high
mcp tool schema

Return evidence_refs and unsupported_claims as first-class output fields.

medium
mcp tool schema

Keep external follow-up drafting as a separate approval-required operation.

medium
Knowledge entries7 entries
Source-of-truth order travels with the skill

Add an explicit source-of-truth order for Calendar, Gmail, Slack, Drive, and manual notes.

source of truth rule
External sends require explicit approval

Require approval for expanded connector scopes, new recipients, recurring schedule changes, and any external send.

approval rule
Eval rule for reusable workflow quality

Brief includes attendees, company, last contact, open tasks, relevant docs, agenda, and suggested asks.

eval rule
Decision boundary for ambiguous context

Add explicit decision questions for attendee matching, evidence sufficiency, and follow-up readiness.

decision rule
Operator follow-up question 1

Which source should win if Gmail and Slack disagree?

operator question
Operator follow-up question 2

What exact brief sections do you reuse before every meeting?

operator question
Next eval cases4 proposed
Regression check: Brief includes attendees, company, last contact, open tasks, relevant docs, agenda, and suggested asks.

Brief includes attendees, company, last contact, open tasks, relevant docs, agenda, and suggested asks.

eval
Regression check: Follow-up draft includes only facts supported by retrieved context.

Follow-up draft includes only facts supported by retrieved context.

eval
Regression check: No external message is sent without approval.

No external message is sent without approval.

eval
Regression check: Every imported step maps to an ordered Workflow IR step

Every imported step maps to an ordered Workflow IR step

eval
Dry-run curator report1 pinned fence
Keep: 1Revise: 1Archive: 1Consolidate: 1Manual review: 1
keep

Curator keep pack: 22 min saved, eval 100%, edits -1

draft
revise

Curator revise pack: 4 min saved, eval 67%, edits 3

draft
archive

Curator archive pack: 0 min saved, eval 100%, edits 0

archived
consolidate

Curator consolidate pack: 22 min saved, eval 100%, edits -1

draft
manual review requested

Curator pinned pack: 4 min saved, eval 67%, edits 3

manual review requested

Report skill_pack_curator_report_2026_04_28 is dry-run only; no skill packs were mutated, archived, consolidated, deleted, sent, or patched.

View curator report
Claim disciplinerepo-native

This is the implemented Apprentice improvement loop: observer/eval/reuse metadata becomes draft skill updates, eval cases, and knowledge entries. It is not a current AutoContext or GBrain dependency.

First proposal: SKILL.md - Updated skill pack passes the basic eval harness and still excludes raw capture/private content.

View API payloadView evidence review

Workflow Inbox

Captured operator workflows

3
Detected candidates1
Partner meeting prep and follow-up

Approve candidate, review Workflow IR diff, then promote only metadata and generated artifacts.

manual context stitchingmeeting prep patternfollow up drafting patternconnector overlaprepeated app sequencesuggested

Workflow IR

Executive workflow

Trigger
Calendar event with external attendee and company domain 30 minutes before start
Execution surface
api native: Calendar, Gmail, Slack, Drive, Browser are available through approved connectors; outputs are read-only or draft-only and external sends are blocked.
Inputs
Calendar, Gmail, Slack, Drive, Browser
Value91
Risk28
Confidence84

Connector Evidence

Normalized source events

Partner check-incalendar

Upcoming external partner meeting with company-domain attendee.

calendar.readworkflow ir referencepublic or low sensitivity
Follow-up on integration timelinegmail

Recent email thread includes open question about integration timeline and owner.

gmail.readshort lived summaryemail content, private message
Partner launch notesslack

Internal Slack thread mentions launch blocker and suggested agenda item.

slack.readshort lived summaryprivate message
Partner implementation notesdrive

Drive doc contains prior implementation notes and unresolved next steps.

drive.readworkflow ir referenceprivate message

Structured UI Step Plan

Imported plan normalized into Workflow IR

Imported workflow

Imported partner meeting prep UI plan

5 ordered steps mapped
Recommended fallback

browser automation

UI selectors are evidence, not execution authority.
Redaction posture

typed values redacted as metadata

Typed values stay redacted unless explicitly allowlisted later.
Generated review artifacts

2

imported-ui-plan-playwright-draft
Computer-use fallbackdefer adapter

Use CUA Driver as a research reference for replayable trajectories and MCP-compatible computer-use fallback, but do not integrate it until API/MCP/CLI and Playwright surfaces fail a validated workflow.

native helper captureplaywrightcua driverdirect appkit screencapturekit

Partner Evidence Ledger

Concierge sprint operating queue

Target metrics

5 partners / 2 paid signals

Targets are goals, not achieved traction.
Demo fixtures

3

Visibly separated from partner-confirmed proof
Verified partners

0

0 partner-confirmed signals
Verified paid signals

0

0 until a verified payment record is linked
Demo founder operator sprintfounder operator

Replace fixture row with a partner-approved capture session before claiming traction.

readydemo fixture3 workflows5 deliverablesraw retention 24h
usagedemo fixture

demo fixture / 4/28/2026

Demo chief of staff sprintchief of staff

Collect a real example recurring meeting and source metadata before upgrading this row.

scheduleddemo fixture3 workflows5 deliverablesraw retention 12h
verbaldemo fixture

demo fixture / 4/28/2026

Demo investor operator sprintinvestor operator

Run intake interview and replace demo fixture with a partner-approved packet.

candidatedemo fixture3 workflows5 deliverablesraw retention 24h
follow updemo fixture

demo fixture / 4/28/2026

Verification

Permission, eval, and audit trail

Permission manifest

calendar.read, gmail.read, slack.read, drive.read

Raw connector bodies are not rendered in the dashboard; raw content is summarized ephemerally and deleted after Workflow IR extraction unless explicitly retained.
Sensitivity posture

public or low sensitivity, email content, private message

No escalation flags in canonical fixture
Eval runner

85/100

0 blocking failures
Audit log

2 metadata-only entries

brief_ref:partner-meeting-2026-04-30

Workflow Registry

Reusable agent-native templates

Partner meeting prep and follow-upchief of staff

Calendar event with external attendee and company domain 30 minutes before start

v0.1.0reviewed4 artifacts18 min/run saved
Weekly investor update packetfounder/operator

Friday digest window or explicit workflow capture

v0.1.0draft3 artifacts45 min/run saved
Candidate sourcing follow-upoperator

Candidate thread with unresolved next step

v0.1.0draft3 artifacts15 min/run saved
Daily team digestchief of staff

Daily digest window or explicit team-summary request

v0.1.0draft2 artifacts20 min/run saved
Research brief builderinvestor/operator

Manual research brief request with target company or topic

v0.1.0draft3 artifacts35 min/run saved

Team Workspace

Operator Apprentice Sprint

Members

4

founder operator, chief of staff, admin, reviewer
Shared workflows

5

workflow_partner_meeting_prep, workflow_weekly_investor_update, workflow_candidate_sourcing_follow_up
Admin controls

Audit export on

Keystrokes and always-on screen recording default to off.
Retention

metadata only

ephemeral delete after ir

Team Governance

Policies, exports, and analytics

Shared approval policies

3

read only, draft only, defer to checklist
Admin controls

calendar, gmail, slack, drive, browser

Raw capture TTL: 12 hours
Audit export

jsonl metadata-only

audit-export://workspace_operator_team/2026-04/team_audit_export_2026_04.jsonl
Workflow analytics

18 recurring runs

9 approved, 1 rejected
MCP server manifest

operator-workflow-compiler

1 tool, 3 resources, 2 prompts
Developer API

/v1

5 metadata-only resources
Marketplace

2 listings

Partner meeting prep and follow-up
Observability

78% avg success

2 checklist degradations
Partner meeting prep and follow-up270 min saved
Daily team digest180 min saved
Candidate sourcing follow-up90 min saved

Generated Artifacts

Agent-ready outputs

codex_skillmeeting prep agentdraft
mcp_tool_stubmeeting prep briefdraft
approval_policymeeting prep draft onlyready for review
eval_checklistmeeting prep quality evalready for review
prompt_templatemeeting prep synthesis promptdraft

Report Export

Agent Workflow Report bundle

Design partner

Founder operator sprint

founder operator / Operator office
Workflow IR draft

Partner meeting prep and follow-up

api_native
Expected ROI

4.2 h/mo

18 min/run x 14 runs
Safety posture

draft only

Raw keystrokes, always-on screen, and autonomous sends are off.
meeting-prep-agentprimary generated artifact

Calendar, Gmail, Slack, Drive, Browser are available through approved connectors; outputs are read-only or draft-only and external sends are blocked.

13 eval checks5 artifacts10663 markdown chars

Approval Policy

Safe execution defaults

  1. Read approved connector context during prep windows.
  2. External messages remain blocked unless explicitly approved.
  3. 4 connector reads planned; raw private content is excluded from logs.
  4. Rollback preview: delete generated draft; disable scheduled run; revoke connector token.
Mode
External sendsAlways blocked unless a separate approval-required send action exists.
Checklist degradationHigh-risk fixture mode: defer to checklist.
Review system design