Version: 1.0 Generated: 2026-03-22 Scope: Complete reference for ADLC + PDLC autonomous pipelines
The ODS Platform runs two autonomous pipelines on a single server, each managed by a Claude Code session inside a tmux window, driven by bash-based dispatchers running on systemd timers.
+-----------------------------------+
| Linux Server |
| /home/jniox_orbusdigital_com/ |
+-----------------------------------+
| |
+---------------+ +---------------+
| |
+----------v-----------+ +--------------v---------+
| ADLC Pipeline | | PDLC Pipeline |
| (Development) | | (Product) |
+----------+------------+ +--------------+----------+
| |
+----------v-----------+ +--------------v---------+
| tmux: ods-claude | | tmux: ods-pdlc |
| Claude supervisor | | Claude supervisor |
| CLAUDE.md rules | | CLAUDE.md rules |
+----------+------------+ +--------------+----------+
| |
+----------v-----------+ +--------------v---------+
| dispatcher-v3.sh | | dispatcher-pdlc.sh |
| systemd timer: 5 min | | systemd timer: 10 min |
| deterministic bash | | deterministic bash |
+----------+------------+ +--------------+----------+
| |
+----------v-----------+ +--------------v---------+
| session-health.sh | <--- shared ---> | session-health.sh |
| systemd timer: 1 min | | systemd timer: 1 min |
+-----------------------+ +------------------------+
| Pipeline | Purpose | tmux Session | Dispatcher | Interval | CLAUDE.md |
|---|---|---|---|---|---|
| ADLC | Autonomous Development Lifecycle | ods-claude |
dispatcher-v3.sh |
5 min | ~/dev/CLAUDE.md |
| PDLC | Product Development Lifecycle | ods-pdlc |
dispatcher-pdlc.sh |
10 min | PDLC CLAUDE.md |
dispatcher-v3.sh (ADLC, every 5 minutes): -
Consolidates feature branches into dev - Reads status files
for every service in ~/dev/projects/ - Runs deterministic
state transitions (tests, status recovery) - Injects complex decisions
into Claude via tmux send-keys - Processes Slack inbox
messages - Detects external blockers (missing .env vars, missing Coolify
configs) - Spawns resolver for systemic issues (2+ services blocked on
same cause) - Posts kanban summary every 6th run (30 min)
dispatcher-pdlc.sh (PDLC, every 10 minutes): - Processes PDLC Slack inbox messages - Checks ADLC state for GTM/Analytics triggers (services on staging) - Pokes PDLC Claude if idle (no agents running, waiting at prompt) - Posts PDLC kanban summary every 6th run (60 min)
Monitors the ods-claude tmux session: 1. Session
alive check: if tmux session missing, creates new one and boots
Claude 2. Idle detection: hashes pane content; if
unchanged for 15 consecutive minutes with no agents running, kills and
restarts session 3. Memory check: if available memory
< 512MB, kills largest non-supervisor Claude process 4.
Session age check: if session is > 12 hours old AND
> 200 interactions, restarts when no agents are running
| Bridge | Script | Inbox Directory | Channel Monitored |
|---|---|---|---|
| ADLC | slack-bridge.sh |
~/dev/ops/slack-inbox/ |
C0AN0N8AUGZ (ADLC), D0AGRAVEC1K (DM) |
| PDLC | pdlc-slack-bridge.sh |
~/dev/ops/pdlc-slack-inbox/ |
C0AN42N3C0L (PDLC) |
Bridges poll Slack for new messages, write them as files to the inbox directory. Dispatchers pick them up and inject into the appropriate Claude session.
At 4:00 AM UTC, a systemd timer triggers a full
restart: - Kills both tmux sessions - Starts fresh sessions - Claude
reads CLAUDE.md and runs /boot (ADLC) or
/boot-pdlc (PDLC) - /boot reconstructs state
from: agent-memory files, progress.md per project, git state per
service, interrupted RUNNING status files, system resources - Starts
pipeline loop
Dev (code) --> Tests (bash) --> BA (review) --> Architect+Security+DevOps (parallel reviews)
--> PR (create+merge) --> Provisioner (if needed) --> Deploy (staging)
--> Scenario (generate tests) --> E2E (execute tests)
Trigger: Dispatcher detects pending tasks in
progress.md with no dev.status file, or Claude
receives /dev-task command.
Agent: dev (model: opus, maxTurns:
50)
Execution: 1. Dev agent reads spec at
~/dev/specs/$PROJECT/specs/$SERVICE/spec.md 2. Creates
branch feat/$TASK_ID from dev 3. TDD cycle:
write failing test, implement, refactor, repeat 4. Commits with
conventional format: feat($SERVICE): description [$TASK_ID]
5. Pushes feature branch
Output files: - Git commits on
feat/$TASK_ID branch - Update to
~/dev/specs/$PROJECT/gestion/progress.md:
- [x] DEV: $TASK_ID -- $DESCRIPTION ($DATE) - Update to
~/.claude/agent-memory/pipeline/state.md:
$PROJECT/$TASK_ID: DEV_COMPLETE @ ISO-timestamp
Status file written: None directly. The dispatcher’s
branch consolidation step merges feature branches into dev,
then sets $SERVICE-dev.status.
On success: Branch exists with commits. Dispatcher proceeds to tests. On failure: If tests fail 3 times on same issue, agent stops and reports blocker. Circuit breaker increments crash counter.
Trigger: Every dispatcher run (5 min), before pipeline scan.
Executor: dispatcher-v3.sh
consolidate_branches() function (pure bash).
Execution: 1. For each service in
~/dev/projects/*/: - Auto-commit any uncommitted changes on
feature branches:
git add -A && git commit -m "wip: auto-commit $branch"
- Switch to dev branch - Merge each feat/*
branch with --no-edit - Delete merged feature branches -
Push dev to origin 2. If merge conflict: abort merge, log
conflict, post to Slack DM
Output: Consolidated dev branch with
all feature work merged.
Trigger: $SERVICE-dev.status =
DONE AND no $SERVICE-test.status file
exists.
Executor: test-runner.sh (pure bash,
zero Claude tokens).
Execution:
bash ~/dev/ops/adlc-v2/scripts/test-runner.sh $PROJECT $SERVICEStatus file written: - Success:
echo "PASS" > ~/dev/ops/outputs/$SERVICE-test.status -
Failure:
echo "FAIL" > ~/dev/ops/outputs/$SERVICE-test.status
On success: Dispatcher proceeds to BA review. On failure: Crash counter incremented. If < 3 crashes, may retry. If >= 3, circuit breaker triggers.
Trigger: $SERVICE-test.status =
PASS AND no $SERVICE-ba.status file
exists.
Agent: ba (model: sonnet, maxTurns:
35)
Dispatcher action: Sets
$SERVICE-ba.status to RUNNING, then injects
into Claude:
Spawn /agent ba for $SERVICE. PROJECT: $PROJECT. Spec: ~/dev/specs/$PROJECT/specs/$SERVICE/spec.md
Execution: 1. BA agent reads FULL spec, extracts
every acceptance criterion (AC-001, AC-002, …) 2. Reads the code, checks
last-reviewed-commit.txt for incremental diff 3. For each
criterion: finds implementing code, records file+line as evidence 4.
Evaluates each: MET, PARTIAL, MISSING, DEVIATION, N/A 5. For app
services: verifies API contracts, database schema, events match spec
exactly 6. For infra services: verifies library API, deployment config,
topic definitions
Output files: -
~/dev/ops/reviews/$SERVICE/ba-report.json (JSON with
criteria, deviations, verdict) -
~/.claude/agent-memory/pipeline/state.md:
$PROJECT/$SERVICE: BA_PASS|BA_FAIL @ timestamp
Status file format:
DONE | 2026-03-22T10:30:00+00:00 | ba | $SERVICE | from-json-report
or
FAILED | 2026-03-22T10:30:00+00:00 | ba | $SERVICE | from-json: non-compliant
Verdict rules: - compliant:
criteriaMissing == 0 AND criteriaPartial == 0 AND no HIGH/CRITICAL
deviations - non-compliant: criteriaMissing > 0 OR any
HIGH/CRITICAL deviation
On success (compliant): Dispatcher proceeds to
parallel reviews. On failure (non-compliant): -
Dispatcher sets $SERVICE-ba.status to TRIAGING
and injects into Claude for analysis - Claude reads the BA report JSON
to determine if missing criteria map to pending tasks (not a real
failure) or to completed tasks (real failure requiring dev fix)
Anti-rubber-stamp rules: - Every criterion MUST have evidence (file path + line number) - If implementing code cannot be found, mark MISSING (never assume) - Read FULL spec every time (never rely on memory) - Never suggest code changes (only report deviations) - Minimum review depth: read every controller/route file and every test file
Trigger: $SERVICE-ba.status =
DONE AND no $SERVICE-review.status file
exists.
Dispatcher action: Sets
$SERVICE-review.status to RUNNING, then
injects into Claude:
Spawn /agent architect + /agent security + /agent devops for $SERVICE
All three run in parallel (if memory > 2000MB).
Agent: architect (model: sonnet,
maxTurns: 30)
8 mandatory checks (ALL must pass): 1. Schema Isolation – service uses own DB schema only 2. Inter-Service Communication – Redpanda events only, no direct HTTP between ODS services 3. Multi-Tenancy – tenant_id from JWT, RLS on all tables, tenant_id in all CloudEvents 4. Layer Structure – Controllers > Services > Repositories separation 5. No Hardcoded URLs – all external endpoints via env vars 6. Header Propagation – Authorization, X-Tenant-Id, X-Correlation-Id, X-Source-Service forwarded 7. CloudEvents Compliance – specversion, type, source, id, time, tenantid fields present 8. Error Handling – service-specific exception filters, correlation ID in logs, no stack traces in responses
Output:
~/dev/ops/reviews/$SERVICE/architect-report.json
Verdict: Single FAIL on any check = overall FAIL. Every
check needs evidence.
Agent: security (model: sonnet,
maxTurns: 30)
OWASP Top 10 checks: - A01 Injection, A02 Broken Auth, A03 Sensitive Data, A04 XXE, A05 Access Control - A06 Security Misconfiguration, A07 XSS, A08 Insecure Deserialization - A09 Known Vulnerabilities (npm audit), A10 Insufficient Logging
Automated scans run first: npm audit, secrets scan (grep), hardcoded URLs scan, .gitignore check.
Output:
~/dev/ops/reviews/$SERVICE/security-report.json
Verdict: - clean: all checks PASS/N/A, no
npm audit critical/high, no secrets - concerns: any WARN,
medium/low npm advisories - critical: any FAIL,
critical/high npm advisory, any secret in code
CRITICAL or HIGH severity = automatic FAIL, PR must NOT merge.
Agent: devops (model: sonnet, maxTurns:
30, mode: review)
Mandatory validation (all executed): 1. Docker build
test – docker build -t test-$SERVICE . 2. .dockerignore
check 3. Health endpoint verification (grep for
health/readiness/liveness) 4. Structured logging check 5. Env vars
documentation (.env.example) 6. Migrations check 7. Test suite
execution
Output:
~/dev/ops/reviews/$SERVICE/devops-report.json
Verdict: PASS, FAIL, or PASS_WITH_NOTES
After all three reviews complete: Dispatcher checks
$SERVICE-review.status. If the three JSON report files all
exist, sets to DONE. Then validates verdicts:
BA=compliant AND ARCH=PASS AND SEC=(clean|concerns with severity!=HIGH/CRITICAL) AND DEVOPS=(PASS|PASS_WITH_NOTES)
If all pass: proceed to PR. If any fails: set
$SERVICE-review.status to FAILED, increment
crashes.
Trigger: $SERVICE-review.status =
DONE AND all_reviews_pass() returns true AND
no $SERVICE-pr.status file exists.
Agent: pr (model: sonnet, maxTurns:
15)
Execution: 1. Reads all four JSON review reports programmatically (never trusts markdown) 2. Validates: BA=compliant, ARCH=PASS, SEC not critical, DEVOPS=PASS/PASS_WITH_NOTES 3. If ANY report MISSING or FAIL: ABORT, do not create PR
Auto-merge safeguards (ALL must pass for auto-merge): - All reviews PASS (not just PASS_WITH_NOTES) - Lines changed <= 500 - No migration files in diff - No cross-service/shared lib changes - Security severity = NONE
If all pass: gh pr merge --squash --auto If any fails:
create PR with label human-review-required, post to Slack
DM, do NOT merge.
Output: - GitHub PR on staging branch -
Status update to ~/.claude/agent-memory/pipeline/state.md:
$PROJECT/$SERVICE: STAGING_DEPLOYED @ timestamp - Update to
~/dev/specs/$PROJECT/gestion/progress.md
Status file: $SERVICE-pr.status =
DONE | date | pr | $SERVICE | PR merged
Abort conditions: Any review MISSING/FAIL, CI fails, merge conflicts, security HIGH/CRITICAL.
Trigger: $SERVICE-pr.status =
DONE AND no $SERVICE-deploy.status AND no
Coolify config file (~/dev/ops/coolify/$SERVICE.json).
Agent: provisioner (model: sonnet,
maxTurns: 30)
Execution (based on service type from
service-project-map.json):
| Service Type | Action |
|---|---|
web-service |
Create Coolify Application (Dockerfile build pack) |
infrastructure |
Create Coolify Service (docker-compose) |
script |
No Coolify app. Mark DONE immediately. |
Resources provisioned: - PostgreSQL schema + user (psql) - GCS bucket + service account (gcloud) - Coolify app (Coolify API) - GitHub repo (gh) - .env.staging file
Post-provision check (MANDATORY before marking DONE): - web-service: trigger initial deploy, health check (2 min timeout) - infrastructure: check containers running - script: verify project builds
Output: -
~/dev/ops/coolify/$SERVICE.json (Coolify config with
appUuid, URLs, registry) -
~/dev/ops/reviews/$SERVICE/provisioner-report.json -
~/dev/projects/$SERVICE/.env.staging
Status file:
DONE | date | provisioner | $SERVICE | verified-operational
or
PROVISION_INCOMPLETE
On failure: PROVISION_INCOMPLETE or
BLOCKED, escalate to Slack DM.
What requires human: COOLIFY_API_TOKEN, external API credentials (Stripe, SendGrid, CinetPay), DNS wildcard, SMTP credentials.
What does NOT require human: Coolify app creation, PostgreSQL schema, GCS bucket, GitHub repo, env vars, TLS certs.
Trigger: $SERVICE-pr.status =
DONE AND Coolify config exists AND no
$SERVICE-deploy.status.
Agent: devops (model: sonnet, maxTurns:
30, mode: deploy)
Execution: 1. Build and tag Docker image 2. Push to registry (if configured) 3. Deploy to Coolify via API (restart application) 4. Health check loop (every 10s, up to 3 min) 5. Smoke test 6. Write deploy report
If health fails: rollback via Coolify API, notify Slack DM.
Output: -
~/dev/ops/reviews/$SERVICE/deploy-report.json -
~/dev/projects/$SERVICE/.env.staging
Status file:
DONE | date | deploy | $SERVICE | script-type: no permanent deploy
or
DEPLOYED | date | devops | $SERVICE | $STAGING_URL
Trigger: Deploy health check passes (sequential, must wait for deploy).
Agent: scenario (model: sonnet,
maxTurns: 25)
Execution: 1. Reads spec and all review files 2.
Generates API scenarios (scenarios.json) 3. Generates
browser scenarios (browser-scenarios.json) if service has
UI 4. Writes mock data SQL and cleanup SQL
Mandatory API scenario categories: happy-path, multi-tenancy, auth, validation, error Mandatory browser scenario categories (if UI): user-journey, multi-tenancy, responsive, error-display, navigation
Output: -
~/dev/projects/$SERVICE/tests/e2e/scenarios.json -
~/dev/projects/$SERVICE/tests/e2e/browser-scenarios.json -
~/dev/projects/$SERVICE/tests/e2e/mock-data.sql -
~/dev/projects/$SERVICE/tests/e2e/cleanup.sql
Status:
$PROJECT/$SERVICE: SCENARIOS_READY @ timestamp
Trigger: Scenario agent completes (sequential after scenarios).
Agent: e2e-test (model: sonnet,
maxTurns: 30)
Tools: Full Playwright MCP tool suite + Read, Write, Bash, Glob, Grep
Pre-check: Verify staging URL is reachable via
curl $STAGING_URL/health. If not: ABORT with
E2E_BLOCKED.
Phase 1 – API E2E Tests: Execute each scenario from
scenarios.json using curl against staging URL. Test
categories: happy path, auth (expired/wrong/no token), multi-tenancy
(cross-tenant access blocked), input validation, error handling.
Phase 2 – Browser E2E Tests: Execute browser
scenarios using Playwright MCP tools: browser_navigate,
browser_fill_form, browser_click,
browser_wait_for, browser_snapshot,
browser_take_screenshot. Save screenshots to
~/dev/ops/reviews/$SERVICE/screenshots/.
Phase 3 – Cleanup: Run cleanup.sql against test database.
Output: -
~/dev/ops/reviews/$SERVICE/e2e-report.json - Screenshots in
~/dev/ops/reviews/$SERVICE/screenshots/
Verdict rules: - All API + browser tests pass:
E2E_PASS - Any test fails: E2E_FAIL - Staging
not reachable: E2E_BLOCKED - Console JS errors during
browser tests: E2E_FAIL (even if assertions pass)
Discovery --> Spec Writing --> Prioritization --> Validation --> ADLC Handoff
--> (wait for ADLC) --> GTM Prep --> Analytics
DISCOVERY --> SPEC_WRITING --> PRIORITIZED --> VALIDATING --> VALIDATED
--> SUBMITTED_TO_ADLC (passive wait) --> ADLC_ACTIVE (passive wait)
--> ADLC_STAGING --> GTM --> ANALYTICS
Trigger: New market signal, user feedback,
stakeholder request, or Slack command discover {topic}.
Agent: discovery (model: sonnet,
maxTurns: 25)
Tools: Read, Bash, Glob, Grep, WebSearch, WebFetch
Execution: 1. Understand the request 2. Research: WebSearch for market data, competitor analysis, industry trends 3. Analyze existing context: read PROJECT.md, existing specs, business-rules.md 4. Identify gaps in platform capabilities 5. Write opportunity brief
Output: -
~/dev/specs/$PROJECT/pdlc/opportunities/$OPPORTUNITY_NAME.md
(markdown brief with problem statement, market evidence, proposed
solution, target users, business impact, risks, recommendation) -
~/dev/specs/$PROJECT/pdlc/opportunities/$OPPORTUNITY_NAME.json
(structured summary)
Recommendation values: GO, NO_GO, NEEDS_MORE_DATA Priority values: CRITICAL, HIGH, MEDIUM, LOW
Status file:
echo "RUNNING | $(date) | discovery | $SERVICE" > ~/dev/ops/outputs/$SERVICE-discovery.status
On success: Proceed to spec writing (if recommendation=GO).
Slack: Posts to PDLC channel (C0AN42N3C0L).
Trigger: Discovery recommendation = GO AND no spec exists for target service.
Agent: spec-writer (model: opus,
maxTurns: 40)
Execution: 1. Read opportunity brief 2. Read architecture.md and business-rules.md for constraints 3. Read 1-2 existing specs for format and depth reference 4. Write the spec
Spec structure (11 mandatory sections): 1. Objectif 2. API Endpoints (method, path, request/response types, status codes, auth, multi-tenancy) 3. Data Model (tables, columns, types, indexes, RLS, relationships) 4. Events/CloudEvents (types, payload schema, topics) 5. Business Rules (validation, edge cases, error handling) 6. Acceptance Criteria (numbered AC-001..AC-N, Given/When/Then, minimum 10) 7. Non-Functional Requirements (performance, security, multi-tenancy) 8. Dependencies 9. Service Classification (MANDATORY – type, stack, deploy mode, staging domain) 10. Infrastructure Requirements (MANDATORY – resource table with auto-provisionable flag) 11. Out of Scope
Output: -
~/dev/specs/$PROJECT/specs/$SERVICE/spec.md - Updates
~/dev/ops/agents/service-project-map.json with new entry -
Updates ~/dev/specs/$PROJECT/gestion/backlog.md
Quality rules: Every endpoint must have types. Every table must have tenant_id + RLS. Every AC must be testable. Minimum 10 ACs. Section 9 and 10 are mandatory.
Status file:
echo "RUNNING | $(date) | spec-writer | $SERVICE" > ~/dev/ops/outputs/$SERVICE-spec-writer.status
Slack: Posts to PDLC channel.
Trigger: New spec written, not yet prioritized.
Agent: prioritization (model: sonnet,
maxTurns: 20)
RICE Scoring: - Reach (1-10): users/tenants affected per quarter - Impact (0.25, 0.5, 1, 2, 3): Minimal to Massive - Confidence (0.5, 0.8, 1.0): Low, Medium, High - Effort (1-10): agent-weeks - Score = (Reach x Impact x Confidence) / Effort
Output: - Updates
~/dev/specs/$PROJECT/gestion/backlog.md (ranked table with
RICE scores, phase assignments P0-P3) -
~/dev/specs/$PROJECT/pdlc/prioritization-$DATE.json
Phase groupings: P0 (foundation, blocks everything), P1 (core, enabled by P0), P2 (growth), P3 (nice-to-have)
Slack: Posts top 3 items and current phase count to PDLC channel.
Trigger: Spec prioritized, not yet validated.
Agent: validation (model: sonnet,
maxTurns: 25)
Validation checklist: - Spec completeness (6 checks): API types, DB schema with tenant_id+RLS, 10+ testable ACs, CloudEvents, dependencies, out of scope - Technical feasibility: architecture alignment, no circular deps, dependencies exist/planned, no stack conflicts - Effort estimation: T-shirt sizing (S/M/L/XL), broken down by API/DB/logic/tests/integration - Risk assessment: technical, integration, security, scope risks - Dependency map: blocked-by, blocks, shared libs, schema conflicts - Infrastructure readiness: Section 9 exists, auto-provisionable vs human resources identified, env vars match .env.example
Output:
~/dev/specs/$PROJECT/pdlc/validations/$SERVICE-validation.json
Verdict rules: - APPROVED: spec
complete, feasible, no HIGH risks, dependencies satisfied -
NEEDS_REVISION: spec incomplete or has addressable issues –
sent back to spec-writer - BLOCKED: depends on unbuilt
services, not feasible with current stack
On APPROVED: - Creates handoff file:
~/dev/specs/$PROJECT/pdlc/handoffs/$SERVICE-handoff.json -
Posts to ADLC channel: spec ready for development
On NEEDS_REVISION: Posts details to PDLC channel. Spec-writer agent may be re-spawned for revision.
Slack: Posts to PDLC channel.
Trigger: Validation verdict = APPROVED.
Execution (by PDLC orchestrator): 1. Ensure spec.md exists 2. Write handoff file:
{
"service": "$SERVICE",
"project": "$PROJECT",
"date": "ISO-timestamp",
"state": "SUBMITTED_TO_ADLC",
"specPath": "~/dev/specs/$PROJECT/specs/$SERVICE/spec.md"
}:package: PDLC->ADLC handoff: $SERVICE spec ready for development$SERVICE: SUBMITTED_TO_ADLC @ timestampAfter handoff – passive wait rules: - PDLC MUST NOT re-submit, re-validate, re-discover, or spawn any agent for this service - PDLC only reads ADLC state passively every 10 minutes
| ADLC State Detected | PDLC Action |
|---|---|
| No dev status yet (< 24h) | Wait. Normal. |
| No dev status yet (> 24h) | Post reminder to ADLC channel |
| Dev RUNNING or DONE | Update to ADLC_ACTIVE |
| STAGING_DEPLOYED or PR merged | Update to ADLC_STAGING, spawn GTM |
| E2E_PASS | Post to DM: ready for prod promotion |
| PROD_DEPLOYED | Spawn Analytics agent |
Trigger: ADLC deploys service to staging
(dispatcher-pdlc.sh detects STAGING_DEPLOYED in ADLC state
and no GTM brief exists).
Agent: gtm (model: sonnet, maxTurns:
20)
Tools: Read, Bash, Glob, Grep, WebSearch
Output:
~/dev/specs/$PROJECT/pdlc/gtm/$SERVICE-gtm.md with: -
Positioning (what, who, value prop, differentiation) - Feature summary -
Rollout plan (internal testing, beta, GA, rollback) - Documentation
needs (API docs, user guide, admin guide, migration guide, changelog) -
Communication plan - Success metrics (adoption, engagement, quality,
business) - Risks and mitigations
Slack: Posts to PDLC channel.
Trigger: Service deployed to production (future: PROD_DEPLOYED state).
Agent: analytics (model: sonnet,
maxTurns: 20)
Data platform: ClickHouse (OLAP), Metabase (BI)
Output:
~/dev/specs/$PROJECT/pdlc/analytics/$SERVICE-kpis.md with:
- KPI definitions (adoption rate, API latency p95, error rate, etc.) -
Tracking plan (CloudEvents via Redpanda) - ClickHouse tables
(materialized views) - Metabase dashboards (Operations, Product,
Business) - Success criteria checklist - Review schedule
Slack: Posts KPI count, event count, dashboard count to PDLC channel.
| Property | Value |
|---|---|
| Model | opus |
| Max Turns | 50 |
| Tools | Read, Write, Edit, Bash, Glob, Grep |
| Input | Spec (~/dev/specs/$PROJECT/specs/$SERVICE/spec.md),
service CLAUDE.md, existing code |
| Output | Git commits on feat/$TASK_ID branch, progress.md
update, state.md update |
| Status format | $PROJECT/$TASK_ID: DEV_COMPLETE @ ISO-timestamp (in
state.md) |
| Pass criteria | All tests pass, code committed and pushed |
| Fail criteria | Tests fail 3 times on same issue |
| Forbidden | Modify docker-compose.yml, change DB schemas outside Prisma, hardcode secrets |
| Property | Value |
|---|---|
| Model | sonnet |
| Max Turns | 35 |
| Tools | Read, Bash, Glob, Grep |
| Input | Spec, architecture.md, business-rules.md, service code |
| Output | ~/dev/ops/reviews/$SERVICE/ba-report.json |
| Status format | $PROJECT/$SERVICE: BA_PASS\|BA_FAIL @ ISO-timestamp |
| Pass criteria | criteriaMissing == 0 AND criteriaPartial == 0 AND no HIGH/CRITICAL deviations |
| Fail criteria | criteriaMissing > 0 OR any HIGH/CRITICAL deviation |
| Forbidden | Suggest code changes, write code, mark criteria MET without evidence |
| Property | Value |
|---|---|
| Model | sonnet |
| Max Turns | 30 |
| Tools | Read, Bash, Glob, Grep |
| Input | architecture.md, business-rules.md, spec, service code |
| Output | ~/dev/ops/reviews/$SERVICE/architect-report.json |
| Status format | $PROJECT/$SERVICE: ARCHITECT_PASS\|ARCHITECT_FAIL @ ISO-timestamp -- X/8 checks |
| Pass criteria | All 8 architectural checks pass |
| Fail criteria | Single FAIL on any check |
| Forbidden | Write production code, mark PASS without evidence |
| Property | Value |
|---|---|
| Model | sonnet |
| Max Turns | 30 |
| Tools | Read, Bash, Glob, Grep |
| Input | Service code |
| Output | ~/dev/ops/reviews/$SERVICE/security-report.json |
| Status format | $PROJECT/$SERVICE: SECURITY_PASS\|SECURITY_FAIL @ ISO-timestamp -- OWASP X/10, severity=Y |
| Pass criteria | All OWASP checks PASS/N/A, no npm critical/high, no secrets |
| Fail criteria | Any FAIL, any critical/high npm advisory, any secret in code |
| Forbidden | Write code, skip automated scans, assume zero findings |
| Property | Value |
|---|---|
| Model | sonnet |
| Max Turns | 30 |
| Tools | Read, Write, Bash, Glob, Grep |
| Input | Service code, Dockerfile, .env.example |
| Output (review mode) | ~/dev/ops/reviews/$SERVICE/devops-report.json |
| Output (deploy mode) | ~/dev/ops/reviews/$SERVICE/deploy-report.json,
.env.staging |
| Status format (review) | Contributes to $SERVICE-review.status |
| Status format (deploy) | DEPLOYED\|DEPLOY_FAILED\|FIRST_DEPLOY_NEEDED in
deploy-report.json |
| Pass criteria (review) | Docker builds, health endpoint exists, tests pass |
| Fail criteria | Docker build fails, no health endpoint, tests fail |
| Forbidden | Write application code |
| Property | Value |
|---|---|
| Model | sonnet |
| Max Turns | 15 |
| Tools | Read, Bash, Glob, Grep |
| Input | All JSON review reports in
~/dev/ops/reviews/$SERVICE/ |
| Output | GitHub PR, state.md update, progress.md update |
| Status format | $PROJECT/$SERVICE: STAGING_DEPLOYED @ ISO-timestamp |
| Pass criteria | All reviews pass, PR created/merged |
| Fail criteria | Any review MISSING/FAIL, CI fails, merge conflicts |
| Forbidden | Write application code, merge with security HIGH/CRITICAL, force merge conflicts |
| Property | Value |
|---|---|
| Model | sonnet |
| Max Turns | 30 |
| Tools | Read, Write, Bash, Glob, Grep |
| Input | Spec (sections 9+10), service-project-map.json, .env.example |
| Output | ~/dev/ops/coolify/$SERVICE.json,
provisioner-report.json, .env.staging, DB
schema, GCS bucket, GitHub repo |
| Status format | DONE \| date \| provisioner \| $SERVICE \| verified-operational
or PROVISION_INCOMPLETE |
| Pass criteria | All resources created AND post-provision health check passes |
| Fail criteria | Cannot create resource, health check fails |
| Forbidden | Write application code, mark DONE before health check |
| Property | Value |
|---|---|
| Model | sonnet |
| Max Turns | 25 |
| Tools | Read, Write, Edit, Bash, Glob, Grep, Playwright (navigate, snapshot, screenshot) |
| Input | Spec, review files |
| Output | scenarios.json, browser-scenarios.json,
mock-data.sql, cleanup.sql in
~/dev/projects/$SERVICE/tests/e2e/ |
| Status format | $PROJECT/$SERVICE: SCENARIOS_READY @ ISO-timestamp -- X api, Y browser scenarios |
| Forbidden | Write application code |
| Property | Value |
|---|---|
| Model | sonnet |
| Max Turns | 30 |
| Tools | Read, Write, Bash, Glob, Grep, full Playwright MCP suite (24 tools) |
| Input | Scenario files, staging URL, mock data |
| Output | ~/dev/ops/reviews/$SERVICE/e2e-report.json,
screenshots |
| Status format | E2E_PASS\|E2E_FAIL\|E2E_BLOCKED |
| Forbidden | Write application code |
| Property | Value |
|---|---|
| Model | sonnet |
| Max Turns | 30 |
| Tools | Read, Bash, Glob, Grep |
| Input | All status files, all review reports, registry, git history |
| Output | ~/dev/ops/reviews/adlc-audit-YYYYMMDD-HHMM.json,
last-audit.md |
| Status format | N/A (audit is a cross-cutting concern) |
| Pass criteria | Zero CRITICAL/HIGH findings = COMPLIANT |
| Triggers | Every 6 hours (automated), on-demand via Slack, after major milestones |
| Audit scope | Pipeline sequence compliance, review report quality (anti-rubber-stamp), registry consistency, orchestrator behavior (not coding directly), blocked/failed services, test coverage |
| Forbidden | Write application code, modify review reports |
| Property | Value |
|---|---|
| Model | opus |
| Max Turns | 40 |
| Tools | Read, Write, Edit, Bash, Glob, Grep |
| Input | Blocked/failed status files, pipeline.log, external-blockers.log, agent definitions, skill definitions, specs |
| Output | ~/dev/ops/reviews/resolver/RES-$DATE-$SEQ.json,
changelog.md, resolver-fixes.log,
resolver-monitor-*.json |
| Status format | N/A (operates above both pipelines) |
| CAN modify | Agent definitions, skill definitions, dispatcher scripts, spec-writer templates, CLAUDE.md rules, provisioner behavior, service-project-map.json |
| CANNOT modify | Application source code, spec content for specific services, review reports, git history, systemd units |
| Property | Value |
|---|---|
| Model | sonnet |
| Max Turns | 25 |
| Tools | Read, Bash, Glob, Grep, WebSearch, WebFetch |
| Input | Market signals, user feedback, existing specs, business-rules.md |
| Output | ~/dev/specs/$PROJECT/pdlc/opportunities/$NAME.md +
.json |
| Forbidden | Write specs, write code |
| Property | Value |
|---|---|
| Model | opus |
| Max Turns | 40 |
| Tools | Read, Bash, Glob, Grep |
| Input | Opportunity brief, architecture.md, business-rules.md, existing specs |
| Output | ~/dev/specs/$PROJECT/specs/$SERVICE/spec.md, updates to
service-project-map.json and backlog.md |
| Forbidden | Write code |
| Property | Value |
|---|---|
| Model | sonnet |
| Max Turns | 20 |
| Tools | Read, Bash, Glob, Grep |
| Input | Pending specs, opportunity briefs, backlog, roadmap, business-rules.md |
| Output | Updated backlog.md,
prioritization-$DATE.json |
| Forbidden | Write specs, write code |
| Property | Value |
|---|---|
| Model | sonnet |
| Max Turns | 25 |
| Tools | Read, Bash, Glob, Grep |
| Input | Spec, architecture.md, business-rules.md, existing services, ADLC state |
| Output | ~/dev/specs/$PROJECT/pdlc/validations/$SERVICE-validation.json,
handoff file on APPROVED |
| Forbidden | Write code, write specs |
| Property | Value |
|---|---|
| Model | sonnet |
| Max Turns | 20 |
| Tools | Read, Bash, Glob, Grep, WebSearch |
| Input | Spec, ADLC state, PROJECT.md |
| Output | ~/dev/specs/$PROJECT/pdlc/gtm/$SERVICE-gtm.md |
| Forbidden | Write code, write specs |
| Property | Value |
|---|---|
| Model | sonnet |
| Max Turns | 20 |
| Tools | Read, Bash, Glob, Grep |
| Input | Spec, GTM brief, ADLC state |
| Output | ~/dev/specs/$PROJECT/pdlc/analytics/$SERVICE-kpis.md |
| Forbidden | Write code |
All status files are written to ~/dev/ops/outputs/ and
follow this format:
STATUS | ISO-date | agent-type | service | details
Example:
DONE | 2026-03-22T10:30:00+00:00 | ba | oid | from-json-report
FAILED | 2026-03-22T11:00:00+00:00 | security | docstore | severity=CRITICAL, findings=3
RUNNING | 2026-03-22T09:00:00+00:00 | dev | pdf-engine | task T-042
BLOCKED_EXTERNAL | 2026-03-22T08:00:00+00:00 | provisioner | billing-engine | need Stripe API key
| Status | Meaning |
|---|---|
DONE |
Stage completed successfully |
PASS |
Tests or checks passed |
FAIL |
Tests or checks failed |
FAILED |
Agent or stage failed |
RUNNING |
Agent currently executing |
BLOCKED |
Blocked on internal dependency |
BLOCKED_EXTERNAL |
Blocked on external resource (human action needed) |
PAUSED |
Manually paused |
TRIAGING |
BA failure being analyzed by orchestrator |
DEPLOYED |
Successfully deployed to staging |
PROVISION_INCOMPLETE |
Provisioner created resources but health check failed |
Anything not in the above list is considered corrupted. The
dispatcher’s read_status() function validates the first
word:
case "$first_word" in
DONE|PASS|FAIL|FAILED|RUNNING|BLOCKED|BLOCKED_EXTERNAL|PAUSED|TRIAGING|DEPLOYED|PROVISION_INCOMPLETE|"")
echo "$first_word"
;;
*)
# Corrupted — log and auto-delete
rm "$actual_file"
echo ""
;;
esacInvalid examples: JSON blobs, service names, partial data, markdown, anything an agent writes that is not a standard keyword.
~/dev/ops/outputs/$SERVICE-$STAGE.status
Where $STAGE is one of: dev,
test, ba, review,
pr, deploy, provisioner,
discovery, spec-writer,
prioritization, validation, gtm,
analytics
Additional files: - $SERVICE.crashes – crash counter
(plain integer) - $SERVICE-dev-$TASKID.status – per-task
dev status (glob fallback)
The dispatcher resets any RUNNING status file older than
30 minutes (agent likely crashed or was killed by session restart):
if [ "$stage_status" = "RUNNING" ]; then
stage_age=$(( $(date +%s) - $(stat -c '%Y' "$stage_file") ))
if [ "$stage_age" -gt 1800 ]; then
rm "$stage_file"
fi
ficonsolidate_branches)process_pipeline)process_slack_inbox)detect_external_blockers)maybe_spawn_resolver)maybe_post_kanban,
every 6th run)For each service in ~/dev/projects/*/ (must have
.git):
+-----------+
| No status |
+-----+-----+
|
v
+-----------+
| dev: DONE |
+-----+-----+
|
[bash: test-runner.sh, zero tokens]
|
+---------v---------+
| test: PASS/FAIL |
+---------+---------+
|
(if PASS)
|
[inject: "Spawn /agent ba"]
[write: ba.status = RUNNING]
|
+---------v---------+
| ba: DONE/FAILED |
+---------+---------+
|
DONE FAILED
| |
| [inject: "BA FAIL, analyze"]
| [write: ba.status = TRIAGING]
| |
[inject: "Spawn reviews"] |
[write: review.status = RUNNING]
|
+---------v---------+
| review: DONE/FAIL |
+---------+---------+
|
(if DONE AND all_reviews_pass)
|
[inject: "Spawn /agent pr"]
[write: pr.status = RUNNING]
|
+---------v---------+
| pr: DONE/FAIL |
+---------+---------+
|
(if DONE, based on service type)
|
+----+----+--------+
| | |
script infra web-service
| | |
DONE +------+-------+
|
[has coolify config?]
| |
YES NO
| |
[deploy] [provisioner]
| |
+---v---+ +----v----+
|DEPLOYED| |PROVISION|
+--------+ +---------+
| Source | What It Extracts |
|---|---|
~/dev/ops/outputs/$SERVICE-$STAGE.status |
First word = status keyword |
~/dev/ops/reviews/$SERVICE/*.json |
JSON verdicts (fallback if no .status file) |
~/dev/ops/outputs/$SERVICE.crashes |
Crash count (circuit breaker) |
~/dev/ops/agents/service-project-map.json |
Service -> project mapping, service type, deploy mode |
~/dev/ops/coolify/$SERVICE.json |
Coolify config existence check |
~/dev/ops/slack-inbox/*.txt |
Slack messages to inject |
The dispatcher NEVER uses Claude for decisions. It uses: -
read_status() – extract and validate first word from status
file - all_reviews_pass() – parse all 4 JSON review files
with python3 one-liners - get_project(),
get_service_type(), get_deploy_mode() – read
from service-project-map.json - get_crashes(),
inc_crashes() – crash counter management -
review_verdict() – extract a JSON field from a review
report
The dispatcher sends text commands to Claude via
tmux send-keys. Examples:
Spawn /agent ba for oid. PROJECT: ods-platform. Spec: ~/dev/specs/ods-platform/specs/oid/spec.md
BA FAIL for docstore. Read ~/dev/ops/reviews/docstore/ba-report.json. If missing criteria map to pending tasks, spawn dev agents. If real failure, spawn dev fix.
Spawn /agent architect + /agent security + /agent devops for pdf-engine. PROJECT: ods-platform.
Spawn /agent provisioner for notification-hub. PROJECT: ods-platform. TYPE: web-service. Create Coolify application.
Spawn /agent resolver. SYSTEMIC BLOCKER DETECTED: 3 services blocked. Top causes: missing Coolify config.
| Action | File Written |
|---|---|
| Tests pass | $SERVICE-test.status = PASS |
| Tests fail | $SERVICE-test.status = FAIL |
| Spawn BA | $SERVICE-ba.status = RUNNING |
| BA FAIL triage | $SERVICE-ba.status = TRIAGING |
| Spawn reviews | $SERVICE-review.status = RUNNING |
| Spawn PR | $SERVICE-pr.status = RUNNING |
| Spawn deploy | $SERVICE-deploy.status = RUNNING |
| Spawn provisioner | $SERVICE-provisioner.status = RUNNING |
| Script service (no deploy) | $SERVICE-deploy.status =
DONE \| date \| deploy \| $SERVICE \| script-type |
| Recover BA from JSON | $SERVICE-ba.status =
DONE \| date \| ba \| $SERVICE \| from-json-report |
| Recover reviews from JSON | $SERVICE-review.status =
DONE \| date \| reviews \| $SERVICE \| from-json-reports |
| Stale RUNNING (>30min) | Deletes the status file |
| Corrupted status | Deletes the status file |
check-pdlc# For each service with a spec under each project:
if ADLC state shows STAGING_DEPLOYED for $SERVICE:
if no GTM brief exists at ~/dev/specs/$PROJECT/pdlc/gtm/$SERVICE-gtm.md:
if $SERVICE-gtm.status != RUNNING and != DONE:
inject "Spawn /agent gtm for $SERVICE"# If last visible line contains "bypass permissions" (Claude prompt)
# AND no agents running (no "local agents" or "background tasks" in pane):
inject "Run check-pdlc. Read ~/.claude/skills/check-pdlc/SKILL.md."| Type | Description | Deployment | Coolify Entity | Health Check |
|---|---|---|---|---|
web-service |
REST API or frontend, runs permanently | Dockerfile build | Coolify Application | /health endpoint |
infrastructure |
Broker, database, cache (Redpanda, ClickHouse, etc.) | docker-compose | Coolify Service | Container status |
script |
Migration, data import, CLI tool | None (one-shot) | None | Build check only |
Location: ~/dev/ops/agents/service-project-map.json
{
"oid": {
"project": "ods-platform",
"type": "web-service",
"stack": "node",
"deploy": "dockerfile"
},
"redpanda": {
"project": "ods-platform",
"type": "infrastructure",
"stack": "docker",
"deploy": "docker-compose"
},
"migration": {
"project": "lejecos",
"type": "script",
"stack": "node",
"deploy": "one-shot"
}
}Legacy format (backward compatible):
"oid": "ods-platform" (simple string = project name,
defaults to type=web-service, deploy=dockerfile).
| Deploy Mode | Coolify Action | Dispatcher Behavior |
|---|---|---|
dockerfile |
Create Coolify Application, Dockerfile build pack | Spawn provisioner if no config, then devops deploy |
docker-compose |
Create Coolify Service | Spawn provisioner for compose, then devops deploy |
one-shot |
No Coolify deployment | Mark deploy as DONE immediately |
| Project | Specs Directory | Services |
|---|---|---|
| ods-platform | ~/dev/specs/ods-platform/ |
oid, redpanda, docstore, pdf-engine, notification-hub, workflow-engine, form-engine, billing-engine, securemail, doceditor, agenda |
| ods-dashboard | ~/dev/specs/ods-dashboard/ |
ods-dashboard |
| lejecos | ~/dev/specs/lejecos/ |
migration |
| Resource Type | CLI/Method | Auto-Provisionable | Agent |
|---|---|---|---|
| PostgreSQL schema + user | psql |
Yes | provisioner |
| GCS bucket + service account | gcloud storage |
Yes | provisioner |
| Coolify app/service | Coolify API | Yes (if token exists) | provisioner |
| GitHub repo | gh |
Yes | provisioner |
| Redpanda topics | rpk / docker exec |
Yes | provisioner |
| Redis instance | docker / GCP Memorystore | Yes | provisioner |
| TLS certificates | Coolify (Let’s Encrypt) | Yes (automatic) | Coolify |
| COOLIFY_API_TOKEN | Manual | No (first time) | Human |
| External API credentials (Stripe, SendGrid, CinetPay) | Manual | No | Human |
| DNS wildcard configuration | Manual | No (first time) | Human |
| SMTP server credentials | Manual | No | Human |
Tracked in ~/dev/ops/external-deps.md:
| Service | Dependency | Type | Auto |
|---|---|---|---|
| oid | PostgreSQL 5433 | infra | Yes |
| docstore | S3/MinIO | infra | Partial (need bucket + credentials) |
| notification-hub | SMTP server | infra | No |
| notification-hub | SendGrid API | external-api | No |
| pdf-engine | (self-contained) | - | N/A |
| billing-engine | Stripe API | external-api | No |
| billing-engine | CinetPay API | external-api | No |
| all | Coolify | deployment | Partial (need per-service UUID) |
When a resource requires human action, the provisioner (or orchestrator) posts to Slack DM:
:key: EXTERNAL BLOCKER -- {service}/{task}
Category: {credentials|infrastructure|deployment|external-api|network|permissions}
Missing: {specific resource or credential}
Spec reference: {where in spec.md this is mentioned}
Impact: {what cannot proceed without this}
Action needed: {exact steps for the human}
After posting: 1. Mark task as BLOCKED_EXTERNAL in
pipeline state 2. Log to
~/dev/ops/outputs/external-blockers.log with timestamp 3.
Move to next task (do NOT wait for resolution) 4. When human responds in
Slack, the slack-bridge skill resets the blocked status
The dispatcher spawns the resolver agent when: 1. 2+ services
blocked on the same root cause –
maybe_spawn_resolver() counts blocked/failed
deploy/provisioner status files, groups by reason. If total >= 2 and
unique reasons <= 2, it is systemic. 2. 2+ corrupted status
files – non-standard first word in status files indicates
agents are writing invalid format. 3. Periodic scan (30
min) finds recurring patterns. 4. Human request via
Slack: “resolve”, “root cause”, “why is everything blocked”
Cooldown: resolver will not re-run within 1 hour of last run (checks
resolver-last-run.ts).
Trace from symptom to root cause using a structured tree:
BLOCKER: [description]
+-- IMMEDIATE: [what directly failed]
+-- AGENT GAP: [which agent should have caught/handled this]
+-- SPEC GAP: [what the spec should have specified]
+-- TEMPLATE GAP: [what the spec-writer template is missing]
+-- PIPELINE GAP: [what the dispatcher/provisioner doesn't check]
+-- DESIGN GAP: [systemic assumption that was wrong]
Sources read: blocked/failed status files, pipeline.log, external-blockers.log, agent definitions, skill definitions, specs.
Detailed fix plan written BEFORE any changes. For each proposed change: - Component and file path - What to modify - Why this fixes the root cause
For each proposed change, evaluate across 4 dimensions:
A. ADLC Integrity Check: Does it alter pipeline sequence? Weaken review gates? Bypass safety mechanisms?
B. PDLC Integrity Check: Does it alter product lifecycle? Affect spec quality? Bypass validation?
C. Bias Detection: Does it favor one service type? Reduce observability? Make rollback harder?
D. Impact Score (6 dimensions, each 0-3):
| Dimension | Scale |
|---|---|
| ADLC pipeline integrity | 0=no impact, 3=breaks pipeline |
| PDLC pipeline integrity | 0=no impact, 3=breaks pipeline |
| Review quality | 0=no impact, 3=weakens reviews |
| System generality | 0=no impact, 3=becomes specific |
| Observability | 0=no impact, 3=reduces visibility |
| Rollback safety | 0=easy rollback, 3=irreversible |
| TOTAL | sum / 18 |
Decision thresholds:
| Total Score | Decision |
|---|---|
| 0-2 | AUTO-APPLY – minimal impact, proceed |
| 3-5 | APPLY WITH MONITORING – low risk, watch closely |
| 6-9 | HUMAN REVIEW REQUIRED – post plan to Slack DM, wait for approval |
| 10+ | DO NOT APPLY – redesign the fix |
cp $FILE ${FILE}.resolver-backup-$(date +%Y%m%d-%H%M)What resolver CAN modify: - Agent definitions
(~/.claude/agents/*.md) - Skill definitions
(~/.claude/skills/*/SKILL.md) - Dispatcher scripts
(~/dev/ops/adlc-v2/scripts/*.sh,
~/dev/ops/pdlc/scripts/*.sh) - Spec-writer templates -
CLAUDE.md orchestrator rules - Provisioner behavior -
service-project-map.json
What resolver CANNOT modify: - Application source
code in ~/dev/projects/*/src/ - Spec content for specific
services - Review reports (read-only audit trails) - Git history -
systemd units (escalate to human)
After applying fixes: 1. Record fix in
~/dev/ops/outputs/resolver-fixes.log 2. Write monitoring
criteria to
~/dev/ops/outputs/resolver-monitor-$BLOCKER_ID.json 3. On
next resolver scan (30 min): read monitor files, execute check commands,
compare against success criteria 4. If success: mark resolved, clean up
5. If failed: execute rollback plan, escalate to Slack DM
The dispatcher also checks monitoring files: if
monitorUntil timestamp has passed, it spawns the resolver
for verification.
Write resolution report to
~/dev/ops/reviews/resolver/RES-$DATE-$SEQ.json with: -
Blocker description, services affected, duration blocked - Root cause
chain (immediate, agent gap, spec gap, template gap, pipeline gap,
design gap) - Fix details (plan, files modified, impact score) -
Monitoring results (expected vs actual outcome) - Lessons learned,
prevention measures
Append summary to
~/dev/ops/reviews/resolver/changelog.md.
| Channel | ID | Purpose | Pipeline |
|---|---|---|---|
| ADLC | C0AN0N8AUGZ |
Pipeline commands, progress milestones, kanban | ADLC |
| PDLC | C0AN42N3C0L |
PM commands, product updates, PDLC kanban | PDLC |
| DM | D0AGRAVEC1K |
Blockers, human review, human interaction | Both |
source ~/.env.adlc 2>/dev/null || source ~/.env.openclaw 2>/dev/null:rotating_light: BLOCKED -- {service}/{task}
Reason: {reason}
Action needed: {what the human should do}
:key: EXTERNAL BLOCKER -- {service}/{task}
Category: {credentials|infrastructure|deployment|external-api|network|permissions}
Missing: {specific resource or credential}
Spec reference: {where in spec.md this is mentioned}
Impact: {what cannot proceed without this}
Action needed: {exact steps for the human}
:eyes: HUMAN REVIEW -- {service}/{task}
Context: {summary}
Options: {what the human can reply}
:white_check_mark: {service} -- {milestone}
{brief details}
[BA] $SERVICE: $STATUS -- $CRITERIA_MET/$CRITERIA_TOTAL criteria met. $DEVIATIONS deviations.
[ARCHITECT] $SERVICE: $VERDICT -- $CHECKS_PASSED/8 checks passed.
[SECURITY] $SERVICE: $STATUS -- OWASP $SCORE/10, severity=$SEVERITY, $FINDINGS findings.
[E2E] $SERVICE: $VERDICT -- API: $PASSED/$TOTAL, Browser: $PASSED/$TOTAL.
[PROVISIONER] $SERVICE: $VERDICT -- Coolify=$X, DB=$Y, Bucket=$Z.
[AUDIT] $VERDICT -- $N services. $FINDINGS findings ($CRITICAL critical, $HIGH high).
[DISCOVERY] $OPPORTUNITY -- $RECOMMENDATION ($PRIORITY). Impact: $IMPACT.
[SPEC] $SERVICE spec written -- $AC_COUNT acceptance criteria, $ENDPOINT_COUNT endpoints.
[PRIORITY] $PROJECT backlog re-prioritized -- $TOTAL items scored. Top 3: $TOP3.
[VALIDATION] $SERVICE -- $VERDICT. Effort: $EFFORT. Risks: $RISK_COUNT.
[GTM] $SERVICE brief ready -- rollout plan, docs checklist, success metrics.
[ANALYTICS] $SERVICE -- $KPI_COUNT KPIs, $EVENT_COUNT events, $DASHBOARD_COUNT dashboards.
:package: PDLC->ADLC handoff: $SERVICE spec ready for development.
:mag: RESOLVER analyzing systemic blocker: $DESCRIPTION ($N services affected)
:wrench: RESOLVER applying fix: $SUMMARY (impact $SCORE/18 -- auto-approved)
:eyes: RESOLVER needs approval: $SUMMARY (impact $SCORE/18) [DM]
:white_check_mark: RESOLVER fix verified: $BLOCKER_ID -- $N services unblocked.
:rotating_light: RESOLVER fix FAILED: $BLOCKER_ID -- rolling back. $REASON
:key: or
:rotating_light: formatresolved,
fixed, done, c'est fait,
ok, credentials added,
token ajoute, cle ajoutee,
{service} unblockedrm ~/dev/ops/outputs/$SERVICE-deploy.status ~/dev/ops/outputs/$SERVICE-provisioner.status:white_check_mark: $SERVICE unblocked. Dispatcher will retry in <5 min.| Pattern | Action |
|---|---|
status, etat, avancement |
Run /status, post summary |
kanban, board |
Run /kanban, post board |
pause, stop, arreter |
Pause all agents |
resume, reprendre, go |
Resume pipeline |
deploy {service} |
Spawn DevOps deploy mode |
merge {service} |
Create PR and merge |
rollback {service} |
Roll back staging |
onboard {project} |
Queue generate-specs |
launch {service} |
Queue dev-task |
audit |
Spawn auditor |
fix {service} {desc} |
Spawn dev agent with fix |
review {service} |
Spawn BA + Architect + Security + DevOps |
test {service} |
Run test-runner.sh |
e2e {service} |
Spawn scenario + e2e-test |
logs {service} |
Read/summarize review reports |
| Pattern | Action |
|---|---|
discover {topic} |
Spawn discovery agent |
spec {service} |
Spawn spec-writer |
prioritize, backlog |
Spawn prioritization |
validate {service} |
Spawn validation |
gtm {service} |
Spawn GTM |
analytics {service} |
Spawn analytics |
handoff {service} |
Trigger ADLC handoff |
status, pipeline |
Post pipeline status |
adlc status |
Read and summarize ADLC state |
Problem: Claude sessions accumulate context over hours, eventually degrading quality or crashing.
Mitigations: - Daily restart at 4am
UTC – kills and restarts both tmux sessions, runs
/boot to reconstruct state from disk files -
session-health.sh (1 min) – monitors session age; if
> 12h AND > 200 interactions, restarts when no agents running -
Interaction counter
(~/dev/ops/outputs/claude-interactions.count) – tracks
cumulative injections - Idle detection – if pane
unchanged for 15 min with no agents, session is stuck; kill and
restart
Problem: Agents sometimes write non-standard status values (JSON blobs, service names, markdown).
Mitigations: - read_status()
validation – dispatcher extracts first word from status file
and checks against known list; corrupted files are auto-deleted and
logged - Resolver auto-trigger – if 2+ corrupted files
detected in a single scan, resolver agent is spawned to trace which
agent is writing invalid format and fix the agent prompt - Stale
RUNNING detection – any RUNNING status file older than 30
minutes is auto-deleted (agent likely crashed)
Problem: An agent writes a status file that doesn’t
match the expected
STATUS | date | agent | service | details format.
Mitigations: - Dispatcher
validation – read_status() only accepts known
keywords; anything else is treated as corrupted - JSON report
fallback – if no .status file but a JSON review
report exists, dispatcher recovers status from JSON
(ba-report.json status=compliant ->
ba.status = DONE) - Resolver – traces
which agent prompt is producing invalid format, fixes the agent
definition
Problem: The orchestrator Claude writes code, runs tests, or does work it should delegate to subagents.
Mitigations: - CLAUDE.md absolute rules – 8 rules that MUST NEVER be broken: 1. NEVER edit source code 2. NEVER run tests (test-runner.sh does that) 3. NEVER create files in service directories 4. NEVER run cargo/npm/pnpm/go/dotnet/node 5. NEVER commit or push 6. NEVER loop or schedule itself 7. NEVER run build commands 8. NEVER run PDLC skills (those are for the other orchestrator) - Auditor agent – checks git history for commits by the supervisor, checks pipeline.log for direct file edits
Problem: ADLC orchestrator accidentally runs
/check-pdlc or /boot-pdlc, which belong to the
PDLC orchestrator.
Mitigation: CLAUDE.md Rule 8 explicitly states: > NEVER run /check-pdlc or /boot-pdlc – those are PDLC skills for the other orchestrator (ods-pdlc tmux). You are ADLC only. Your skills: /check-pipeline, /boot, /status, /kanban, /dev-task, /slack-bridge, /registry.
Problem: A service fails 3+ times and gets permanently blocked.
Mitigations: - Crash counter
($SERVICE.crashes) tracked per service - After 3
crashes: service is skipped by dispatcher, BLOCKED posted to
Slack DM - Human can reset: reply
{service} unblocked in Slack DM to clear status files -
Resolver can analyze systemic causes across multiple
blocked services
Problem: Too many concurrent Claude agents exhaust server RAM.
Mitigations: - Before spawning
agents: orchestrator checks
awk '/MemAvailable/ {print int($2/1024)}' /proc/meminfo -
If > 2000MB: spawn freely, all pending work in
parallel - If < 2000MB: queue new spawns, wait for
running agents to finish, post to Slack DM - If < 512MB
(critical): session-health.sh kills largest non-supervisor
Claude process - Subagent design: review agents use
sonnet (lighter) rather than opus; only dev, spec-writer, and resolver
use opus
Problem: PDLC dispatcher could spawn duplicate agents for the same service+stage.
Mitigation: check-pdlc skill implements
anti-duplicate gates:
check_agent_status() {
local status_file=~/dev/ops/outputs/${service}-${agent_type}.status
# RUNNING -> do NOT spawn
# DONE -> do NOT re-run
# FAILED (<3 attempts) -> retry once
# FAILED (>=3) -> BLOCKED
# NONE -> eligible to spawn
}Problem: Feature branches conflict when merged into
dev.
Mitigation: Dispatcher tries
git merge --no-edit. On failure: -
git merge --abort - Log conflict - Post to Slack DM:
:warning: Merge conflict: $SERVICE/$branch needs manual resolution
- Skip that branch, continue with others
| Path | Purpose |
|---|---|
~/dev/ |
Working root for all development |
~/dev/projects/ |
Git repositories for each service |
~/dev/specs/ |
Project specs, backlogs, roadmaps, PDLC artifacts |
~/dev/ops/ |
Operations: scripts, agents, reviews, outputs |
~/.claude/ |
Claude Code configuration and agent memory |
~/dev/ops/)| Path | Purpose |
|---|---|
~/dev/ops/outputs/ |
Status files, pipeline log, crash counters, dispatcher state |
~/dev/ops/outputs/pipeline.log |
ADLC dispatcher log (all transitions, injections, errors) |
~/dev/ops/outputs/pdlc-pipeline.log |
PDLC dispatcher log |
~/dev/ops/outputs/session-health.log |
Session health monitor log |
~/dev/ops/outputs/external-blockers.log |
External blocker history |
~/dev/ops/outputs/resolver-fixes.log |
Resolver fix history |
~/dev/ops/outputs/$SERVICE-$STAGE.status |
Per-service per-stage status files |
~/dev/ops/outputs/$SERVICE.crashes |
Per-service crash counter |
~/dev/ops/outputs/dispatcher-run-count |
ADLC dispatcher run counter (for kanban every 6th) |
~/dev/ops/outputs/pdlc-dispatcher-run-count |
PDLC dispatcher run counter |
~/dev/ops/outputs/session-health-state |
Idle counter and hash state |
~/dev/ops/outputs/claude-interactions.count |
Cumulative interaction counter |
~/dev/ops/outputs/resolver-last-run.ts |
Timestamp of last resolver execution |
~/dev/ops/outputs/resolver-monitor-*.json |
Post-fix monitoring criteria |
~/dev/ops/reviews/ |
Review reports directory |
~/dev/ops/reviews/$SERVICE/ |
Per-service review reports |
~/dev/ops/reviews/$SERVICE/ba-report.json |
BA review report |
~/dev/ops/reviews/$SERVICE/architect-report.json |
Architect review report |
~/dev/ops/reviews/$SERVICE/security-report.json |
Security review report |
~/dev/ops/reviews/$SERVICE/devops-report.json |
DevOps review report |
~/dev/ops/reviews/$SERVICE/deploy-report.json |
Deploy report |
~/dev/ops/reviews/$SERVICE/e2e-report.json |
E2E test report |
~/dev/ops/reviews/$SERVICE/provisioner-report.json |
Provisioner report |
~/dev/ops/reviews/$SERVICE/screenshots/ |
E2E browser test screenshots |
~/dev/ops/reviews/$SERVICE/last-reviewed-commit.txt |
Last commit reviewed by BA |
~/dev/ops/reviews/$SERVICE/last-fail-commit.txt |
Commit at time of last review failure |
~/dev/ops/reviews/adlc-audit-YYYYMMDD-HHMM.json |
Periodic audit reports |
~/dev/ops/reviews/resolver/ |
Resolver resolution reports and changelog |
~/dev/ops/reviews/resolver/RES-$DATE-$SEQ.json |
Individual resolution reports |
~/dev/ops/reviews/resolver/changelog.md |
Resolver fix history |
~/dev/ops/agents/service-project-map.json |
Service -> project mapping (THE registry) |
~/dev/ops/coolify/$SERVICE.json |
Coolify deployment config per service |
~/dev/ops/external-deps.md |
Known external dependencies table |
~/dev/ops/slack-inbox/ |
ADLC Slack message inbox |
~/dev/ops/slack-inbox/processed/ |
Processed ADLC Slack messages |
~/dev/ops/pdlc-slack-inbox/ |
PDLC Slack message inbox |
~/dev/ops/pdlc-slack-inbox/processed/ |
Processed PDLC Slack messages |
| Path | Purpose | Interval |
|---|---|---|
~/dev/ops/adlc-v2/scripts/dispatcher-v3.sh |
ADLC bash dispatcher | 5 min (systemd) |
~/dev/ops/adlc-v2/scripts/session-health.sh |
Claude session health monitor | 1 min (systemd) |
~/dev/ops/adlc-v2/scripts/test-runner.sh |
Bash test runner (zero tokens) | On demand |
~/dev/ops/adlc-v2/scripts/slack-bridge.sh |
ADLC Slack polling bridge | Continuous |
~/dev/ops/pdlc/scripts/dispatcher-pdlc.sh |
PDLC bash dispatcher | 10 min (systemd) |
~/dev/ops/pdlc/scripts/pdlc-slack-bridge.sh |
PDLC Slack polling bridge | Continuous |
| Path | Agent |
|---|---|
~/.claude/agents/dev.md |
Developer agent |
~/.claude/agents/ba.md |
Business Analyst agent |
~/.claude/agents/architect.md |
Architect review agent |
~/.claude/agents/security.md |
Security review agent |
~/.claude/agents/devops.md |
DevOps review/deploy agent |
~/.claude/agents/pr.md |
PR creation agent |
~/.claude/agents/provisioner.md |
Infrastructure provisioner agent |
~/.claude/agents/scenario.md |
E2E scenario generator agent |
~/.claude/agents/e2e-test.md |
E2E test executor agent |
~/.claude/agents/auditor.md |
ADLC compliance auditor agent |
~/.claude/agents/resolver.md |
Systemic problem resolver agent |
~/.claude/agents/discovery.md |
Product discovery agent |
~/.claude/agents/spec-writer.md |
Spec writer agent |
~/.claude/agents/prioritization.md |
Backlog prioritization agent |
~/.claude/agents/validation.md |
Spec validation agent |
~/.claude/agents/gtm.md |
Go-to-market agent |
~/.claude/agents/analytics.md |
Analytics/KPI agent |
| Path | Skill | Pipeline |
|---|---|---|
~/.claude/skills/dev-task/SKILL.md |
/dev-task – develop a feature |
ADLC |
~/.claude/skills/boot/SKILL.md |
/boot – daily context rebuild |
ADLC |
~/.claude/skills/status/SKILL.md |
/status – show current state |
ADLC |
~/.claude/skills/kanban/SKILL.md |
/kanban – visual kanban board |
ADLC |
~/.claude/skills/registry/SKILL.md |
/registry – maintain service-project-map.json |
ADLC |
~/.claude/skills/check-pipeline/SKILL.md |
/check-pipeline – scan and advance pipeline |
ADLC |
~/.claude/skills/slack-bridge/SKILL.md |
/slack-bridge – interpret Slack messages |
ADLC |
~/.claude/skills/boot-pdlc/SKILL.md |
/boot-pdlc – PDLC context rebuild |
PDLC |
~/.claude/skills/pdlc-bridge/SKILL.md |
/pdlc-bridge – interpret PDLC Slack messages |
PDLC |
~/.claude/skills/check-pdlc/SKILL.md |
/check-pdlc – scan and advance PDLC pipeline |
PDLC |
~/dev/specs/$PROJECT/)| Path | Purpose |
|---|---|
~/dev/specs/$PROJECT/PROJECT.md |
Project definition |
~/dev/specs/$PROJECT/context/architecture.md |
Architecture decisions |
~/dev/specs/$PROJECT/context/business-rules.md |
Business rules |
~/dev/specs/$PROJECT/gestion/progress.md |
Dev progress tracking |
~/dev/specs/$PROJECT/gestion/backlog.md |
Prioritized backlog (RICE scored) |
~/dev/specs/$PROJECT/gestion/roadmap.md |
Timeline / roadmap |
~/dev/specs/$PROJECT/specs/$SERVICE/spec.md |
Service specification |
~/dev/specs/$PROJECT/pdlc/pipeline-state.md |
PDLC pipeline state per feature |
~/dev/specs/$PROJECT/pdlc/opportunities/$NAME.md |
Discovery opportunity briefs |
~/dev/specs/$PROJECT/pdlc/opportunities/$NAME.json |
Discovery opportunity JSON summaries |
~/dev/specs/$PROJECT/pdlc/validations/$SERVICE-validation.json |
Spec validation reports |
~/dev/specs/$PROJECT/pdlc/handoffs/$SERVICE-handoff.json |
ADLC handoff files |
~/dev/specs/$PROJECT/pdlc/gtm/$SERVICE-gtm.md |
GTM briefs |
~/dev/specs/$PROJECT/pdlc/analytics/$SERVICE-kpis.md |
Analytics/KPI definitions |
~/dev/specs/$PROJECT/pdlc/prioritization-$DATE.json |
Prioritization snapshots |
~/dev/projects/$SERVICE/)| Path | Purpose |
|---|---|
~/dev/projects/$SERVICE/CLAUDE.md |
Service-specific Claude rules |
~/dev/projects/$SERVICE/.env |
Environment variables (runtime) |
~/dev/projects/$SERVICE/.env.example |
Env var documentation |
~/dev/projects/$SERVICE/.env.staging |
Staging URL and config |
~/dev/projects/$SERVICE/tests/e2e/scenarios.json |
API E2E test scenarios |
~/dev/projects/$SERVICE/tests/e2e/browser-scenarios.json |
Browser E2E test scenarios |
~/dev/projects/$SERVICE/tests/e2e/mock-data.sql |
E2E test data |
~/dev/projects/$SERVICE/tests/e2e/cleanup.sql |
E2E cleanup script |
~/.claude/)| Path | Purpose |
|---|---|
~/.claude/agent-memory/pipeline/state.md |
ADLC pipeline state (read by both pipelines) |
~/.claude/agent-memory/pipeline/last-audit.md |
Last audit summary |
~/.claude/agent-memory/pipeline/last-audit-ts.txt |
Last audit timestamp |
~/.claude/agent-memory/pipeline/run-count |
Check-pipeline run counter |
~/.claude/agent-memory/pipeline/retries-$SERVICE.txt |
Per-service retry counter |
~/.claude/agent-memory/pdlc-bridge/ |
PDLC bridge timestamps |
~/dev/CLAUDE.md |
ADLC orchestrator rules |
| Path | Purpose |
|---|---|
~/dev/ops/coolify/$SERVICE.json |
Coolify app config (UUID, URLs, registry) |
~/dev/infra-registry/ |
Infrastructure asset registry (git repo) |
~/.env.adlc |
ADLC environment (SLACK_BOT_TOKEN, COOLIFY_API_TOKEN, etc.) |
~/.env.openclaw |
Fallback environment file |
/tmp/dispatcher-v3.lock |
ADLC dispatcher lock file |
/tmp/dispatcher-pdlc.lock |
PDLC dispatcher lock file |
| Resource | Connection |
|---|---|
| PostgreSQL | postgres://ods:ods-dev-2026@127.0.0.1:5433/ods |
| Resource | Value |
|---|---|
| Project ID | ninth-park-452914-v8 |
| Region | europe-west1 |
| Bucket naming | ods-$SERVICE-staging |
End of System Reference Document
Third pipeline alongside ADLC and PDLC. Feeds both with technology signals.
| Agent | Model | Turns | Role |
|---|---|---|---|
| veille | sonnet | 30 | Daily tech watch: releases, CVEs, trending repos (top 10 rotation), ad-hoc URL review, security audit |
| benchmark | opus | 35 | Monthly comparison ODS vs industry best practices |
| poc-builder | opus | 40 | Rapid PoC prototypes in ~/dev/pocs/ |
| innovation-scorer | sonnet | 20 | FIRE framework scoring + correlation with ODS backlog/specs/daily findings |
| adr-writer | sonnet | 15 | Architecture Decision Records |
| Pattern | Action |
|---|---|
| Any URL | Ad-hoc veille review with security audit |
veille, watch |
Run daily veille |
benchmark |
Run monthly benchmark |
poc {description} |
Build proof of concept |
score {proposal} |
FIRE score + correlation |
adr {decision} |
Write Architecture Decision Record |
| Timer | Schedule | Action |
|---|---|---|
| ods-innovation.timer | Daily 07:00 UTC | Run veille agent |
| ods-innovation-bridge.service | Always-on | Poll #Innovation every 30s |
| CLI | Purpose | Validates |
|---|---|---|
| write-status.sh | Status files | Status ∈ enum (11 values), agent ∈ enum |
| write-review.sh | JSON review reports | Schema per agent type, verdict ∈ enum |
| write-lesson.sh | Lessons learned | All 6 fields non-empty (min 5 chars) |
| write-pipeline-state.sh | Pipeline state | State ∈ enum (27 values) |
| write-human-review.sh | Human decision reports | JSON schema, generates HTML + PDF + GDrive + Slack |
Agents MUST use CLI tools to write output files. Never Write/Edit directly. The CLI validates format and rejects invalid input — the agent has no choice but to produce correct output.
All 3 bridges poll every 30s, inject directly into tmux via send-keys. Fallback to inbox files if tmux session is down.
| Bridge | Channel | Tmux Target | Method |
|---|---|---|---|
| ADLC | C0AN0N8AUGZ + D0AGRAVEC1K | ods-claude | Direct tmux + inbox fallback |
| PDLC | C0AN42N3C0L | ods-pdlc | Direct tmux |
| Innovation | C0AMSKF5NCF | ods-claude | Direct tmux |
The PDLC dispatcher auto-detects its mode based on pending work:
| Mode | Trigger | Scan Interval | Actions |
|---|---|---|---|
| Active | Specs being written/validated/designed | 10 min | Full pipeline scan + ADLC monitoring + poke Claude |
| Monitoring | All specs done, ADLC working | 30 min | Check ADLC state → GTM/Analytics/UI validation triggers only |
| Idle | Nothing pending, all deployed | Bridge only | Process Slack inbox, no scan |
pending = count(spec-writing + validating + UI_DESIGN + DISCOVERY in pipeline-state.md)
running = count(RUNNING in *-spec-writer.status, *-validation.status, *-ui-design.status)
monitoring_work = count(STAGING_DEPLOYED without GTM, PROD_DEPLOYED without Analytics)
if pending + running > 0 → active
elif monitoring_work > 0 → monitoring
else → idle
| ADLC State | PDLC Action |
|---|---|
| STAGING_DEPLOYED + no GTM brief | Spawn GTM agent |
| STAGING_DEPLOYED + has design brief + no UI review | Spawn UI Designer Mode 2 (visual validation) |
| PROD_DEPLOYED + no Analytics KPIs | Spawn Analytics agent |