ODS Platform -- System Reference

ODS Platform – System Reference

Version: 1.0 Generated: 2026-03-22 Scope: Complete reference for ADLC + PDLC autonomous pipelines

System Architecture
ADLC Pipeline Flow
PDLC Pipeline Flow
Agent Registry
Status File Format Standard
Dispatcher State Machine
Service Classification
External Dependencies
Resolver Protocol
Slack Channels and Communication
Known Issues and Mitigations
File System Map

1. System Architecture

Overview

The ODS Platform runs two autonomous pipelines on a single server, each managed by a Claude Code session inside a tmux window, driven by bash-based dispatchers running on systemd timers.

                     +-----------------------------------+
                     |           Linux Server            |
                     |  /home/jniox_orbusdigital_com/    |
                     +-----------------------------------+
                              |              |
              +---------------+              +---------------+
              |                                              |
   +----------v-----------+                   +--------------v---------+
   |   ADLC Pipeline      |                   |   PDLC Pipeline        |
   |   (Development)      |                   |   (Product)            |
   +----------+------------+                  +--------------+----------+
              |                                              |
   +----------v-----------+                   +--------------v---------+
   | tmux: ods-claude      |                  | tmux: ods-pdlc          |
   | Claude supervisor     |                  | Claude supervisor       |
   | CLAUDE.md rules       |                  | CLAUDE.md rules         |
   +----------+------------+                  +--------------+----------+
              |                                              |
   +----------v-----------+                   +--------------v---------+
   | dispatcher-v3.sh      |                  | dispatcher-pdlc.sh      |
   | systemd timer: 5 min  |                  | systemd timer: 10 min   |
   | deterministic bash    |                  | deterministic bash      |
   +----------+------------+                  +--------------+----------+
              |                                              |
   +----------v-----------+                   +--------------v---------+
   | session-health.sh     |  <--- shared ---> |  session-health.sh     |
   | systemd timer: 1 min  |                   |  systemd timer: 1 min  |
   +-----------------------+                   +------------------------+

Two Pipelines

Pipeline	Purpose	tmux Session	Dispatcher	Interval	CLAUDE.md
ADLC	Autonomous Development Lifecycle	`ods-claude`	`dispatcher-v3.sh`	5 min	`~/dev/CLAUDE.md`
PDLC	Product Development Lifecycle	`ods-pdlc`	`dispatcher-pdlc.sh`	10 min	PDLC `CLAUDE.md`

Bash Dispatchers

dispatcher-v3.sh (ADLC, every 5 minutes): - Consolidates feature branches into dev - Reads status files for every service in ~/dev/projects/ - Runs deterministic state transitions (tests, status recovery) - Injects complex decisions into Claude via tmux send-keys - Processes Slack inbox messages - Detects external blockers (missing .env vars, missing Coolify configs) - Spawns resolver for systemic issues (2+ services blocked on same cause) - Posts kanban summary every 6th run (30 min)

dispatcher-pdlc.sh (PDLC, every 10 minutes): - Processes PDLC Slack inbox messages - Checks ADLC state for GTM/Analytics triggers (services on staging) - Pokes PDLC Claude if idle (no agents running, waiting at prompt) - Posts PDLC kanban summary every 6th run (60 min)

Session Health (session-health.sh, every 1 minute)

Monitors the ods-claude tmux session: 1. Session alive check: if tmux session missing, creates new one and boots Claude 2. Idle detection: hashes pane content; if unchanged for 15 consecutive minutes with no agents running, kills and restarts session 3. Memory check: if available memory < 512MB, kills largest non-supervisor Claude process 4. Session age check: if session is > 12 hours old AND > 200 interactions, restarts when no agents are running

Slack Bridges

Bridge	Script	Inbox Directory	Channel Monitored
ADLC	`slack-bridge.sh`	`~/dev/ops/slack-inbox/`	C0AN0N8AUGZ (ADLC), D0AGRAVEC1K (DM)
PDLC	`pdlc-slack-bridge.sh`	`~/dev/ops/pdlc-slack-inbox/`	C0AN42N3C0L (PDLC)

Bridges poll Slack for new messages, write them as files to the inbox directory. Dispatchers pick them up and inject into the appropriate Claude session.

Daily Restart

At 4:00 AM UTC, a systemd timer triggers a full restart: - Kills both tmux sessions - Starts fresh sessions - Claude reads CLAUDE.md and runs /boot (ADLC) or /boot-pdlc (PDLC) - /boot reconstructs state from: agent-memory files, progress.md per project, git state per service, interrupted RUNNING status files, system resources - Starts pipeline loop

2. ADLC Pipeline Flow

Complete Pipeline Sequence

Dev (code) --> Tests (bash) --> BA (review) --> Architect+Security+DevOps (parallel reviews)
    --> PR (create+merge) --> Provisioner (if needed) --> Deploy (staging)
    --> Scenario (generate tests) --> E2E (execute tests)

Step-by-Step Detail

Step 1: Development

Trigger: Dispatcher detects pending tasks in progress.md with no dev.status file, or Claude receives /dev-task command.

Agent: dev (model: opus, maxTurns: 50)

Execution: 1. Dev agent reads spec at ~/dev/specs/$PROJECT/specs/$SERVICE/spec.md 2. Creates branch feat/$TASK_ID from dev 3. TDD cycle: write failing test, implement, refactor, repeat 4. Commits with conventional format: feat($SERVICE): description [$TASK_ID] 5. Pushes feature branch

Output files: - Git commits on feat/$TASK_ID branch - Update to ~/dev/specs/$PROJECT/gestion/progress.md: - [x] DEV: $TASK_ID -- $DESCRIPTION ($DATE) - Update to ~/.claude/agent-memory/pipeline/state.md: $PROJECT/$TASK_ID: DEV_COMPLETE @ ISO-timestamp

Status file written: None directly. The dispatcher’s branch consolidation step merges feature branches into dev, then sets $SERVICE-dev.status.

On success: Branch exists with commits. Dispatcher proceeds to tests. On failure: If tests fail 3 times on same issue, agent stops and reports blocker. Circuit breaker increments crash counter.

Step 2: Branch Consolidation (Dispatcher, automatic)

Trigger: Every dispatcher run (5 min), before pipeline scan.

Executor: dispatcher-v3.sh consolidate_branches() function (pure bash).

Execution: 1. For each service in ~/dev/projects/*/: - Auto-commit any uncommitted changes on feature branches: git add -A && git commit -m "wip: auto-commit $branch" - Switch to dev branch - Merge each feat/* branch with --no-edit - Delete merged feature branches - Push dev to origin 2. If merge conflict: abort merge, log conflict, post to Slack DM

Output: Consolidated dev branch with all feature work merged.

Step 3: Tests (Bash)

Trigger: $SERVICE-dev.status = DONE AND no $SERVICE-test.status file exists.

Executor: test-runner.sh (pure bash, zero Claude tokens).

Execution:

bash ~/dev/ops/adlc-v2/scripts/test-runner.sh $PROJECT $SERVICE

Status file written: - Success: echo "PASS" > ~/dev/ops/outputs/$SERVICE-test.status - Failure: echo "FAIL" > ~/dev/ops/outputs/$SERVICE-test.status

On success: Dispatcher proceeds to BA review. On failure: Crash counter incremented. If < 3 crashes, may retry. If >= 3, circuit breaker triggers.

Step 4: BA Review

Trigger: $SERVICE-test.status = PASS AND no $SERVICE-ba.status file exists.

Agent: ba (model: sonnet, maxTurns: 35)

Dispatcher action: Sets $SERVICE-ba.status to RUNNING, then injects into Claude:

Spawn /agent ba for $SERVICE. PROJECT: $PROJECT. Spec: ~/dev/specs/$PROJECT/specs/$SERVICE/spec.md

Execution: 1. BA agent reads FULL spec, extracts every acceptance criterion (AC-001, AC-002, …) 2. Reads the code, checks last-reviewed-commit.txt for incremental diff 3. For each criterion: finds implementing code, records file+line as evidence 4. Evaluates each: MET, PARTIAL, MISSING, DEVIATION, N/A 5. For app services: verifies API contracts, database schema, events match spec exactly 6. For infra services: verifies library API, deployment config, topic definitions

Output files: - ~/dev/ops/reviews/$SERVICE/ba-report.json (JSON with criteria, deviations, verdict) - ~/.claude/agent-memory/pipeline/state.md: $PROJECT/$SERVICE: BA_PASS|BA_FAIL @ timestamp

Status file format:

DONE | 2026-03-22T10:30:00+00:00 | ba | $SERVICE | from-json-report

FAILED | 2026-03-22T10:30:00+00:00 | ba | $SERVICE | from-json: non-compliant

Verdict rules: - compliant: criteriaMissing == 0 AND criteriaPartial == 0 AND no HIGH/CRITICAL deviations - non-compliant: criteriaMissing > 0 OR any HIGH/CRITICAL deviation

On success (compliant): Dispatcher proceeds to parallel reviews. On failure (non-compliant): - Dispatcher sets $SERVICE-ba.status to TRIAGING and injects into Claude for analysis - Claude reads the BA report JSON to determine if missing criteria map to pending tasks (not a real failure) or to completed tasks (real failure requiring dev fix)

Anti-rubber-stamp rules: - Every criterion MUST have evidence (file path + line number) - If implementing code cannot be found, mark MISSING (never assume) - Read FULL spec every time (never rely on memory) - Never suggest code changes (only report deviations) - Minimum review depth: read every controller/route file and every test file

Step 5: Architect + Security + DevOps Reviews (Parallel)

Trigger: $SERVICE-ba.status = DONE AND no $SERVICE-review.status file exists.

Dispatcher action: Sets $SERVICE-review.status to RUNNING, then injects into Claude:

Spawn /agent architect + /agent security + /agent devops for $SERVICE

All three run in parallel (if memory > 2000MB).

5a. Architect Review

Agent: architect (model: sonnet, maxTurns: 30)

8 mandatory checks (ALL must pass): 1. Schema Isolation – service uses own DB schema only 2. Inter-Service Communication – Redpanda events only, no direct HTTP between ODS services 3. Multi-Tenancy – tenant_id from JWT, RLS on all tables, tenant_id in all CloudEvents 4. Layer Structure – Controllers > Services > Repositories separation 5. No Hardcoded URLs – all external endpoints via env vars 6. Header Propagation – Authorization, X-Tenant-Id, X-Correlation-Id, X-Source-Service forwarded 7. CloudEvents Compliance – specversion, type, source, id, time, tenantid fields present 8. Error Handling – service-specific exception filters, correlation ID in logs, no stack traces in responses

Output: ~/dev/ops/reviews/$SERVICE/architect-report.json Verdict: Single FAIL on any check = overall FAIL. Every check needs evidence.

5b. Security Review

Agent: security (model: sonnet, maxTurns: 30)

OWASP Top 10 checks: - A01 Injection, A02 Broken Auth, A03 Sensitive Data, A04 XXE, A05 Access Control - A06 Security Misconfiguration, A07 XSS, A08 Insecure Deserialization - A09 Known Vulnerabilities (npm audit), A10 Insufficient Logging

Automated scans run first: npm audit, secrets scan (grep), hardcoded URLs scan, .gitignore check.

Output: ~/dev/ops/reviews/$SERVICE/security-report.json Verdict: - clean: all checks PASS/N/A, no npm audit critical/high, no secrets - concerns: any WARN, medium/low npm advisories - critical: any FAIL, critical/high npm advisory, any secret in code

CRITICAL or HIGH severity = automatic FAIL, PR must NOT merge.

5c. DevOps Review

Agent: devops (model: sonnet, maxTurns: 30, mode: review)

Mandatory validation (all executed): 1. Docker build test – docker build -t test-$SERVICE . 2. .dockerignore check 3. Health endpoint verification (grep for health/readiness/liveness) 4. Structured logging check 5. Env vars documentation (.env.example) 6. Migrations check 7. Test suite execution

Output: ~/dev/ops/reviews/$SERVICE/devops-report.json Verdict: PASS, FAIL, or PASS_WITH_NOTES

After all three reviews complete: Dispatcher checks $SERVICE-review.status. If the three JSON report files all exist, sets to DONE. Then validates verdicts:

BA=compliant AND ARCH=PASS AND SEC=(clean|concerns with severity!=HIGH/CRITICAL) AND DEVOPS=(PASS|PASS_WITH_NOTES)

If all pass: proceed to PR. If any fails: set $SERVICE-review.status to FAILED, increment crashes.

Step 6: PR Creation and Merge

Trigger: $SERVICE-review.status = DONE AND all_reviews_pass() returns true AND no $SERVICE-pr.status file exists.

Agent: pr (model: sonnet, maxTurns: 15)

Execution: 1. Reads all four JSON review reports programmatically (never trusts markdown) 2. Validates: BA=compliant, ARCH=PASS, SEC not critical, DEVOPS=PASS/PASS_WITH_NOTES 3. If ANY report MISSING or FAIL: ABORT, do not create PR

Auto-merge safeguards (ALL must pass for auto-merge): - All reviews PASS (not just PASS_WITH_NOTES) - Lines changed <= 500 - No migration files in diff - No cross-service/shared lib changes - Security severity = NONE

If all pass: gh pr merge --squash --auto If any fails: create PR with label human-review-required, post to Slack DM, do NOT merge.

Output: - GitHub PR on staging branch - Status update to ~/.claude/agent-memory/pipeline/state.md: $PROJECT/$SERVICE: STAGING_DEPLOYED @ timestamp - Update to ~/dev/specs/$PROJECT/gestion/progress.md

Status file: $SERVICE-pr.status = DONE | date | pr | $SERVICE | PR merged

Abort conditions: Any review MISSING/FAIL, CI fails, merge conflicts, security HIGH/CRITICAL.

Step 7: Provisioner (if needed)

Trigger: $SERVICE-pr.status = DONE AND no $SERVICE-deploy.status AND no Coolify config file (~/dev/ops/coolify/$SERVICE.json).

Agent: provisioner (model: sonnet, maxTurns: 30)

Execution (based on service type from service-project-map.json):

Service Type	Action
`web-service`	Create Coolify Application (Dockerfile build pack)
`infrastructure`	Create Coolify Service (docker-compose)
`script`	No Coolify app. Mark DONE immediately.

Resources provisioned: - PostgreSQL schema + user (psql) - GCS bucket + service account (gcloud) - Coolify app (Coolify API) - GitHub repo (gh) - .env.staging file

Post-provision check (MANDATORY before marking DONE): - web-service: trigger initial deploy, health check (2 min timeout) - infrastructure: check containers running - script: verify project builds

Output: - ~/dev/ops/coolify/$SERVICE.json (Coolify config with appUuid, URLs, registry) - ~/dev/ops/reviews/$SERVICE/provisioner-report.json - ~/dev/projects/$SERVICE/.env.staging

Status file:

DONE | date | provisioner | $SERVICE | verified-operational

PROVISION_INCOMPLETE

On failure: PROVISION_INCOMPLETE or BLOCKED, escalate to Slack DM.

What requires human: COOLIFY_API_TOKEN, external API credentials (Stripe, SendGrid, CinetPay), DNS wildcard, SMTP credentials.

What does NOT require human: Coolify app creation, PostgreSQL schema, GCS bucket, GitHub repo, env vars, TLS certs.

Step 8: Deploy to Staging

Trigger: $SERVICE-pr.status = DONE AND Coolify config exists AND no $SERVICE-deploy.status.

Agent: devops (model: sonnet, maxTurns: 30, mode: deploy)

Execution: 1. Build and tag Docker image 2. Push to registry (if configured) 3. Deploy to Coolify via API (restart application) 4. Health check loop (every 10s, up to 3 min) 5. Smoke test 6. Write deploy report

If health fails: rollback via Coolify API, notify Slack DM.

Output: - ~/dev/ops/reviews/$SERVICE/deploy-report.json - ~/dev/projects/$SERVICE/.env.staging

Status file:

DONE | date | deploy | $SERVICE | script-type: no permanent deploy

DEPLOYED | date | devops | $SERVICE | $STAGING_URL

Step 9: Scenario Generation

Trigger: Deploy health check passes (sequential, must wait for deploy).

Agent: scenario (model: sonnet, maxTurns: 25)

Execution: 1. Reads spec and all review files 2. Generates API scenarios (scenarios.json) 3. Generates browser scenarios (browser-scenarios.json) if service has UI 4. Writes mock data SQL and cleanup SQL

Mandatory API scenario categories: happy-path, multi-tenancy, auth, validation, error Mandatory browser scenario categories (if UI): user-journey, multi-tenancy, responsive, error-display, navigation

Output: - ~/dev/projects/$SERVICE/tests/e2e/scenarios.json - ~/dev/projects/$SERVICE/tests/e2e/browser-scenarios.json - ~/dev/projects/$SERVICE/tests/e2e/mock-data.sql - ~/dev/projects/$SERVICE/tests/e2e/cleanup.sql

Status: $PROJECT/$SERVICE: SCENARIOS_READY @ timestamp

Step 10: E2E Tests

Trigger: Scenario agent completes (sequential after scenarios).

Agent: e2e-test (model: sonnet, maxTurns: 30)

Tools: Full Playwright MCP tool suite + Read, Write, Bash, Glob, Grep

Pre-check: Verify staging URL is reachable via curl $STAGING_URL/health. If not: ABORT with E2E_BLOCKED.

Phase 1 – API E2E Tests: Execute each scenario from scenarios.json using curl against staging URL. Test categories: happy path, auth (expired/wrong/no token), multi-tenancy (cross-tenant access blocked), input validation, error handling.

Phase 2 – Browser E2E Tests: Execute browser scenarios using Playwright MCP tools: browser_navigate, browser_fill_form, browser_click, browser_wait_for, browser_snapshot, browser_take_screenshot. Save screenshots to ~/dev/ops/reviews/$SERVICE/screenshots/.

Phase 3 – Cleanup: Run cleanup.sql against test database.

Output: - ~/dev/ops/reviews/$SERVICE/e2e-report.json - Screenshots in ~/dev/ops/reviews/$SERVICE/screenshots/

Verdict rules: - All API + browser tests pass: E2E_PASS - Any test fails: E2E_FAIL - Staging not reachable: E2E_BLOCKED - Console JS errors during browser tests: E2E_FAIL (even if assertions pass)

3. PDLC Pipeline Flow

Complete Pipeline Sequence

Discovery --> Spec Writing --> Prioritization --> Validation --> ADLC Handoff
    --> (wait for ADLC) --> GTM Prep --> Analytics

Pipeline States

DISCOVERY --> SPEC_WRITING --> PRIORITIZED --> VALIDATING --> VALIDATED
  --> SUBMITTED_TO_ADLC (passive wait) --> ADLC_ACTIVE (passive wait)
  --> ADLC_STAGING --> GTM --> ANALYTICS

Step-by-Step Detail

Step 1: Discovery

Trigger: New market signal, user feedback, stakeholder request, or Slack command discover {topic}.

Agent: discovery (model: sonnet, maxTurns: 25)

Tools: Read, Bash, Glob, Grep, WebSearch, WebFetch

Execution: 1. Understand the request 2. Research: WebSearch for market data, competitor analysis, industry trends 3. Analyze existing context: read PROJECT.md, existing specs, business-rules.md 4. Identify gaps in platform capabilities 5. Write opportunity brief

Output: - ~/dev/specs/$PROJECT/pdlc/opportunities/$OPPORTUNITY_NAME.md (markdown brief with problem statement, market evidence, proposed solution, target users, business impact, risks, recommendation) - ~/dev/specs/$PROJECT/pdlc/opportunities/$OPPORTUNITY_NAME.json (structured summary)

Recommendation values: GO, NO_GO, NEEDS_MORE_DATA Priority values: CRITICAL, HIGH, MEDIUM, LOW

Status file: echo "RUNNING | $(date) | discovery | $SERVICE" > ~/dev/ops/outputs/$SERVICE-discovery.status

On success: Proceed to spec writing (if recommendation=GO).

Slack: Posts to PDLC channel (C0AN42N3C0L).

Step 2: Spec Writing

Trigger: Discovery recommendation = GO AND no spec exists for target service.

Agent: spec-writer (model: opus, maxTurns: 40)

Execution: 1. Read opportunity brief 2. Read architecture.md and business-rules.md for constraints 3. Read 1-2 existing specs for format and depth reference 4. Write the spec

Spec structure (11 mandatory sections): 1. Objectif 2. API Endpoints (method, path, request/response types, status codes, auth, multi-tenancy) 3. Data Model (tables, columns, types, indexes, RLS, relationships) 4. Events/CloudEvents (types, payload schema, topics) 5. Business Rules (validation, edge cases, error handling) 6. Acceptance Criteria (numbered AC-001..AC-N, Given/When/Then, minimum 10) 7. Non-Functional Requirements (performance, security, multi-tenancy) 8. Dependencies 9. Service Classification (MANDATORY – type, stack, deploy mode, staging domain) 10. Infrastructure Requirements (MANDATORY – resource table with auto-provisionable flag) 11. Out of Scope

Output: - ~/dev/specs/$PROJECT/specs/$SERVICE/spec.md - Updates ~/dev/ops/agents/service-project-map.json with new entry - Updates ~/dev/specs/$PROJECT/gestion/backlog.md

Quality rules: Every endpoint must have types. Every table must have tenant_id + RLS. Every AC must be testable. Minimum 10 ACs. Section 9 and 10 are mandatory.

Status file: echo "RUNNING | $(date) | spec-writer | $SERVICE" > ~/dev/ops/outputs/$SERVICE-spec-writer.status

Slack: Posts to PDLC channel.

Step 3: Prioritization

Trigger: New spec written, not yet prioritized.

Agent: prioritization (model: sonnet, maxTurns: 20)

RICE Scoring: - Reach (1-10): users/tenants affected per quarter - Impact (0.25, 0.5, 1, 2, 3): Minimal to Massive - Confidence (0.5, 0.8, 1.0): Low, Medium, High - Effort (1-10): agent-weeks - Score = (Reach x Impact x Confidence) / Effort

Output: - Updates ~/dev/specs/$PROJECT/gestion/backlog.md (ranked table with RICE scores, phase assignments P0-P3) - ~/dev/specs/$PROJECT/pdlc/prioritization-$DATE.json

Phase groupings: P0 (foundation, blocks everything), P1 (core, enabled by P0), P2 (growth), P3 (nice-to-have)

Slack: Posts top 3 items and current phase count to PDLC channel.

Step 4: Validation

Trigger: Spec prioritized, not yet validated.

Agent: validation (model: sonnet, maxTurns: 25)

Validation checklist: - Spec completeness (6 checks): API types, DB schema with tenant_id+RLS, 10+ testable ACs, CloudEvents, dependencies, out of scope - Technical feasibility: architecture alignment, no circular deps, dependencies exist/planned, no stack conflicts - Effort estimation: T-shirt sizing (S/M/L/XL), broken down by API/DB/logic/tests/integration - Risk assessment: technical, integration, security, scope risks - Dependency map: blocked-by, blocks, shared libs, schema conflicts - Infrastructure readiness: Section 9 exists, auto-provisionable vs human resources identified, env vars match .env.example

Output: ~/dev/specs/$PROJECT/pdlc/validations/$SERVICE-validation.json

Verdict rules: - APPROVED: spec complete, feasible, no HIGH risks, dependencies satisfied - NEEDS_REVISION: spec incomplete or has addressable issues – sent back to spec-writer - BLOCKED: depends on unbuilt services, not feasible with current stack

On APPROVED: - Creates handoff file: ~/dev/specs/$PROJECT/pdlc/handoffs/$SERVICE-handoff.json - Posts to ADLC channel: spec ready for development

On NEEDS_REVISION: Posts details to PDLC channel. Spec-writer agent may be re-spawned for revision.

Slack: Posts to PDLC channel.

Step 5: ADLC Handoff

Trigger: Validation verdict = APPROVED.

Execution (by PDLC orchestrator): 1. Ensure spec.md exists 2. Write handoff file:

{
  "service": "$SERVICE",
  "project": "$PROJECT",
  "date": "ISO-timestamp",
  "state": "SUBMITTED_TO_ADLC",
  "specPath": "~/dev/specs/$PROJECT/specs/$SERVICE/spec.md"
}

Post to ADLC Slack channel: :package: PDLC->ADLC handoff: $SERVICE spec ready for development
Update pipeline-state.md: $SERVICE: SUBMITTED_TO_ADLC @ timestamp

After handoff – passive wait rules: - PDLC MUST NOT re-submit, re-validate, re-discover, or spawn any agent for this service - PDLC only reads ADLC state passively every 10 minutes

ADLC State Detected	PDLC Action
No dev status yet (< 24h)	Wait. Normal.
No dev status yet (> 24h)	Post reminder to ADLC channel
Dev RUNNING or DONE	Update to ADLC_ACTIVE
STAGING_DEPLOYED or PR merged	Update to ADLC_STAGING, spawn GTM
E2E_PASS	Post to DM: ready for prod promotion
PROD_DEPLOYED	Spawn Analytics agent

Step 6: GTM (Go-to-Market)

Trigger: ADLC deploys service to staging (dispatcher-pdlc.sh detects STAGING_DEPLOYED in ADLC state and no GTM brief exists).

Agent: gtm (model: sonnet, maxTurns: 20)

Tools: Read, Bash, Glob, Grep, WebSearch

Output: ~/dev/specs/$PROJECT/pdlc/gtm/$SERVICE-gtm.md with: - Positioning (what, who, value prop, differentiation) - Feature summary - Rollout plan (internal testing, beta, GA, rollback) - Documentation needs (API docs, user guide, admin guide, migration guide, changelog) - Communication plan - Success metrics (adoption, engagement, quality, business) - Risks and mitigations

Slack: Posts to PDLC channel.

Step 7: Analytics

Trigger: Service deployed to production (future: PROD_DEPLOYED state).

Agent: analytics (model: sonnet, maxTurns: 20)

Data platform: ClickHouse (OLAP), Metabase (BI)

Output: ~/dev/specs/$PROJECT/pdlc/analytics/$SERVICE-kpis.md with: - KPI definitions (adoption rate, API latency p95, error rate, etc.) - Tracking plan (CloudEvents via Redpanda) - ClickHouse tables (materialized views) - Metabase dashboards (Operations, Product, Business) - Success criteria checklist - Review schedule

Slack: Posts KPI count, event count, dashboard count to PDLC channel.

4. Agent Registry

ADLC Agents

dev

Property	Value
Model	opus
Max Turns	50
Tools	Read, Write, Edit, Bash, Glob, Grep
Input	Spec (`~/dev/specs/$PROJECT/specs/$SERVICE/spec.md`), service CLAUDE.md, existing code
Output	Git commits on `feat/$TASK_ID` branch, progress.md update, state.md update
Status format	`$PROJECT/$TASK_ID: DEV_COMPLETE @ ISO-timestamp` (in state.md)
Pass criteria	All tests pass, code committed and pushed
Fail criteria	Tests fail 3 times on same issue
Forbidden	Modify docker-compose.yml, change DB schemas outside Prisma, hardcode secrets

ba

Property	Value
Model	sonnet
Max Turns	35
Tools	Read, Bash, Glob, Grep
Input	Spec, architecture.md, business-rules.md, service code
Output	`~/dev/ops/reviews/$SERVICE/ba-report.json`
Status format	`$PROJECT/$SERVICE: BA_PASS\\|BA_FAIL @ ISO-timestamp`
Pass criteria	criteriaMissing == 0 AND criteriaPartial == 0 AND no HIGH/CRITICAL deviations
Fail criteria	criteriaMissing > 0 OR any HIGH/CRITICAL deviation
Forbidden	Suggest code changes, write code, mark criteria MET without evidence

architect

Property	Value
Model	sonnet
Max Turns	30
Tools	Read, Bash, Glob, Grep
Input	architecture.md, business-rules.md, spec, service code
Output	`~/dev/ops/reviews/$SERVICE/architect-report.json`
Status format	`$PROJECT/$SERVICE: ARCHITECT_PASS\\|ARCHITECT_FAIL @ ISO-timestamp -- X/8 checks`
Pass criteria	All 8 architectural checks pass
Fail criteria	Single FAIL on any check
Forbidden	Write production code, mark PASS without evidence

security

Property	Value
Model	sonnet
Max Turns	30
Tools	Read, Bash, Glob, Grep
Input	Service code
Output	`~/dev/ops/reviews/$SERVICE/security-report.json`
Status format	`$PROJECT/$SERVICE: SECURITY_PASS\\|SECURITY_FAIL @ ISO-timestamp -- OWASP X/10, severity=Y`
Pass criteria	All OWASP checks PASS/N/A, no npm critical/high, no secrets
Fail criteria	Any FAIL, any critical/high npm advisory, any secret in code
Forbidden	Write code, skip automated scans, assume zero findings

devops

Property	Value
Model	sonnet
Max Turns	30
Tools	Read, Write, Bash, Glob, Grep
Input	Service code, Dockerfile, .env.example
Output (review mode)	`~/dev/ops/reviews/$SERVICE/devops-report.json`
Output (deploy mode)	`~/dev/ops/reviews/$SERVICE/deploy-report.json`, `.env.staging`
Status format (review)	Contributes to `$SERVICE-review.status`
Status format (deploy)	`DEPLOYED\\|DEPLOY_FAILED\\|FIRST_DEPLOY_NEEDED` in deploy-report.json
Pass criteria (review)	Docker builds, health endpoint exists, tests pass
Fail criteria	Docker build fails, no health endpoint, tests fail
Forbidden	Write application code

pr

Property	Value
Model	sonnet
Max Turns	15
Tools	Read, Bash, Glob, Grep
Input	All JSON review reports in `~/dev/ops/reviews/$SERVICE/`
Output	GitHub PR, state.md update, progress.md update
Status format	`$PROJECT/$SERVICE: STAGING_DEPLOYED @ ISO-timestamp`
Pass criteria	All reviews pass, PR created/merged
Fail criteria	Any review MISSING/FAIL, CI fails, merge conflicts
Forbidden	Write application code, merge with security HIGH/CRITICAL, force merge conflicts

provisioner

Property	Value
Model	sonnet
Max Turns	30
Tools	Read, Write, Bash, Glob, Grep
Input	Spec (sections 9+10), service-project-map.json, .env.example
Output	`~/dev/ops/coolify/$SERVICE.json`, `provisioner-report.json`, `.env.staging`, DB schema, GCS bucket, GitHub repo
Status format	`DONE \\| date \\| provisioner \\| $SERVICE \\| verified-operational` or `PROVISION_INCOMPLETE`
Pass criteria	All resources created AND post-provision health check passes
Fail criteria	Cannot create resource, health check fails
Forbidden	Write application code, mark DONE before health check

scenario

Property	Value
Model	sonnet
Max Turns	25
Tools	Read, Write, Edit, Bash, Glob, Grep, Playwright (navigate, snapshot, screenshot)
Input	Spec, review files
Output	`scenarios.json`, `browser-scenarios.json`, `mock-data.sql`, `cleanup.sql` in `~/dev/projects/$SERVICE/tests/e2e/`
Status format	`$PROJECT/$SERVICE: SCENARIOS_READY @ ISO-timestamp -- X api, Y browser scenarios`
Forbidden	Write application code

e2e-test

Property	Value
Model	sonnet
Max Turns	30
Tools	Read, Write, Bash, Glob, Grep, full Playwright MCP suite (24 tools)
Input	Scenario files, staging URL, mock data
Output	`~/dev/ops/reviews/$SERVICE/e2e-report.json`, screenshots
Status format	`E2E_PASS\\|E2E_FAIL\\|E2E_BLOCKED`
Forbidden	Write application code

auditor

Property	Value
Model	sonnet
Max Turns	30
Tools	Read, Bash, Glob, Grep
Input	All status files, all review reports, registry, git history
Output	`~/dev/ops/reviews/adlc-audit-YYYYMMDD-HHMM.json`, `last-audit.md`
Status format	N/A (audit is a cross-cutting concern)
Pass criteria	Zero CRITICAL/HIGH findings = COMPLIANT
Triggers	Every 6 hours (automated), on-demand via Slack, after major milestones
Audit scope	Pipeline sequence compliance, review report quality (anti-rubber-stamp), registry consistency, orchestrator behavior (not coding directly), blocked/failed services, test coverage
Forbidden	Write application code, modify review reports

resolver

Property	Value
Model	opus
Max Turns	40
Tools	Read, Write, Edit, Bash, Glob, Grep
Input	Blocked/failed status files, pipeline.log, external-blockers.log, agent definitions, skill definitions, specs
Output	`~/dev/ops/reviews/resolver/RES-$DATE-$SEQ.json`, `changelog.md`, `resolver-fixes.log`, `resolver-monitor-*.json`
Status format	N/A (operates above both pipelines)
CAN modify	Agent definitions, skill definitions, dispatcher scripts, spec-writer templates, CLAUDE.md rules, provisioner behavior, service-project-map.json
CANNOT modify	Application source code, spec content for specific services, review reports, git history, systemd units

PDLC Agents

discovery

Property	Value
Model	sonnet
Max Turns	25
Tools	Read, Bash, Glob, Grep, WebSearch, WebFetch
Input	Market signals, user feedback, existing specs, business-rules.md
Output	`~/dev/specs/$PROJECT/pdlc/opportunities/$NAME.md` + `.json`
Forbidden	Write specs, write code

spec-writer

Property	Value
Model	opus
Max Turns	40
Tools	Read, Bash, Glob, Grep
Input	Opportunity brief, architecture.md, business-rules.md, existing specs
Output	`~/dev/specs/$PROJECT/specs/$SERVICE/spec.md`, updates to service-project-map.json and backlog.md
Forbidden	Write code

prioritization

Property	Value
Model	sonnet
Max Turns	20
Tools	Read, Bash, Glob, Grep
Input	Pending specs, opportunity briefs, backlog, roadmap, business-rules.md
Output	Updated `backlog.md`, `prioritization-$DATE.json`
Forbidden	Write specs, write code

validation

Property	Value
Model	sonnet
Max Turns	25
Tools	Read, Bash, Glob, Grep
Input	Spec, architecture.md, business-rules.md, existing services, ADLC state
Output	`~/dev/specs/$PROJECT/pdlc/validations/$SERVICE-validation.json`, handoff file on APPROVED
Forbidden	Write code, write specs

gtm

Property	Value
Model	sonnet
Max Turns	20
Tools	Read, Bash, Glob, Grep, WebSearch
Input	Spec, ADLC state, PROJECT.md
Output	`~/dev/specs/$PROJECT/pdlc/gtm/$SERVICE-gtm.md`
Forbidden	Write code, write specs

analytics

Property	Value
Model	sonnet
Max Turns	20
Tools	Read, Bash, Glob, Grep
Input	Spec, GTM brief, ADLC state
Output	`~/dev/specs/$PROJECT/pdlc/analytics/$SERVICE-kpis.md`
Forbidden	Write code

5. Status File Format Standard

THE Definitive Format

All status files are written to ~/dev/ops/outputs/ and follow this format:

STATUS | ISO-date | agent-type | service | details

Example:

DONE | 2026-03-22T10:30:00+00:00 | ba | oid | from-json-report
FAILED | 2026-03-22T11:00:00+00:00 | security | docstore | severity=CRITICAL, findings=3
RUNNING | 2026-03-22T09:00:00+00:00 | dev | pdf-engine | task T-042
BLOCKED_EXTERNAL | 2026-03-22T08:00:00+00:00 | provisioner | billing-engine | need Stripe API key

Valid STATUS Values

Status	Meaning
`DONE`	Stage completed successfully
`PASS`	Tests or checks passed
`FAIL`	Tests or checks failed
`FAILED`	Agent or stage failed
`RUNNING`	Agent currently executing
`BLOCKED`	Blocked on internal dependency
`BLOCKED_EXTERNAL`	Blocked on external resource (human action needed)
`PAUSED`	Manually paused
`TRIAGING`	BA failure being analyzed by orchestrator
`DEPLOYED`	Successfully deployed to staging
`PROVISION_INCOMPLETE`	Provisioner created resources but health check failed

INVALID Values

Anything not in the above list is considered corrupted. The dispatcher’s read_status() function validates the first word:

case "$first_word" in
    DONE|PASS|FAIL|FAILED|RUNNING|BLOCKED|BLOCKED_EXTERNAL|PAUSED|TRIAGING|DEPLOYED|PROVISION_INCOMPLETE|"")
        echo "$first_word"
        ;;
    *)
        # Corrupted — log and auto-delete
        rm "$actual_file"
        echo ""
        ;;
esac

Invalid examples: JSON blobs, service names, partial data, markdown, anything an agent writes that is not a standard keyword.

Status File Naming Convention

~/dev/ops/outputs/$SERVICE-$STAGE.status

Where $STAGE is one of: dev, test, ba, review, pr, deploy, provisioner, discovery, spec-writer, prioritization, validation, gtm, analytics

Additional files: - $SERVICE.crashes – crash counter (plain integer) - $SERVICE-dev-$TASKID.status – per-task dev status (glob fallback)

Stale RUNNING Detection

The dispatcher resets any RUNNING status file older than 30 minutes (agent likely crashed or was killed by session restart):

if [ "$stage_status" = "RUNNING" ]; then
    stage_age=$(( $(date +%s) - $(stat -c '%Y' "$stage_file") ))
    if [ "$stage_age" -gt 1800 ]; then
        rm "$stage_file"
    fi
fi

6. Dispatcher State Machine

ADLC Dispatcher (dispatcher-v3.sh)

Execution Order (every 5 minutes)

Branch consolidation (consolidate_branches)
Pipeline scan (process_pipeline)
Slack inbox (process_slack_inbox)
External blocker detection (detect_external_blockers)
Resolver trigger (maybe_spawn_resolver)
Kanban summary (maybe_post_kanban, every 6th run)

State Transitions

For each service in ~/dev/projects/*/ (must have .git):

              +-----------+
              | No status |
              +-----+-----+
                    |
                    v
              +-----------+
              | dev: DONE |
              +-----+-----+
                    |
    [bash: test-runner.sh, zero tokens]
                    |
          +---------v---------+
          |  test: PASS/FAIL  |
          +---------+---------+
                    |
              (if PASS)
                    |
    [inject: "Spawn /agent ba"]
    [write: ba.status = RUNNING]
                    |
          +---------v---------+
          |   ba: DONE/FAILED |
          +---------+---------+
                    |
         DONE               FAILED
           |                  |
           |     [inject: "BA FAIL, analyze"]
           |     [write: ba.status = TRIAGING]
           |                  |
    [inject: "Spawn reviews"] |
    [write: review.status = RUNNING]
           |
    +---------v---------+
    | review: DONE/FAIL |
    +---------+---------+
              |
   (if DONE AND all_reviews_pass)
              |
    [inject: "Spawn /agent pr"]
    [write: pr.status = RUNNING]
              |
    +---------v---------+
    |   pr: DONE/FAIL   |
    +---------+---------+
              |
      (if DONE, based on service type)
              |
    +----+----+--------+
    |    |             |
  script infra     web-service
    |    |             |
  DONE  +------+-------+
               |
     [has coolify config?]
        |            |
       YES          NO
        |            |
    [deploy]    [provisioner]
        |            |
    +---v---+   +----v----+
    |DEPLOYED|  |PROVISION|
    +--------+  +---------+

What the Dispatcher Reads

Source	What It Extracts
`~/dev/ops/outputs/$SERVICE-$STAGE.status`	First word = status keyword
`~/dev/ops/reviews/$SERVICE/*.json`	JSON verdicts (fallback if no .status file)
`~/dev/ops/outputs/$SERVICE.crashes`	Crash count (circuit breaker)
`~/dev/ops/agents/service-project-map.json`	Service -> project mapping, service type, deploy mode
`~/dev/ops/coolify/$SERVICE.json`	Coolify config existence check
`~/dev/ops/slack-inbox/*.txt`	Slack messages to inject

Decision Logic (Pure Bash)

The dispatcher NEVER uses Claude for decisions. It uses: - read_status() – extract and validate first word from status file - all_reviews_pass() – parse all 4 JSON review files with python3 one-liners - get_project(), get_service_type(), get_deploy_mode() – read from service-project-map.json - get_crashes(), inc_crashes() – crash counter management - review_verdict() – extract a JSON field from a review report

What the Dispatcher Injects into Claude

The dispatcher sends text commands to Claude via tmux send-keys. Examples:

Spawn /agent ba for oid. PROJECT: ods-platform. Spec: ~/dev/specs/ods-platform/specs/oid/spec.md

BA FAIL for docstore. Read ~/dev/ops/reviews/docstore/ba-report.json. If missing criteria map to pending tasks, spawn dev agents. If real failure, spawn dev fix.

Spawn /agent architect + /agent security + /agent devops for pdf-engine. PROJECT: ods-platform.

Spawn /agent provisioner for notification-hub. PROJECT: ods-platform. TYPE: web-service. Create Coolify application.

Spawn /agent resolver. SYSTEMIC BLOCKER DETECTED: 3 services blocked. Top causes: missing Coolify config.

What Status Files the Dispatcher Creates/Modifies

Action	File Written
Tests pass	`$SERVICE-test.status` = `PASS`
Tests fail	`$SERVICE-test.status` = `FAIL`
Spawn BA	`$SERVICE-ba.status` = `RUNNING`
BA FAIL triage	`$SERVICE-ba.status` = `TRIAGING`
Spawn reviews	`$SERVICE-review.status` = `RUNNING`
Spawn PR	`$SERVICE-pr.status` = `RUNNING`
Spawn deploy	`$SERVICE-deploy.status` = `RUNNING`
Spawn provisioner	`$SERVICE-provisioner.status` = `RUNNING`
Script service (no deploy)	`$SERVICE-deploy.status` = `DONE \\| date \\| deploy \\| $SERVICE \\| script-type`
Recover BA from JSON	`$SERVICE-ba.status` = `DONE \\| date \\| ba \\| $SERVICE \\| from-json-report`
Recover reviews from JSON	`$SERVICE-review.status` = `DONE \\| date \\| reviews \\| $SERVICE \\| from-json-reports`
Stale RUNNING (>30min)	Deletes the status file
Corrupted status	Deletes the status file

PDLC Dispatcher (dispatcher-pdlc.sh)

Execution Order (every 10 minutes)

Process PDLC Slack inbox
Check ADLC state for GTM/Analytics triggers
Detect idle PDLC Claude and inject check-pdlc
Kanban summary (every 6th run = 60 min)

GTM Auto-Trigger Logic

# For each service with a spec under each project:
if ADLC state shows STAGING_DEPLOYED for $SERVICE:
    if no GTM brief exists at ~/dev/specs/$PROJECT/pdlc/gtm/$SERVICE-gtm.md:
        if $SERVICE-gtm.status != RUNNING and != DONE:
            inject "Spawn /agent gtm for $SERVICE"

Idle Detection

# If last visible line contains "bypass permissions" (Claude prompt)
# AND no agents running (no "local agents" or "background tasks" in pane):
inject "Run check-pdlc. Read ~/.claude/skills/check-pdlc/SKILL.md."

7. Service Classification

Three Service Types

Type	Description	Deployment	Coolify Entity	Health Check
`web-service`	REST API or frontend, runs permanently	Dockerfile build	Coolify Application	`/health` endpoint
`infrastructure`	Broker, database, cache (Redpanda, ClickHouse, etc.)	docker-compose	Coolify Service	Container status
`script`	Migration, data import, CLI tool	None (one-shot)	None	Build check only

service-project-map.json Structure

Location: ~/dev/ops/agents/service-project-map.json

{
  "oid": {
    "project": "ods-platform",
    "type": "web-service",
    "stack": "node",
    "deploy": "dockerfile"
  },
  "redpanda": {
    "project": "ods-platform",
    "type": "infrastructure",
    "stack": "docker",
    "deploy": "docker-compose"
  },
  "migration": {
    "project": "lejecos",
    "type": "script",
    "stack": "node",
    "deploy": "one-shot"
  }
}

Legacy format (backward compatible): "oid": "ods-platform" (simple string = project name, defaults to type=web-service, deploy=dockerfile).

Deploy Mode Mapping

Deploy Mode	Coolify Action	Dispatcher Behavior
`dockerfile`	Create Coolify Application, Dockerfile build pack	Spawn provisioner if no config, then devops deploy
`docker-compose`	Create Coolify Service	Spawn provisioner for compose, then devops deploy
`one-shot`	No Coolify deployment	Mark deploy as DONE immediately

Current Projects

Project	Specs Directory	Services
ods-platform	`~/dev/specs/ods-platform/`	oid, redpanda, docstore, pdf-engine, notification-hub, workflow-engine, form-engine, billing-engine, securemail, doceditor, agenda
ods-dashboard	`~/dev/specs/ods-dashboard/`	ods-dashboard
lejecos	`~/dev/specs/lejecos/`	migration

8. External Dependencies

Resource Categories and Provisionability

Resource Type	CLI/Method	Auto-Provisionable	Agent
PostgreSQL schema + user	`psql`	Yes	provisioner
GCS bucket + service account	`gcloud storage`	Yes	provisioner
Coolify app/service	Coolify API	Yes (if token exists)	provisioner
GitHub repo	`gh`	Yes	provisioner
Redpanda topics	`rpk` / docker exec	Yes	provisioner
Redis instance	docker / GCP Memorystore	Yes	provisioner
TLS certificates	Coolify (Let’s Encrypt)	Yes (automatic)	Coolify
COOLIFY_API_TOKEN	Manual	No (first time)	Human
External API credentials (Stripe, SendGrid, CinetPay)	Manual	No	Human
DNS wildcard configuration	Manual	No (first time)	Human
SMTP server credentials	Manual	No	Human

Known External Dependencies Per Service

Tracked in ~/dev/ops/external-deps.md:

Service	Dependency	Type	Auto
oid	PostgreSQL 5433	infra	Yes
docstore	S3/MinIO	infra	Partial (need bucket + credentials)
notification-hub	SMTP server	infra	No
notification-hub	SendGrid API	external-api	No
pdf-engine	(self-contained)	-	N/A
billing-engine	Stripe API	external-api	No
billing-engine	CinetPay API	external-api	No
all	Coolify	deployment	Partial (need per-service UUID)

Escalation Format for Human Blockers

When a resource requires human action, the provisioner (or orchestrator) posts to Slack DM:

:key: EXTERNAL BLOCKER -- {service}/{task}
Category: {credentials|infrastructure|deployment|external-api|network|permissions}
Missing: {specific resource or credential}
Spec reference: {where in spec.md this is mentioned}
Impact: {what cannot proceed without this}
Action needed: {exact steps for the human}

After posting: 1. Mark task as BLOCKED_EXTERNAL in pipeline state 2. Log to ~/dev/ops/outputs/external-blockers.log with timestamp 3. Move to next task (do NOT wait for resolution) 4. When human responds in Slack, the slack-bridge skill resets the blocked status

9. Resolver Protocol

When It Triggers

The dispatcher spawns the resolver agent when: 1. 2+ services blocked on the same root cause – maybe_spawn_resolver() counts blocked/failed deploy/provisioner status files, groups by reason. If total >= 2 and unique reasons <= 2, it is systemic. 2. 2+ corrupted status files – non-standard first word in status files indicates agents are writing invalid format. 3. Periodic scan (30 min) finds recurring patterns. 4. Human request via Slack: “resolve”, “root cause”, “why is everything blocked”

Cooldown: resolver will not re-run within 1 hour of last run (checks resolver-last-run.ts).

6-Phase Cycle

Phase 1: Root Cause Analysis

Trace from symptom to root cause using a structured tree:

BLOCKER: [description]
+-- IMMEDIATE: [what directly failed]
+-- AGENT GAP: [which agent should have caught/handled this]
+-- SPEC GAP: [what the spec should have specified]
+-- TEMPLATE GAP: [what the spec-writer template is missing]
+-- PIPELINE GAP: [what the dispatcher/provisioner doesn't check]
+-- DESIGN GAP: [systemic assumption that was wrong]

Sources read: blocked/failed status files, pipeline.log, external-blockers.log, agent definitions, skill definitions, specs.

Phase 2: Plan

Detailed fix plan written BEFORE any changes. For each proposed change: - Component and file path - What to modify - Why this fixes the root cause

Phase 3: Impact Analysis (MANDATORY)

For each proposed change, evaluate across 4 dimensions:

A. ADLC Integrity Check: Does it alter pipeline sequence? Weaken review gates? Bypass safety mechanisms?

B. PDLC Integrity Check: Does it alter product lifecycle? Affect spec quality? Bypass validation?

C. Bias Detection: Does it favor one service type? Reduce observability? Make rollback harder?

D. Impact Score (6 dimensions, each 0-3):

Dimension	Scale
ADLC pipeline integrity	0=no impact, 3=breaks pipeline
PDLC pipeline integrity	0=no impact, 3=breaks pipeline
Review quality	0=no impact, 3=weakens reviews
System generality	0=no impact, 3=becomes specific
Observability	0=no impact, 3=reduces visibility
Rollback safety	0=easy rollback, 3=irreversible
TOTAL	sum / 18

Decision thresholds:

Total Score	Decision
0-2	AUTO-APPLY – minimal impact, proceed
3-5	APPLY WITH MONITORING – low risk, watch closely
6-9	HUMAN REVIEW REQUIRED – post plan to Slack DM, wait for approval
10+	DO NOT APPLY – redesign the fix

Phase 4: Apply (if impact acceptable)

Backup files before modification: cp $FILE ${FILE}.resolver-backup-$(date +%Y%m%d-%H%M)
Make changes
Verify syntax

What resolver CAN modify: - Agent definitions (~/.claude/agents/*.md) - Skill definitions (~/.claude/skills/*/SKILL.md) - Dispatcher scripts (~/dev/ops/adlc-v2/scripts/*.sh, ~/dev/ops/pdlc/scripts/*.sh) - Spec-writer templates - CLAUDE.md orchestrator rules - Provisioner behavior - service-project-map.json

What resolver CANNOT modify: - Application source code in ~/dev/projects/*/src/ - Spec content for specific services - Review reports (read-only audit trails) - Git history - systemd units (escalate to human)

Phase 5: Post-Application Monitoring

After applying fixes: 1. Record fix in ~/dev/ops/outputs/resolver-fixes.log 2. Write monitoring criteria to ~/dev/ops/outputs/resolver-monitor-$BLOCKER_ID.json 3. On next resolver scan (30 min): read monitor files, execute check commands, compare against success criteria 4. If success: mark resolved, clean up 5. If failed: execute rollback plan, escalate to Slack DM

The dispatcher also checks monitoring files: if monitorUntil timestamp has passed, it spawns the resolver for verification.

Phase 6: Documentation

Write resolution report to ~/dev/ops/reviews/resolver/RES-$DATE-$SEQ.json with: - Blocker description, services affected, duration blocked - Root cause chain (immediate, agent gap, spec gap, template gap, pipeline gap, design gap) - Fix details (plan, files modified, impact score) - Monitoring results (expected vs actual outcome) - Lessons learned, prevention measures

Append summary to ~/dev/ops/reviews/resolver/changelog.md.

10. Slack Channels and Communication

Channel Map

Channel	ID	Purpose	Pipeline
ADLC	`C0AN0N8AUGZ`	Pipeline commands, progress milestones, kanban	ADLC
PDLC	`C0AN42N3C0L`	PM commands, product updates, PDLC kanban	PDLC
DM	`D0AGRAVEC1K`	Blockers, human review, human interaction	Both

Token Loading

source ~/.env.adlc 2>/dev/null || source ~/.env.openclaw 2>/dev/null

Message Formats by Type

Blocker Notification (DM)

:rotating_light: BLOCKED -- {service}/{task}
Reason: {reason}
Action needed: {what the human should do}

External Blocker (DM)

:key: EXTERNAL BLOCKER -- {service}/{task}
Category: {credentials|infrastructure|deployment|external-api|network|permissions}
Missing: {specific resource or credential}
Spec reference: {where in spec.md this is mentioned}
Impact: {what cannot proceed without this}
Action needed: {exact steps for the human}

Human Review Required (DM)

:eyes: HUMAN REVIEW -- {service}/{task}
Context: {summary}
Options: {what the human can reply}

Progress Milestone (ADLC Channel)

:white_check_mark: {service} -- {milestone}
{brief details}

Agent Results (ADLC Channel)

[BA] $SERVICE: $STATUS -- $CRITERIA_MET/$CRITERIA_TOTAL criteria met. $DEVIATIONS deviations.
[ARCHITECT] $SERVICE: $VERDICT -- $CHECKS_PASSED/8 checks passed.
[SECURITY] $SERVICE: $STATUS -- OWASP $SCORE/10, severity=$SEVERITY, $FINDINGS findings.
[E2E] $SERVICE: $VERDICT -- API: $PASSED/$TOTAL, Browser: $PASSED/$TOTAL.
[PROVISIONER] $SERVICE: $VERDICT -- Coolify=$X, DB=$Y, Bucket=$Z.
[AUDIT] $VERDICT -- $N services. $FINDINGS findings ($CRITICAL critical, $HIGH high).

PDLC Notifications (PDLC Channel)

[DISCOVERY] $OPPORTUNITY -- $RECOMMENDATION ($PRIORITY). Impact: $IMPACT.
[SPEC] $SERVICE spec written -- $AC_COUNT acceptance criteria, $ENDPOINT_COUNT endpoints.
[PRIORITY] $PROJECT backlog re-prioritized -- $TOTAL items scored. Top 3: $TOP3.
[VALIDATION] $SERVICE -- $VERDICT. Effort: $EFFORT. Risks: $RISK_COUNT.
[GTM] $SERVICE brief ready -- rollout plan, docs checklist, success metrics.
[ANALYTICS] $SERVICE -- $KPI_COUNT KPIs, $EVENT_COUNT events, $DASHBOARD_COUNT dashboards.

PDLC to ADLC Handoff (ADLC Channel)

:package: PDLC->ADLC handoff: $SERVICE spec ready for development.

Resolver Notifications

:mag: RESOLVER analyzing systemic blocker: $DESCRIPTION ($N services affected)
:wrench: RESOLVER applying fix: $SUMMARY (impact $SCORE/18 -- auto-approved)
:eyes: RESOLVER needs approval: $SUMMARY (impact $SCORE/18) [DM]
:white_check_mark: RESOLVER fix verified: $BLOCKER_ID -- $N services unblocked.
:rotating_light: RESOLVER fix FAILED: $BLOCKER_ID -- rolling back. $REASON

Blocker Resolution Flow

System detects blocker, posts to DM with :key: or :rotating_light: format
Human fixes the issue externally (adds credentials, creates resource, etc.)
Human responds in Slack DM with one of: resolved, fixed, done, c'est fait, ok, credentials added, token ajoute, cle ajoutee, {service} unblocked
Slack bridge (ADLC or PDLC) detects message, routes to orchestrator
Orchestrator’s slack-bridge skill:
- Identifies affected service(s) from recent blocker notifications
- Removes blocking status files: rm ~/dev/ops/outputs/$SERVICE-deploy.status ~/dev/ops/outputs/$SERVICE-provisioner.status
- Confirms to human: :white_check_mark: $SERVICE unblocked. Dispatcher will retry in <5 min.
Dispatcher’s next 5-min cycle detects missing status files, resumes pipeline for the service

Slack Bridge Intent Routing (ADLC)

Pattern	Action
`status`, `etat`, `avancement`	Run `/status`, post summary
`kanban`, `board`	Run `/kanban`, post board
`pause`, `stop`, `arreter`	Pause all agents
`resume`, `reprendre`, `go`	Resume pipeline
`deploy {service}`	Spawn DevOps deploy mode
`merge {service}`	Create PR and merge
`rollback {service}`	Roll back staging
`onboard {project}`	Queue generate-specs
`launch {service}`	Queue dev-task
`audit`	Spawn auditor
`fix {service} {desc}`	Spawn dev agent with fix
`review {service}`	Spawn BA + Architect + Security + DevOps
`test {service}`	Run test-runner.sh
`e2e {service}`	Spawn scenario + e2e-test
`logs {service}`	Read/summarize review reports

Slack Bridge Intent Routing (PDLC)

Pattern	Action
`discover {topic}`	Spawn discovery agent
`spec {service}`	Spawn spec-writer
`prioritize`, `backlog`	Spawn prioritization
`validate {service}`	Spawn validation
`gtm {service}`	Spawn GTM
`analytics {service}`	Spawn analytics
`handoff {service}`	Trigger ADLC handoff
`status`, `pipeline`	Post pipeline status
`adlc status`	Read and summarize ADLC state

11. Known Issues and Mitigations

Context Window Exhaustion

Problem: Claude sessions accumulate context over hours, eventually degrading quality or crashing.

Mitigations: - Daily restart at 4am UTC – kills and restarts both tmux sessions, runs /boot to reconstruct state from disk files - session-health.sh (1 min) – monitors session age; if > 12h AND > 200 interactions, restarts when no agents running - Interaction counter (~/dev/ops/outputs/claude-interactions.count) – tracks cumulative injections - Idle detection – if pane unchanged for 15 min with no agents, session is stuck; kill and restart

Status File Corruption

Problem: Agents sometimes write non-standard status values (JSON blobs, service names, markdown).

Mitigations: - read_status() validation – dispatcher extracts first word from status file and checks against known list; corrupted files are auto-deleted and logged - Resolver auto-trigger – if 2+ corrupted files detected in a single scan, resolver agent is spawned to trace which agent is writing invalid format and fix the agent prompt - Stale RUNNING detection – any RUNNING status file older than 30 minutes is auto-deleted (agent likely crashed)

Agent Writes Wrong Format

Problem: An agent writes a status file that doesn’t match the expected STATUS | date | agent | service | details format.

Mitigations: - Dispatcher validation – read_status() only accepts known keywords; anything else is treated as corrupted - JSON report fallback – if no .status file but a JSON review report exists, dispatcher recovers status from JSON (ba-report.json status=compliant -> ba.status = DONE) - Resolver – traces which agent prompt is producing invalid format, fixes the agent definition

Claude Acts Instead of Delegating

Problem: The orchestrator Claude writes code, runs tests, or does work it should delegate to subagents.

Mitigations: - CLAUDE.md absolute rules – 8 rules that MUST NEVER be broken: 1. NEVER edit source code 2. NEVER run tests (test-runner.sh does that) 3. NEVER create files in service directories 4. NEVER run cargo/npm/pnpm/go/dotnet/node 5. NEVER commit or push 6. NEVER loop or schedule itself 7. NEVER run build commands 8. NEVER run PDLC skills (those are for the other orchestrator) - Auditor agent – checks git history for commits by the supervisor, checks pipeline.log for direct file edits

PDLC Skills Executed by ADLC

Problem: ADLC orchestrator accidentally runs /check-pdlc or /boot-pdlc, which belong to the PDLC orchestrator.

Mitigation: CLAUDE.md Rule 8 explicitly states: > NEVER run /check-pdlc or /boot-pdlc – those are PDLC skills for the other orchestrator (ods-pdlc tmux). You are ADLC only. Your skills: /check-pipeline, /boot, /status, /kanban, /dev-task, /slack-bridge, /registry.

Circuit Breaker Exhaustion

Problem: A service fails 3+ times and gets permanently blocked.

Mitigations: - Crash counter ($SERVICE.crashes) tracked per service - After 3 crashes: service is skipped by dispatcher, BLOCKED posted to Slack DM - Human can reset: reply {service} unblocked in Slack DM to clear status files - Resolver can analyze systemic causes across multiple blocked services

Memory Pressure

Problem: Too many concurrent Claude agents exhaust server RAM.

Mitigations: - Before spawning agents: orchestrator checks awk '/MemAvailable/ {print int($2/1024)}' /proc/meminfo - If > 2000MB: spawn freely, all pending work in parallel - If < 2000MB: queue new spawns, wait for running agents to finish, post to Slack DM - If < 512MB (critical): session-health.sh kills largest non-supervisor Claude process - Subagent design: review agents use sonnet (lighter) rather than opus; only dev, spec-writer, and resolver use opus

Duplicate Agent Spawning (PDLC)

Problem: PDLC dispatcher could spawn duplicate agents for the same service+stage.

Mitigation: check-pdlc skill implements anti-duplicate gates:

check_agent_status() {
  local status_file=~/dev/ops/outputs/${service}-${agent_type}.status
  # RUNNING -> do NOT spawn
  # DONE -> do NOT re-run
  # FAILED (<3 attempts) -> retry once
  # FAILED (>=3) -> BLOCKED
  # NONE -> eligible to spawn
}

Merge Conflicts During Branch Consolidation

Problem: Feature branches conflict when merged into dev.

Mitigation: Dispatcher tries git merge --no-edit. On failure: - git merge --abort - Log conflict - Post to Slack DM: :warning: Merge conflict: $SERVICE/$branch needs manual resolution - Skip that branch, continue with others

12. File System Map

Root Directories

Path	Purpose
`~/dev/`	Working root for all development
`~/dev/projects/`	Git repositories for each service
`~/dev/specs/`	Project specs, backlogs, roadmaps, PDLC artifacts
`~/dev/ops/`	Operations: scripts, agents, reviews, outputs
`~/.claude/`	Claude Code configuration and agent memory

Operations (`~/dev/ops/`)

Path	Purpose
`~/dev/ops/outputs/`	Status files, pipeline log, crash counters, dispatcher state
`~/dev/ops/outputs/pipeline.log`	ADLC dispatcher log (all transitions, injections, errors)
`~/dev/ops/outputs/pdlc-pipeline.log`	PDLC dispatcher log
`~/dev/ops/outputs/session-health.log`	Session health monitor log
`~/dev/ops/outputs/external-blockers.log`	External blocker history
`~/dev/ops/outputs/resolver-fixes.log`	Resolver fix history
`~/dev/ops/outputs/$SERVICE-$STAGE.status`	Per-service per-stage status files
`~/dev/ops/outputs/$SERVICE.crashes`	Per-service crash counter
`~/dev/ops/outputs/dispatcher-run-count`	ADLC dispatcher run counter (for kanban every 6th)
`~/dev/ops/outputs/pdlc-dispatcher-run-count`	PDLC dispatcher run counter
`~/dev/ops/outputs/session-health-state`	Idle counter and hash state
`~/dev/ops/outputs/claude-interactions.count`	Cumulative interaction counter
`~/dev/ops/outputs/resolver-last-run.ts`	Timestamp of last resolver execution
`~/dev/ops/outputs/resolver-monitor-*.json`	Post-fix monitoring criteria
`~/dev/ops/reviews/`	Review reports directory
`~/dev/ops/reviews/$SERVICE/`	Per-service review reports
`~/dev/ops/reviews/$SERVICE/ba-report.json`	BA review report
`~/dev/ops/reviews/$SERVICE/architect-report.json`	Architect review report
`~/dev/ops/reviews/$SERVICE/security-report.json`	Security review report
`~/dev/ops/reviews/$SERVICE/devops-report.json`	DevOps review report
`~/dev/ops/reviews/$SERVICE/deploy-report.json`	Deploy report
`~/dev/ops/reviews/$SERVICE/e2e-report.json`	E2E test report
`~/dev/ops/reviews/$SERVICE/provisioner-report.json`	Provisioner report
`~/dev/ops/reviews/$SERVICE/screenshots/`	E2E browser test screenshots
`~/dev/ops/reviews/$SERVICE/last-reviewed-commit.txt`	Last commit reviewed by BA
`~/dev/ops/reviews/$SERVICE/last-fail-commit.txt`	Commit at time of last review failure
`~/dev/ops/reviews/adlc-audit-YYYYMMDD-HHMM.json`	Periodic audit reports
`~/dev/ops/reviews/resolver/`	Resolver resolution reports and changelog
`~/dev/ops/reviews/resolver/RES-$DATE-$SEQ.json`	Individual resolution reports
`~/dev/ops/reviews/resolver/changelog.md`	Resolver fix history
`~/dev/ops/agents/service-project-map.json`	Service -> project mapping (THE registry)
`~/dev/ops/coolify/$SERVICE.json`	Coolify deployment config per service
`~/dev/ops/external-deps.md`	Known external dependencies table
`~/dev/ops/slack-inbox/`	ADLC Slack message inbox
`~/dev/ops/slack-inbox/processed/`	Processed ADLC Slack messages
`~/dev/ops/pdlc-slack-inbox/`	PDLC Slack message inbox
`~/dev/ops/pdlc-slack-inbox/processed/`	Processed PDLC Slack messages

Scripts

Path	Purpose	Interval
`~/dev/ops/adlc-v2/scripts/dispatcher-v3.sh`	ADLC bash dispatcher	5 min (systemd)
`~/dev/ops/adlc-v2/scripts/session-health.sh`	Claude session health monitor	1 min (systemd)
`~/dev/ops/adlc-v2/scripts/test-runner.sh`	Bash test runner (zero tokens)	On demand
`~/dev/ops/adlc-v2/scripts/slack-bridge.sh`	ADLC Slack polling bridge	Continuous
`~/dev/ops/pdlc/scripts/dispatcher-pdlc.sh`	PDLC bash dispatcher	10 min (systemd)
`~/dev/ops/pdlc/scripts/pdlc-slack-bridge.sh`	PDLC Slack polling bridge	Continuous

Agent Definitions

Path	Agent
`~/.claude/agents/dev.md`	Developer agent
`~/.claude/agents/ba.md`	Business Analyst agent
`~/.claude/agents/architect.md`	Architect review agent
`~/.claude/agents/security.md`	Security review agent
`~/.claude/agents/devops.md`	DevOps review/deploy agent
`~/.claude/agents/pr.md`	PR creation agent
`~/.claude/agents/provisioner.md`	Infrastructure provisioner agent
`~/.claude/agents/scenario.md`	E2E scenario generator agent
`~/.claude/agents/e2e-test.md`	E2E test executor agent
`~/.claude/agents/auditor.md`	ADLC compliance auditor agent
`~/.claude/agents/resolver.md`	Systemic problem resolver agent
`~/.claude/agents/discovery.md`	Product discovery agent
`~/.claude/agents/spec-writer.md`	Spec writer agent
`~/.claude/agents/prioritization.md`	Backlog prioritization agent
`~/.claude/agents/validation.md`	Spec validation agent
`~/.claude/agents/gtm.md`	Go-to-market agent
`~/.claude/agents/analytics.md`	Analytics/KPI agent

Skill Definitions

Path	Skill	Pipeline
`~/.claude/skills/dev-task/SKILL.md`	`/dev-task` – develop a feature	ADLC
`~/.claude/skills/boot/SKILL.md`	`/boot` – daily context rebuild	ADLC
`~/.claude/skills/status/SKILL.md`	`/status` – show current state	ADLC
`~/.claude/skills/kanban/SKILL.md`	`/kanban` – visual kanban board	ADLC
`~/.claude/skills/registry/SKILL.md`	`/registry` – maintain service-project-map.json	ADLC
`~/.claude/skills/check-pipeline/SKILL.md`	`/check-pipeline` – scan and advance pipeline	ADLC
`~/.claude/skills/slack-bridge/SKILL.md`	`/slack-bridge` – interpret Slack messages	ADLC
`~/.claude/skills/boot-pdlc/SKILL.md`	`/boot-pdlc` – PDLC context rebuild	PDLC
`~/.claude/skills/pdlc-bridge/SKILL.md`	`/pdlc-bridge` – interpret PDLC Slack messages	PDLC
`~/.claude/skills/check-pdlc/SKILL.md`	`/check-pdlc` – scan and advance PDLC pipeline	PDLC

Project Specs (`~/dev/specs/$PROJECT/`)

Path	Purpose
`~/dev/specs/$PROJECT/PROJECT.md`	Project definition
`~/dev/specs/$PROJECT/context/architecture.md`	Architecture decisions
`~/dev/specs/$PROJECT/context/business-rules.md`	Business rules
`~/dev/specs/$PROJECT/gestion/progress.md`	Dev progress tracking
`~/dev/specs/$PROJECT/gestion/backlog.md`	Prioritized backlog (RICE scored)
`~/dev/specs/$PROJECT/gestion/roadmap.md`	Timeline / roadmap
`~/dev/specs/$PROJECT/specs/$SERVICE/spec.md`	Service specification
`~/dev/specs/$PROJECT/pdlc/pipeline-state.md`	PDLC pipeline state per feature
`~/dev/specs/$PROJECT/pdlc/opportunities/$NAME.md`	Discovery opportunity briefs
`~/dev/specs/$PROJECT/pdlc/opportunities/$NAME.json`	Discovery opportunity JSON summaries
`~/dev/specs/$PROJECT/pdlc/validations/$SERVICE-validation.json`	Spec validation reports
`~/dev/specs/$PROJECT/pdlc/handoffs/$SERVICE-handoff.json`	ADLC handoff files
`~/dev/specs/$PROJECT/pdlc/gtm/$SERVICE-gtm.md`	GTM briefs
`~/dev/specs/$PROJECT/pdlc/analytics/$SERVICE-kpis.md`	Analytics/KPI definitions
`~/dev/specs/$PROJECT/pdlc/prioritization-$DATE.json`	Prioritization snapshots

Service Projects (`~/dev/projects/$SERVICE/`)

Path	Purpose
`~/dev/projects/$SERVICE/CLAUDE.md`	Service-specific Claude rules
`~/dev/projects/$SERVICE/.env`	Environment variables (runtime)
`~/dev/projects/$SERVICE/.env.example`	Env var documentation
`~/dev/projects/$SERVICE/.env.staging`	Staging URL and config
`~/dev/projects/$SERVICE/tests/e2e/scenarios.json`	API E2E test scenarios
`~/dev/projects/$SERVICE/tests/e2e/browser-scenarios.json`	Browser E2E test scenarios
`~/dev/projects/$SERVICE/tests/e2e/mock-data.sql`	E2E test data
`~/dev/projects/$SERVICE/tests/e2e/cleanup.sql`	E2E cleanup script

Claude Memory (`~/.claude/`)

Path	Purpose
`~/.claude/agent-memory/pipeline/state.md`	ADLC pipeline state (read by both pipelines)
`~/.claude/agent-memory/pipeline/last-audit.md`	Last audit summary
`~/.claude/agent-memory/pipeline/last-audit-ts.txt`	Last audit timestamp
`~/.claude/agent-memory/pipeline/run-count`	Check-pipeline run counter
`~/.claude/agent-memory/pipeline/retries-$SERVICE.txt`	Per-service retry counter
`~/.claude/agent-memory/pdlc-bridge/`	PDLC bridge timestamps
`~/dev/CLAUDE.md`	ADLC orchestrator rules

Infrastructure

Path	Purpose
`~/dev/ops/coolify/$SERVICE.json`	Coolify app config (UUID, URLs, registry)
`~/dev/infra-registry/`	Infrastructure asset registry (git repo)
`~/.env.adlc`	ADLC environment (SLACK_BOT_TOKEN, COOLIFY_API_TOKEN, etc.)
`~/.env.openclaw`	Fallback environment file
`/tmp/dispatcher-v3.lock`	ADLC dispatcher lock file
`/tmp/dispatcher-pdlc.lock`	PDLC dispatcher lock file

Database

Resource	Connection
PostgreSQL	`postgres://ods:ods-dev-2026@127.0.0.1:5433/ods`

GCP

Resource	Value
Project ID	`ninth-park-452914-v8`
Region	`europe-west1`
Bucket naming	`ods-$SERVICE-staging`

End of System Reference Document

13. Innovation Pipeline

Architecture

Third pipeline alongside ADLC and PDLC. Feeds both with technology signals.

Agents (5)

Agent	Model	Turns	Role
veille	sonnet	30	Daily tech watch: releases, CVEs, trending repos (top 10 rotation), ad-hoc URL review, security audit
benchmark	opus	35	Monthly comparison ODS vs industry best practices
poc-builder	opus	40	Rapid PoC prototypes in ~/dev/pocs/
innovation-scorer	sonnet	20	FIRE framework scoring + correlation with ODS backlog/specs/daily findings
adr-writer	sonnet	15	Architecture Decision Records

Slack Channel

Innovation: C0AMSKF5NCF

Commands (via Slack #Innovation)

Pattern	Action
Any URL	Ad-hoc veille review with security audit
`veille`, `watch`	Run daily veille
`benchmark`	Run monthly benchmark
`poc {description}`	Build proof of concept
`score {proposal}`	FIRE score + correlation
`adr {decision}`	Write Architecture Decision Record

Timers

Timer	Schedule	Action
ods-innovation.timer	Daily 07:00 UTC	Run veille agent
ods-innovation-bridge.service	Always-on	Poll #Innovation every 30s

Output Format

Per-finding: JSON + HTML (ODS branded)
Daily summary: JSON + HTML dashboard
All synced to GDrive: ods-platform/innovation/

FIRE Scoring Framework

Feasibility (0-10): Can we build this?
Impact (0-10): How much value?
Relevance (0-10): Alignment with roadmap?
Effort (0-10 inverse): 10=trivial, 0=massive
Score = (F+I+R+E) / 4

Integration

Veille findings → PDLC Discovery agent input
Benchmark results → ADLC Architect agent patterns
PoC results → PDLC Spec Writer input
ADRs → All agents reference for decisions

14. CLI Tools (Deterministic Output)

Available Tools

CLI	Purpose	Validates
write-status.sh	Status files	Status ∈ enum (11 values), agent ∈ enum
write-review.sh	JSON review reports	Schema per agent type, verdict ∈ enum
write-lesson.sh	Lessons learned	All 6 fields non-empty (min 5 chars)
write-pipeline-state.sh	Pipeline state	State ∈ enum (27 values)
write-human-review.sh	Human decision reports	JSON schema, generates HTML + PDF + GDrive + Slack

Principle

Agents MUST use CLI tools to write output files. Never Write/Edit directly. The CLI validates format and rejects invalid input — the agent has no choice but to produce correct output.

15. Bridges (Slack → Orchestrator)

Architecture

All 3 bridges poll every 30s, inject directly into tmux via send-keys. Fallback to inbox files if tmux session is down.

Bridge	Channel	Tmux Target	Method
ADLC	C0AN0N8AUGZ + D0AGRAVEC1K	ods-claude	Direct tmux + inbox fallback
PDLC	C0AN42N3C0L	ods-pdlc	Direct tmux
Innovation	C0AMSKF5NCF	ods-claude	Direct tmux

Known Issues

Timestamp must be reset after bridge restart (otherwise picks up from old position)
Bot flood (scan messages) can delay human message processing — limit=50 mitigates
Bridge restart after daily session restart requires timestamp init to “now”

16. PDLC Adaptive Modes

The PDLC dispatcher auto-detects its mode based on pending work:

Mode	Trigger	Scan Interval	Actions
Active	Specs being written/validated/designed	10 min	Full pipeline scan + ADLC monitoring + poke Claude
Monitoring	All specs done, ADLC working	30 min	Check ADLC state → GTM/Analytics/UI validation triggers only
Idle	Nothing pending, all deployed	Bridge only	Process Slack inbox, no scan

Mode Detection Logic

pending = count(spec-writing + validating + UI_DESIGN + DISCOVERY in pipeline-state.md)
running = count(RUNNING in *-spec-writer.status, *-validation.status, *-ui-design.status)
monitoring_work = count(STAGING_DEPLOYED without GTM, PROD_DEPLOYED without Analytics)

if pending + running > 0 → active
elif monitoring_work > 0 → monitoring
else → idle

Mode Transitions

Mode changes are logged and posted to Slack #PDLC
Kanban frequency adapts: active=every 1h, monitoring=every 1.5h, idle=never
Inbox processing happens in ALL modes (human messages always received)
A new Slack message (spec request, design recheck) can wake up from idle → active

Monitoring Triggers

ADLC State	PDLC Action
STAGING_DEPLOYED + no GTM brief	Spawn GTM agent
STAGING_DEPLOYED + has design brief + no UI review	Spawn UI Designer Mode 2 (visual validation)
PROD_DEPLOYED + no Analytics KPIs	Spawn Analytics agent

ODS Platform – System Reference

Table of Contents

1. System Architecture

Overview

Two Pipelines

Bash Dispatchers

Session Health (session-health.sh, every 1 minute)

Slack Bridges

Daily Restart

2. ADLC Pipeline Flow

Complete Pipeline Sequence

Step-by-Step Detail

Step 1: Development

Step 2: Branch Consolidation (Dispatcher, automatic)

Step 3: Tests (Bash)

Step 4: BA Review

Step 5: Architect + Security + DevOps Reviews (Parallel)

5a. Architect Review

5b. Security Review

5c. DevOps Review

Step 6: PR Creation and Merge

Step 7: Provisioner (if needed)

Step 8: Deploy to Staging

Step 9: Scenario Generation

Step 10: E2E Tests

3. PDLC Pipeline Flow

Complete Pipeline Sequence

Pipeline States

Step-by-Step Detail

Step 1: Discovery

Step 2: Spec Writing

Step 3: Prioritization

Step 4: Validation

Step 5: ADLC Handoff

Step 6: GTM (Go-to-Market)

Step 7: Analytics

4. Agent Registry

ADLC Agents

dev

ba

architect

security

devops

pr

provisioner

scenario

e2e-test

auditor

resolver

PDLC Agents

discovery

spec-writer

prioritization

validation

gtm

analytics

5. Status File Format Standard

THE Definitive Format

Valid STATUS Values

INVALID Values

Status File Naming Convention

Stale RUNNING Detection

6. Dispatcher State Machine

ADLC Dispatcher (dispatcher-v3.sh)

Execution Order (every 5 minutes)

State Transitions

What the Dispatcher Reads

Decision Logic (Pure Bash)

What the Dispatcher Injects into Claude

What Status Files the Dispatcher Creates/Modifies

PDLC Dispatcher (dispatcher-pdlc.sh)

Execution Order (every 10 minutes)

GTM Auto-Trigger Logic

Idle Detection

7. Service Classification

Three Service Types

service-project-map.json Structure

Deploy Mode Mapping

Current Projects

8. External Dependencies

Operations (`~/dev/ops/`)

Project Specs (`~/dev/specs/$PROJECT/`)

Service Projects (`~/dev/projects/$SERVICE/`)

Claude Memory (`~/.claude/`)