SOP-008: Security Findings Triage and Remediation

Purpose

Triage, prioritize, and remediate security findings from automated security reviews, dependency audits, and manual assessments. Ensure all findings are tracked to resolution and that critical vulnerabilities are escalated immediately.

Scope

Applies to all services in the ADLC pipeline. Covers security agent review reports, dependency CVEs, authentication/authorization issues, rate limiting bypasses, and data exposure risks.

Prerequisites

Security review report: ~/dev/ops/reviews/{service}/security-report.json
Slack credentials: source ~/.env.adlc
Understanding of the service’s authentication model (JWT with tenant_id)
Access to the service’s spec for expected security controls

Procedure

1. Read the security report

python3 -c "
import json
r = json.load(open('$HOME/dev/ops/reviews/{service}/security-report.json'))
print(f'Status: {r[\"status\"]}')
print(f'Severity: {r[\"severity\"]}')
for c in r.get('checks', []):
    if c.get('status') not in ('PASS', 'pass', 'OK'):
        print(f'  FINDING: {c[\"id\"]} -- {c[\"description\"]} [{c.get(\"severity\",\"?\")}]')
"

2. Classify findings by severity

Severity	Definition	SLA
CRITICAL	Active exploit possible, data breach imminent	Block pipeline, fix NOW
HIGH	Vulnerability exploitable with moderate effort	Fix before staging promotion
MEDIUM	Defense-in-depth gap, no immediate exploit	Fix before production
LOW	Best practice deviation, informational	Track, fix when convenient

3. Triage common finding categories

A. Authentication / Authorization

Missing tenant_id filter (CRITICAL): Every database query MUST include tenant_id. Cross-tenant data access is the highest severity finding.

# Check for queries without tenant_id
grep -rn "SELECT\|INSERT\|UPDATE\|DELETE" ~/dev/projects/{service}/src/ | grep -v "tenant_id" | grep -v "migration"

JWT validation gaps: - Missing signature verification - Missing expiry check - Missing tenant_id extraction from claims - Hardcoded secrets (check .env files for actual keys in repo)

Open redirect via callbackUrl (lesson from 2026-03-23): Any login form accepting callbackUrl/returnTo/next must validate it is a relative path:

if (!callbackUrl.startsWith('/') || callbackUrl.startsWith('//')) use fallback

B. Rate Limiting

x-forwarded-for bypass (lesson from 2026-03-23): Rate limiters MUST NOT use x-forwarded-for first value as the IP source. Behind Cloudflare, use cf-connecting-ip:

request.headers.get('cf-connecting-ip') ?? request.headers.get('x-forwarded-for')?.split(',')[0].trim()

C. Dependency Vulnerabilities

# For Rust services (if cargo-audit is installed)
cd ~/dev/projects/{service} && cargo audit 2>/dev/null || echo "cargo-audit not installed -- check Cargo.toml manually"

# For Node services
cd ~/dev/projects/{service} && npm audit --production 2>/dev/null

Note: cargo audit may not be installed (lesson from 2026-03-22). Fall back to manual Cargo.toml review. Do not block the entire report for this.

Active CVE: CVE-2026-2005 (PostgreSQL pgcrypto HIGH) – upgrade to PostgreSQL 17.9 required.

D. Configuration / Secrets

.env files must NOT be committed to git
Check .env.example uses correct variable names (lesson: AUTH_SECRET not NEXTAUTH_SECRET for next-auth v5)
NODE_ENV must be production on staging (lesson from 2026-03-23)
next-auth signOut requires POST, not GET (lesson from 2026-03-23)

E. Code Quality Security Issues

Commented-out security controls are NOT fixes (lesson from 2026-03-23). Verify controls are active at runtime.
Stub modules not replaced (lesson from 2026-03-23). When a real auth module replaces a stub, grep for all imports of the stub.
Empty data files (lesson from 2026-03-23). Check content (wc -c), not just existence.

4. Escalate CRITICAL/HIGH findings

For CRITICAL findings, post immediately to Slack DM:

source ~/.env.adlc
curl -sf -X POST "https://slack.com/api/chat.postMessage" \
  -H "Authorization: Bearer $SLACK_BOT_TOKEN" \
  -H "Content-Type: application/json" \
  -d "$(python3 -c "
import json
print(json.dumps({
    'channel': 'D0AGRAVEC1K',
    'text': ':rotating_light: SECURITY CRITICAL -- {service}\nFinding: {finding_id} -- {description}\nRisk: {risk_description}\nRemediation: {proposed_fix}'
}))
")"

For CRITICAL findings, also generate a human review report:

cat << 'EOF' | bash ~/dev/ops/adlc-v2/scripts/cli/write-human-review.sh {service} security
{
  "title": "{finding_id}: {short_title}",
  "summary": "{description of the security finding and its impact}",
  "timeline": [...],
  "blocker": {
    "description": "{what is at risk}",
    "rootCause": "{why the vulnerability exists}",
    "impact": "{potential damage}",
    "severity": "CRITICAL"
  },
  "options": [
    {"label": "Fix now", "description": "Block pipeline, remediate immediately", "slackReply": "approved fix {finding_id}"},
    {"label": "Accept risk", "description": "Document and proceed to staging", "slackReply": "accepted risk {finding_id}"}
  ]
}
EOF

5. Remediate

Spawn dev agent for fix:

/agent dev "SERVICE: {service}. PROJECT: {project}. SECURITY FIX: {finding_id} -- {description}. Fix: {remediation_steps}. Spec: ~/dev/specs/{project}/specs/{service}/spec.md"

6. Re-audit after fix

Spawn ONLY the security agent (not all 4 reviews):

/agent security "SERVICE: {service}. PROJECT: {project}. RE-AUDIT after fix for {finding_id}. Write JSON report to ~/dev/ops/reviews/{service}/security-report.json"

Track retry count. After 3 retries, mark BLOCKED.

7. Write findings using CLI

CLI="$HOME/dev/ops/adlc-v2/scripts/cli"
echo '{"id":"{finding_id}","severity":"{severity}","description":"{desc}","status":"OPEN","service":"{service}"}' | bash $CLI/write-finding.sh {service}

Verification

Security report shows clean or concerns with no HIGH/CRITICAL: cat ~/dev/ops/reviews/{service}/security-report.json
All CRITICAL/HIGH findings have corresponding fix commits
Re-audit confirms fixes are active at runtime (not just commented code)
No secrets in git history: git log --all -p -- '*.env' '*.key' '*.pem'

Rollback

Security fixes should not be rolled back. If a fix introduces a regression: 1. Keep the security fix 2. Fix the regression separately 3. Re-run both security audit and functional tests

If a finding is a false positive: 1. Document why in the security report 2. Add to an exceptions list: ~/dev/ops/reviews/{service}/security-exceptions.json 3. Re-run audit to confirm it passes with the exception noted

References

Security reports: ~/dev/ops/reviews/{service}/security-report.json
CLI finding tool: ~/dev/ops/adlc-v2/scripts/cli/write-finding.sh
Human review tool: ~/dev/ops/adlc-v2/scripts/cli/write-human-review.sh
Lesson: cargo-audit not installed (2026-03-22) – flag and continue, don’t block
Lesson: x-forwarded-for rate limit bypass (2026-03-23) – use cf-connecting-ip behind Cloudflare
Lesson: Open redirect via callbackUrl (2026-03-23) – validate relative path
Lesson: Commented-out control is not a fix (2026-03-23) – verify runtime activation
Lesson: Parallel auth modules / stub not replaced (2026-03-23) – grep all imports
Lesson: next-auth v5 AUTH_SECRET not NEXTAUTH_SECRET (2026-03-23)
Lesson: next-auth signOut requires POST (2026-03-23)
Lesson: Empty JSON file is not the same as absent file (2026-03-23)