BA Review — redpanda

ODS Platform · Business Analyst Agent
INFRA
⚠ NON-COMPLIANT
Commit: 6f99451 · 2026-03-18
29
Total
21
Met ✅
1
Partial 🔶
2
Missing ❌
5
N/A —

Compliance Rate (applicable criteria)

21 of 24 applicable criteria met — 87.5% · 5 criteria N/A (infrastructure service)
⚠ Correction from Previous Report

The previous BA review (same commit, stored at specs/.../ops/reviews/redpanda/) incorrectly applied application-service checks to this infrastructure library crate. REST API, PostgreSQL schema/RLS, audit trail, soft delete, and Dockerfile-as-binary were marked MISSING with HIGH severity. These are all N/A for an infrastructure service. With correct service type classification, no HIGH severity deviations exist. The crate is functionally complete for P0.

📋 Spec Source

No spec.md found at /dev/specs/ods-platform/specs/redpanda/spec.md. Acceptance criteria derived from CLAUDE.md (service-specific), global CLAUDE.md, PROJECT.md, and ADR-001 through ADR-005.

Acceptance Criteria
ID Criterion Status Evidence
AC-001 CloudEvents v1.0 with ODS extensions (tenantid, correlationid) MET src/event.rs — 18 unit tests
AC-002 Async EventProducer publishing to Redpanda MET src/producer.rs — 5 unit tests
AC-003 Async EventConsumer receiving and deserializing CloudEvents MET src/consumer.rs — 4 async tests
AC-004 Topic naming convention: events.{source} MET src/event.rs::topic()
AC-005 19 ODS topics defined (8 events, 7 CDC, 4 billing) with correct configs MET src/topic.rs — all_topics() — 17 tests
AC-006 Idempotent topic provisioning on cluster MET src/provisioning.rs — 6 tests
AC-007 Cluster health checks (broker count, topic availability) MET src/health.rs — 4 tests
AC-008 Metrics collection (consumer lag, partition watermarks) MET src/metrics.rs — 22 tests
AC-009 Alert evaluation with configurable thresholds and severity levels MET src/alerting.rs — 17 tests
AC-010 Prometheus metrics exposition format export MET src/monitoring.rs — PrometheusExporter
AC-011 Webhook alert forwarding (Slack-compatible payloads) MET src/monitoring.rs — WebhookNotifier
AC-012 Cross-cluster replication (ActivePassive / ActiveActive) MET src/replication.rs — 30 tests
AC-013 Topic filtering for replication (All / Explicit / Prefix) MET src/replication.rs — TopicFilter
AC-014 Topic mapping for replication (Identity / Prefixed) MET src/replication.rs — TopicMapping
AC-015 Replication lag tracking across partitions MET src/replication.rs — ReplicationStatus
AC-016 Structured error handling covering all failure modes MET src/error.rs — 6 variants, 6 tests
AC-017 CI/CD: test + clippy + fmt + build + push + deploy MET .github/workflows/ci-deploy.yml
AC-018 Redpanda single-node production deployment on VPS #3 MET docker-compose.yml — v24.3.1, 4GB, 2 SMP
AC-019 Prometheus + Grafana observability stack deployed MET docker-compose.yml — prom:v2.51.2, grafana:11.0.0
AC-020 Shell script for topic creation via rpk MET scripts/create-topics.sh — 19 topics, idempotent
AC-021 Architecture Decision Records for all major decisions MET docs/adr/ — ADR-001..005 (all Accepted)
AC-022 events.redpanda topic in topic registry MISSING Not in src/topic.rs all_topics() — 19 topics defined but service's own topic absent
AC-023 PostgreSQL schema 'redpanda' with RLS by tenant_id N/A Infra service — no DB persistence layer required
AC-024 REST API endpoints (Actix-web) with JWT auth N/A Infra service — shared library crate, no HTTP server
AC-025 Audit trail: log who/when/what for every mutation N/A Infra service — no mutable application state
AC-026 Soft delete only (no hard deletes) N/A Infra service — no persistence layer
AC-027 Integration tests with testcontainers (real Redpanda) MISSING Deferred in ADR-001 and ADR-004 — docker-compose.test.yml exists but no testcontainers tests
AC-028 Dockerfile produces deployable service binary N/A Infra service — crate is [lib], distributed as Rust dependency
AC-029 Grafana admin password secured (no default 'admin') PARTIAL docker-compose.yml: defaults to 'admin' if GRAFANA_ADMIN_PASSWORD unset
Deviations
MEDIUM
Topic events.redpanda absent from topic registry. 19 topics defined in src/topic.rs, but the service's own operational topic is missing.
Spec: CLAUDE.md — Topic: events.redpanda (Redpanda, CloudEvents v1.0)
MEDIUM
No integration tests with a real Redpanda cluster. ADR-001 and ADR-004 explicitly defer testcontainers integration tests. docker-compose.test.yml exists but is unused by tests.
Spec: Global CLAUDE.md — TDD: Integration tests: Real PostgreSQL + real Redpanda (testcontainers)
LOW
Grafana admin password defaults to 'admin' when GRAFANA_ADMIN_PASSWORD env var is not set. Production risk if Coolify deployment omits this variable.
Spec: Global CLAUDE.md — Security-First Rules