| AC-001 |
CloudEvents v1.0 format with ODS extensions |
Met |
src/event.rs — CloudEvent struct, specversion='1.0', tenantid + correlationid required fields. validate() enforces all rules. 13 unit tests. |
| AC-002 |
Async EventProducer publishing CloudEvents |
Met |
src/client/producer.rs — EventProducer wraps rdkafka FutureProducer. emit() + emit_raw(). Subject as Kafka key for partition affinity. 5s default timeout. 6 unit tests. |
| AC-003 |
Async EventConsumer receiving CloudEvents |
Met |
src/client/consumer.rs — EventConsumer wraps StreamConsumer. 1 MB payload guard, malformed-message skip + warn, stream-ended error propagation. 7 tests. |
| AC-004 |
Topic naming convention: events.{source} |
Met |
CloudEvent::topic() returns format!("events.{}", source). Verified by topic_follows_ods_convention test. |
| AC-005 |
20 ODS platform topics with correct retention/partition configs |
Met |
src/admin/topic.rs — 9 events (7d/3p/delete), 7 CDC (3d/1p/compact), 4 billing (30d/3p/delete). 13 unit tests verify counts, names, retention values, cleanup policies. |
| AC-006 |
Idempotent topic provisioning on cluster |
Met |
src/admin/provisioning.rs — TopicProvisioner::provision() + provision_all(). Fetches existing topics, skips known ones. ProvisionResult::Created|AlreadyExists|Failed. validate_topic_configs() detects drift. 8 tests. |
| AC-007 |
Cluster health checks |
Met |
src/admin/health.rs — HealthChecker::check() returns ClusterHealth with broker count and visible topics (internal _* filtered). topic_exists() helper. 4 tests. |
| AC-008 |
Metrics collection (cluster, consumer lag, watermarks) |
Met |
src/monitoring/metrics.rs — MetricsCollector, ClusterMetrics, ConsumerGroupLag, PartitionLag. Watermark-delta message counting. 14 unit tests. |
| AC-009 |
Alert evaluation with configurable thresholds |
Met |
src/monitoring/alerting.rs — AlertEvaluator with AlertThresholds (lag_warning:1000, lag_critical:10000, min_brokers:1, min_topics:20). evaluate_all() sorts critical-first. 17 tests including boundary conditions. |
| AC-010 |
Prometheus text exposition format export |
Met |
src/monitoring/prometheus.rs — PrometheusExporter outputs HELP/TYPE/gauge lines for broker count, topic count, per-topic messages, per-partition watermarks. Configurable prefix. 10 tests. |
| AC-011 |
Webhook-based alert forwarding (Slack-compatible) |
Met |
src/monitoring/webhook.rs — WebhookNotifier, WebhookPayloadBuilder::build_generic() + build_slack(). Sensitive headers (Authorization, X-Api-Key) redacted during serialization. 22 tests. |
| AC-012 |
Cross-cluster replication (ActivePassive / ActiveActive) |
Met |
src/replication/replicator.rs — TopicReplicator::replicate_batch() consumes source, produces to destination. status() computes per-partition lag via watermark comparison. Mode: ActivePassive|ActiveActive. |
| AC-013 |
Topic filtering (All / Explicit / Prefix) |
Met |
src/replication/config.rs — TopicFilter::All|Explicit(Vec)|Prefix(String). Internal _* topics always excluded. filter_topics() helper. 7 tests. |
| AC-014 |
Topic mapping (Identity / Prefixed) |
Met |
TopicMapping::Identity|Prefixed(String) with map_topic(). Empty-prefix edge case tested. 3 tests. |
| AC-015 |
Replication lag tracking across partitions |
Met |
src/replication/status.rs — ReplicationStatus + PartitionReplicationStatus with has_critical_lag(), max_lag(), topic_count(). 7 tests. |
| AC-016 |
Structured error handling (all failure modes) |
Met |
src/error.rs — EventError with 6 variants via thiserror: Serialization, Send, Consume, InvalidEvent, Monitoring, Replication. 6 tests. |
| AC-017 |
TLS/SASL security config for broker connections |
Met |
src/security.rs — SecurityConfig with Plaintext|Ssl|SaslPlaintext|SaslSsl, SCRAM-SHA-256/512, with_client_cert() for mTLS. apply() writes to rdkafka::ClientConfig. 9 tests. |
| AC-018 |
CI/CD pipeline (test + clippy + fmt + build + deploy) |
Met |
.github/workflows/ci-deploy.yml — cargo fmt --check, clippy -D warnings, cargo test, Docker push to registry.agirdigital.com, Coolify webhooks for dev/staging/prod. |
| AC-019 |
Redpanda single-node production deployment |
Met |
docker-compose.yml — Redpanda v24.3.1, VPN-bound ports on 10.0.0.3, 2 SMP, 4 GB memory, healthcheck via rpk cluster health. Schema Registry + Console included. |
| AC-020 |
Prometheus + Grafana observability stack |
Met |
docker-compose.yml — prom/prometheus:v2.51.2 (30-day retention), grafana/grafana:11.0.0. config/prometheus.yml scrapes /public_metrics + /metrics at 15s. |
| AC-021 |
Shell script for topic creation via rpk |
Met |
scripts/create-topics.sh — creates all 20 topics with correct configs via rpk. Idempotent. Accepts optional BROKER argument. |
| AC-022 |
Architecture Decision Records (ADR-001..005) |
Met |
docs/adr/ — ADR-001 (crate design), ADR-002 (topic provisioning), ADR-003 (monitoring), ADR-004 (replication), ADR-005 (third-party integrations). All Accepted. |
| AC-023 |
events.redpanda self-topic in registry |
Met |
Fixed in this commit. all_topics() now includes events.redpanda (9 events total, 20 overall). Test event_topic_names_match_spec() verifies its presence. Previously MISSING. |
| AC-024 |
Grafana admin password mandatory enforcement |
Met |
Fixed in this commit. docker-compose.yml line 83 now uses ${GRAFANA_ADMIN_PASSWORD:?...} — fails startup if unset. Previously PARTIAL (defaulted to 'admin'). |
| AC-025 |
Integration tests with real Redpanda (testcontainers) |
Missing |
docker-compose.test.yml prepared but testcontainers absent from Cargo.toml. All 163 tests are unit-only. ADR-001 and ADR-004 explicitly defer this. Unchanged since prior review. |
| AC-026 |
PostgreSQL schema with RLS by tenant_id |
N/A |
Infrastructure service — shared library crate. No application persistence layer. |
| AC-027 |
REST API endpoints with JWT authentication |
N/A |
Infrastructure service — library crate with no HTTP server. |
| AC-028 |
Audit trail (who/when/what for mutations) |
N/A |
Infrastructure service — no mutable application state. Audit trail applies to consuming application services. |
| AC-029 |
Soft delete only |
N/A |
Infrastructure service — no database persistence layer. |