Skip to content

Observability — Langfuse, Pulse, status

Cavaridge™ observability has three feeds that work together.

Every LLM call from every app routes through the AI gateway, and the gateway emits to Langfuse with:

  • Tenant ID + source app
  • Model, provider, latency, token count
  • Per-request cost (input + output)
  • Hashed prompt fingerprint for similarity grouping

Langfuse is the single place to debug “why is this user’s spend high” or “why did this completion regress.”

User-journey milestones, domain verbs, and trial-lifecycle events flow through Pulse. The status page consumes Pulse to show “AEGIS scans completed in last hour” or “trial conversions today.”

Status registry — uptime + dependency health

Section titled “Status registry — uptime + dependency health”

packages/status-registry/ exposes per-app health checks. Each app registers a /healthz that the registry probes. Cascade failures (e.g., AI gateway down → every dependent app degraded) are visible at a glance.

QuestionLook at
”Why is our LLM bill high?”Langfuse → group by tenant_id + model
”Did the trial lifecycle email send?”Pulse → trial_* events for that tenant
”Is AEGIS healthy?”Status registry + AEGIS Pulse domain events
”What did this user do today?”Activity feed (Pulse consumer)
  • Don’t bypass the AI gateway — Langfuse won’t see the spend.
  • Don’t write to Pulse for high-frequency internal events. Reserve it for user-meaningful actions.