Health Checks
Stategraph provides two health check endpoints for deployment environments like ECS/ALB, Kubernetes, and other container orchestrators.
Endpoints
Liveness Probe
GET /health/live
Returns 200 OK as long as the nginx process is running. This endpoint is served directly by nginx and does not depend on the backend application or database.
Use this for:
- ALB target group health checks
- ECS container health checks
- Kubernetes liveness probes
Readiness Probe
GET /health/ready
Returns 200 OK only when the backend application is running and ready to serve requests. This endpoint is proxied by nginx to the backend and will return a non-200 status if the backend has not started yet (e.g., during database migrations).
Use this for:
- Kubernetes readiness probes
- Load balancer routing decisions (only send traffic to ready instances)
- Monitoring systems
Legacy Endpoint
GET /api/v1/health
This endpoint is served by the backend application and behaves the same as /health/ready — it returns 200 OK only when the backend is running. It is supported for backwards compatibility but new deployments should use /health/live and /health/ready.
Startup Behavior
During container startup, Stategraph runs database migrations before starting the backend HTTP server. The timeline looks like this:
t=0 Container starts
t=1 nginx starts listening on port 8080
t=1 /health/live returns 200
t=2 Database migrations begin
t=5+ Migrations complete, backend starts
t=5+ /health/ready returns 200
During the migration window:
- /health/live returns 200 (nginx is up)
- /health/ready returns 502 (backend not yet listening)
After migrations complete:
- /health/live returns 200
- /health/ready returns 200
ALB / ECS Configuration
Recommended Settings
| Setting | Value | Reason |
|---|---|---|
| Health check path | /health/live |
Available immediately, survives migration window |
| Health check interval | 30s | Standard interval |
| Healthy threshold | 2 | Two consecutive successes |
| Unhealthy threshold | 5 | Tolerant during startup |
| Health check grace period | 120s | Allow time for migrations on first deploy |
ECS Task Definition
{
"healthCheck": {
"command": ["CMD-SHELL", "curl -f http://localhost:8080/health/live || exit 1"],
"interval": 30,
"timeout": 5,
"retries": 5,
"startPeriod": 120
}
}
The startPeriod of 120 seconds gives migrations time to complete before ECS starts checking health.
ALB Target Group
resource "aws_lb_target_group" "stategraph" {
# ...
health_check {
path = "/health/live"
interval = 30
timeout = 5
healthy_threshold = 2
unhealthy_threshold = 5
}
}
Verifying Readiness Before Routing Traffic
If you want to ensure the backend is fully ready before routing traffic, use /health/ready as the ALB health check path instead. Set a longer grace period (180s) to account for migration time:
health_check {
path = "/health/ready"
interval = 30
timeout = 5
healthy_threshold = 2
unhealthy_threshold = 10
}
Kubernetes Configuration
livenessProbe:
httpGet:
path: /health/live
port: 8080
initialDelaySeconds: 5
periodSeconds: 10
failureThreshold: 3
readinessProbe:
httpGet:
path: /health/ready
port: 8080
initialDelaySeconds: 10
periodSeconds: 10
failureThreshold: 10