Troubleshooting
Common issues and solutions when running Stategraph.
Deployment Issues
Container won't start
Symptoms: Server container exits immediately or keeps restarting.
Check logs:
docker compose logs server
Missing required environment variables
Error: Key_error "STATEGRAPH_UI_BASE"
Solution:
- Set all required environment variables (see Environment Variables)
Database connection failed
Error: Could not connect to database
Solution:
- Verify database is running and credentials are correct
Port already in use
Error: bind: address already in use
Solution:
- Stop the service using the port or change STATEGRAPH_PORT
Database connection errors
Symptoms: Server starts but can't connect to PostgreSQL.
Checklist:
- PostgreSQL container is healthy:
docker compose ps DB_HOSTmatches the service name (e.g.,dbfor Docker Compose)DB_PORTis correct (default:5432)DB_USER,DB_PASS,DB_NAMEmatch PostgreSQL configuration- Network connectivity between containers
Test connection:
docker compose exec server nc -zv db 5432
Health check failing
Symptoms: Container marked unhealthy, restarts repeatedly.
Check health endpoint:
curl http://localhost:8080/api/v1/health
Common causes:
- Database not ready yet (increase depends_on timeout)
- Port mismatch between health check and actual port
- Internal service not starting
Authentication Issues
OAuth redirect errors
"redirect_uri_mismatch"
The callback URL doesn't match your OAuth provider configuration.
Solution:
1. Check STATEGRAPH_UI_BASE matches your access URL exactly
2. Add exact callback URL to OAuth provider:
- For Google: {STATEGRAPH_UI_BASE}/oauth2/google/callback
- For OIDC: {STATEGRAPH_UI_BASE}/oauth2/oidc/callback
3. Verify protocol (http vs https) matches
"invalid_client"
Client ID or secret is incorrect.
Solution:
- Verify credentials from OAuth provider dashboard
- Check for extra whitespace or newlines
- Regenerate secret if needed
Session not persisting
Symptoms: Login succeeds but immediately redirects back to login.
Causes:
1. URL mismatch: STATEGRAPH_UI_BASE differs from access URL
2. Cookie not set: Proxy stripping cookies, or SameSite issues
3. HTTPS mismatch: Accessing via http when configured for https
Solutions:
- Verify STATEGRAPH_UI_BASE exactly matches your browser URL
- Check for reverse proxy cookie handling
- Use consistent protocol
"Access denied" after authentication
Causes:
1. Email domain restriction: User email not in allowed domain
2. Google Groups: User not in required group
3. OAuth app not approved in organization
Solutions:
- Check STATEGRAPH_OAUTH_EMAIL_DOMAIN setting
- Verify group membership (Google Groups)
- Request app approval from organization admin
Terraform Backend Issues
"Failed to get existing workspaces"
Symptoms: terraform init fails with HTTP error.
Causes:
1. Stategraph server not running
2. URL incorrect
3. Network connectivity
4. Authentication failure
Solutions:
# Test connectivity
curl http://localhost:8080/api/v1/health
# Test with credentials
curl -H "Authorization: Bearer $STATEGRAPH_API_KEY" http://localhost:8080/api/v1/whoami
"Error acquiring the state lock"
Symptoms: Terraform can't acquire lock, another process may be running.
Solutions:
1. Wait for other operation to complete
2. If stuck, force unlock:
bash
terraform force-unlock LOCK_ID
3. Check for crashed Terraform processes
"HTTP error: 401 Unauthorized"
Causes:
1. API key invalid or expired
2. Username not set to session
3. Token has leading/trailing whitespace
Solutions:
- Create a new API key
- Verify username = "session" in backend config
- Check token value for whitespace
Large state timeout
Symptoms: Operations fail for large state files.
Solutions:
1. Increase STATEGRAPH_CLIENT_MAX_BODY_SIZE (default: 512m)
2. Check reverse proxy timeouts
3. Consider splitting state into smaller files
UI Issues
Page won't load
Symptoms: Browser shows blank page or error.
Check:
1. Browser developer console for JavaScript errors
2. Network tab for failed requests
3. Server logs for backend errors
Solutions:
- Clear browser cache
- Try incognito/private mode
- Check CORS settings if UI is separate
Query returns no results
Symptoms: MQL query runs but returns empty.
Causes:
1. No matching data
2. Query syntax issue
3. Wrong table or column names
Debug steps:
-- Verify data exists
SELECT count(*) FROM instances
-- Check available types
SELECT DISTINCT r.type FROM resources r ORDER BY r.type
-- Simplify query
SELECT * FROM instances LIMIT 10
Graph won't render
Symptoms: Dependency graph blank or shows error.
Causes:
1. State has no resources
2. Very large state causing performance issues
3. Browser memory limitations
Solutions:
- Check state has resources in the list view
- Apply filters to reduce graph size
- Try a different browser
Performance Issues
Slow queries
Symptoms: MQL queries take long to execute.
Solutions:
1. Add LIMIT clause:
sql
SELECT * FROM instances LIMIT 100
2. Use specific columns instead of *
3. Add filters early in query
4. Check database indexes
High memory usage
Causes:
1. Large state files
2. Many concurrent connections
3. Memory leak (report as bug)
Solutions:
- Increase container memory limits
- Reduce DB_MAX_POOL_SIZE
- Monitor and restart periodically if needed
Database connection exhaustion
Symptoms: "too many connections" errors.
Solutions:
1. Reduce DB_MAX_POOL_SIZE
2. Increase PostgreSQL max_connections
3. Check for connection leaks
Gap Analysis Issues
"Not ready for gap analysis"
Symptoms: Gap analysis reports not ready.
Causes:
1. AWS Config not enabled
2. Aggregator not configured
3. Missing IAM permissions
Solutions:
- Enable AWS Config in your account
- Create a configuration aggregator
- Grant Stategraph required permissions
AWS resources not appearing
Causes:
1. AWS Config not recording resource types
2. Aggregator missing regions
3. Stale cache
Solutions:
- Verify AWS Config recording settings
- Check aggregator configuration
- Use source=no-cache to force refresh
Getting Help
Collect diagnostic information
Before reporting issues, gather:
-
Server logs:
bash docker compose logs server > server.log 2>&1 -
Environment (redact secrets):
bash docker compose config -
Version information:
bash docker compose images -
Health check output:
bash curl http://localhost:8080/api/v1/health
Reporting issues
Report issues at: https://github.com/stategraph/releases/issues
Include:
- Description of the problem
- Steps to reproduce
- Expected vs actual behavior
- Diagnostic information (above)
- Screenshots if applicable
Common Error Messages
| Error | Cause | Solution |
|---|---|---|
Key_error "..." |
Missing environment variable | Set the required variable |
Connection refused |
Service not running | Start the service |
401 Unauthorized |
Invalid credentials | Check token/session |
redirect_uri_mismatch |
OAuth URL mismatch | Fix callback URL |
Lock held by another |
Concurrent Terraform | Wait or force-unlock |
too many connections |
Pool exhausted | Reduce pool size |