Google Cloud Run

Deploy Stategraph on Google Cloud Run as a serverless, fully managed container. This guide pairs the Stategraph server image with a managed Cloud SQL for PostgreSQL database, connected through a Cloud SQL Auth Proxy sidecar, with the database password stored in Secret Manager.

Prerequisites

Before deploying Stategraph on Cloud Run, ensure you have:

  • gcloud CLI installed and authenticated (gcloud auth login)
  • A Google Cloud project with billing enabled
  • Owner or Editor on the project (or the equivalent granular roles for Cloud Run, Cloud SQL, Secret Manager, and Artifact Registry)
  • Docker installed locally (only needed to mirror the image into Artifact Registry — see below)
  • Access to the Stategraph server image — it's distributed privately so we can make sure your team has the support it needs; reach out and we'll get you set up

Cloud Run only pulls from Artifact Registry, GCR, or Docker Hub

Cloud Run cannot deploy an image directly from an arbitrary registry such as ghcr.io — it accepts images only from Artifact Registry (*-docker.pkg.dev), Container Registry (gcr.io), or Docker Hub (docker.io). Mirror the Stategraph image into Artifact Registry in your project first (a one-time copy), then point Cloud Run at the Artifact Registry path.

Architecture Overview

Stategraph on Cloud Run uses a serverless, GCP-native architecture:

Internet (HTTPS)
    │
    ▼
Cloud Run Service
  ┌──────────────────────────────────┐
  │  stategraph (port 8080)           │
  │       │  127.0.0.1:5432           │
  │       ▼                           │
  │  cloud-sql-proxy (sidecar)        │
  └───────────────┬──────────────────┘
                  ▼
        Cloud SQL for PostgreSQL
                  ▲
        Secret Manager (DB password)

What you create:

  • A Cloud Run service running two containers: the Stategraph server (ingress, port 8080) and the Cloud SQL Auth Proxy sidecar
  • A Cloud SQL for PostgreSQL instance, database, and user
  • A Secret Manager secret holding the database password
  • An Artifact Registry repository holding the Stategraph image
  • IAM bindings granting the Cloud Run runtime service account access to Cloud SQL, Secret Manager, and Artifact Registry

Why a Cloud SQL Auth Proxy sidecar? Stategraph connects to PostgreSQL over TCP using DB_HOST and DB_PORT. The sidecar exposes the database on 127.0.0.1:5432 inside the service, so the server connects with DB_HOST=127.0.0.1 — no code changes, no public database IP exposed.

Quick Start

The commands below assume these shell variables. Adjust them to your project:

export PROJECT_ID="your-project-id"
export REGION="us-central1"
export INSTANCE="stategraph-db"
export AR_REPO="stategraph"

gcloud config set project "$PROJECT_ID"
gcloud config set run/region "$REGION"

1. Enable APIs

gcloud services enable \
  run.googleapis.com \
  sqladmin.googleapis.com \
  secretmanager.googleapis.com \
  artifactregistry.googleapis.com \
  compute.googleapis.com

2. Mirror the image into Artifact Registry

Create an Artifact Registry repository and copy the Stategraph image into it. The image reference (<stategraph-server-image>) is provided when you get access.

# Create the repository
gcloud artifacts repositories create "$AR_REPO" \
  --repository-format=docker \
  --location="$REGION" \
  --description="Stategraph server images"

# Authenticate Docker to Artifact Registry
gcloud auth configure-docker "${REGION}-docker.pkg.dev" --quiet

# Mirror the image (one-time copy)
export IMAGE="${REGION}-docker.pkg.dev/${PROJECT_ID}/${AR_REPO}/stategraph-server:latest"
docker pull <stategraph-server-image>:latest   # image URL provided when you get access: https://stategraph.com/contact
docker tag  <stategraph-server-image>:latest "$IMAGE"
docker push "$IMAGE"

3. Create the Cloud SQL database

# Create the instance (ENTERPRISE edition allows the smaller shared-core tiers)
gcloud sql instances create "$INSTANCE" \
  --database-version=POSTGRES_17 \
  --edition=ENTERPRISE \
  --tier=db-custom-1-3840 \
  --region="$REGION" \
  --storage-size=10 \
  --storage-type=SSD

# Create the database and user
DB_PASSWORD="$(openssl rand -base64 24 | tr -d '/+=' | head -c 24)"
gcloud sql databases create stategraph --instance="$INSTANCE"
gcloud sql users create stategraph --instance="$INSTANCE" --password="$DB_PASSWORD"

# Note the instance connection name — you'll need it for the sidecar
export CONNECTION_NAME="$(gcloud sql instances describe "$INSTANCE" --format='value(connectionName)')"
echo "$CONNECTION_NAME"   # e.g. your-project-id:us-central1:stategraph-db

Tier and edition

POSTGRES_17 defaults to the Enterprise Plus edition, which rejects the small shared-core tiers. Pass --edition=ENTERPRISE to use cost-effective tiers like db-custom-1-3840 (1 vCPU / 3.75 GB) or the shared-core db-g1-small for evaluation. Size up for production.

4. Store the database password in Secret Manager

printf '%s' "$DB_PASSWORD" | gcloud secrets create stategraph-db-pass --data-file=-

5. Grant the runtime service account access

Cloud Run runs as the project's default compute service account unless you specify another. Grant it access to Cloud SQL, the secret, and the Artifact Registry repository:

export PROJECT_NUMBER="$(gcloud projects describe "$PROJECT_ID" --format='value(projectNumber)')"
export RUNTIME_SA="${PROJECT_NUMBER}-compute@developer.gserviceaccount.com"

gcloud projects add-iam-policy-binding "$PROJECT_ID" \
  --member="serviceAccount:${RUNTIME_SA}" \
  --role="roles/cloudsql.client" --condition=None

gcloud secrets add-iam-policy-binding stategraph-db-pass \
  --member="serviceAccount:${RUNTIME_SA}" \
  --role="roles/secretmanager.secretAccessor"

gcloud artifacts repositories add-iam-policy-binding "$AR_REPO" \
  --location="$REGION" \
  --member="serviceAccount:${RUNTIME_SA}" \
  --role="roles/artifactregistry.reader"

6. Create the service definition

Save the following as service.yaml. Replace IMAGE and CONNECTION_NAME with your values (or substitute the environment variables with envsubst):

apiVersion: serving.knative.dev/v1
kind: Service
metadata:
  name: stategraph
  labels:
    cloud.googleapis.com/location: us-central1
spec:
  template:
    metadata:
      annotations:
        # Keep one warm instance: the price-book loader and cost scheduler are
        # background jobs that scale-to-zero would interrupt.
        autoscaling.knative.dev/minScale: "1"
        autoscaling.knative.dev/maxScale: "3"
        # CPU is always allocated so background work runs outside request handling.
        run.googleapis.com/cpu-throttling: "false"
        run.googleapis.com/startup-cpu-boost: "true"
        # Start the proxy before the server so the DB is reachable at boot.
        run.googleapis.com/container-dependencies: '{"server":["cloud-sql-proxy"]}'
    spec:
      containers:
        - name: server
          image: IMAGE
          ports:
            - containerPort: 8080
          env:
            - name: DB_HOST
              value: "127.0.0.1"
            - name: DB_PORT
              value: "5432"
            - name: DB_USER
              value: "stategraph"
            - name: DB_NAME
              value: "stategraph"
            - name: DB_PASS
              valueFrom:
                secretKeyRef:
                  name: stategraph-db-pass
                  key: "latest"
            # Set to the service URL after the first deploy (see step 8).
            - name: STATEGRAPH_UI_BASE
              value: "https://REPLACE_WITH_SERVICE_URL"
          # Migrations run on boot; /health/ready returns 502 until they finish.
          # /health/live is up immediately. Give migrations generous headroom.
          startupProbe:
            httpGet:
              path: /health/ready
              port: 8080
            initialDelaySeconds: 10
            periodSeconds: 10
            failureThreshold: 30
            timeoutSeconds: 5
          resources:
            limits:
              cpu: "1"
              memory: 1Gi
        - name: cloud-sql-proxy
          image: gcr.io/cloud-sql-connectors/cloud-sql-proxy:2.11.0
          args:
            - "--port=5432"
            - "--health-check"
            - "--http-address=0.0.0.0"
            - "--http-port=9090"
            - "CONNECTION_NAME"
          startupProbe:
            httpGet:
              path: /startup
              port: 9090
            periodSeconds: 5
            failureThreshold: 12
            timeoutSeconds: 3
          resources:
            limits:
              cpu: "1"
              memory: 512Mi
  traffic:
    - percent: 100
      latestRevision: true

7. Deploy

gcloud run services replace service.yaml --region="$REGION"

The deploy completes once the startup probe passes — that is, after database migrations finish and /health/ready returns 200.

8. Set the public URL and redeploy

Cloud Run assigns the service URL on first deploy. Stategraph needs it in STATEGRAPH_UI_BASE for correct links, cookies, and OAuth callbacks:

export SERVICE_URL="$(gcloud run services describe stategraph --region="$REGION" --format='value(status.url)')"
echo "$SERVICE_URL"   # e.g. https://stategraph-xxxxxxxxxx.us-central1.run.app

sed -i "s#https://REPLACE_WITH_SERVICE_URL#${SERVICE_URL}#" service.yaml
gcloud run services replace service.yaml --region="$REGION"

Cookie Security

Stategraph derives cookie security from STATEGRAPH_UI_BASE. The Cloud Run URL is always https://, so cookies are set with the Secure flag automatically.

9. Expose the UI

To make the UI reachable in a browser, allow unauthenticated access:

gcloud run services add-iam-policy-binding stategraph \
  --region="$REGION" \
  --member=allUsers \
  --role=roles/run.invoker

To keep the service private instead, skip this step and reach it with an identity token (see the next step) or front it with Identity-Aware Proxy.

If your organization enforces Domain Restricted Sharing (iam.allowedPolicyMemberDomains), the allUsers binding is rejected with a "do not belong to a permitted customer" error. Grant the project an allowAll override for that constraint, or grant roles/run.invoker to specific users/groups instead.

10. Verify the deployment

# Liveness — available as soon as nginx is up
curl -sS -o /dev/null -w "live: %{http_code}\n"  "$SERVICE_URL/health/live"
# Readiness — 200 once migrations are done and the backend is connected
curl -sS -o /dev/null -w "ready: %{http_code}\n" "$SERVICE_URL/health/ready"

If the service requires authentication (you skipped step 9), add an identity token:

TOKEN="$(gcloud auth print-identity-token)"
curl -sS -H "Authorization: Bearer $TOKEN" \
  -o /dev/null -w "ready: %{http_code}\n" "$SERVICE_URL/health/ready"

Then open $SERVICE_URL in your browser — the setup wizard prompts you to create an admin account.

Configuration

Stategraph is configured entirely through environment variables on the server container. The Quick Start sets the minimum; common additions are below. See the Environment Variables reference for the full list.

Variable Description Example
STATEGRAPH_UI_BASE Public URL (the Cloud Run service URL or custom domain) https://stategraph-xxxx.us-central1.run.app
DB_HOST Database host — 127.0.0.1 via the sidecar proxy 127.0.0.1
DB_PORT Database port the sidecar listens on 5432
DB_USER / DB_NAME Database user and name stategraph
DB_PASS Database password (from Secret Manager)

Custom domain

Map a custom domain to the service, then update STATEGRAPH_UI_BASE to match:

gcloud run domain-mappings create \
  --service=stategraph \
  --domain=stategraph.example.com \
  --region="$REGION"

After the domain is mapped and DNS propagates, set STATEGRAPH_UI_BASE=https://stategraph.example.com and redeploy.

Authentication

Enable OAuth by adding environment variables to the server container. The OAuth callback path is /oauth2/{provider}/callback, so set the redirect base to your public URL:

# Add to the server container's env: in service.yaml
- name: STATEGRAPH_OAUTH_TYPE
  value: "google"            # or "oidc"
- name: STATEGRAPH_OAUTH_CLIENT_ID
  value: "your-client-id.apps.googleusercontent.com"
- name: STATEGRAPH_OAUTH_CLIENT_SECRET
  valueFrom:
    secretKeyRef:
      name: stategraph-oauth-secret
      key: "latest"
- name: STATEGRAPH_OAUTH_REDIRECT_BASE
  value: "https://stategraph.example.com"   # must match STATEGRAPH_UI_BASE
- name: STATEGRAPH_OAUTH_EMAIL_DOMAIN
  value: "yourcompany.com"                   # optional: restrict to a domain

Set the Authorized redirect URI in your provider to https://<your-domain>/oauth2/google/callback (or /oauth2/oidc/callback). Store the client secret in Secret Manager and grant the runtime service account roles/secretmanager.secretAccessor on it, exactly as with the database password. For details, see the Authentication guide.

Enable cost estimation

Cost analysis is off by default. The pricing service ships inside the server image; turn it on by adding STATEGRAPH_COST_ENABLED=true to the server container's env. Because it lives in the service definition, cost survives every new revision and reschedule. The bundled pricing service runs in the same container, so STATEGRAPH_PRICING_SERVICE_URL already defaults to http://localhost:8090 — set it only to point at an external pricing service.

# Add to the server container's env: in service.yaml
- name: STATEGRAPH_COST_ENABLED
  value: "true"

On first boot the pricing service loads the price book into the cloud_pricing database in the background, so the server and UI stay available immediately. Because Cloud Run instances are ephemeral, the price book is stored in your Cloud SQL instance (durable), and the minScale: "1" plus always-allocated CPU in this guide keep the background loader and scheduled recomputes running. See Cost Setup for verification, refresh cadence, and air-gapped installs.

Health Checks

Stategraph exposes two health endpoints, both used by the startup probe in this guide:

Endpoint Purpose Available
/health/live Liveness — returns 200 as long as nginx is running Immediately on startup
/health/ready Readiness — returns 200 when the backend is ready After database migrations complete

The server container's startupProbe targets /health/ready so Cloud Run does not route traffic until migrations finish. With failureThreshold: 30 and periodSeconds: 10, migrations have up to five minutes — raise the threshold for very large databases. For details, see the Health Checks reference.

Scaling

Cloud Run autoscales by request concurrency. Control the bounds with annotations on the revision template:

autoscaling.knative.dev/minScale: "1"   # keep one warm instance (recommended)
autoscaling.knative.dev/maxScale: "3"   # cap concurrent instances

Keep minScale at 1 or higher. Stategraph runs background work — the price-book loader and scheduled cost recomputes — that does not run when the service scales to zero. Always-allocated CPU (run.googleapis.com/cpu-throttling: "false") ensures that work proceeds between requests.

To scale vertically, raise the server container's CPU and memory limits:

resources:
  limits:
    cpu: "2"
    memory: 2Gi

Upgrading

Update the Stategraph version

Mirror the new image tag into Artifact Registry, update the image: in service.yaml, and redeploy:

docker pull <stategraph-server-image>:1.2.0   # image URL provided when you get access: https://stategraph.com/contact
docker tag  <stategraph-server-image>:1.2.0 "${REGION}-docker.pkg.dev/${PROJECT_ID}/${AR_REPO}/stategraph-server:1.2.0"
docker push "${REGION}-docker.pkg.dev/${PROJECT_ID}/${AR_REPO}/stategraph-server:1.2.0"

# Update image: in service.yaml, then:
gcloud run services replace service.yaml --region="$REGION"

Cloud Run performs a gradual rollout and shifts traffic to the new revision once its startup probe passes — no downtime.

Monitoring

View logs

gcloud run services logs read stategraph --region="$REGION" --limit=100

Or stream them in the Cloud Console under Cloud Run > stategraph > Logs.

Inspect the service

gcloud run services describe stategraph --region="$REGION"

Request count, latency, instance count, CPU, and memory are available in the Cloud Console under Cloud Run > stategraph > Metrics.

Troubleshooting

Deploy rejected: invalid image host

Symptoms: gcloud run services replace fails referencing the image host.

Error message

Expected an image path like [host/]repo-path[:tag and/or @digest], where host is
one of [region.]gcr.io, [region-]docker.pkg.dev or docker.io

Solution:

  • Cloud Run cannot pull from ghcr.io or other arbitrary registries
  • Mirror the image into Artifact Registry (see step 2) and reference the *-docker.pkg.dev path

Cloud SQL instance create fails on tier

Symptoms: gcloud sql instances create fails immediately.

Error message

Invalid Tier (db-f1-micro) for (ENTERPRISE_PLUS) Edition

Solution:

  • POSTGRES_17 defaults to the Enterprise Plus edition, which does not allow shared-core tiers
  • Add --edition=ENTERPRISE and choose a supported tier such as db-custom-1-3840 or db-g1-small

Service stuck starting / readiness never passes

Symptoms: The deploy times out, or /health/ready keeps returning 502.

Solutions:

  1. Confirm the runtime service account has roles/cloudsql.client and roles/secretmanager.secretAccessor
  2. Check the logs for the cloud-sql-proxy container — a bad CONNECTION_NAME or missing IAM shows up there
  3. For a large database, raise the server startup probe failureThreshold to give migrations more time

Next Steps