GCP Marketplace — Package Reference
This is a technical reference for the OpenDSO GCP Marketplace package: how the deployer works, what it creates, and the configuration, security, networking, and persistence model behind it. It complements the hands-on guides:
- Deploying OpenDSO on GCP with Helm - From-scratch Helm deployment
- GCP Marketplace — Customer Pre-Deployment Checklist - Preparing a cluster for a Marketplace install
Executive Summary
OpenDSO is a near-real-time Distribution System Operator (DSO) platform built by Open Energy Solutions Inc. (OES) for managing grid-edge devices, DERs, and microgrid operations using the OpenFMB standard. This GCP Marketplace package deploys the full OpenDSO stack onto a GKE cluster via a single Helm release managed by a custom deployer image.
The package deploys the full "backoffice" tier of the platform — the deployer script, Helm chart (38 subcharts), schema, and verification script. Production installs do not rely on shipped plaintext credential fallbacks, TLS is standardized around a release-scoped secret with compatibility aliases for older mounts, and several backend services use explicit numeric non-root security contexts.
1. Overview
What OpenDSO Is in This Marketplace Package
OpenDSO is an open-source platform enabling interoperability, edge intelligence, and application management for electric distribution grids. It uses the OpenFMB standard (UCAIUG) as its interoperability layer and NATS as its internal message bus. The platform supports DER management, conservation voltage reduction (CVR), ESS management, topology modeling from CIM data, and real-time operational dashboards.
The GCP Marketplace package deploys the full "backoffice" tier of the platform — all central services, databases, and UI applications — as a single Helm release onto a GKE cluster.
Source: opendso-gcp-marketplace/README.md
Main Components and Deployment Shape
A single Helm umbrella chart (opendso, version 0.1.0, appVersion 1.0.0) with 37 internal subcharts + 1 external (Grafana 10.5.14). All components are deployed into a single Kubernetes namespace under a single Helm release name.
| Category | Components |
|---|---|
| Infrastructure | NATS (messaging, with NKey auth callout), Keycloak 24.0 (identity), Grafana 10.5.14 (monitoring) |
| Databases | MongoDB 7.0.2-ubi8, Citus 12.1.2-alpine (PostgreSQL), TimescaleDB 2.26.0-pg16, Keycloak-DB (PostgreSQL 17, optional) |
| Core Services | GMS API, Historian, OpenFMB Event Service, NATS Auth Service, Topology Genesis, Topology Nodes |
| Grid Applications | DER Dispatch (app + svc), ESS Manager (app + svc + Redis), ESS Tester (app + svc), Asset Health (svc + sim-svc), CVR (3 services), OmegaDSS, RPCDSS |
| Frontend Apps | One-Line, GIS, Historian, Inspector, Inventory, Data Viewer, Event Viewer, DER Dispatch App, ESS Manager App, ESS Tester App, Schedule Dispatch, OpenFMB Event Creator, OpenDSO Docs, Genesis Node |
Source: chart/Chart.yaml, chart/README.md, chart/values.yaml
Intended Runtime Environment
- GKE cluster (Kubernetes 1.24+)
- External nginx ingress controller
- Wildcard DNS pointing to nginx LoadBalancer
- TLS certificate ideally pre-created as a Kubernetes secret; the chart can also generate a self-signed fallback in the Marketplace path for test or recovery scenarios
- Images served from GCP Artifact Registry (mirrored from OES registry)
2. Installation Model
How the Marketplace Deployment Works at a High Level
The deployer is a custom Docker image that extends the GCP deployer_helm base image. It executes deployer/deploy.sh, which performs these steps in sequence:
- NKey generation (step 1/3): Generates NATS NKey pairs (account, user, curve/xkey). On re-runs, existing keys from the
<release>-nats-auth-keyssecret are reused to avoid NATS reconfiguration. - Secret creation (step 2/3): Creates/updates the
<release>-nats-auth-keysKubernetes secret. Generates or reuses passwords foropendso-apps-dbandcitus-db. - Topology Genesis ConfigMap (step 3a): Pre-creates the topology-genesis site ConfigMap via
kubectl apply --server-side. This is required becausecim.xmlis 367KB, exceeding the 262KB Kubernetes annotation limit for client-side apply. - License secret (step 3a-2): Creates
<release>-opendso-licensesecret containingLICENSE_KEY,LICENSE_INSTALLATION_KEY, andLICENSE_ENVIRONMENT_NAME(cluster UID fromkube-systemnamespace). - Keycloak client secrets (step 3b): Generates UUID secrets per Keycloak client, injects them into the realm JSON (so Keycloak imports pre-populated secrets on first boot), and creates
<release>-<client>-keycloak-envKubernetes secrets. - Helm deploy (step 3/3): Runs
helm upgrade --installwith a 15-minute timeout, combiningvalues-gcp.yaml, user values from/data/user/values.yaml, and an auto-generated NATS auth overlay. - Keycloak secret sync (step 4/4): Pushes client secrets to Keycloak via Admin REST API post-deploy (non-fatal if Keycloak not yet ready — secrets are already in the realm JSON from step 3b).
- Status patch: Calls
patch_assembly_phase.sh --status="Success".
Source: deployer/deploy.sh
Role of Deployer, Helm Chart, and Configuration Values
- Deployer image (
deployer/Dockerfile,deployer/deploy.sh): Orchestrates all pre-Helm setup, secret generation, and Keycloak provisioning. Extendsdeployer_helm. - Helm chart (
chart/): Umbrella chart with 38 subcharts. All services are configured viavalues.yamldefaults, overridden byvalues-gcp.yaml(GCP-specific production settings), user-supplied Marketplace UI parameters, and an auto-generated NATS auth overlay. - schema.yaml: Defines the GCP Marketplace UI parameters (name, namespace, domain, license key, installation key, Keycloak/MongoDB/Grafana credentials, resource profile, image registry).
Values precedence (last wins):
values-gcp.yaml → /data/user/values.yaml (Marketplace UI inputs) → auto-generated NATS overlay → --set overrides for release-name-dependent values.
Source: deployer/deploy.sh, schema.yaml, chart/values.yaml, chart/values-gcp.yaml
Release/Namespace Behavior
- All resources are deployed into a single namespace specified by the Marketplace UI.
- The Helm release name equals the application instance name (
APP_INSTANCE_NAME) from the Marketplace framework. - All service references use
{{ .Release.Name }}-<service>naming patterns, enabling multi-release deployments in separate namespaces. - The deployer adopts any
Applicationresource pre-created bympdevinto Helm management via annotation/label patching.
Source: chart/README.md, deployer/deploy.sh
What Is Created During Install
- Kubernetes Deployments for all enabled services and frontends
- StatefulSets for MongoDB, Citus, opendso-apps-db, Keycloak-DB (if enabled), ESS Manager Redis
- Services (ClusterIP) for all components
- Ingress resources:
<release>-ingress(UI apps),<release>-ingress-api(GMS API),<release>-ingress-nats-ws(NATS WebSocket) - PVCs for stateful components (MongoDB, Citus, opendso-apps-db, Grafana, Keycloak, topology-genesis, openfmb-event-service, asset-health-sim-svc)
- Secrets: NATS auth keys, Keycloak env per client, Grafana credentials, database credentials, release-scoped TLS secret, and TLS compatibility aliases (
root-ca,server-cert,server-key) when TLS management is enabled - ConfigMaps: site configs (topology-genesis, keycloak realm, mongodb init, etc.), frontend environment config
- Jobs:
<release>-mongodb-init(runs once to initialize MongoDB collections/users) - Roles, RoleBindings, ServiceAccounts for specific services (gms-api, mongodb, der-dispatch-svc)
- A
Applicationresource (app.k8s.io/v1beta1) for GCP Marketplace tracking PodDisruptionBudgetsfor omegadss-svc and rpcdss-svc
Source: chart/Chart.yaml, chart/templates/, individual subchart templates
3. Prerequisites
Cluster Requirements
- GKE cluster, Kubernetes 1.24+, Helm 3.8+
nginx-ingresscontroller installed and running (external LoadBalancer IP assigned)app.k8s.io/v1beta1Application CRD installed (required bympdevtooling)- GKE node service account must have Artifact Registry Reader access (for image pulls;
imagePullSecretsis set to[]in the GCP overlay — Workload Identity or node SA role required)
Source: opendso-gcp-marketplace/README.md, deployer/deploy.sh (NATS_AUTH_VALUES sets imagePullSecrets: [])
DNS / Ingress / TLS Requirements
-
Wildcard DNS
*.yourdomain.compointing to the nginx LoadBalancer IP -
TLS certificate pre-created as a Kubernetes secret named
<release-name>-tls-secretin the target namespace is the preferred production modelOptions documented:
- cert-manager + Let's Encrypt (recommended)
- Self-signed certificate via
mkcert
-
Important nuance: When
global.gcpMarketplace=true(set by the deployer),tls-secrets.yamlwill auto-generate a self-signed<release>-tls-secretif it does not already exist and will also create the legacy alias secretsroot-ca,server-cert, andserver-keyfor workloads that still mount those names. This allows the deployer to proceed without a pre-existing cert, but the resulting certificate is self-signed and not trusted by browsers. The intended long-term production model is still a real pre-created or cert-manager-issued certificate.
Source: opendso-gcp-marketplace/README.md, chart/templates/tls-secrets.yaml
Storage Requirements
- Default storage class:
standard(values.yaml) /standard-rwo(values-gcp.yaml) - PVCs created at install:
| Component | Size (GCP profile) | Notes |
|---|---|---|
| MongoDB | 10Gi data + 5Gi log | deleteOnUpgrade: false |
| Citus DB | 50Gi | Production size; dev default is 10Gi |
| opendso-apps-db | 10Gi | Hosts ess_tester + assets DBs |
| Grafana | 10Gi | Dashboard persistence |
| Keycloak | 10Gi | Realm data persistence |
| topology-genesis | (chart default) | CIM topology data |
| openfmb-event-service | (chart default) | Event data |
| asset-health-sim-svc | (chart default) | Simulation data |
| ESS Manager Redis | 1Gi | Cache persistence |
Source: chart/values-gcp.yaml, chart/charts/*/templates/pvc.yaml
Image Registry / Pull Access Assumptions
- All images must be present in GCP Artifact Registry before deploying
- The GCP overlay sets
imagePullSecrets: []— pull access relies on GKE Workload Identity or the node service account's Artifact Registry Reader IAM role - OES images (gms-api, historian, topology-genesis, etc.) plus third-party images (NATS, MongoDB, Citus, Keycloak, Redis, TimescaleDB, Envoy) must all be mirrored
schema.yamlimagessection is populated for the chart-managed default Marketplace image set
Source: schema.yaml, chart/values.yaml, chart/values-gcp.yaml, deployer/deploy.sh
Required Secrets, Licenses, and Keys
The following must be obtained from OES before deployment:
- OpenDSO License Key (
license.key) — required; validated via<release>-opendso-licensesecret consumed bytopology-nodes - OpenDSO Installation Key (
installation.key) — required; stored asLICENSE_INSTALLATION_KEYin the same secret
The following credentials are user-supplied in the Marketplace UI:
- Keycloak admin password
- MongoDB root password and app password
- Grafana admin password
All credentials are passed via GCP Marketplace's MASKED_FIELD mechanism and stored in Kubernetes Secrets.
Source: schema.yaml, deployer/deploy.sh, chart/values-gcp.yaml (topology-nodes env section)
4. Configuration Model
Important Marketplace Inputs (schema.yaml properties)
| Parameter | Type | Required | Default | Notes |
|---|---|---|---|---|
name | string | yes | — | Injected by Marketplace (release name) |
namespace | string | yes | — | Injected by Marketplace |
license.key | MASKED_FIELD | yes | — | OES-issued license key |
installation.key | MASKED_FIELD | yes | — | OES-issued installation key |
global.domain | string | yes | — | Base domain, e.g. opendso.example.com |
global.imageRegistry | string | no | "" | Artifact Registry prefix |
keycloak.config.adminUser | string | no | admin | |
keycloak.config.adminPassword | MASKED_FIELD | yes | — | |
mongodb.auth.rootUsername | string | no | root | |
mongodb.auth.rootPassword | MASKED_FIELD | yes | — | |
mongodb.auth.username | string | no | opendso | |
mongodb.auth.password | MASKED_FIELD | yes | — | |
grafana.adminUser | string | no | admin | |
grafana.adminPassword | MASKED_FIELD | yes | — | |
global.resourceProfile | enum | no | default | minimal, default, or production |
Source: schema.yaml
Domain, TLS, Keycloak, Database, and Storage Configuration
Domain: global.domain sets the base domain. All ingress routes and Keycloak URLs are derived from it. Frontend apps receive global.environment.apiUrl and global.environment.natsUrl via a frontend-environment-configmap rendered from this domain.
TLS: The deployer passes --set global.tls.existingSecret=<release>-tls-secret --set ingress.tls.secretName=<release>-tls-secret --set nats.tls.secretName=<release>-tls-secret. If the secret does not exist prior to Helm running, tls-secrets.yaml generates a self-signed cert when global.gcpMarketplace=true and creates root-ca, server-cert, and server-key compatibility secrets for older workload mounts.
Keycloak: The deployer sets:
global.keycloak.internalUrl→http://<release>-keycloak-svc:8080(for in-cluster service communication)global.keycloak.url→https://keycloak.<domain>(for browser-facing flows)- Keycloak is configured with realm name
oes, client IDgms(shared by UI and API) keycloak.config.hostnameStrict: "false"is set directly invalues-gcp.yaml; no deployer--setoverride is needed or present
Databases: Internal databases are enabled by default:
- MongoDB (
settings_apiDB, useropendso) - Citus DB (
ofmb_db, usercitususer) — historian time-series data - opendso-apps-db (TimescaleDB, user
essuser) — hostsess_testerandassetsdatabases - keycloak-db (PostgreSQL) — disabled by default; Keycloak uses in-chart persistence (
keycloak.persistence.enabled: true)
External database overrides are supported for all three databases via externalDatabase.* values.
Storage: global.storageClass: standard-rwo in GCP overlay. All PVC-backed components use this storage class. pd-ssd is documented as an alternative for production workloads.
Source: chart/values.yaml, chart/values-gcp.yaml, deployer/deploy.sh, chart/templates/
Values Expected from User vs Generated Automatically
User-provided (Marketplace UI):
- Domain, license key, installation key, all passwords, image registry, resource profile
The production chart path no longer relies on shipped plaintext password defaults for MongoDB, Keycloak, Grafana, Citus, or opendso-apps-db. Those values must be supplied explicitly or created by the deploy flow.
Deployer-generated at deploy time:
- NATS NKey pairs (account seed, user seed, xkey seed) → stored in
<release>-nats-auth-keyssecret - Keycloak client UUIDs → stored in
<release>-<client>-keycloak-envsecrets - opendso-apps-db password → stored in
<release>-opendso-apps-db-secret - citus-db password → stored in
<release>-citus-db-secret - License environment name (cluster UID from
kube-system) - topology-genesis ConfigMap (server-side applied)
Helm-generated:
- TLS secrets (self-signed fallback if not pre-existing, plus
root-ca,server-cert, andserver-keyaliases when managed by the chart) - Grafana credentials secret
- MongoDB admin/app secrets
- GMS API MongoDB secret
- Grafana datasource credentials
Which Values Must Remain Release-Specific
The following are templated using {{ .Release.Name }} and must be set as --set overrides (cannot be in values files), as documented in deployer/deploy.sh:
grafana.admin.existingSecretgrafana.envValueFrom.CITUS_PASSWORD.secretKeyRef.namegrafana.envValueFrom.OPENDSO_APPS_DB_PASSWORD.secretKeyRef.nameglobal.tls.existingSecretingress.tls.secretNamenats.tls.secretNameglobal.keycloak.internalUrlglobal.environment.apiUrlglobal.keycloak.url
Source: chart/README.md, deployer/deploy.sh
5. Security Model
TLS Expectations
- TLS terminates at the nginx ingress layer; internal service-to-service communication is HTTP (ClusterIP, no TLS)
- NATS uses TLS (
nats.tls.enabled: trueinvalues-gcp.yaml) with the same<release>-tls-secret - MongoDB TLS is explicitly disabled (
mongodb.tls.enabled: false) — internal ClusterIP-only traffic - Keycloak in
values-gcp.yamlhashttpsEnabled: truereferencing cert files, andhostnameStrict: "false"is set directly invalues-gcp.yaml; no deployer override is needed or present - The NATS WebSocket ingress (
ingress-nats-ws.yaml) usesnginx.ingress.kubernetes.io/backend-protocol: "HTTPS", indicating it connects to NATS over TLS internally
Source: chart/values-gcp.yaml, deployer/deploy.sh, chart/templates/ingress-nats-ws.yaml
Secret Handling
- License key and installation key are stored in
<release>-opendso-licenseKubernetes Secret (created by deployer, not Helm, so not in Helm state) - NATS NKey seeds stored in
<release>-nats-auth-keysKubernetes Secret - Keycloak client secrets stored in per-client
<release>-<client>-keycloak-envKubernetes Secrets - All Marketplace credential inputs (
MASKED_FIELD) stored as Kubernetes Secrets by the deployer framework - The production chart path does not ship plaintext fallback passwords; sensitive credentials must come from Marketplace inputs, deployer-generated secrets, or pre-created secrets
- TLS is standardized around
<release>-tls-secret; when the chart manages TLS it also createsroot-ca,server-cert, andserver-keycompatibility secrets for workloads that still mount legacy names - NKey private seeds are never committed to the repository; they are generated at deploy time
- Keycloak client secrets are generated as UUIDs at deploy time and never stored in the repository
- Database internal passwords (opendso-apps-db, citus-db) are generated via
python3 -c 'import secrets; print(secrets.token_urlsafe(24))'and stored in dedicated secrets - On re-runs, all generated secrets are reused (idempotent)
Source: deployer/deploy.sh, opendso-gcp-marketplace/README.md
Authentication / Keycloak Model
- Keycloak
24.0is deployed in-cluster, accessible externally athttps://keycloak.<domain> - Realm name:
oes - Client ID shared by UI and API:
gms - Per-service NATS clients each have their own Keycloak client ID and secret, injected via
<release>-<client>-keycloak-envsecrets - The realm JSON (
configs/ieee13/keycloak/realm/oes-realm.json) containsREPLACE_SECRET_<client-id>placeholders. The deployer replaces these with generated UUIDs before Helm runs. - Grafana anonymous access is disabled (
GF_AUTH_ANONYMOUS_ENABLED: "false")
Source: deployer/deploy.sh, chart/values.yaml
Non-Root / Container Security Posture
The chart uses a mixed hardening model rather than forcing one policy across every workload:
- Selected backend services that are known to support numeric non-root execution use explicit non-root security contexts. Examples include
omegadss-svc,rpcdss-svc,ess-manager-svc,ess-tester-svc,nats-auth-svc, andkeycloak. - Additional backend services such as
historian-svc,topology-genesis,openfmb-event-service, andder-dispatch-svcalso carry explicit non-root-oriented security settings in their subchart values. - Frontend apps generally carry baseline hardening (
seccompProfile.type: RuntimeDefault,allowPrivilegeEscalation: false, dropped capabilities), but universalrunAsNonRootis not yet documented as safe for every frontend image. - Some stateful or broker-style workloads remain intentionally more conservative because their startup routines still need image-default filesystem behavior. This currently applies to components such as NATS, Citus DB, opendso-apps-db, and ESS Manager Redis.
- MongoDB and
keycloak-dbhave their own image-specific security contexts, but those should still be validated against the exact runtime path used in Marketplace testing.
In practice, the hardening baseline that is most consistently applied across the chart is:
allowPrivilegeEscalation: falsecapabilities.drop: [ALL]seccompProfile.type: RuntimeDefault
Source: chart/values.yaml, chart/values-gcp.yaml
Documented Limitations or Exceptions
- MongoDB TLS is disabled; relies on ClusterIP network isolation
- GMS API's Docker API is pointed to
http://127.0.0.1:2376on GKE because GKE uses containerd (no Docker socket). Orchestration API calls will fail gracefully. keycloak-db(separate PostgreSQL for Keycloak) is disabled by default; Keycloak uses its built-in persistence with a PVCgrafana.rbac.namespaced: trueis set invalues-gcp.yamlbecause the GCP Marketplace deployer SA only has namespace-scoped permissions
Source: chart/values-gcp.yaml
6. Networking
Public Endpoints
After deployment at <domain>:
| Endpoint | URL Pattern | Backend Service / Port |
|---|---|---|
| GMS / Genesis Node (root) | https://gms.<domain> and https://<domain> | genesis-node-app :8081 |
| Keycloak | https://keycloak.<domain> | keycloak-svc :8080 |
| Grafana | https://grafana.<domain> | grafana :80 |
| GMS API | https://api.<domain> | gms-api :8000 |
| NATS WebSocket | wss://nats.<domain> | nats-service-ws :9222 |
| GIS App | https://gis.<domain> | gis-app :8084 |
| One-Line App | https://oneline.<domain> | one-line-app :8085 |
| Event Viewer | https://eventviewer.<domain> | event-viewer-app :8088 |
| Inventory | https://inventory.<domain> | inventory-app :8089 |
| OpenFMB Event Creator | https://openfmbeventcreator.<domain> | openfmb-event-creator-app :8090 |
| Data Viewer | https://dataviewer.<domain> | data-viewer-app :8093 |
| Inspector (OpenFMB) | https://openfmb.<domain> | inspector-app :8086 |
| DER Dispatch | https://derdispatch.<domain> | der-dispatch-app :8095 |
| ESS Manager | https://device.<domain> | ess-manager-app :8094 |
| ESS Tester | https://esstesting.<domain> | ess-tester-app :8096 |
| Historian | https://historian.<domain> | historian-app :8087 |
| Schedule Dispatch | https://scheduledispatch.<domain> | schedule-dispatch-app :8094 |
| OpenDSO Docs | https://docs.<domain> | opendso-docs-app :8092 |
Source: chart/templates/ingress.yaml, chart/templates/ingress-api.yaml, chart/templates/ingress-nats-ws.yaml, opendso-gcp-marketplace/README.md
Internal Service Communication
- All service-to-service communication uses ClusterIP Services with the pattern
<release>-<service> - NATS messaging: internal services connect to
<release>-nats-service:4222 - MongoDB:
<release>-mongodb:27017 - Citus DB:
<release>-citus-db:5432 - opendso-apps-db (TimescaleDB):
<release>-opendso-apps-db:5432 - Keycloak internal:
http://<release>-keycloak-svc:8080 - GMS API:
<release>-gms-api:8000 - ESS Manager Redis:
<release>-ess-manager-redis:6379(assumed from Redis default)
Ingress Behavior
Three separate Ingress resources are created:
<release>-ingress— all UI frontend apps, Keycloak, Grafana; shared TLS secret; nginx annotations for CORS, proxy timeouts, and SSL redirect<release>-ingress-api— GMS API atapi.<domain>; CORS with dynamically computed origin list<release>-ingress-nats-ws— NATS WebSocket atnats.<domain>;backend-protocol: HTTPS, WebSocket enabled, 3600s timeouts
All ingress uses ingressClassName: nginx. TLS is applied via <release>-tls-secret to cover all hosts.
Source: chart/templates/ingress.yaml, chart/templates/ingress-api.yaml, chart/templates/ingress-nats-ws.yaml
Ports and Protocols
| Protocol | Port | Usage |
|---|---|---|
| HTTPS | 443 | All external access via nginx ingress |
| HTTP | 80 | Redirected to HTTPS by nginx |
| NATS TCP | 4222 | Internal in-cluster NATS client connections |
| NATS WebSocket | 9222 | Internal; exposed externally via ingress as WSS |
| MongoDB | 27017 | Internal ClusterIP only |
| PostgreSQL (Citus) | 5432 | Internal ClusterIP only |
| PostgreSQL (apps-db) | 5432 | Internal ClusterIP only |
| Keycloak HTTP | 8080 | Internal; nginx terminates TLS externally |
7. Persistence and Data
Stateful Components
Stateful components use StatefulSets with PVCs:
| Component | StatefulSet | Data Stored |
|---|---|---|
| MongoDB | <release>-mongodb | GMS API settings, user config, application state (settings_api DB) |
| Citus DB | <release>-citus-db | OpenFMB historian time-series data (ofmb_db DB) |
| opendso-apps-db | <release>-opendso-apps-db | ESS tester data (ess_tester DB) and asset health data (assets DB, TimescaleDB features) |
| ESS Manager Redis | <release>-ess-manager-redis | ESS state caching |
Deployments with PVCs (not StatefulSets):
- Keycloak — realm/session data (
keycloak.persistence.enabled: true, 10Gi) - Grafana — dashboard persistence (10Gi)
- topology-genesis — parsed CIM topology data
- openfmb-event-service — event data
- asset-health-sim-svc — simulation data
PVC / Storage Expectations
global.storageClass: standard-rwo(GKE persistent disk, HDD).pd-ssdis noted as recommended for production databases.- All PVCs use
ReadWriteOnceaccess mode (implied bystandard-rwo) - MongoDB:
deleteOnUpgrade: false— PVC data is preserved across Helm upgrades
What Data Is Persisted
- Grid Management System (GMS) configuration and user settings — MongoDB
- OpenFMB historian time-series data — Citus DB
- ESS testing data and asset health telemetry — opendso-apps-db (TimescaleDB)
- Keycloak realm state, users, sessions
- Grafana dashboards and preferences
- Topology CIM model (CIM XML parsed by topology-genesis)
Upgrade or Reinstall Implications
- Helm upgrade re-uses all existing Kubernetes secrets (NATS keys, Keycloak client secrets, database passwords) — fully idempotent by design
- The
mongodb-initJob is deleted before each upgrade (immutable Job specs) and re-created; it runs idempotently - Keycloak realm data is NOT re-imported on upgrades if a PVC already exists; it only imports on a fresh database. The deployer's step 4 (Keycloak Admin API sync) compensates for this by syncing client secrets post-deploy.
- PVC data is preserved across upgrades (
deleteOnUpgrade: falseon MongoDB) - Uninstall (
helm uninstall) does NOT delete PVCs by default; manual cleanup required
Source: chart/values-gcp.yaml, chart/README.md, deployer/deploy.sh
8. Operations
Basic Health / Verification Expectations
scripts/verify.sh performs the following checks post-deploy (10-minute timeout):
- StatefulSet readiness:
<release>-mongodb - Deployment readiness:
<release>-keycloak,<release>-nats - Database StatefulSet readiness attempts:
<release>-citus-db,<release>-opendso-apps-db,<release>-keycloak-db - Application service readiness:
<release>-gms-api,<release>-historian-svc,<release>-nats-auth-svc - NATS auth keys secret existence check
- Keycloak OIDC discovery via port-forward:
http://localhost:18080/realms/oes/.well-known/openid-configuration - GMS API health via port-forward:
http://localhost:18081/api/health(falls back to/api)
Important implementation detail: although the database checks are written with || true, the helper they call exits the script on timeout. In practice, these database readiness checks are currently fatal, not non-fatal.
Source: scripts/verify.sh
Known Startup Dependencies
- Keycloak
readinessProbe.initialDelaySeconds: 120,livenessProbe.initialDelaySeconds: 180— long startup expected - Grafana same: 120s readiness, 180s liveness initial delay
- MongoDB init job waits for MongoDB readiness before running init script
- Services that consume Keycloak client secrets via envFrom will fail if those secrets are missing — the deployer creates them before Helm runs
- Keycloak Admin API sync in deploy.sh retries up to 60 seconds (12 attempts × 5s) for Keycloak to become available
- Helm is invoked with
--timeout 15m(no--waitflag) — the 15-minute timeout applies to the resource application operation, not pod readiness
Source: chart/values.yaml, deployer/deploy.sh
Upgrade Behavior
helm upgrade <release-name> . \
-f values-gcp.yaml \
-f <user-values> \
-n <namespace>
- All generated secrets are reused on re-runs (NATS keys, Keycloak secrets, DB passwords)
- MongoDB init Job is deleted before upgrade and re-applied (idempotent)
- topology-genesis ConfigMap is re-applied via
--server-side - Keycloak client secrets are synced to the live Keycloak instance via Admin API
- Application resource version is patched to
1.0.0after Helm completes
Source: deployer/deploy.sh, chart/README.md
Uninstall / Cleanup Expectations
# Uninstall release
helm uninstall <release-name> -n <namespace>
# PVCs must be deleted manually
kubectl delete pvc -l app.kubernetes.io/instance=<release-name> -n <namespace>
# Secrets not managed by Helm must be deleted manually
kubectl delete secret <release-name>-nats-auth-keys \
<release-name>-opendso-license \
<release-name>-opendso-apps-db-secret \
<release-name>-citus-db-secret \
-n <namespace>
# Per-client Keycloak env secrets
kubectl delete secret -l ... -n <namespace> # (no label selector documented)
# Optional: delete namespace
kubectl delete namespace <namespace>
The deployer-created secrets (nats-auth-keys, opendso-license, per-client Keycloak envs, opendso-apps-db-secret, citus-db-secret) are NOT managed by Helm and will NOT be deleted by helm uninstall.
Source: chart/README.md, deployer/deploy.sh
Troubleshooting Guidance Present in Docs
From chart/README.md:
- Grafana pod pending → check CPU/memory resources, describe pod
- Secret not found →
kubectl get secrets, recreate manually - Grafana redirect issues → verify
GF_SERVER_ROOT_URL - Image pull errors → verify the GKE node service account has
roles/artifactregistry.readeron the Artifact Registry;imagePullSecretsis empty ([]) in the GCP overlay — image pulls rely on Workload Identity or the node SA IAM role, not aregsecret - Helm dependency issues →
rm -rf charts/*.tgz Chart.lock && helm dependency update
Debug commands documented:
kubectl get all -n <namespace>kubectl get pods -n <namespace> -o widekubectl logs -l app.kubernetes.io/name=<name> -n <namespace>kubectl get events -n <namespace> --sort-by='.lastTimestamp'kubectl get/describe ingress -n <namespace>- Test service connectivity with a debug pod (
busybox)
Note: A dedicated GKE troubleshooting guide exists at GKE_MARKETPLACE_TROUBLESHOOTING.md. The troubleshooting guidance in chart/README.md still serves as the lower-level Helm/Kubernetes quick reference.
Source: chart/README.md
9. Marketplace Review Considerations
GCP Marketplace Hard Requirements / Relevant Items
- Application CRD:
app.k8s.io/v1beta1Application resource is correctly created withpartner_id: oesinc,product_id: opendso,partner_name: Open Energy Solutions Inc.(seechart/templates/application.yaml) - Schema version:
schemaVersion: v2,applicationApiVersion: v1beta1— correct for current Marketplace deployer_helm schema.yamlimages section: populated with 29 repository/tag/digest mappings for the chart-managed default Marketplace image set. This covers the OpenDSO application images plus the main chart-managed infrastructure images such as NATS, Keycloak, MongoDB, Citus, Redis, and the core services. It does not describe every dependency-chart image path in the repo, most notably Grafana dependency images.- managedUpdates:
kalmSupported: false— KALM update support is not claimed - All passwords use
MASKED_FIELD— correctly prevents plaintext display in Marketplace UI helm --timeout 15mwithout--wait— the deployer runshelm upgrade --installwith--timeout 15mbut no--waitflag. Helm exits after resource application and readiness is checked separately byscripts/verify.sh.- Deployer image: Must be built and pushed to Artifact Registry. Build instructions are in the README.
mpdev verifytooling:scripts/mpdev.shandscripts/provision-test-env.shexist for local testing.
Potential Reviewer Questions or Weak Spots
- TLS certificate fallback remains a product-policy question —
tls-secrets.yamlcan still auto-generate a self-signed certificate whenglobal.gcpMarketplace=trueand no<release>-tls-secretexists. This is only formpdev verify, controlled test installs, and installer resilience, not the intended production TLS model. A reviewer may still ask why that fallback remains enabled in Marketplace code paths. - License model is intentionally out-of-band from Marketplace metering — OpenDSO uses OES-issued
license.keyandinstallation.keyvalues and validates them through the license API path used bytopology-nodes. Reviewers may ask whether Marketplace entitlement and OES licensing are expected to coexist or whether OES licensing is the sole enforcement mechanism. - Artifact Registry readiness is still an operational dependency — the GCP overlay relies on node-level IAM / Workload Identity instead of
imagePullSecrets, and the image mirroring process remains manual. A misconfigured registry or IAM binding fails at runtime rather than at chart render time. - GMS API orchestration features are intentionally limited on GKE —
gms-api.config.dockerApiis stubbed on GKE because there is no Docker socket. A reviewer may ask which user-visible features are unavailable as a result.
10. Reference Notes / Clarifications
Default Marketplace-enabled component set
The baseline chart enables the following by default in the Marketplace path: nats, keycloak, grafana, mongodb, citus-db, historian-svc, gms-api, openfmb-event-service, topology-genesis, topology-nodes, der-dispatch-app, der-dispatch-svc, genesis-node-app, data-viewer-app, event-viewer-app, gis-app, historian-app, inspector-app, inventory-app, one-line-app, opendso-docs-app, openfmb-event-creator-app, schedule-dispatch-app, ess-manager-app, ess-tester-app, ess-manager-svc, ess-tester-svc, opendso-apps-db, ess-manager-redis, omegadss-svc, and rpcdss-svc. Disabled by default are nats-auth-svc in base values, keycloak-db, asset-health-svc, asset-health-sim-svc, and the three CVR services. In the Marketplace path specifically, the deployer enables nats-auth-svc via a generated overlay. This means the practical Marketplace default set is the base enabled list plus nats-auth-svc.
License and installation-key validation behavior
topology-nodes enforces runtime license validation when built with ENABLE_LICENSE_VALIDATION=ON by calling POST <LICENSE_API_URL>/v1/license/validate with license_key, installation_key, and environment_name at startup and then re-validating every 8 hours by default. After 3 consecutive re-validation failures, it exits. The entitlement service in ../entitlement-py verifies both the HMAC-signed license key and the installation key, binds installation keys to an environment name, and enforces installation limits. The Marketplace deployer creates <release>-opendso-license with the env var names topology-nodes expects: LICENSE_KEY, LICENSE_INSTALLATION_KEY, and LICENSE_ENVIRONMENT_NAME.
NATS and ESS state during rolling updates
The current chart does not prove durable preservation of all transient state across rolling updates. NATS is deployed as a Deployment, not a StatefulSet, and its JetStream store_dir: datastore is not backed by a PVC in the chart, so in-flight or broker-local persisted NATS state should not be documented as durable across pod replacement. For ESS Manager, Redis state is persisted via a PVC-backed StatefulSet, but the service's DAY_AHEAD_DIR is mounted from emptyDir, so working files in /var/lib/ess/dayahead are ephemeral across pod replacement. The correct stance is that some state is durable (Redis PVC), some is explicitly not (emptyDir working files), and the repo does not establish a guarantee for preserving in-flight NATS messages across rolling upgrades.