Production Deployment on AWS EC2

This guide provides step-by-step instructions for deploying OpenDSO in a production environment using AWS EC2 instances with Terraform Infrastructure as Code (IaC). This will review some steps that exist in the Local Deployment page, but will delve deeper into common IaC structures and how your team and OES has deployed OpenDSO in your production environment.

Overview

The production deployment uses Terraform to automate the provisioning and configuration of AWS infrastructure. The deployment:

Provisions EC2 instances with Amazon Linux
Configures security groups and networking
Automatically installs Docker and dependencies on the EC2 instances
Downloads versioned release archives (zips) of opendso, config, and models from OES's GitHub releases via setup-opendso.sh
Sets up TLS certificates using Let's Encrypt with DNS-01 challenge
Supports multiple environments (Lab/Test and Field)

Repository Overview

The Terraform IaC projects are usually named as: \{client_project\}-iac

Key Files:

main.tf - Main Terraform configuration defining AWS resources
variables.tf - Input variable definitions
outputs.tf - Output definitions (SSH key, IP address)
run.sh - Helper script for Terraform operations
*.tfvars - Environment-specific variable files
assets/ - Provisioning scripts and configuration

Environment Files:

\{environment_name\}.tfvars

Prerequisites

Local Machine Requirements

Terraform >= 1.2.0
AWS CLI v2 (configured with credentials)
GitHub Personal Access Token (for downloading releases)
SSH client
git

Install Terraform

# Linux (Ubuntu/Debian)
wget -O- https://apt.releases.hashicorp.com/gpg | sudo gpg --dearmor -o /usr/share/keyrings/hashicorp-archive-keyring.gpg
echo "deb [signed-by=/usr/share/keyrings/hashicorp-archive-keyring.gpg] https://apt.releases.hashicorp.com $(lsb_release -cs) main" | sudo tee /etc/apt/sources.list.d/hashicorp.list
sudo apt update && sudo apt install terraform

# macOS (Homebrew)
brew tap hashicorp/tap
brew install hashicorp/tap/terraform

# Verify installation
terraform --version

Configure AWS CLI

# Install AWS CLI
curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip"
unzip awscliv2.zip
sudo ./aws/install

# Configure credentials
aws configure
# Enter: AWS Access Key ID, Secret Access Key, Region, Output format

GitHub Personal Access Token

Create a GitHub PAT with repo scope for downloading OpenDSO releases:

Go to GitHub Settings → Developer settings → Personal access tokens → Tokens (classic)
Click "Generate new token (classic)"
Select scope: repo (Full control of private repositories)
Generate and save the token securely

AWS Infrastructure Overview

The usual Terraform configuration creates the following AWS resources. Note: The actual deployed infrastructure may differ from the Terraform configuration due to manual modifications.

EC2 Instance (Actual Deployed Configuration)

AMI: Amazon Linux
Instance Type: t2.large (2 vCPU, 8GB RAM)
Root Volume: 8GB or more EBS
Connection: SSH via generated RSA key pair (4096-bit RSA)
Tags: Name=OpenDSO, CreatedBy=terraform, autoShutdown=Off

Security Group

Name: oes-sg
Ingress Rules (from actual EC2 console):
- Port 22 (SSH) from 0.0.0.0/0
- Port 443 (HTTPS) from 0.0.0.0/0
- Port 8080 (HTTP Alt) from 0.0.0.0/0
- Port 3389 (RDP) from 0.0.0.0/0
Egress: All traffic to 0.0.0.0/0 (IPv4) and ::/0 (IPv6)

SSH Key Pair

Algorithm: RSA 4096-bit
Generated by Terraform: Automatic key generation
Output: Private key should be saved to ~/.ssh/tf_id_rsa.pem

Network Configuration

Uses existing VPC and subnet (specified in tfvars)
Private IP address assigned from subnet
Requires VPN or jump host for access

Deployment Process

Step 1: Clone the IaC Repository

git clone https://github.com/openenergysolutions/\{client_project\}-iac.git

Step 2: Review Environment Configuration

Choose the appropriate environment file:

OES Test Example Environment:

cat oes-test.tfvars

aws_region    = "example"
aws_subnet_id = "example
aws_vpc_id    = "example"
sshkey_name   = "tf_id_rsa"

Step 3: Review Deployment Configuration

The config.json file defines which GitHub repositories setup-opendso.sh will download release archives from, and what local directory each archive should be unzipped into. On a deployed host, both config.json and setup-opendso.sh live in the home directory next to the unzipped opendso/, config/, and models/ folders:

cat assets/config.json

{
  "organization": "openenergysolutions",
  "releases": [
    {
      "repositoryName": "opendso-docker-compose",
      "displayName": "opendso"
    },
    {
      "repositoryName": "\{client_project\}-docker-compose",
      "displayName": "models"
    },
    {
      "repositoryName": "\{client_project\}-config-docker-compose",
      "displayName": "config"
    }
  ]
}

How this is used at deploy time:

For each entry, setup-opendso.sh calls https://api.github.com/repos/{organization}/{repositoryName}/releases/latest to find the most recent published release, downloads its first attached asset (the config.zip, models.zip, or opendso.zip produced by each repo's tag-archive workflow), and unzips it into ~/{displayName}/. See Versioned Release Archives for the workflow that publishes those zips.

Important: setup-opendso.sh always pulls the latest release. To roll a deployment forward you push a new git tag on the config or models repo (which triggers the release workflow) and then re-run setup-opendso.sh on the host. See Tagging Versioned Updates below.

Step 4: Deploy Infrastructure

Use the run.sh helper script to deploy:

# Deploy to environment
./run.sh -i ./oes-test.tfvars -p -s

Script Options:

-i <file> - Specify tfvars file (default: dev.tfvars)
-p - Prompt for GitHub PAT (Personal Access Token)
-s - Setup (create and provision infrastructure)
-t - Teardown (destroy infrastructure)
-h - Display help

What Happens During Deployment:

Terraform Initialization
- Downloads AWS provider plugins
- Initializes backend
Infrastructure Creation
- Creates security group with SSH and HTTPS rules
- Generates SSH key pair (RSA 4096-bit)
- Provisions EC2 instance
Automatic Provisioning
- Copies provisioning scripts to EC2 instance
- Runs init.sh - Installs Docker, Docker Compose, golang
- Runs setup-opendso.sh - Downloads the latest tagged release archives (opendso.zip, config.zip, models.zip) from GitHub and unzips them into ~/opendso/, ~/config/, and ~/models/
Output
- SSH private key saved to ~/.ssh/tf_id_rsa.pem
- EC2 instance private IP displayed

Step 5: Access the Instance

The deployment outputs the EC2 private IP address. Since the instance is in a private subnet, you need VPN or jump host access:

# View the private IP (output is marked sensitive)
terraform output -raw apphost_ip

# Or extract from terraform output
PRIVATE_IP=$(terraform output -raw apphost_ip)
echo $PRIVATE_IP

# SSH to the instance (requires VPN connection to the VPC)
ssh -i ~/.ssh/tf_id_rsa.pem ec2-user@$PRIVATE_IP

Note: The apphost_ip output is marked as sensitive. To view it, use the -raw flag or terraform show.

Post-Deployment Configuration

Once connected to the EC2 instance, complete the setup:

Verify Installation

# Check Docker installation
docker --version
docker compose version

# Check downloaded repositories
ls -la ~/
# Should see: opendso/, models/, config/ directories

Configure DNS

The deployment uses Let's Encrypt with DNS-01 challenge for TLS certificates. The domain is configured in setup-certs.sh:

Update DNS records to point to your instance:

Create A record for \{client_address_project_name\}.oesinc.dev pointing to instance IP (or jump host IP)
Create wildcard A record for *.\{client_address_project_name\}.oesinc.dev pointing to same IP

Set Up Certificates

The setup-certs.sh script uses certbot with Google Cloud DNS for Let's Encrypt certificates.

Prerequisites:

Google Cloud DNS credentials file at ~/.secrets/
Domain delegation to Google Cloud DNS

Run Certificate Setup:

# Install certbot and Google DNS plugin
sudo yum install -y certbot python3-certbot-dns-google

# Place Google Cloud credentials
mkdir -p ~/.secrets
# Upload your Google Cloud DNS credentials JSON file

# Run certificate generation
chmod +x setup-certs.sh
./setup-certs.sh

What This Does:

Generates Let's Encrypt certificates using DNS-01 challenge
Creates certificates for both domains
Copies certificates to ~/certs/ directory

Generated Certificates:

~/certs/
├── fullchain.pem       # Full certificate chain
├── server-cert.pem     # Server certificate
├── server-key.pem      # Private key
├── chain.pem           # Intermediate chain
├── rootCA.pem          # Root CA
├── \{other_certs\}.pem   # Depending on the client deployment, you may generate additional certs

Configure Database Credentials

Before deploying services, configure database usernames and passwords for the various OpenDSO services. Database credentials are typically stored in environment files within the config repositories.

Common Database Credential Locations:

GMS API MongoDB Credentials - config/gms-api/env/production.env:

MONGODB_DOMAIN = mongodb:27017
MONGODB_DBNAME = settings_api
MONGODB_USERNAME = SettingsAPIUser
MONGODB_PASSWORD = your_secure_password

Docker Environment Variables - config/docker/.env:

# PostgreSQL credentials for various services
CITUS_PGUSER="postgres"
CITUS_PGPASSWORD="your_secure_password"

HISTORIAN_PGUSER="historian"
HISTORIAN_PGPASSWORD="your_secure_password"

DER_DISPATCH_PGUSER="opendso"
DER_DISPATCH_PGPASSWORD="your_secure_password"

# Keycloak admin credentials
KEYCLOAK_ADMIN="admin"
KEYCLOAK_PASSWORD="your_secure_password"

Security Best Practices:

Change Default Passwords: Always replace default passwords before production deployment
Use Strong Passwords: Generate complex passwords with sufficient entropy
Restrict File Permissions: Limit access to environment files containing credentials
```
chmod 600 ~/config/gms-api/env/production.env
chmod 600 ~/config/docker/.env
```
Consider Secrets Management: For enhanced security, consider using AWS Secrets Manager or similar services instead of plain-text environment files
Avoid Version Control: Ensure .env files with real credentials are in .gitignore

Note: The environment files shown here contain example credentials from project templates. Update these with your actual secure credentials appropriate for your production environment.

Deploy OpenDSO Services

Navigate to the OpenDSO orchestration directory and deploy:

cd ~/opendso/opendso-docker-compose

# View available profiles
./run.sh -l

# Deploy all services
./run.sh -p all -c

# Verify deployment
docker ps

Verify Deployment

Access services via your domain, for example:

Terraform Operations

Manual Terraform Commands

If you prefer direct Terraform commands over the run.sh script:

# Initialize Terraform
terraform init

# Plan changes
terraform plan -var-file="./oes-test.tfvars" -var="github_token=$GITHUB_TOKEN"

# Apply changes
terraform apply -var-file="./oes-test.tfvars" -var="github_token=$GITHUB_TOKEN" -auto-approve

# Show outputs
terraform output

# Destroy infrastructure (careful!)
terraform destroy -var-file="./oes-test.tfvars" -var="github_token=$GITHUB_TOKEN" -auto-approve

State Management

Terraform state is stored locally by default.

Updating Deployments

Update OpenDSO Components

Updates roll forward by pushing a new git tag on the config and/or models repo (which triggers the GitHub Actions release workflow described in Tagging Versioned Updates) and then re-running setup-opendso.sh on the host so it picks up the new latest release:

# SSH to instance
ssh -i ~/.ssh/tf_id_rsa.pem ec2-user@<PRIVATE_IP>

# Stop services
cd ~/opendso/opendso-docker-compose
./run.sh -p all -d

# Re-run setup to download the latest tagged release archives
# (config.zip / models.zip / opendso.zip from each repo's GitHub releases)
export GITHUB_TOKEN="your-token"
cd ~
./setup-opendso.sh

# Restart services
cd ~/opendso/opendso-docker-compose
./run.sh -p all -c

Note: setup-opendso.sh always pulls the latest GitHub release for each repo. Make sure the tag you intend to deploy is the most recent published release on the config and models repos before running this; otherwise the host will pick up whichever tag is most recent.

Update Terraform Configuration

# From the IaC git directory
git pull

# Review changes
terraform plan -var-file="./oes-test.tfvars" -var="github_token=$GITHUB_TOKEN"

# Apply updates
terraform apply -var-file="./oes-test.tfvars" -var="github_token=$GITHUB_TOKEN"

Tagging Versioned Updates

A client deployment is "versioned" by which git tags are the most recent published releases on the config, models, and opendso repositories. The deployment host pulls the latest release of each via setup-opendso.sh. To roll forward (or back) you cut a new tag, let the GitHub Actions workflow build and publish the archive, then re-run setup-opendso.sh on the host.

When to Tag

Tag a new version any time you have a deployable change to one of these repos:

\{client_repo_name\}-config-docker-compose — environment variables, image tags in docker/.env, service config files. Most client-driven changes (image version bumps, credential updates, new service tunables) live here.
\{client_repo_name\}-docker-compose (the models repo) — OpenFMB adapter mappings for the client's field devices. Tag whenever adapter coverage or configuration changes.
opendso-docker-compose — base orchestration. Usually tagged by the OpenDSO platform team, not the client team.

Tagging Convention

Use SemVer-style tags (e.g. 1.4.0, 1.4.1). The workflow accepts any tag pattern (tags: '*'), but staying consistent makes the GitHub release list and the .version file dropped into each archive readable.

# From a clean checkout of the repo you want to release
git checkout main
git pull

# Create and push the tag
git tag 1.4.0
git push origin 1.4.0

What Happens on Push

The .github/workflows/main.yaml in the config and models repos triggers on the tag push and:

Checks out the tagged commit with full history.
Runs GitVersion to derive a version.
Writes the tag name into a .version file at the repo root so the unpacked archive on the deployment host carries the version stamp (cat ~/config/.version will show the tag a deployed host is running).
Zips the working tree (excluding .git*, .vscode/*, .editorconfig) into config.zip (config repo) or models.zip (models repo).
Creates a GitHub release for the tag and attaches the zip as a release asset.

You can confirm a release published correctly by visiting the repo's Releases page on GitHub and verifying the new tag has a config.zip or models.zip attached.

Deploying the New Tag

Once the workflow has finished and the new tag is the most recent release on its repo, deploy it to the host:

ssh -i ~/.ssh/tf_id_rsa.pem ec2-user@<PRIVATE_IP>

# Stop services so the archive contents can be replaced cleanly
cd ~/opendso/opendso-docker-compose
./run.sh -p all -d

# Re-run setup-opendso.sh — this will fetch the *latest* release of each repo
# listed in config.json and overwrite ~/opendso/, ~/config/, and ~/models/
export GITHUB_TOKEN="your-token"
cd ~
./setup-opendso.sh

# Confirm the deployed version stamp
cat ~/config/.version
cat ~/models/.version

# Bring services back up
cd ~/opendso/opendso-docker-compose
./run.sh -p all -c

Notes and gotchas:

setup-opendso.sh always pulls releases/latest. There is no per-tag pinning in config.json. If you need to roll back, the supported path is to either re-publish an older tag as the latest release on GitHub, or to manually download and unzip the older release asset on the host.
Pushing a tag that already exists will not re-run the workflow. Delete the tag remotely (git push --delete origin 1.4.0) and the GitHub release before re-tagging if you need to rebuild an archive at the same version.
The workflow only runs on tag pushes. Pushing commits to main does not publish a new archive — nothing changes for deployed hosts until a tag is cut.
unzip -o overwrites files in place but does not delete files that were removed in the new tag. If a release removes a file, manually delete it from ~/config/ or ~/models/ (or wipe the directory before running setup-opendso.sh).

For troubleshooting failed workflows, missing release assets, or setup-opendso.sh errors, see Production Deployment Troubleshooting → Release Archive and setup-opendso.sh Issues.

Provisioning Scripts Explained

init.sh

Installs system dependencies and Docker:

Updates system packages
Installs unzip, golang, nss-tools, docker
Enables and starts Docker service
Installs Docker Compose
Adds ec2-user to docker group

setup-opendso.sh

Downloads the versioned OpenDSO release archives from GitHub and unpacks them into the deployment layout that run.sh expects.

What it does:

Reads config.json (in the same directory it is run from) to get the GitHub organization and the list of releases to fetch
For each entry, calls GET /repos/{organization}/{repositoryName}/releases/latest and reads .assets[0].id (the first attached asset on the latest release) — this is the config.zip, models.zip, or opendso.zip published by each repo's tag-archive workflow
Downloads the asset with Accept: application/octet-stream against /repos/{organization}/{repositoryName}/releases/assets/{ASSET_ID} and writes it to {displayName}.zip
Runs unzip -o {displayName}.zip -d {displayName} so the archive contents land in ~/opendso/, ~/config/, and ~/models/
If a repo has no releases yet, the API returns a null asset id and the script logs Unable to find release for {OWNER}/{REPO} and skips it (the script does not fail the whole deployment)

Requires:

GITHUB_TOKEN environment variable with repo scope on the OES org (the script exits with code 5 if it is unset)
jq and curl available on the host (installed by init.sh)
config.json present in the working directory

Why this matters for upgrades: because the script only ever asks for releases/latest, the way to change what gets deployed is to change which tagged release is the most recent on the config / models / opendso repo — see Tagging Versioned Updates.

setup-certs.sh

Generates TLS certificates using Let's Encrypt:

Uses certbot with DNS-01 challenge (Google Cloud DNS)
Generates wildcard certificates
Copies certificates to ~/certs/ directory
Creates OpenADR-specific certificate files
Sets appropriate file permissions

Requires: Google Cloud DNS credentials

Monitoring and Maintenance

View Logs

# Docker service logs
sudo journalctl -u docker -f

# Container logs
docker compose logs -f

# System logs
sudo journalctl -f

Monitor Resources

# Docker stats
docker stats

# System resources
htop
df -h
free -h

Database Backup and Restore

OpenDSO uses multiple databases to store different types of data:

MongoDB - Stores GMS API configuration data, user settings, and application state
PostgreSQL (Topology Genesis) - Stores parsed CIM topology data and equipment information
SQLite (OpenADR Service) - Stores OpenADR VEN/VTN registration and event data

Backup MongoDB (GMS API Data)

The run.sh script provides a convenient backup command that uses mongodump to create a binary archive of the MongoDB database:

cd ~/opendso/opendso-docker-compose
./run.sh -b

What this does:

Connects to the running MongoDB container
Authenticates using credentials from environment variables
Creates a binary archive dump of the database
Saves the output to db.dump in the current directory

Requirements:

MongoDB container must be running
Credentials must be properly configured in the environment

Manual Backup (Alternative):

If you need more control over the backup process:

# Backup with custom filename
# Note: Use the database name as authenticationDatabase (typically settings_api)
docker exec mongodb sh -c 'mongodump --authenticationDatabase ${MONGODB_COLLECTION} \
  -u ${MONGODB_USERNAME} -p ${MONGODB_PASSWORD} \
  --db ${MONGODB_COLLECTION} --archive' > gms-backup-$(date +%Y%m%d-%H%M%S).dump

# Backup to a directory (not archive)
docker exec mongodb sh -c 'mongodump --authenticationDatabase ${MONGODB_COLLECTION} \
  -u ${MONGODB_USERNAME} -p ${MONGODB_PASSWORD} \
  --db ${MONGODB_COLLECTION} --out /data/backup'

Important: The SettingsAPIUser is created in the settings_api database, not in the admin database. Therefore, use --authenticationDatabase settings_api (or the database name from ${MONGODB_COLLECTION}) instead of --authenticationDatabase admin.

Restore MongoDB Database

To restore from a backup:

cd ~/opendso/opendso-docker-compose
./run.sh -r

Requirements:

db.dump file must exist in the current directory
MongoDB container must be running

Manual Restore (Alternative):

# Restore from custom backup file
# Note: Use the database name as authenticationDatabase (typically settings_api)
docker exec -i mongodb sh -c 'mongorestore --authenticationDatabase ${MONGODB_COLLECTION} \
  -u ${MONGODB_USERNAME} -p ${MONGODB_PASSWORD} \
  --db ${MONGODB_COLLECTION} --archive' < gms-backup-20240101.dump

# Drop existing database before restore
docker exec -i mongodb sh -c 'mongorestore --authenticationDatabase ${MONGODB_COLLECTION} \
  -u ${MONGODB_USERNAME} -p ${MONGODB_PASSWORD} \
  --db ${MONGODB_COLLECTION} --drop --archive' < db.dump

Backup PostgreSQL (Topology Genesis Data)

The Topology Genesis service uses PostgreSQL to store parsed CIM files and topology data:

# Backup PostgreSQL database
docker exec postgres sh -c 'pg_dump -U ${POSTGRES_USER} ${POSTGRES_DB}' > topology-backup-$(date +%Y%m%d-%H%M%S).sql

# Compressed backup
docker exec postgres sh -c 'pg_dump -U ${POSTGRES_USER} ${POSTGRES_DB}' | gzip > topology-backup-$(date +%Y%m%d-%H%M%S).sql.gz

Restore PostgreSQL:

# Restore from SQL dump
docker exec -i postgres sh -c 'psql -U ${POSTGRES_USER} ${POSTGRES_DB}' < topology-backup.sql

# Restore from compressed backup
gunzip < topology-backup.sql.gz | docker exec -i postgres sh -c 'psql -U ${POSTGRES_USER} ${POSTGRES_DB}'

Backup OpenADR Service Data

The OpenADR service uses SQLite for VEN registrations and event data:

# Locate and backup SQLite database
docker exec oadr-service sh -c 'sqlite3 /app/data/oadr.db ".backup /tmp/oadr-backup.db"'
docker cp oadr-service:/tmp/oadr-backup.db ./oadr-backup-$(date +%Y%m%d-%H%M%S).db

# Or simply copy the database file
docker cp oadr-service:/app/data/oadr.db ./oadr-backup-$(date +%Y%m%d-%H%M%S).db

Restore OpenADR Database:

# Stop the service first
docker stop oadr-service

# Copy backup to container
docker cp oadr-backup.db oadr-service:/app/data/oadr.db

# Restart service
docker start oadr-service

Complete System Backup Example

Use this script example to help create a backup of OpenDSO database systems:

#!/bin/bash
# Complete backup script

BACKUP_DIR="./backups/$(date +%Y%m%d-%H%M%S)"
mkdir -p "$BACKUP_DIR"

echo "Starting OpenDSO complete backup..."

# Backup MongoDB (GMS API)
echo "Backing up MongoDB..."
./run.sh -b
mv db.dump "$BACKUP_DIR/mongodb-gms.dump"

# Backup PostgreSQL (Topology Genesis)
echo "Backing up PostgreSQL..."
docker exec postgres sh -c 'pg_dump -U ${POSTGRES_USER} ${POSTGRES_DB}' | gzip > "$BACKUP_DIR/postgres-topology.sql.gz"

# Backup OpenADR Service (if running)
if docker ps | grep -q oadr-service; then
  echo "Backing up OpenADR Service..."
  docker cp oadr-service:/app/data/oadr.db "$BACKUP_DIR/oadr-service.db"
fi

# Backup configuration files
echo "Backing up configuration..."
cp -r ../config "$BACKUP_DIR/config-backup"

# Example copy to S3 (if configured)
# aws s3 cp "$BACKUP_DIR" s3://your-backup-bucket/opendso/backup-$(date +%Y%m%d) --recursive

echo "Backup complete: $BACKUP_DIR"
ls -lh "$BACKUP_DIR"

Certificate Renewal

Let's Encrypt certificates expire after 90 days, so the certificates have to be reissued and the running services have to pick up the new files. Each cert-consuming service mounts the host's ~/certs/ directory, so once new certificates are written the containers only need to be restarted to reload them — they do not need to be destroyed and recreated.

The recommended approach is therefore to reissue the certificates and restart the containers in place.

Reissue and restart in place (recommended)

Step 1 — Issue the new certificates

Run the certificate script. This refreshes the files in ~/certs/ in place:

cd ~
./setup-certs.sh

If setup-certs.sh reports errors (for example, a Google Cloud DNS or certbot failure), stop here and resolve those first — see Let's Encrypt Certificate Generation Fails. The rest of this process assumes new certificates were written successfully.

Step 2 — Restart the containers so they reload the certs

The certificates are read when a container starts, so the running containers keep serving the old certificate until they are restarted. docker compose ... restart stops and starts the existing containers — it does not run down/up, so no containers, volumes, or networks are removed and no database data is touched.

Restart the whole stack with --profile all, using the same env file you deployed with:

# cd to wherever the compose project lives on your host. This is
# ~/opendso/opendso-docker-compose on a standard deployment, but may be
# ~/opendso or elsewhere depending on how the host was set up.
cd ~/opendso/opendso-docker-compose

docker compose --env-file ../config/docker/.env --profile all restart

Use --profile all, not a subset. Restarting only some profiles (for example --profile ui --profile opendadr) fails with dependency errors, because a service restarted on its own still expects the services it depends_on to be part of the same operation. --profile all restarts the full dependency graph together, which is what works in practice.

Step 3 — Verify

Confirm the containers came back up, then check that the served certificate is the new one:

docker ps

# Inspect the certificate the running endpoint is now serving
echo | openssl s_client -connect localhost:443 -servername <your-domain> 2>/dev/null \
  | openssl x509 -noout -dates

The notBefore/notAfter dates should reflect the certificate you just issued.

Automating renewal (cron)

Because the restart-in-place method does not destroy anything, it is safe to run unattended. Create a renewal script and schedule it monthly:

# Create renewal script
cat > ~/renew-certs.sh <<'EOF'
#!/bin/bash
set -e
cd ~
./setup-certs.sh
cd ~/opendso/opendso-docker-compose
docker compose --env-file ../config/docker/.env --profile all restart
EOF

chmod +x ~/renew-certs.sh

# Add to crontab (run monthly)
crontab -e
# Add: 0 0 1 * * /home/ec2-user/renew-certs.sh

Reissue and rebuild the deployment (last resort)

If a restart alone does not pick up the new certificates, you can fully recreate the stack:

cd ~
./setup-certs.sh
cd ~/opendso/opendso-docker-compose
./run.sh -p all -d
./run.sh -p all -c

This destroys containers and removes volumes

./run.sh -p all -d runs docker compose down -v, which tears down every container and removes anonymous volumes. Any in-memory state and any database data that is not held in a named/external volume or backed up will be lost. Do not run this unattended (for example from cron). Before using it, back up first — see Database Backup and Restore — and only proceed when you are confident nothing will be erased. Prefer the restart-in-place method above wherever possible.

Tearing Down Infrastructure

To destroy the infrastructure:

# Using run.sh script
./run.sh -i ./oes-test.tfvars -p -t

# Or using Terraform directly
terraform destroy -var-file="./oes-test.tfvars" -var="github_token=$GITHUB_TOKEN" -auto-approve

Warning: This permanently deletes the EC2 instance and all data. Ensure backups are taken first.

Troubleshooting

For production deployment issues, see the Production Deployment Troubleshooting Guide.

Common issues covered include:

Terraform deployment failures
Provisioner script errors
EC2 access and SSH issues
DNS and certificate problems
Infrastructure drift detection and resolution
Multi-environment management
Emergency procedures and disaster recovery

For Docker and container-specific issues, see the Docker Troubleshooting Guide.

Infrastructure Drift Management

Understanding Drift

Infrastructure drift occurs when the actual deployed infrastructure differs from the Terraform configuration.

Causes of Drift

Manual AWS Console changes after Terraform deployment
EBS volume resizing performed outside Terraform
Security group rule additions for operational needs
Terraform configuration not applied to existing resources

Detecting Drift

Check for drift using Terraform:

# View current state vs configuration
terraform plan -var-file="./oes-test.tfvars" -var="github_token=$GITHUB_TOKEN"

# Refresh state from AWS
terraform refresh -var-file="./oes-test.tfvars" -var="github_token=$GITHUB_TOKEN"

# Show detailed state
terraform show

Resolving Drift

Option 1: Update Terraform to match actual infrastructure

Option 2: Import existing resources into Terraform

Option 3: Accept managed drift

For resources with intentional manual changes, use lifecycle rules:

resource "aws_security_group" "ssh_sg" {
  # ... config ...

  lifecycle {
    ignore_changes = [ingress]  # Allow manual ingress rule changes
  }
}

Security Best Practices

1. Restrict SSH Access

Update security group to limit SSH to specific IPs:

# In main.tf, modify ingress rule:
ingress {
  from_port   = 22
  to_port     = 22
  protocol    = "tcp"
  cidr_blocks = ["YOUR_VPN_CIDR"]  # Instead of 0.0.0.0/0
}

2. Use Systems Manager Session Manager

Instead of SSH, use AWS Systems Manager:

# Install Session Manager plugin
aws ssm start-session --target i-1234567890abcdef0

3. Secrets Management

Store sensitive data securely:

# Use AWS Secrets Manager for GitHub token
aws secretsmanager create-secret \
  --name opendso/github-token \
  --secret-string "your-token"

4. Enable CloudWatch Logging

Add CloudWatch agent for centralized logging:

sudo yum install -y amazon-cloudwatch-agent
sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-config-wizard

5. Regular Updates

# System updates
sudo yum update -y

# Docker updates
sudo yum update docker -y
sudo systemctl restart docker

Advanced Configuration

Custom Domain Configuration

To use a different domain, modify setup-certs.sh:

# Edit domain variable
DOMAIN=your-domain.com

Update DNS records accordingly.

Resize Instance

To change instance type (warning: this could destroy existing data, back up everything first):

# In main.tf, modify instance_type:
resource "aws_instance" "app_server" {
  instance_type = "t2.xlarge"  # Change from t2.large
  # ...
}

Then apply changes:

terraform apply -var-file="./oes-test.tfvars" -var="github_token=$GITHUB_TOKEN"

Add Additional Storage

# In main.tf, add additional block device:
resource "aws_instance" "app_server" {
  # ... existing config ...

  ebs_block_device {
    device_name = "/dev/sdf"
    volume_size = 100
    volume_type = "gp3"
  }
}

Next Steps

Local Testing: See Local Test Deployment for development environment
Adding Services: See Adding Custom Services to extend the deployment
Architecture: Review OpenDSO Architecture

Support

For production deployment support:

Review Terraform plan output before applying changes
Contact the OpenDSO software team
Consult with your OES support team

Overview​

Repository Overview​

Prerequisites​

Local Machine Requirements​

Install Terraform​

Configure AWS CLI​

GitHub Personal Access Token​

AWS Infrastructure Overview​

EC2 Instance (Actual Deployed Configuration)​

Security Group​

SSH Key Pair​

Network Configuration​

Deployment Process​

Step 1: Clone the IaC Repository​

Step 2: Review Environment Configuration​

Step 3: Review Deployment Configuration​

Step 4: Deploy Infrastructure​

Step 5: Access the Instance​

Post-Deployment Configuration​

Verify Installation​

Configure DNS​

Set Up Certificates​

Configure Database Credentials​

Deploy OpenDSO Services​

Verify Deployment​

Terraform Operations​

Manual Terraform Commands​

State Management​

Updating Deployments​

Update OpenDSO Components​

Update Terraform Configuration​

Tagging Versioned Updates​

When to Tag​

Tagging Convention​

What Happens on Push​

Deploying the New Tag​

Provisioning Scripts Explained​

init.sh​

setup-opendso.sh​

setup-certs.sh​

Monitoring and Maintenance​

View Logs​

Monitor Resources​

Database Backup and Restore​

Backup MongoDB (GMS API Data)​

Restore MongoDB Database​

Backup PostgreSQL (Topology Genesis Data)​

Backup OpenADR Service Data​

Complete System Backup Example​

Certificate Renewal​

Reissue and restart in place (recommended)​

Reissue and rebuild the deployment (last resort)​

Tearing Down Infrastructure​

Troubleshooting​

Infrastructure Drift Management​

Understanding Drift​

Causes of Drift​

Detecting Drift​

Resolving Drift​

Security Best Practices​

1. Restrict SSH Access​

2. Use Systems Manager Session Manager​

3. Secrets Management​

4. Enable CloudWatch Logging​

5. Regular Updates​

Advanced Configuration​

Custom Domain Configuration​

Resize Instance​

Add Additional Storage​

Next Steps​

Support​