Production Deployment on AWS EC2
This guide provides step-by-step instructions for deploying OpenDSO in a production environment using AWS EC2 instances with Terraform Infrastructure as Code (IaC). This will review some steps that exist in the Local Deployment page, but will delve deeper into common IaC structures and how your team and OES has deployed OpenDSO in your production environment.
Overview
The production deployment uses Terraform to automate the provisioning and configuration of AWS infrastructure. The deployment:
- Provisions EC2 instances with Amazon Linux
- Configures security groups and networking
- Automatically installs Docker and dependencies on the EC2 instances
- Downloads versioned release archives (zips) of
opendso,config, andmodelsfrom OES's GitHub releases via setup-opendso.sh - Sets up TLS certificates using Let's Encrypt with DNS-01 challenge
- Supports multiple environments (Lab/Test and Field)
Repository Overview
The Terraform IaC projects are usually named as: \{client_project\}-iac
Key Files:
main.tf- Main Terraform configuration defining AWS resourcesvariables.tf- Input variable definitionsoutputs.tf- Output definitions (SSH key, IP address)run.sh- Helper script for Terraform operations*.tfvars- Environment-specific variable filesassets/- Provisioning scripts and configuration
Environment Files:
\{environment_name\}.tfvars
Prerequisites
Local Machine Requirements
- Terraform >= 1.2.0
- AWS CLI v2 (configured with credentials)
- GitHub Personal Access Token (for downloading releases)
- SSH client
- git
Install Terraform
# Linux (Ubuntu/Debian)
wget -O- https://apt.releases.hashicorp.com/gpg | sudo gpg --dearmor -o /usr/share/keyrings/hashicorp-archive-keyring.gpg
echo "deb [signed-by=/usr/share/keyrings/hashicorp-archive-keyring.gpg] https://apt.releases.hashicorp.com $(lsb_release -cs) main" | sudo tee /etc/apt/sources.list.d/hashicorp.list
sudo apt update && sudo apt install terraform
# macOS (Homebrew)
brew tap hashicorp/tap
brew install hashicorp/tap/terraform
# Verify installation
terraform --version
Configure AWS CLI
# Install AWS CLI
curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip"
unzip awscliv2.zip
sudo ./aws/install
# Configure credentials
aws configure
# Enter: AWS Access Key ID, Secret Access Key, Region, Output format
GitHub Personal Access Token
Create a GitHub PAT with repo scope for downloading OpenDSO releases:
- Go to GitHub Settings → Developer settings → Personal access tokens → Tokens (classic)
- Click "Generate new token (classic)"
- Select scope:
repo(Full control of private repositories) - Generate and save the token securely
AWS Infrastructure Overview
The usual Terraform configuration creates the following AWS resources. Note: The actual deployed infrastructure may differ from the Terraform configuration due to manual modifications.
EC2 Instance (Actual Deployed Configuration)
- AMI: Amazon Linux
- Instance Type:
t2.large(2 vCPU, 8GB RAM) - Root Volume: 8GB or more EBS
- Connection: SSH via generated RSA key pair (4096-bit RSA)
- Tags:
Name=OpenDSO,CreatedBy=terraform,autoShutdown=Off
Security Group
- Name:
oes-sg - Ingress Rules (from actual EC2 console):
- Port 22 (SSH) from 0.0.0.0/0
- Port 443 (HTTPS) from 0.0.0.0/0
- Port 8080 (HTTP Alt) from 0.0.0.0/0
- Port 3389 (RDP) from 0.0.0.0/0
- Egress: All traffic to 0.0.0.0/0 (IPv4) and ::/0 (IPv6)
SSH Key Pair
- Algorithm: RSA 4096-bit
- Generated by Terraform: Automatic key generation
- Output: Private key should be saved to
~/.ssh/tf_id_rsa.pem
Network Configuration
- Uses existing VPC and subnet (specified in tfvars)
- Private IP address assigned from subnet
- Requires VPN or jump host for access
Deployment Process
Step 1: Clone the IaC Repository
git clone https://github.com/openenergysolutions/\{client_project\}-iac.git
Step 2: Review Environment Configuration
Choose the appropriate environment file:
OES Test Example Environment:
cat oes-test.tfvars
aws_region = "example"
aws_subnet_id = "example
aws_vpc_id = "example"
sshkey_name = "tf_id_rsa"
Step 3: Review Deployment Configuration
The config.json file defines which GitHub repositories setup-opendso.sh will download release archives from, and what local directory each archive should be unzipped into. On a deployed host, both config.json and setup-opendso.sh live in the home directory next to the unzipped opendso/, config/, and models/ folders:
cat assets/config.json
{
"organization": "openenergysolutions",
"releases": [
{
"repositoryName": "opendso-docker-compose",
"displayName": "opendso"
},
{
"repositoryName": "\{client_project\}-docker-compose",
"displayName": "models"
},
{
"repositoryName": "\{client_project\}-config-docker-compose",
"displayName": "config"
}
]
}
How this is used at deploy time:
For each entry, setup-opendso.sh calls https://api.github.com/repos/{organization}/{repositoryName}/releases/latest to find the most recent published release, downloads its first attached asset (the config.zip, models.zip, or opendso.zip produced by each repo's tag-archive workflow), and unzips it into ~/{displayName}/. See Versioned Release Archives for the workflow that publishes those zips.
Important: setup-opendso.sh always pulls the latest release. To roll a deployment forward you push a new git tag on the config or models repo (which triggers the release workflow) and then re-run setup-opendso.sh on the host. See Tagging Versioned Updates below.
Step 4: Deploy Infrastructure
Use the run.sh helper script to deploy:
# Deploy to environment
./run.sh -i ./oes-test.tfvars -p -s
Script Options:
-i <file>- Specify tfvars file (default: dev.tfvars)-p- Prompt for GitHub PAT (Personal Access Token)-s- Setup (create and provision infrastructure)-t- Teardown (destroy infrastructure)-h- Display help
What Happens During Deployment:
-
Terraform Initialization
- Downloads AWS provider plugins
- Initializes backend
-
Infrastructure Creation
- Creates security group with SSH and HTTPS rules
- Generates SSH key pair (RSA 4096-bit)
- Provisions EC2 instance
-
Automatic Provisioning
- Copies provisioning scripts to EC2 instance
- Runs
init.sh- Installs Docker, Docker Compose, golang - Runs
setup-opendso.sh- Downloads the latest tagged release archives (opendso.zip,config.zip,models.zip) from GitHub and unzips them into~/opendso/,~/config/, and~/models/
-
Output
- SSH private key saved to
~/.ssh/tf_id_rsa.pem - EC2 instance private IP displayed
- SSH private key saved to
Step 5: Access the Instance
The deployment outputs the EC2 private IP address. Since the instance is in a private subnet, you need VPN or jump host access:
# View the private IP (output is marked sensitive)
terraform output -raw apphost_ip
# Or extract from terraform output
PRIVATE_IP=$(terraform output -raw apphost_ip)
echo $PRIVATE_IP
# SSH to the instance (requires VPN connection to the VPC)
ssh -i ~/.ssh/tf_id_rsa.pem ec2-user@$PRIVATE_IP
Note: The apphost_ip output is marked as sensitive. To view it, use the -raw flag or terraform show.
Post-Deployment Configuration
Once connected to the EC2 instance, complete the setup:
Verify Installation
# Check Docker installation
docker --version
docker compose version
# Check downloaded repositories
ls -la ~/
# Should see: opendso/, models/, config/ directories
Configure DNS
The deployment uses Let's Encrypt with DNS-01 challenge for TLS certificates. The domain is configured in setup-certs.sh:
Update DNS records to point to your instance:
- Create A record for
\{client_address_project_name\}.oesinc.devpointing to instance IP (or jump host IP) - Create wildcard A record for
*.\{client_address_project_name\}.oesinc.devpointing to same IP
Set Up Certificates
The setup-certs.sh script uses certbot with Google Cloud DNS for Let's Encrypt certificates.
Prerequisites:
- Google Cloud DNS credentials file at
~/.secrets/ - Domain delegation to Google Cloud DNS
Run Certificate Setup:
# Install certbot and Google DNS plugin
sudo yum install -y certbot python3-certbot-dns-google
# Place Google Cloud credentials
mkdir -p ~/.secrets
# Upload your Google Cloud DNS credentials JSON file
# Run certificate generation
chmod +x setup-certs.sh
./setup-certs.sh
What This Does:
- Generates Let's Encrypt certificates using DNS-01 challenge
- Creates certificates for both domains
- Copies certificates to
~/certs/directory
Generated Certificates:
~/certs/
├── fullchain.pem # Full certificate chain
├── server-cert.pem # Server certificate
├── server-key.pem # Private key
├── chain.pem # Intermediate chain
├── rootCA.pem # Root CA
├── \{other_certs\}.pem # Depending on the client deployment, you may generate additional certs
Configure Database Credentials
Before deploying services, configure database usernames and passwords for the various OpenDSO services. Database credentials are typically stored in environment files within the config repositories.
Common Database Credential Locations:
-
GMS API MongoDB Credentials -
config/gms-api/env/production.env:MONGODB_DOMAIN = mongodb:27017
MONGODB_DBNAME = settings_api
MONGODB_USERNAME = SettingsAPIUser
MONGODB_PASSWORD = your_secure_password -
Docker Environment Variables -
config/docker/.env:# PostgreSQL credentials for various services
CITUS_PGUSER="postgres"
CITUS_PGPASSWORD="your_secure_password"
HISTORIAN_PGUSER="historian"
HISTORIAN_PGPASSWORD="your_secure_password"
DER_DISPATCH_PGUSER="opendso"
DER_DISPATCH_PGPASSWORD="your_secure_password"
# Keycloak admin credentials
KEYCLOAK_ADMIN="admin"
KEYCLOAK_PASSWORD="your_secure_password"
Security Best Practices:
- Change Default Passwords: Always replace default passwords before production deployment
- Use Strong Passwords: Generate complex passwords with sufficient entropy
- Restrict File Permissions: Limit access to environment files containing credentials
chmod 600 ~/config/gms-api/env/production.env
chmod 600 ~/config/docker/.env - Consider Secrets Management: For enhanced security, consider using AWS Secrets Manager or similar services instead of plain-text environment files
- Avoid Version Control: Ensure
.envfiles with real credentials are in.gitignore
Note: The environment files shown here contain example credentials from project templates. Update these with your actual secure credentials appropriate for your production environment.
Deploy OpenDSO Services
Navigate to the OpenDSO orchestration directory and deploy:
cd ~/opendso/opendso-docker-compose
# View available profiles
./run.sh -l
# Deploy all services
./run.sh -p all -c
# Verify deployment
docker ps
Verify Deployment
Access services via your domain, for example:
- https://docs.\{client_domain\}.oesinc.dev
- https://api.\{client_domain\}.oesinc.dev
- https://hmi.\{client_domain\}.oesinc.dev
- https://events.\{client_domain\}.oesinc.dev
- https://historian.\{client_domain\}.oesinc.dev
Terraform Operations
Manual Terraform Commands
If you prefer direct Terraform commands over the run.sh script:
# Initialize Terraform
terraform init
# Plan changes
terraform plan -var-file="./oes-test.tfvars" -var="github_token=$GITHUB_TOKEN"
# Apply changes
terraform apply -var-file="./oes-test.tfvars" -var="github_token=$GITHUB_TOKEN" -auto-approve
# Show outputs
terraform output
# Destroy infrastructure (careful!)
terraform destroy -var-file="./oes-test.tfvars" -var="github_token=$GITHUB_TOKEN" -auto-approve
State Management
Terraform state is stored locally by default.
Updating Deployments
Update OpenDSO Components
Updates roll forward by pushing a new git tag on the config and/or models repo (which triggers the GitHub Actions release workflow described in Tagging Versioned Updates) and then re-running setup-opendso.sh on the host so it picks up the new latest release:
# SSH to instance
ssh -i ~/.ssh/tf_id_rsa.pem ec2-user@<PRIVATE_IP>
# Stop services
cd ~/opendso/opendso-docker-compose
./run.sh -p all -d
# Re-run setup to download the latest tagged release archives
# (config.zip / models.zip / opendso.zip from each repo's GitHub releases)
export GITHUB_TOKEN="your-token"
cd ~
./setup-opendso.sh
# Restart services
cd ~/opendso/opendso-docker-compose
./run.sh -p all -c
Note: setup-opendso.sh always pulls the latest GitHub release for each repo. Make sure the tag you intend to deploy is the most recent published release on the config and models repos before running this; otherwise the host will pick up whichever tag is most recent.
Update Terraform Configuration
# From the IaC git directory
git pull
# Review changes
terraform plan -var-file="./oes-test.tfvars" -var="github_token=$GITHUB_TOKEN"
# Apply updates
terraform apply -var-file="./oes-test.tfvars" -var="github_token=$GITHUB_TOKEN"
Tagging Versioned Updates
A client deployment is "versioned" by which git tags are the most recent published releases on the config, models, and opendso repositories. The deployment host pulls the latest release of each via setup-opendso.sh. To roll forward (or back) you cut a new tag, let the GitHub Actions workflow build and publish the archive, then re-run setup-opendso.sh on the host.
When to Tag
Tag a new version any time you have a deployable change to one of these repos:
\{client_repo_name\}-config-docker-compose— environment variables, image tags indocker/.env, service config files. Most client-driven changes (image version bumps, credential updates, new service tunables) live here.\{client_repo_name\}-docker-compose(the models repo) — OpenFMB adapter mappings for the client's field devices. Tag whenever adapter coverage or configuration changes.opendso-docker-compose— base orchestration. Usually tagged by the OpenDSO platform team, not the client team.
Tagging Convention
Use SemVer-style tags (e.g. 1.4.0, 1.4.1). The workflow accepts any tag pattern (tags: '*'), but staying consistent makes the GitHub release list and the .version file dropped into each archive readable.
# From a clean checkout of the repo you want to release
git checkout main
git pull
# Create and push the tag
git tag 1.4.0
git push origin 1.4.0
What Happens on Push
The .github/workflows/main.yaml in the config and models repos triggers on the tag push and:
- Checks out the tagged commit with full history.
- Runs GitVersion to derive a version.
- Writes the tag name into a
.versionfile at the repo root so the unpacked archive on the deployment host carries the version stamp (cat ~/config/.versionwill show the tag a deployed host is running). - Zips the working tree (excluding
.git*,.vscode/*,.editorconfig) intoconfig.zip(config repo) ormodels.zip(models repo). - Creates a GitHub release for the tag and attaches the zip as a release asset.
You can confirm a release published correctly by visiting the repo's Releases page on GitHub and verifying the new tag has a config.zip or models.zip attached.
Deploying the New Tag
Once the workflow has finished and the new tag is the most recent release on its repo, deploy it to the host:
ssh -i ~/.ssh/tf_id_rsa.pem ec2-user@<PRIVATE_IP>
# Stop services so the archive contents can be replaced cleanly
cd ~/opendso/opendso-docker-compose
./run.sh -p all -d
# Re-run setup-opendso.sh — this will fetch the *latest* release of each repo
# listed in config.json and overwrite ~/opendso/, ~/config/, and ~/models/
export GITHUB_TOKEN="your-token"
cd ~
./setup-opendso.sh
# Confirm the deployed version stamp
cat ~/config/.version
cat ~/models/.version
# Bring services back up
cd ~/opendso/opendso-docker-compose
./run.sh -p all -c
Notes and gotchas:
setup-opendso.shalways pullsreleases/latest. There is no per-tag pinning inconfig.json. If you need to roll back, the supported path is to either re-publish an older tag as the latest release on GitHub, or to manually download and unzip the older release asset on the host.- Pushing a tag that already exists will not re-run the workflow. Delete the tag remotely (
git push --delete origin 1.4.0) and the GitHub release before re-tagging if you need to rebuild an archive at the same version. - The workflow only runs on tag pushes. Pushing commits to
maindoes not publish a new archive — nothing changes for deployed hosts until a tag is cut. unzip -ooverwrites files in place but does not delete files that were removed in the new tag. If a release removes a file, manually delete it from~/config/or~/models/(or wipe the directory before runningsetup-opendso.sh).
For troubleshooting failed workflows, missing release assets, or setup-opendso.sh errors, see Production Deployment Troubleshooting → Release Archive and setup-opendso.sh Issues.
Provisioning Scripts Explained
init.sh
Installs system dependencies and Docker:
- Updates system packages
- Installs unzip, golang, nss-tools, docker
- Enables and starts Docker service
- Installs Docker Compose
- Adds ec2-user to docker group
setup-opendso.sh
Downloads the versioned OpenDSO release archives from GitHub and unpacks them into the deployment layout that run.sh expects.
What it does:
- Reads
config.json(in the same directory it is run from) to get the GitHuborganizationand the list ofreleasesto fetch - For each entry, calls
GET /repos/{organization}/{repositoryName}/releases/latestand reads.assets[0].id(the first attached asset on the latest release) — this is theconfig.zip,models.zip, oropendso.zippublished by each repo's tag-archive workflow - Downloads the asset with
Accept: application/octet-streamagainst/repos/{organization}/{repositoryName}/releases/assets/{ASSET_ID}and writes it to{displayName}.zip - Runs
unzip -o {displayName}.zip -d {displayName}so the archive contents land in~/opendso/,~/config/, and~/models/ - If a repo has no releases yet, the API returns a
nullasset id and the script logsUnable to find release for {OWNER}/{REPO}and skips it (the script does not fail the whole deployment)
Requires:
GITHUB_TOKENenvironment variable withreposcope on the OES org (the script exits with code 5 if it is unset)jqandcurlavailable on the host (installed byinit.sh)config.jsonpresent in the working directory
Why this matters for upgrades: because the script only ever asks for releases/latest, the way to change what gets deployed is to change which tagged release is the most recent on the config / models / opendso repo — see Tagging Versioned Updates.
setup-certs.sh
Generates TLS certificates using Let's Encrypt:
- Uses certbot with DNS-01 challenge (Google Cloud DNS)
- Generates wildcard certificates
- Copies certificates to
~/certs/directory - Creates OpenADR-specific certificate files
- Sets appropriate file permissions
Requires: Google Cloud DNS credentials
Monitoring and Maintenance
View Logs
# Docker service logs
sudo journalctl -u docker -f
# Container logs
docker compose logs -f
# System logs
sudo journalctl -f
Monitor Resources
# Docker stats
docker stats
# System resources
htop
df -h
free -h
Database Backup and Restore
OpenDSO uses multiple databases to store different types of data:
- MongoDB - Stores GMS API configuration data, user settings, and application state
- PostgreSQL (Topology Genesis) - Stores parsed CIM topology data and equipment information
- SQLite (OpenADR Service) - Stores OpenADR VEN/VTN registration and event data
Backup MongoDB (GMS API Data)
The run.sh script provides a convenient backup command that uses mongodump to create a binary archive of the MongoDB database:
cd ~/opendso/opendso-docker-compose
./run.sh -b
What this does:
- Connects to the running MongoDB container
- Authenticates using credentials from environment variables
- Creates a binary archive dump of the database
- Saves the output to
db.dumpin the current directory
Requirements:
- MongoDB container must be running
- Credentials must be properly configured in the environment
Manual Backup (Alternative):
If you need more control over the backup process:
# Backup with custom filename
# Note: Use the database name as authenticationDatabase (typically settings_api)
docker exec mongodb sh -c 'mongodump --authenticationDatabase ${MONGODB_COLLECTION} \
-u ${MONGODB_USERNAME} -p ${MONGODB_PASSWORD} \
--db ${MONGODB_COLLECTION} --archive' > gms-backup-$(date +%Y%m%d-%H%M%S).dump
# Backup to a directory (not archive)
docker exec mongodb sh -c 'mongodump --authenticationDatabase ${MONGODB_COLLECTION} \
-u ${MONGODB_USERNAME} -p ${MONGODB_PASSWORD} \
--db ${MONGODB_COLLECTION} --out /data/backup'
Important: The SettingsAPIUser is created in the settings_api database, not in the admin database. Therefore, use --authenticationDatabase settings_api (or the database name from ${MONGODB_COLLECTION}) instead of --authenticationDatabase admin.
Restore MongoDB Database
To restore from a backup:
cd ~/opendso/opendso-docker-compose
./run.sh -r
Requirements:
db.dumpfile must exist in the current directory- MongoDB container must be running
Manual Restore (Alternative):
# Restore from custom backup file
# Note: Use the database name as authenticationDatabase (typically settings_api)
docker exec -i mongodb sh -c 'mongorestore --authenticationDatabase ${MONGODB_COLLECTION} \
-u ${MONGODB_USERNAME} -p ${MONGODB_PASSWORD} \
--db ${MONGODB_COLLECTION} --archive' < gms-backup-20240101.dump
# Drop existing database before restore
docker exec -i mongodb sh -c 'mongorestore --authenticationDatabase ${MONGODB_COLLECTION} \
-u ${MONGODB_USERNAME} -p ${MONGODB_PASSWORD} \
--db ${MONGODB_COLLECTION} --drop --archive' < db.dump
Backup PostgreSQL (Topology Genesis Data)
The Topology Genesis service uses PostgreSQL to store parsed CIM files and topology data:
# Backup PostgreSQL database
docker exec postgres sh -c 'pg_dump -U ${POSTGRES_USER} ${POSTGRES_DB}' > topology-backup-$(date +%Y%m%d-%H%M%S).sql
# Compressed backup
docker exec postgres sh -c 'pg_dump -U ${POSTGRES_USER} ${POSTGRES_DB}' | gzip > topology-backup-$(date +%Y%m%d-%H%M%S).sql.gz
Restore PostgreSQL:
# Restore from SQL dump
docker exec -i postgres sh -c 'psql -U ${POSTGRES_USER} ${POSTGRES_DB}' < topology-backup.sql
# Restore from compressed backup
gunzip < topology-backup.sql.gz | docker exec -i postgres sh -c 'psql -U ${POSTGRES_USER} ${POSTGRES_DB}'
Backup OpenADR Service Data
The OpenADR service uses SQLite for VEN registrations and event data:
# Locate and backup SQLite database
docker exec oadr-service sh -c 'sqlite3 /app/data/oadr.db ".backup /tmp/oadr-backup.db"'
docker cp oadr-service:/tmp/oadr-backup.db ./oadr-backup-$(date +%Y%m%d-%H%M%S).db
# Or simply copy the database file
docker cp oadr-service:/app/data/oadr.db ./oadr-backup-$(date +%Y%m%d-%H%M%S).db
Restore OpenADR Database:
# Stop the service first
docker stop oadr-service
# Copy backup to container
docker cp oadr-backup.db oadr-service:/app/data/oadr.db
# Restart service
docker start oadr-service
Complete System Backup Example
Use this script example to help create a backup of OpenDSO database systems:
#!/bin/bash
# Complete backup script
BACKUP_DIR="./backups/$(date +%Y%m%d-%H%M%S)"
mkdir -p "$BACKUP_DIR"
echo "Starting OpenDSO complete backup..."
# Backup MongoDB (GMS API)
echo "Backing up MongoDB..."
./run.sh -b
mv db.dump "$BACKUP_DIR/mongodb-gms.dump"
# Backup PostgreSQL (Topology Genesis)
echo "Backing up PostgreSQL..."
docker exec postgres sh -c 'pg_dump -U ${POSTGRES_USER} ${POSTGRES_DB}' | gzip > "$BACKUP_DIR/postgres-topology.sql.gz"
# Backup OpenADR Service (if running)
if docker ps | grep -q oadr-service; then
echo "Backing up OpenADR Service..."
docker cp oadr-service:/app/data/oadr.db "$BACKUP_DIR/oadr-service.db"
fi
# Backup configuration files
echo "Backing up configuration..."
cp -r ../config "$BACKUP_DIR/config-backup"
# Example copy to S3 (if configured)
# aws s3 cp "$BACKUP_DIR" s3://your-backup-bucket/opendso/backup-$(date +%Y%m%d) --recursive
echo "Backup complete: $BACKUP_DIR"
ls -lh "$BACKUP_DIR"
Certificate Renewal
Let's Encrypt certificates expire after 90 days. Set up automatic renewal:
# Create renewal script
cat > ~/renew-certs.sh <<'EOF'
#!/bin/bash
cd ~
./setup-certs.sh
cd ~/opendso/opendso-docker-compose
./run.sh -p all -d
./run.sh -p all -c
EOF
chmod +x ~/renew-certs.sh
# Add to crontab (run monthly)
crontab -e
# Add: 0 0 1 * * /home/ec2-user/renew-certs.sh
Tearing Down Infrastructure
To destroy the infrastructure:
# Using run.sh script
./run.sh -i ./oes-test.tfvars -p -t
# Or using Terraform directly
terraform destroy -var-file="./oes-test.tfvars" -var="github_token=$GITHUB_TOKEN" -auto-approve
Warning: This permanently deletes the EC2 instance and all data. Ensure backups are taken first.
Troubleshooting
For production deployment issues, see the Production Deployment Troubleshooting Guide.
Common issues covered include:
- Terraform deployment failures
- Provisioner script errors
- EC2 access and SSH issues
- DNS and certificate problems
- Infrastructure drift detection and resolution
- Multi-environment management
- Emergency procedures and disaster recovery
For Docker and container-specific issues, see the Docker Troubleshooting Guide.
Infrastructure Drift Management
Understanding Drift
Infrastructure drift occurs when the actual deployed infrastructure differs from the Terraform configuration.
Causes of Drift
- Manual AWS Console changes after Terraform deployment
- EBS volume resizing performed outside Terraform
- Security group rule additions for operational needs
- Terraform configuration not applied to existing resources
Detecting Drift
Check for drift using Terraform:
# View current state vs configuration
terraform plan -var-file="./oes-test.tfvars" -var="github_token=$GITHUB_TOKEN"
# Refresh state from AWS
terraform refresh -var-file="./oes-test.tfvars" -var="github_token=$GITHUB_TOKEN"
# Show detailed state
terraform show
Resolving Drift
Option 1: Update Terraform to match actual infrastructure
Option 2: Import existing resources into Terraform
Option 3: Accept managed drift
For resources with intentional manual changes, use lifecycle rules:
resource "aws_security_group" "ssh_sg" {
# ... config ...
lifecycle {
ignore_changes = [ingress] # Allow manual ingress rule changes
}
}
Security Best Practices
1. Restrict SSH Access
Update security group to limit SSH to specific IPs:
# In main.tf, modify ingress rule:
ingress {
from_port = 22
to_port = 22
protocol = "tcp"
cidr_blocks = ["YOUR_VPN_CIDR"] # Instead of 0.0.0.0/0
}
2. Use Systems Manager Session Manager
Instead of SSH, use AWS Systems Manager:
# Install Session Manager plugin
aws ssm start-session --target i-1234567890abcdef0
3. Secrets Management
Store sensitive data securely:
# Use AWS Secrets Manager for GitHub token
aws secretsmanager create-secret \
--name opendso/github-token \
--secret-string "your-token"
4. Enable CloudWatch Logging
Add CloudWatch agent for centralized logging:
sudo yum install -y amazon-cloudwatch-agent
sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-config-wizard
5. Regular Updates
# System updates
sudo yum update -y
# Docker updates
sudo yum update docker -y
sudo systemctl restart docker
Advanced Configuration
Custom Domain Configuration
To use a different domain, modify setup-certs.sh:
# Edit domain variable
DOMAIN=your-domain.com
Update DNS records accordingly.
Resize Instance
To change instance type (warning: this could destroy existing data, back up everything first):
# In main.tf, modify instance_type:
resource "aws_instance" "app_server" {
instance_type = "t2.xlarge" # Change from t2.large
# ...
}
Then apply changes:
terraform apply -var-file="./oes-test.tfvars" -var="github_token=$GITHUB_TOKEN"
Add Additional Storage
# In main.tf, add additional block device:
resource "aws_instance" "app_server" {
# ... existing config ...
ebs_block_device {
device_name = "/dev/sdf"
volume_size = 100
volume_type = "gp3"
}
}
Next Steps
- Local Testing: See Local Test Deployment for development environment
- Adding Services: See Adding Custom Services to extend the deployment
- Architecture: Review OpenDSO Architecture
Support
For production deployment support:
- Review Terraform plan output before applying changes
- Contact the OpenDSO software team
- Consult with your OES support team