Container Architecture¶
Overview¶
The EEMT system utilizes a multi-layered container architecture designed for scalability, reproducibility, and ease of deployment. This document provides a comprehensive overview of the container ecosystem, including design decisions, component interactions, and deployment patterns.
Architecture Design Principles¶
Layered Container Strategy¶
The EEMT container architecture follows a layered approach with clear separation of concerns:
- Base Layer: Ubuntu 24.04 LTS with system dependencies
- Scientific Stack Layer: GDAL, GRASS GIS, geospatial libraries
- EEMT Core Layer: Workflow scripts and scientific algorithms
- Application Layer: Web interface, API, orchestration tools
graph TD
subgraph "Container Layers"
A[Ubuntu 24.04 LTS Base] --> B[Scientific Stack]
B --> C[EEMT Core]
C --> D1[Web Interface]
C --> D2[Worker Node]
C --> D3[Documentation]
end
Container Images¶
Base Container: eemt:ubuntu24.04¶
Purpose: Core computational environment with all scientific dependencies
Key Components: - Ubuntu 24.04 LTS base operating system - Python 3.11 with Miniconda environment management - GDAL 3.11+ with complete geospatial stack - GRASS GIS 8.4+ compiled with EEMT extensions - CCTools 7.8.2 (Makeflow + Work Queue) - Scientific Python libraries (numpy, pandas, xarray, rasterio)
Build Process:
Size: ~3.5 GB
Base Image: ubuntu:24.04
Web Interface Container: eemt-web¶
Purpose: FastAPI application for job submission and monitoring
Key Components:
- Inherits from eemt:ubuntu24.04
- FastAPI web framework
- SQLite job database
- Docker SDK for container management
- Workflow orchestration logic
Build Process:
Size: ~3.8 GB (includes base) Exposed Ports: 5000 (web), 9123 (work queue)
Documentation Container: eemt-docs¶
Purpose: MkDocs documentation server
Key Components: - Python 3.11 slim base - MkDocs Material theme - Documentation plugins - Live reload capability
Build Process:
Size: ~200 MB Exposed Port: 8000
Container Orchestration¶
Docker Compose Architecture¶
The docker-compose.yml defines three deployment profiles:
# Default Profile: Local single-node deployment
services:
eemt-web:
profiles: [default]
ports: ["5000:5000"]
volumes:
- ./data:/app/data
- /var/run/docker.sock:/var/run/docker.sock
# Distributed Profile: Multi-node cluster
eemt-master:
profiles: [distributed]
ports: ["5000:5000", "9123:9123"]
eemt-worker:
profiles: [distributed]
deploy:
replicas: 5
# Documentation Profile
eemt-docs:
profiles: [docs]
ports: ["8000:8000"]
Container Communication¶
graph LR
subgraph "Host Machine"
subgraph "Docker Network: eemt-network"
WEB[Web Interface<br/>Container]
WORK1[Worker 1<br/>Container]
WORK2[Worker 2<br/>Container]
WORKN[Worker N<br/>Container]
end
subgraph "Volumes"
UPLOADS[uploads/]
RESULTS[results/]
TEMP[temp/]
CACHE[cache/]
end
DOCKER[Docker<br/>Daemon]
end
USER[User<br/>Browser] --> WEB
WEB --> DOCKER
DOCKER --> WORK1
DOCKER --> WORK2
DOCKER --> WORKN
WORK1 --> UPLOADS
WORK1 --> RESULTS
WORK2 --> TEMP
WORKN --> CACHE
Volume Management¶
Data Persistence Strategy¶
The container architecture uses Docker volumes for data persistence with specific mount points:
| Volume | Container Path | Purpose | Persistence |
|---|---|---|---|
uploads |
/app/uploads |
DEM file uploads | Persistent |
results |
/app/results |
Workflow outputs | Persistent |
temp |
/app/temp |
Processing scratch space | Ephemeral |
cache |
/app/cache |
Workflow caching | Semi-persistent |
shared |
/app/shared |
Distributed mode shared data | Persistent |
Volume Configuration¶
# Docker Compose volume definitions
volumes:
eemt-data:
driver: local
driver_opts:
type: none
o: bind
device: /data/eemt # Host directory
# Container mount configuration
volumes:
- type: bind
source: ./data/uploads
target: /app/uploads
read_only: false
- type: bind
source: ./data/results
target: /app/results
read_only: false
- type: tmpfs
target: /tmp
tmpfs:
size: 2G
Data Flow Architecture¶
sequenceDiagram
participant User
participant WebUI
participant Container
participant Volume
participant Results
User->>WebUI: Upload DEM
WebUI->>Volume: Save to uploads/
WebUI->>Container: Start workflow
Container->>Volume: Read DEM from uploads/
Container->>Volume: Write temp data
Container->>Container: Process workflow
Container->>Results: Write outputs
Container->>WebUI: Report completion
WebUI->>User: Provide download link
Network Configuration¶
Bridge Network¶
The default eemt-network uses Docker's bridge driver:
Service Discovery¶
Containers use Docker's internal DNS for service discovery:
- Web interface:
eemt-weboreemt-master - Workers:
eemt-worker,eemt-worker-2, etc. - Documentation:
eemt-docs
Port Exposure¶
| Service | Internal Port | External Port | Protocol | Purpose |
|---|---|---|---|---|
| Web Interface | 5000 | 5000 | HTTP | FastAPI application |
| Work Queue | 9123 | 9123 | TCP | CCTools master |
| Documentation | 8000 | 8000 | HTTP | MkDocs server |
| Monitoring | 9090 | 9090 | HTTP | Prometheus (future) |
Container Lifecycle Management¶
Startup Sequence¶
- Network Creation: Docker creates
eemt-network - Volume Initialization: Bind mounts are established
- Base Services: Database, cache services start
- Web Interface: FastAPI application initializes
- Workers: Worker containers connect to master
- Health Checks: Containers report ready status
Health Monitoring¶
# Health check configuration
HEALTHCHECK --interval=30s --timeout=10s --start-period=40s --retries=3 \
CMD curl -f http://localhost:5000/health || exit 1
Resource Limits¶
# Container resource constraints
deploy:
resources:
limits:
cpus: '4'
memory: 8G
reservations:
cpus: '2'
memory: 4G
Security Considerations¶
Container Security¶
- Non-root Execution: Containers run as user
eemt(UID 1000) - Minimal Base Images: Only essential packages installed
- Read-only Root Filesystem: Where applicable
- Secrets Management: Environment variables for sensitive data
# Security configuration in Dockerfile
RUN useradd -m -s /bin/bash -u 1000 eemt
USER eemt
WORKDIR /home/eemt
Network Security¶
Volume Security¶
- Read-only mounts for input data
- Restricted permissions on result directories
- Temporary data cleaned after processing
Performance Optimization¶
Build Optimization¶
# Multi-stage build for smaller images
FROM ubuntu:24.04 as builder
RUN apt-get update && apt-get install -y build-essential
# Build steps...
FROM ubuntu:24.04
COPY --from=builder /app /app
Layer Caching¶
Runtime Optimization¶
- Shared memory for inter-process communication
- Volume caching for frequently accessed data
- CPU affinity for computational tasks
Troubleshooting¶
Common Issues¶
Container Fails to Start¶
# Check logs
docker logs eemt-web
# Inspect container
docker inspect eemt-web
# Debug interactively
docker run -it --entrypoint /bin/bash eemt-web
Volume Permission Issues¶
# Fix ownership
docker exec eemt-web chown -R eemt:eemt /app/data
# Check permissions
docker exec eemt-web ls -la /app/data
Network Connectivity¶
# Test internal DNS
docker exec eemt-worker ping eemt-master
# Check network configuration
docker network inspect eemt-network
Debugging Tools¶
# Container shell access
docker exec -it eemt-web /bin/bash
# Process monitoring
docker top eemt-worker
# Resource usage
docker stats --no-stream
# Network debugging
docker run --rm --network eemt-network nicolaka/netshoot
Best Practices¶
Image Management¶
- Version Tags: Always use specific version tags
- Regular Updates: Rebuild images monthly for security patches
- Image Scanning: Use
docker scanfor vulnerability detection - Registry Usage: Push to registry for distributed deployments
Container Operations¶
- Graceful Shutdown: Implement SIGTERM handlers
- Log Management: Use centralized logging
- Monitoring: Implement health checks and metrics
- Backup Strategy: Regular volume backups
Development Workflow¶
- Local Development: Use bind mounts for code changes
- Testing: Separate test containers with isolated data
- Staging: Mirror production configuration
- CI/CD Integration: Automated builds and deployments
Migration and Upgrades¶
Container Image Updates¶
# Pull latest images
docker pull eemt:ubuntu24.04:latest
# Stop running containers
docker-compose down
# Update and restart
docker-compose up -d
Data Migration¶
# Backup volumes
docker run --rm -v eemt_data:/data -v $(pwd):/backup \
ubuntu tar czf /backup/eemt-backup.tar.gz /data
# Restore volumes
docker run --rm -v eemt_data:/data -v $(pwd):/backup \
ubuntu tar xzf /backup/eemt-backup.tar.gz -C /
Future Enhancements¶
Planned Improvements¶
- Kubernetes Support: Helm charts for K8s deployment
- GPU Acceleration: CUDA-enabled containers
- Service Mesh: Istio/Linkerd integration
- Observability: Prometheus + Grafana stack
- Registry Integration: Harbor/Nexus private registry
Container Roadmap¶
- Q1 2025: GPU-accelerated GRASS GIS container
- Q2 2025: Kubernetes operators for EEMT
- Q3 2025: Multi-architecture images (ARM64)
- Q4 2025: Serverless container deployments