Security Configuration¶
Phentrieve's Docker deployment is hardened for production use to minimize attack surfaces and follow security best practices. This document details the comprehensive security measures implemented in the Docker containers.
Security Philosophy¶
The deployment follows a defense-in-depth approach with multiple layers of security controls:
- Principle of Least Privilege: Containers run with minimal permissions
- Immutable Infrastructure: Read-only filesystems prevent runtime tampering
- Resource Isolation: CPU and memory limits prevent DoS attacks
- Network Segmentation: Internal networks isolate services
- Minimal Attack Surface: Only essential capabilities granted
Container Security Hardening¶
1. Non-Root Execution¶
Configuration:
# API Container
user: "10001:10001" # phentrieve:phentrieve
# Frontend Container (Nginx)
user: "101:101" # nginx:nginx
Why This Matters: - Prevents container breakout from escalating to root on host - Limits damage if application is compromised - Enforces principle of least privilege
Verification:
# Check user inside API container
docker exec phentrieve-api-1 id
# Output: uid=10001(phentrieve) gid=10001(phentrieve)
# Check user inside frontend container
docker exec phentrieve-frontend-1 id
# Output: uid=101(nginx) gid=101(nginx)
2. Read-Only Root Filesystem¶
Configuration:
Why This Matters: - Prevents malicious code from modifying the container image - Makes containers immutable after deployment - Detects and blocks runtime tampering attempts
Writable Areas (Explicit tmpfs Mounts):
API Container:
tmpfs:
- /tmp:uid=10001,gid=10001,mode=1777,size=1G # Temporary files
- /app/.cache:uid=10001,gid=10001,mode=0755,size=2G # Model cache
Frontend Container:
tmpfs:
- /tmp:uid=101,gid=101,mode=1777,size=100M # Nginx temp
- /var/cache/nginx:uid=101,gid=101,mode=0755,size=50M
- /var/run:uid=101,gid=101,mode=0755,size=10M
Verification:
# Try to create file in read-only area (should fail)
docker exec phentrieve-api-1 touch /app/test.txt
# Error: Read-only file system
# Create file in allowed tmpfs area (should succeed)
docker exec phentrieve-api-1 touch /tmp/test.txt
# Success
3. Capability Dropping¶
Configuration:
What Gets Dropped:
All 38+ Linux capabilities including:
- CAP_NET_ADMIN - Network configuration
- CAP_SYS_ADMIN - System administration operations
- CAP_CHOWN - Ownership changes
- CAP_DAC_OVERRIDE - File permission bypassing
- CAP_SETUID/SETGID - UID/GID changes
- CAP_KILL - Signal sending to arbitrary processes
Why This Matters: - Prevents privilege escalation attacks - Limits system calls available to compromised containers - Follows principle of least privilege
When to Add Capabilities Back:
Only if absolutely necessary (extremely rare):
Verification:
# Check capabilities inside container
docker exec phentrieve-api-1 capsh --print
# Should show: Current: (empty set)
4. Security Options¶
Configuration:
security_opt:
- no-new-privileges:true # Prevent privilege escalation
- seccomp:unconfined # May need tuning for ChromaDB
no-new-privileges:
- Prevents processes from gaining new privileges via setuid binaries
- Blocks sudo, su, and privilege escalation exploits
- Essential for defense-in-depth
seccomp (Secure Computing Mode):
- Currently unconfined for ChromaDB compatibility
- Can be hardened with custom seccomp profile if needed
- Filters system calls available to container processes
Future Hardening:
Create custom seccomp profile:
{
"defaultAction": "SCMP_ACT_ERRNO",
"architectures": ["SCMP_ARCH_X86_64"],
"syscalls": [
{"names": ["read", "write", "open", "close", ...], "action": "SCMP_ACT_ALLOW"}
]
}
Resource Limits¶
CPU and Memory Constraints¶
API Container:
deploy:
resources:
limits:
cpus: '4.0' # Maximum 4 CPU cores
memory: 8G # Maximum 8GB RAM
reservations:
cpus: '1.0' # Minimum 1 CPU core
memory: 4G # Minimum 4GB RAM
Frontend Container:
deploy:
resources:
limits:
cpus: '1.0' # Maximum 1 CPU core
memory: 512M # Maximum 512MB RAM
reservations:
cpus: '0.25' # Minimum 0.25 CPU cores
memory: 128M # Minimum 128MB RAM
Why This Matters: - DoS Prevention: Prevents resource exhaustion attacks - Multi-Tenancy: Ensures fair resource sharing on shared infrastructure - Predictability: Guarantees minimum resources for operation - Cost Control: Prevents runaway processes from consuming excessive resources
Monitoring Resource Usage:
# Real-time stats
docker stats phentrieve-api-1 phentrieve-frontend-1
# Check for OOM kills
docker inspect phentrieve-api-1 | grep -A 5 "OOMKilled"
Log Management¶
Structured Logging with Rotation¶
Configuration:
logging:
driver: "json-file"
options:
max-size: "10m" # Maximum 10MB per log file
max-file: "3" # Keep 3 rotated files
labels: "service,env"
Why This Matters: - Disk Exhaustion Prevention: Logs don't fill up disk - Log Retention: Keeps recent logs while managing space - Structured Format: JSON format for log aggregation systems
Log Storage:
Total log storage per container: - API: 30MB maximum (10MB × 3 files) - Frontend: 30MB maximum (10MB × 3 files)
Accessing Logs:
# View recent logs
docker logs phentrieve-api-1 --tail 100
# Follow logs in real-time
docker logs -f phentrieve-api-1
# Export logs with timestamp
docker logs phentrieve-api-1 --since 1h > api-logs.txt
Network Isolation¶
Network Architecture¶
Internal Network:
phentrieve_internal_net:
driver: bridge
internal: false # Allows internet access for model downloads
External Proxy Network (Optional):
Why This Matters: - Service Isolation: Backend not directly exposed to internet - Controlled Access: Only frontend accessible via proxy - Defense in Depth: Network segmentation limits lateral movement
Network Topology:
Internet
│
├─> npm_proxy_network (external)
│ │
│ └─> phentrieve_frontend (nginx)
│ │
└─> phentrieve_internal_net
│
└─> phentrieve_api (FastAPI)
Verification:
# List networks
docker network ls
# Inspect network
docker network inspect phentrieve_phentrieve_internal_net
Data Volume Security¶
Mount Permissions¶
Read-Only Data Mount:
Selective Read-Write Mounts:
volumes:
# Only indexes need write access
- ${PHENTRIEVE_HOST_DATA_DIR}/indexes:/phentrieve_data_mount/indexes:rw
# Model cache needs write access
- ${PHENTRIEVE_HOST_HF_CACHE_DIR}:/app/.cache/huggingface:rw
Why This Matters: - Data Integrity: Core data cannot be corrupted by container processes - Principle of Least Privilege: Write access only where absolutely necessary - Audit Trail: Changes limited to specific directories
Security Best Practices:
- Separate Data Directory: Don't mount entire host filesystem
- Explicit Permissions: Always specify
:roor:rwexplicitly - No Sensitive Data: Don't mount host
/etc,/var, or/home - UID/GID Alignment: Ensure host permissions match container UID 10001
Verification:
# Check mount permissions
docker inspect phentrieve-api-1 | grep -A 10 Mounts
# Verify data directory ownership on host
ls -la ${PHENTRIEVE_HOST_DATA_DIR}
# Should show: drwxr-xr-x ... 10001 10001 ...
Health Checks¶
Liveness and Readiness Probes¶
API Health Check:
healthcheck:
test: ["CMD-SHELL", "curl -f http://localhost:8000/api/v1/health || exit 1"]
interval: 30s # Check every 30 seconds
timeout: 10s # 10 second timeout
retries: 5 # 5 retries before unhealthy
start_period: 180s # 3 minute grace period
Frontend Health Check:
healthcheck:
test: ["CMD-SHELL", "curl -f http://localhost:80/health || exit 1"]
interval: 30s
timeout: 5s
retries: 3
start_period: 30s
Why This Matters: - Automatic Recovery: Unhealthy containers restarted automatically - Load Balancer Integration: Health checks guide traffic routing - Early Detection: Identifies issues before users notice
Monitoring Health:
# Check health status
docker ps --filter name=phentrieve
# View health logs
docker inspect --format='{{json .State.Health}}' phentrieve-api-1 | jq
Restart Policies¶
Configuration:
Options:
- no: Never restart (not recommended for production)
- always: Always restart (can cause boot loops)
- on-failure: Restart only on crashes (good for debugging)
- unless-stopped: Restart except when explicitly stopped (recommended)
Why unless-stopped:
- Survives server reboots
- Doesn't restart when manually stopped
- Prevents boot loops during maintenance
Security Checklist for Production¶
Before deploying to production, verify:
- [ ] Containers run as non-root users (UID 10001 for API, 101 for frontend)
- [ ] Read-only filesystem enabled (
read_only: true) - [ ] All capabilities dropped (
cap_drop: [ALL]) - [ ] Security options set (
no-new-privileges:true) - [ ] Resource limits configured (CPU, memory)
- [ ] Log rotation enabled (max-size, max-file)
- [ ] Health checks configured and passing
- [ ] Volumes mounted with explicit permissions (
:roor:rw) - [ ] No secrets in environment variables (use Docker secrets instead)
- [ ] Latest security patches applied (rebuild images regularly)
Vulnerability Management¶
Regular Security Updates¶
Update Schedule:
1. Base Images: Update monthly or when CVEs announced
2. Dependencies: uv lock --upgrade weekly, test, deploy
3. Security Patches: Apply critical patches immediately
Scanning for Vulnerabilities:
# Scan local images with Docker Scout
docker scout cves ghcr.io/berntpopp/phentrieve/api:latest
# Scan with Trivy (more detailed)
trivy image ghcr.io/berntpopp/phentrieve/api:latest
# Only show HIGH and CRITICAL
trivy image --severity HIGH,CRITICAL ghcr.io/berntpopp/phentrieve/api:latest
Automated Scanning: - GitHub Dependabot enabled (weekly dependency checks) - GitHub Actions CI scans on every PR - GHCR vulnerability scanning on push
Secrets Management¶
NEVER do this:
DO this instead:
secrets:
- database_password
- api_key
environment:
- DATABASE_PASSWORD_FILE=/run/secrets/database_password
Creating Docker Secrets:
# Create secret from file
echo "my_secret_value" | docker secret create db_password -
# Use in docker-compose.yml
secrets:
db_password:
external: true
Incident Response¶
If Container is Compromised¶
-
Isolate Immediately:
-
Capture Forensics:
-
Investigate:
-
Rebuild and Redeploy:
-
Post-Incident:
- Review and update security controls
- Patch vulnerabilities
- Update incident response plan
Security Monitoring¶
Recommended Tools¶
Host-Level: - Falco: Runtime security monitoring - Auditd: Linux kernel audit logs - OSSEC/Wazuh: Host-based intrusion detection
Container-Level: - Docker Bench Security: Automated security audit - Clair/Trivy: Vulnerability scanning - Sysdig: Container forensics and monitoring
Network-Level: - Zeek: Network traffic analysis - Suricata: Intrusion detection/prevention
Audit Logging¶
Enable Docker daemon audit logging:
Monitor events:
# Docker events
docker events --filter type=container --filter event=start
# Audit logs (if enabled)
journalctl -u docker.service -f
Compliance Considerations¶
These security measures help satisfy requirements for:
- ISO 27001: Information security management
- SOC 2 Type II: Security, availability, confidentiality
- HIPAA: Healthcare data protection (with additional controls)
- GDPR: Data protection and privacy
- PCI DSS: Payment card data security (if applicable)
Note: Full compliance requires additional organizational and process controls beyond container security.
Further Reading¶
- CIS Docker Benchmark
- OWASP Docker Security Cheat Sheet
- Docker Security Best Practices
- NIST Application Container Security Guide
Security is a Process, Not a Product
These measures provide strong baseline security, but must be combined with: - Regular security audits and penetration testing - Incident response planning and drills - Security awareness training for developers - Continuous monitoring and threat intelligence - Patch management and vulnerability remediation
Security Updates
Subscribe to security advisories for all dependencies: - Docker Security: https://docs.docker.com/engine/security/ - Python Security: https://www.python.org/news/security/ - FastAPI Security: https://fastapi.tiangolo.com/deployment/security/