Monitoring & Alerting¶
Monitoring strategy for JDX Odoo production environment.
Health Endpoints¶
| Service | Endpoint | Expected |
|---|---|---|
| Odoo | /web/health |
HTTP 200 |
| PWA | /api/health |
HTTP 200 |
| Docs | / |
HTTP 200 |
| Nginx | Direct probe | HTTP 200 |
Health Check Script¶
Run manual health check:
Output example:
============================================
JDX Odoo Health Check
2025-12-12 10:45:00
============================================
Service Health:
---------------
Odoo: OK (HTTP 200)
PWA: OK (HTTP 200)
Docs: OK (HTTP 200)
Nginx: OK (HTTP 200)
Container Status:
-----------------
odoo: healthy
pwa: healthy
docs: healthy
nginx: healthy
db: healthy
System Resources:
-----------------
Disk space: OK (45% used)
Memory: OK (62% used)
============================================
Status: ALL HEALTHY
Docker Health Checks¶
All services have built-in health checks:
# docker-compose.yml
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8069/web/health"]
interval: 30s
timeout: 10s
retries: 3
start_period: 60s
View container health:
CloudWatch Metrics (Production)¶
Recommended Metrics¶
| Metric | Threshold | Action |
|---|---|---|
| CPU Utilization | > 80% for 5 min | Alert |
| Memory Utilization | > 85% | Alert |
| Disk Usage | > 80% | Alert |
| HTTP 5xx Errors | > 10/min | Alert |
| Response Time | > 5s avg | Alert |
| Database Connections | > 80% | Alert |
CloudWatch Agent Config¶
{
"metrics": {
"metrics_collected": {
"cpu": {
"measurement": ["cpu_usage_active"],
"metrics_collection_interval": 60
},
"mem": {
"measurement": ["mem_used_percent"],
"metrics_collection_interval": 60
},
"disk": {
"measurement": ["used_percent"],
"resources": ["/"],
"metrics_collection_interval": 60
}
}
}
}
Log Aggregation¶
Docker Logs¶
# View all logs
docker compose logs -f
# View specific service
docker compose logs -f odoo
# Last 100 lines
docker compose logs --tail=100 odoo
Log Locations¶
| Service | Log Path |
|---|---|
| Odoo | Container stdout/stderr |
| Nginx | /var/log/nginx/*.log |
| PostgreSQL | Container stdout/stderr |
CloudWatch Logs (Production)¶
Configure log shipping:
# Install CloudWatch agent
sudo yum install amazon-cloudwatch-agent
# Configure log groups
/opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl \
-a fetch-config -m ec2 -c file:/opt/cloudwatch-config.json -s
Alerting¶
Alert Channels¶
| Severity | Channel | Response Time |
|---|---|---|
| Critical | PagerDuty/Phone | 15 min |
| High | Slack + Email | 1 hour |
| Medium | 4 hours | |
| Low | Dashboard | Next business day |
Critical Alerts¶
- Service down > 2 minutes
- Database connection failure
- SSL certificate expiring < 7 days
- Disk usage > 90%
- Memory usage > 95%
Slack Webhook Example¶
#!/bin/bash
# Send alert to Slack
curl -X POST -H 'Content-type: application/json' \
--data '{"text":"🚨 Odoo service is down!"}' \
$SLACK_WEBHOOK_URL
Uptime Monitoring¶
External Monitoring¶
Recommended services: - UptimeRobot (free tier) - Pingdom - AWS Route 53 Health Checks
Endpoints to Monitor¶
| URL | Interval | Alert After |
|---|---|---|
| https://erp.domain.com/web/health | 1 min | 2 failures |
| https://field.domain.com/api/health | 1 min | 2 failures |
| https://docs.domain.com/ | 5 min | 2 failures |
Scheduled Checks¶
Cron Jobs (Production)¶
# Health check every 5 minutes
*/5 * * * * /opt/odoo/scripts/health-check.sh >> /var/log/health-check.log 2>&1
# Daily backup at 2 AM
0 2 * * * /opt/odoo/scripts/backup.sh full >> /var/log/backup.log 2>&1
# Weekly cleanup on Sunday at 3 AM
0 3 * * 0 /opt/odoo/scripts/backup.sh cleanup >> /var/log/backup.log 2>&1
Dashboard¶
Key Metrics to Display¶
- Service status (green/red)
- Response times (graph)
- Error rates (graph)
- Active users (counter)
- Database connections (gauge)
- Disk/Memory usage (gauge)