a0e146a33c
init
156 lines
4.3 KiB
Markdown
156 lines
4.3 KiB
Markdown
# Docker Metrics Collector
|
|
|
|
A lightweight, modular Docker monitoring tool that collects comprehensive metrics from containers, volumes, and the Docker system, then sends them to Graphite.
|
|
|
|
## 🚀 Features
|
|
|
|
### Comprehensive Metrics Collection
|
|
|
|
**Container Metrics:**
|
|
|
|
- CPU usage percentage (accurate per-container calculation)
|
|
- Memory usage (bytes and percentage)
|
|
- Network I/O (rx/tx bytes and packets)
|
|
- Block I/O (read/write bytes)
|
|
- Container state (running=2, paused=1, stopped=0)
|
|
- Health status (healthy=2, starting=1, unhealthy=0)
|
|
- Restart count
|
|
|
|
**Volume Metrics:**
|
|
|
|
- Container count per volume
|
|
- Volume labels count
|
|
- Volume usage tracking
|
|
|
|
**System Metrics:**
|
|
|
|
- Total/running/paused/stopped container counts
|
|
- Total image count and active images
|
|
- System-wide storage usage (images, containers, volumes)
|
|
- Docker system df parsing for detailed disk usage
|
|
|
|
**Aggregated Metrics:**
|
|
|
|
- Per-container metric summaries
|
|
- Volume usage patterns (in-use vs unused)
|
|
- Container utilization percentage
|
|
|
|
## Quick Start
|
|
|
|
### Using Docker Compose
|
|
|
|
```bash
|
|
# Start Graphite and the metrics collector
|
|
docker compose up -d
|
|
|
|
# View logs
|
|
docker logs -f docker-df-collector
|
|
|
|
# Access Grafana
|
|
open http://localhost:80
|
|
```
|
|
|
|
The collector will gather metrics every few seconds and send them to Graphite.
|
|
|
|
## Configuration
|
|
|
|
Configure via environment variables in `compose.yml`:
|
|
|
|
| Variable | Description | Default |
|
|
| ------------------- | ------------------------------ | ---------------------- |
|
|
| `GRAPHITE_ENDPOINT` | Graphite plaintext endpoint | `http://graphite:2003` |
|
|
| `GRAPHITE_PREFIX` | Prefix for all metric names | `docker-metrics` |
|
|
| `INTERVAL_SECONDS` | Collection interval in seconds | `60` |
|
|
| `DEBUG` | Enable debug console output | `false` |
|
|
|
|
## Metrics Reference
|
|
|
|
All metrics follow the pattern: `{prefix}.{category}.{name}.{metric}`
|
|
|
|
### Container Metrics
|
|
|
|
```
|
|
docker-metrics.containers.{container_name}.cpu_percent
|
|
docker-metrics.containers.{container_name}.memory_bytes
|
|
docker-metrics.containers.{container_name}.memory_percent
|
|
docker-metrics.containers.{container_name}.state
|
|
docker-metrics.containers.{container_name}.health
|
|
docker-metrics.containers.{container_name}.restart_count
|
|
docker-metrics.containers.{container_name}.network.rx_bytes
|
|
docker-metrics.containers.{container_name}.network.tx_bytes
|
|
docker-metrics.containers.{container_name}.blkio.read_bytes
|
|
docker-metrics.containers.{container_name}.blkio.write_bytes
|
|
```
|
|
|
|
### System Metrics
|
|
|
|
```
|
|
docker-metrics.system.containers.total
|
|
docker-metrics.system.containers.running
|
|
docker-metrics.system.images.total
|
|
docker-metrics.system.images.total_size_bytes
|
|
docker-metrics.system.containers.total_size_bytes
|
|
docker-metrics.system.volumes.total_size_bytes
|
|
```
|
|
|
|
### Aggregated Metrics
|
|
|
|
```
|
|
docker-metrics.aggregated.volumes.unused_count
|
|
docker-metrics.aggregated.system.container_utilization_percent
|
|
```
|
|
|
|
## 📊 Grafana Queries
|
|
|
|
A few example queries for common Grafana selections (container, host, or aggregate views)
|
|
|
|
- Top 10 CPU consumers: `aliasByNode(highestMax(docker-metrics.containers.*.cpu_percent, 10), 2)`
|
|
- Total network traffic: `sumSeries(docker-metrics.containers.*.network.rx_bytes)`
|
|
- Container health: `aliasByNode(docker-metrics.containers.*.health, 2)`
|
|
|
|
## 🛠️ Development
|
|
|
|
### Running Locally
|
|
|
|
```bash
|
|
cd src
|
|
pip install -r requirements.txt
|
|
|
|
export GRAPHITE_ENDPOINT=http://localhost:2003
|
|
export DEBUG=true
|
|
|
|
python main.py
|
|
```
|
|
|
|
### Adding Custom Collectors
|
|
|
|
Extend `BaseCollector` to create new metric collectors:
|
|
|
|
```python
|
|
from collectors.base import BaseCollector
|
|
|
|
class MyCollector(BaseCollector):
|
|
def get_name(self) -> str:
|
|
return "mycollector"
|
|
|
|
def collect(self) -> list:
|
|
return [{'name': 'my.metric', 'value': 42, 'timestamp': time.time()}]
|
|
```
|
|
|
|
## Performance
|
|
|
|
- **Memory:** ~50-100MB
|
|
- **CPU:** <1% (during collection)
|
|
- **Collection time:** 1-5 seconds
|
|
- **Network:** Minimal (Graphite plaintext protocol)
|
|
|
|
## Why This Tool?
|
|
|
|
This tool brings comprehensive Docker monitoring to Graphite with:
|
|
|
|
✅ **Modular design** - Easy to extend and customize
|
|
✅ **Lightweight** - Minimal resource usage
|
|
✅ **Comprehensive** - 50+ metrics out of the box
|
|
✅ **Production-ready** - Runs in container, non-root, read-only socket access
|
|
|
|
Perfect for monitoring Docker hosts without complex setups. |