Files
docu/pve2/power-mqtt-agent.md
T
root 6f52d46267 Initiale Infrastruktur-Dokumentation pve1 und pve2.
Enthält Host-Doku, MQTT/HA, Git-Setup, Power-Monitoring und GPU-Idle (pve2).

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-06-27 19:53:55 +02:00

96 lines
2.4 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# pve2 — Power-MQTT-Agent
CPU (Intel RAPL) + GPU (`nvidia-smi`) → MQTT → Home Assistant.
## Installation
| Komponente | Pfad |
|------------|------|
| Binary | `/usr/local/bin/pve-power-mqtt` |
| systemd | `/etc/systemd/system/pve-power-mqtt.service` |
| Env | `/etc/pve-power-mqtt.env` |
| Quellcode | `/root/code/pve-power-mqtt` |
| Repo | https://git.jeanavril.com/jean/server-power.git |
## Konfiguration `/etc/pve-power-mqtt.env`
```ini
POWER_MQTT_BROKER=tcp://homeassistant.iot:1883
POWER_MQTT_USER=server
POWER_MQTT_PASSWORD="F0x84rAOW#q@LX"
POWER_MQTT_HOSTNAME=pve2
POWER_MQTT_DISCOVERY=true
```
Client-ID: **`pve-power-mqtt-pve2`**
## HA-Sensoren
| Entity | Quelle |
|--------|--------|
| sensor.pve2_cpu_power | RAPL |
| sensor.pve2_gpu0_power | GPU 0 |
| sensor.pve2_gpu1_power | GPU 1 |
| sensor.pve2_estimated_total | CPU + GPU0 + GPU1 |
`estimated_total` ist **kein** Netzteil-/PDU-Wert.
Broker-Details: [../shared/mqtt-homeassistant.md](../shared/mqtt-homeassistant.md)
## Build & Deploy
```bash
cd /root/code/pve-power-mqtt
git pull
export PATH="/usr/local/go/bin:$PATH"
go build -o pve-power-mqtt ./cmd/pve-power-mqtt
install -m 755 pve-power-mqtt /usr/local/bin/pve-power-mqtt
systemctl restart pve-power-mqtt
```
## NVIDIA Persistence (Voraussetzung für sinnvolle GPU-Idle-Werte)
```bash
systemctl status nvidia-persistenced
nvidia-smi --query-gpu=power.draw,pstate,persistence_mode --format=csv
```
Erwartung idle: **P8**, ~89 W pro GTX 1080.
Siehe [09_GPU-Idle-vollstaendig.md](09_GPU-Idle-vollstaendig.md)
## Agent vs. GPU Idle
`nvidia-smi` alle **5 s** kann GPUs kurz aus P8 wecken — für reine Idle-Messung:
```bash
systemctl stop pve-power-mqtt
sleep 60
nvidia-smi --query-gpu=power.draw,pstate --format=csv
systemctl start pve-power-mqtt
```
Optional: GPU-Intervall im Code erhöhen (z. B. 60 s) — siehe server-power Repo.
## Betrieb
```bash
systemctl status pve-power-mqtt
journalctl -u pve-power-mqtt -f
```
## Fixes (Historie)
- `expire_after` / `availability_topic` aus Discovery entfernt (HA „unavailable“)
- Eindeutige Client-IDs pro Host
- Keepalive 120 s, Ping-Timeout 30 s
- MQTT-Reconnect-Logging
## Troubleshooting
| Problem | Lösung |
|---------|--------|
| GPU unavailable in HA | Agent läuft? `nvidia-smi` auf Host? |
| Hohe GPU-Idle-Werte | Persistence + LXC-Mounts prüfen (CT 101 ohne NVIDIA) |
| MQTT timeout | VLAN 10→40, Broker homeassistant.iot erreichbar? |