Ansible ins Repo migrieren und zentrale SSH-Keys in shared/ssh.

Playbooks liegen unter pve1/ansible und pve2/ansible; authorized_keys
als Fragmente mit Deploy-Skript und Ziel-Matrix für Proxmox, VM 101 und CTs.

Co-authored-by: Cursor <cursoragent@cursor.com>
This commit is contained in:
root
2026-06-28 11:24:31 +02:00
parent 842e66996f
commit e98e3a2b84
27 changed files with 876 additions and 5 deletions
+15 -5
View File
@@ -14,17 +14,25 @@ Stattdessen:
```
/etc/cron.weekly/pve-lxc-disk-maintenance
↓ (Symlink)
/root/ansible/run-disk-maintenance.sh
/root/ansible/run-disk-maintenance.sh ← Symlink nach /root/docu/pve2/ansible
ansible-playbook playbooks/disk-maintenance.yml
↓ SSH
docker (101) · media (109) · AIDEV (110)
```
## Verzeichnisstruktur
## Verzeichnisstruktur (Git)
Quelle im Repo **`docu`**, auf pve2 deployen:
```bash
cd /root/docu && git pull
ln -sfn /root/docu/pve2/ansible /root/ansible
```
```
/root/ansible/
/root/docu/pve2/ansible/ # (= /root/ansible nach Symlink)
├── README.md
├── ansible.cfg
├── run-disk-maintenance.sh → von cron.weekly aufgerufen
├── inventory/
@@ -39,6 +47,8 @@ ansible-playbook playbooks/disk-maintenance.yml
└── handlers/main.yml
```
SSH-Keys für Ansible → [../shared/ssh/README.md](../shared/ssh/README.md)
## Verwaltete Hosts
| Ansible-Host | VMID | IP | Besonderheiten |
@@ -47,7 +57,7 @@ ansible-playbook playbooks/disk-maintenance.yml
| media | 109 | 192.168.20.6 | Jellyfin-Cache-Pfad |
| aidev | 110 | 10.100.2.13 | Dev-Tooling optional |
SSH als `root` vom Proxmox-Host — Key-Auth war bereits eingerichtet.
SSH als `root` vom Proxmox-Host — Public Key `root@pve2` muss in den CTs stehen ([shared/ssh](../shared/ssh/README.md)).
## Was das Playbook macht
@@ -101,7 +111,7 @@ echo '0 3 * * * root /root/ansible/run-disk-maintenance.sh' > /etc/cron.d/pve-lx
## Konfiguration anpassen
Globale Werte: `/root/ansible/inventory/group_vars/all.yml`
Globale Werte: `/root/docu/pve2/ansible/inventory/group_vars/all.yml` (oder `/root/ansible/…` via Symlink)
```yaml
journal_max_size: 200M
+42
View File
@@ -0,0 +1,42 @@
# Ansible auf pve2 — LXC Disk Maintenance
Wöchentliche Wartung für CTs **101 docker**, **109 media**, **110 AIDEV** per SSH vom Proxmox-Host.
| Pfad | Inhalt |
|------|--------|
| [ansible.cfg](ansible.cfg) | Defaults |
| [inventory/hosts.yml](inventory/hosts.yml) | Hosts + CT-Variablen |
| [inventory/group_vars/all.yml](inventory/group_vars/all.yml) | Schwellwerte |
| [playbooks/disk-maintenance.yml](playbooks/disk-maintenance.yml) | Playbook |
| [roles/disk_cleanup/](roles/disk_cleanup/) | Tasks (Journal, Docker, fstrim, …) |
| [run-disk-maintenance.sh](run-disk-maintenance.sh) | Cron-Einstieg |
Doku: [../06_Ansible-Automatisierung.md](../06_Ansible-Automatisierung.md)
## Ausführen
```bash
cd /root/docu/pve2/ansible # oder: /root/ansible → Symlink
./run-disk-maintenance.sh
# oder
ansible-playbook playbooks/disk-maintenance.yml
```
## Cron (pve2)
```text
/etc/cron.weekly/pve-lxc-disk-maintenance → /root/ansible/run-disk-maintenance.sh
```
Nach Symlink auf dieses Verzeichnis bleibt der Cron gültig.
## Deploy
```bash
cd /root/docu && git pull
ln -sfn /root/docu/pve2/ansible /root/ansible
```
## SSH
Ansible verbindet als **root** zu den CTs — Host-Key `root@pve2` muss in CT `authorized_keys` stehen → [../../shared/ssh/README.md](../../shared/ssh/README.md).
+12
View File
@@ -0,0 +1,12 @@
[defaults]
inventory = inventory/hosts.yml
roles_path = roles
remote_user = root
host_key_checking = False
retry_files_enabled = False
gathering = implicit
stdout_callback = yaml
interpreter_python = auto_silent
[privilege_escaping]
paramiko = ansible.paramiko_ssh.paramiko_ssh
+33
View File
@@ -0,0 +1,33 @@
---
# Disk maintenance defaults — tune per host in inventory if needed
disk_maintenance_enabled: true
# systemd journal
journal_max_size: 200M
# Docker
docker_prune_stopped_containers_older_than: 168h # 7 days
docker_prune_dangling_images: true
docker_prune_unused_images_older_than: 336h # 14 days (aggressive tag)
docker_prune_build_cache_older_than: 336h
docker_prune_dangling_volumes: true
docker_log_truncate_threshold: 50M
docker_log_truncate_target: 10M
# LVM thin provisioning — critical on Proxmox local-lvm / nvme_second
fstrim_enabled: true
# Frigate recordings on docker CT (matches config.yaml retain.days: 30)
frigate_recordings_retain_days: 30
frigate_clips_retain_days: 14
# Jellyfin transcode/image cache (not metadata — that is library artwork)
jellyfin_cache_max_age_days: 30
# Optional dev tooling (AIDEV)
npm_cache_clean: false
apt_clean: true
# Alert thresholds for summary output
disk_warn_percent: 80
thin_pool_warn_percent: 85
+17
View File
@@ -0,0 +1,17 @@
all:
children:
lxc_containers:
hosts:
docker:
ansible_host: 192.168.10.101
proxmox_vmid: 101
frigate_recordings_path: /mnt/records/recordings
frigate_clips_path: /mnt/records/clips
media:
ansible_host: 192.168.20.6
proxmox_vmid: 109
jellyfin_cache_path: /opt/stacks/jellyfin/config/cache
aidev:
ansible_host: 10.100.2.13
proxmox_vmid: 110
dev_tooling_cleanup: true
@@ -0,0 +1,37 @@
---
# Weekly disk maintenance for Proxmox LXC containers
# Run from the Proxmox host: ansible-playbook playbooks/disk-maintenance.yml
#
# Tags:
# aggressive — also prune unused images older than 14 days
# frigate — enforce recording/clip retention on docker CT
# jellyfin — clean stale transcode/image cache on media CT
# dev-tooling — npm cache clean on AIDEV (off by default)
- name: LXC disk maintenance
hosts: lxc_containers
become: true
gather_facts: true
vars:
disk_maintenance_enabled: true
roles:
- role: disk_cleanup
when: disk_maintenance_enabled | bool
- name: Report Proxmox thin pool usage
hosts: localhost
connection: local
gather_facts: false
tasks:
- name: Get LVM thin pool stats
ansible.builtin.shell: lvs pve/data nvme_second/nvme_second -o vg_name,lv_name,data_percent 2>/dev/null --noheadings
register: thin_pools
changed_when: false
- name: Thin pool summary
ansible.builtin.debug:
msg: |
Proxmox thin pools after maintenance:
{{ thin_pools.stdout }}
Schedule: see /etc/cron.weekly/pve-lxc-disk-maintenance
@@ -0,0 +1,17 @@
---
journal_max_size: 200M
docker_prune_stopped_containers_older_than: 168h
docker_prune_dangling_images: true
docker_prune_unused_images_older_than: 336h
docker_prune_build_cache_older_than: 336h
docker_prune_dangling_volumes: true
docker_log_truncate_threshold: 50M
docker_log_truncate_target: 10M
fstrim_enabled: true
frigate_recordings_retain_days: 30
frigate_clips_retain_days: 14
jellyfin_cache_max_age_days: 30
npm_cache_clean: false
apt_clean: true
disk_warn_percent: 80
thin_pool_warn_percent: 85
@@ -0,0 +1,5 @@
---
- name: Restart docker
ansible.builtin.service:
name: docker
state: restarted
@@ -0,0 +1,224 @@
---
- name: Disk usage before maintenance
ansible.builtin.shell: df -hT / | tail -1
register: disk_before
changed_when: false
- name: Show disk before
ansible.builtin.debug:
msg: "{{ inventory_hostname }} before: {{ disk_before.stdout }}"
- name: Vacuum systemd journal
ansible.builtin.command: "journalctl --vacuum-size={{ journal_max_size }}"
register: journal_vacuum
changed_when: "'Vacuuming done' in journal_vacuum.stdout"
failed_when: false
- name: Clean apt cache
ansible.builtin.apt:
autoclean: true
autoremove: true
clean: true
when: apt_clean | bool
- name: Check if docker is available
ansible.builtin.command: docker info
register: docker_info
changed_when: false
failed_when: false
tags: [always, docker]
- name: Truncate oversized Docker container logs
ansible.builtin.shell: |
set -o pipefail
find /var/lib/docker/containers -name '*-json.log' -size +{{ docker_log_truncate_threshold }} \
-exec truncate -s {{ docker_log_truncate_target }} {} \;
echo done
args:
executable: /bin/bash
register: log_truncate
changed_when: log_truncate.stdout is search('done')
when: docker_info is defined and docker_info.rc == 0
- name: Prune stopped containers
ansible.builtin.command: >-
docker container prune -f --filter until={{ docker_prune_stopped_containers_older_than }}
register: container_prune
changed_when: "'Total reclaimed space' in container_prune.stdout and '0B' not in container_prune.stdout.split('Total reclaimed space')[1].split('\n')[0]"
when: docker_info is defined and docker_info.rc == 0
- name: Prune dangling images
ansible.builtin.command: docker image prune -f
register: image_prune_dangling
changed_when: "'Total reclaimed space' in image_prune_dangling.stdout and '0B' not in image_prune_dangling.stdout.split('Total reclaimed space')[1].split('\n')[0]"
when:
- docker_info is defined
- docker_info.rc == 0
- docker_prune_dangling_images | bool
- name: Prune unused images older than threshold
ansible.builtin.command: >-
docker image prune -af --filter until={{ docker_prune_unused_images_older_than }}
register: image_prune_old
changed_when: "'Total reclaimed space' in image_prune_old.stdout and '0B' not in image_prune_old.stdout.split('Total reclaimed space')[1].split('\n')[0]"
when:
- docker_info is defined
- docker_info.rc == 0
- docker_prune_unused_images_older_than | length > 0
tags:
- aggressive
- name: Prune docker build cache
ansible.builtin.command: >-
docker builder prune -af --filter until={{ docker_prune_build_cache_older_than }}
register: builder_prune
changed_when: "'Total:' in builder_prune.stdout"
failed_when: false
when: docker_info is defined and docker_info.rc == 0
- name: Prune dangling docker volumes
ansible.builtin.command: docker volume prune -f
register: volume_prune
changed_when: "'Total reclaimed space' in volume_prune.stdout and '0B' not in volume_prune.stdout.split('Total reclaimed space')[1].split('\n')[0]"
when:
- docker_info is defined
- docker_info.rc == 0
- docker_prune_dangling_volumes | bool
- name: Check for existing Docker daemon.json
ansible.builtin.stat:
path: /etc/docker/daemon.json
register: docker_daemon_json
when: docker_info is defined and docker_info.rc == 0
- name: Ensure Docker log rotation defaults
ansible.builtin.copy:
dest: /etc/docker/daemon.json
owner: root
group: root
mode: "0644"
force: false
content: |
{
"log-driver": "json-file",
"log-opts": {
"max-size": "10m",
"max-file": "3"
}
}
notify: Restart docker
when:
- docker_info is defined
- docker_info.rc == 0
- not docker_daemon_json.stat.exists
- name: Remove old Frigate recording day folders
ansible.builtin.shell: |
set -euo pipefail
retain={{ frigate_recordings_retain_days }}
cutoff=$(date -d "-${retain} days" +%Y-%m-%d)
removed=0
for d in "{{ frigate_recordings_path }}"/20??-??-??; do
[ -d "$d" ] || continue
day=$(basename "$d")
if [[ "$day" < "$cutoff" ]]; then
rm -rf "$d"
echo "removed $day"
removed=1
fi
done
[ "$removed" -eq 0 ] || true
args:
executable: /bin/bash
register: frigate_recording_cleanup
changed_when: frigate_recording_cleanup.stdout | length > 0
when:
- frigate_recordings_path is defined
- frigate_recordings_path | length > 0
tags:
- frigate
- name: Remove old Frigate clip previews
ansible.builtin.find:
paths: "{{ frigate_clips_path | default('') }}/previews"
age: "{{ frigate_clips_retain_days }}d"
file_type: any
recurse: true
register: old_frigate_clips
when:
- frigate_clips_path is defined
- frigate_clips_path | length > 0
- name: Delete old Frigate clip files
ansible.builtin.file:
path: "{{ item.path }}"
state: absent
loop: "{{ old_frigate_clips.files | default([]) }}"
when:
- frigate_clips_path is defined
- frigate_clips_path | length > 0
loop_control:
label: "{{ item.path }}"
tags:
- frigate
- name: Clean stale Jellyfin cache files
ansible.builtin.find:
paths: "{{ jellyfin_cache_path | default('') }}"
age: "{{ jellyfin_cache_max_age_days }}d"
file_type: file
recurse: true
register: old_jellyfin_cache
when:
- jellyfin_cache_path is defined
- jellyfin_cache_path | length > 0
- name: Delete stale Jellyfin cache
ansible.builtin.file:
path: "{{ item.path }}"
state: absent
loop: "{{ old_jellyfin_cache.files | default([]) }}"
when:
- jellyfin_cache_path is defined
- jellyfin_cache_path | length > 0
loop_control:
label: "{{ item.path }}"
tags:
- jellyfin
- name: Clean npm cache on dev hosts
ansible.builtin.command: npm cache clean --force
when:
- dev_tooling_cleanup | default(false) | bool
- npm_cache_clean | bool
changed_when: true
failed_when: false
tags:
- dev-tooling
- name: Run fstrim on root filesystem
ansible.builtin.command: fstrim -v /
register: fstrim_result
changed_when: "'trimmed' in fstrim_result.stdout and '0 B' not in fstrim_result.stdout"
when: fstrim_enabled | bool
- name: Docker disk summary
ansible.builtin.command: docker system df
register: docker_df
changed_when: false
failed_when: false
when: docker_info is defined and docker_info.rc == 0
- name: Disk usage after maintenance
ansible.builtin.shell: df -hT / | tail -1
register: disk_after
changed_when: false
- name: Maintenance summary
ansible.builtin.debug:
msg: |
{{ inventory_hostname }}:
before: {{ disk_before.stdout }}
after: {{ disk_after.stdout }}
fstrim: {{ fstrim_result.stdout | default('skipped') }}
docker: {{ docker_df.stdout | default('n/a') }}
+9
View File
@@ -0,0 +1,9 @@
#!/bin/bash
# Weekly disk maintenance — runs Ansible playbook from Proxmox host
set -euo pipefail
export ANSIBLE_CONFIG=/root/ansible/ansible.cfg
LOG=/var/log/pve-lxc-disk-maintenance.log
exec >>"$LOG" 2>&1
echo "=== $(date -Is) disk maintenance start ==="
ansible-playbook /root/ansible/playbooks/disk-maintenance.yml
echo "=== $(date -Is) disk maintenance done ==="