Files
hops/TODO.md
T
Stephen Klein cce25bfa4d Add TODO: review Huntarr for elfhosted rebuild (M5)
Current definition uses the retired plexguide image. Need to
evaluate and switch to the elfhosted-maintained rebuild before 1.0.0.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-10 21:23:39 -04:00

281 lines
12 KiB
Markdown

# HOPS TODO
Generated by codebase audit (2026-06-10). Ranked by severity.
## Decisions (2026-06-10)
- **Canonical pipeline**: Path A (`hops` -> `install` -> `services`). Path B deleted.
- **Deleted**: `setup`, `privileged-setup`, `user-operations`, `services-improved`,
`lib/privileges.sh` -- all Path B artifacts, gone.
- **Service catalog**: `services` is the single source of truth. Latest tags kept.
- **`lib/secrets.sh`**: keep and fix. Goal is to encrypt the `.env` file at rest
(passwords/API keys written by `install`). Fix the broken AES-GCM crypto and
wire encryption/decryption into the install flow.
- **macOS**: future roadmap. Linux is the target for now.
---
## CRITICAL BUGS (breaks primary use cases)
### B1 -- Infinite recursion in `services` on Linux [CRITICAL]
- File: `services:25-46`
- `get_timezone_mount()` and `get_gpu_devices()` call themselves on the non-Darwin
branch via `echo "$(get_timezone_mount)"`. Hits bash FUNCNEST limit on every
Linux compose generation. Main `./hops` install is broken on Linux.
- Fix: replace the recursive calls with the literal YAML strings they should emit.
### B2 -- Brace mismatch in `lib/privileges.sh` [CRITICAL] -- RESOLVED: delete file
- File: `lib/privileges.sh:429,612`
- Moot -- `lib/privileges.sh` is Path B dead code, scheduled for deletion (see A3).
---
## HIGH BUGS
### B3 -- Glob stored as string, directory detection always fails [HIGH]
- Files: `hops:154-166`, `uninstall:127-147`
- `homelab_dirs=( "/home/*/hops" )` stores a literal glob; the quoted for-loop
never expands it. Multi-user detection is broken, `cd "$HOMELAB_DIR"` fails
under `set -e`.
- Fix: iterate unquoted or use `compgen -G "/home/*/hops"`.
### B4 -- Missing service definitions file reference [HIGH]
- File: `install:916`
- `setup_firewall()` sources `"$SCRIPT_DIR/hops_service_definitions.sh"` which
does not exist (the file is named `services`). Per-service firewall rules are
silently never applied.
- Fix: correct the filename to `services`.
### B5 -- `((x++))` aborts script under `set -e` [HIGH]
- Files: `hops:299,317`, `install:784`, and others
- `((running_count++))` returns exit code 1 when the pre-increment value is 0,
which kills the script under `set -e`.
- Fix: use `running_count=$((running_count + 1))` or append `|| true`.
### B6 -- `hops` entry point is Linux-only despite macOS library support [HIGH]
- File: `hops:108-136,263`
- `check_dependencies` requires `systemctl`, `check_system_requirements` calls
`free` and `df -BG`, `show_service_status` calls `systemctl`. All Linux-only.
The documented entry point fails immediately on macOS.
- Fix: add OS guards or document `hops` as Linux-only.
### B7 -- Port collisions not detected within a selection [HIGH]
- File: `services` (port map)
- sabnzbd and traefik dashboard both use 8080; traefik and nginx-proxy-manager
both bind 80/443; authelia and transmission both use 9091.
- `check_all_ports` only checks host listeners, not intra-selection conflicts,
so users can generate an un-startable compose silently.
- Fix: add intra-selection conflict check before compose generation.
---
## MEDIUM BUGS
### B8 -- Watchtower assigned a bogus port [MED]
- Files: `lib/docker.sh:47`, `services-improved:90`
- Watchtower has no web UI. Assigning it port 8080 emits a spurious `ports:`
block and broken healthcheck in the generated compose.
### B9 -- Update backup copies into itself [MED]
- File: `hops:586-595`
- `update_hops` does `cp -r "$SCRIPT_DIR" "$backup_dir"` where `$backup_dir`
is inside `$SCRIPT_DIR`. Results in recursive self-copy including `.git/`.
- Fix: create the backup dir outside the script directory.
### B10 -- `secure_delete` `stat` flag wrong on macOS [MED]
- File: `lib/secrets.sh:146`
- Uses `stat -c%s` (GNU) which fails on macOS (`stat -f%z`).
Manual-overwrite fallback silently no-ops on macOS.
### B11 -- `jellystat` generated with wrong template in `services-improved` [MED]
- File: `services-improved:422`
- Routed through the generic media-server template; gets no postgres DB and no
JWT_SECRET, so it cannot run. The hand-written `services` version is correct.
### B12 -- Empty-password detection regex broken [LOW]
- File: `lib/security.sh:361-384`
- `grep "PASSWORD=\s*$"` without `-E` or `-P` means `\s` is matched literally,
not as whitespace. Empty-password detection is dead.
---
## SECURITY
### S1 -- Broken/unauthenticated encryption [HIGH]
- File: `lib/secrets.sh:85,115`
- `openssl enc -aes-256-gcm` via the CLI does not handle the GCM auth tag.
This is not authenticated encryption and round-trips unreliably.
- Fix: use a supported openssl mode or switch to `gpg --symmetric`.
### S2 -- Passphrases/keys exposed in process list [HIGH]
- Files: `lib/secrets.sh:85,115`, `lib/security.sh:140,156,175,204,416,442`
- `-pass pass:"$key"` and `--passphrase "$x"` on the command line are visible
to any local user via `ps`.
- Fix: use `-pass fd:N` or `--passphrase-fd N` with a file descriptor.
### S3 -- Committed default Authelia credential [HIGH]
- File: `services:1148-1157`
- `users_database.yml` ships a default admin account with a known password hash
(hash of literal "password"). Every Authelia deploy has this credential.
- Fix: force password change on first login or generate the hash at deploy time.
### S4 -- Traefik dashboard exposed with no auth [MED]
- File: `services:672-673,684`
- `api.insecure=true` exposes the Traefik dashboard on :8080 with no auth.
Consider disabling or requiring middleware auth.
### S5 -- `eval` on environment-derived value [MED]
- Files: `install:598,671`, `uninstall:136,462`, `lib/system.sh:306`, others
- `eval echo "~$SUDO_USER"` expands an env-sourced value through eval.
- Fix: `getent passwd "$SUDO_USER" | cut -d: -f6`
### S6 -- Predictable temp file paths [MED]
- Files: `lib/secrets.sh:16,188,288`, `uninstall:374`
- `/tmp/hops_env_$$` etc. in world-writable `/tmp` are symlink-race targets
before the `chmod 600` runs.
- Fix: use `mktemp` and assign before use.
### S7 -- Commands built as strings, run unquoted [MED]
- File: `install:731-736,755,773-779`
- `pull_cmd="sudo -u $SUDO_USER docker compose pull"` run as `$pull_cmd`
is fragile with unusual usernames and bypasses quoting.
- Fix: use bash arrays.
### S9 -- Non-idempotent sysctl append [LOW]
- File: `privileged-setup:224`
- `echo 'net.ipv4.ip_forward = 1' >> /etc/sysctl.conf` appended every run.
Accumulates duplicate lines.
- Fix: check before appending (`grep -q ... || echo ... >>`)
---
## ARCHITECTURE / DESIGN
### A1 -- Two divergent install pipelines [HIGH] -- RESOLVED
- Path A chosen. Delete: `setup`, `privileged-setup`, `user-operations`,
`services-improved`.
### A2 -- Three sources of truth for the service catalog [HIGH] -- DO FIRST
- `services`: `get_service_ports()` + inline image strings (CANONICAL)
- `services-improved`: scheduled for deletion (Path B)
- `lib/docker.sh`: `HOPS_SERVICES` array -- reconcile or remove duplicates
- Fix: `lib/docker.sh` service maps must match `services`; remove anything
only used by Path B.
- Must be done before port collision fix (B7) so there is one authoritative map.
### A3 -- `lib/privileges.sh` is dead code [MED] -- RESOLVED: delete
- Path B artifact. Delete it.
### A4 -- `lib/secrets.sh` crypto needs fixing and wiring in [MED]
- Goal: encrypt the `.env` file at rest after `install` writes it.
- Fix broken AES-GCM (use `gpg --symmetric` or a supported openssl mode).
- Fix passphrase-on-command-line exposure (S1, S2).
- Wire encrypt/decrypt calls into `install` flow.
### A5 -- `hops` duplicates functions from `lib/common.sh` [HIGH] -- DO FIRST
- `log`, `error_exit`, `warning`, `success`, `info`, `validate_timezone`,
`validate_password`, `generate_secure_password`, `create_docker_networks`,
`get_service_port/image` are all defined twice (or three times).
- Fix: source `lib/common.sh` from `hops` and remove local duplicates.
- Must be done before bug fixes to avoid patching the same logic in multiple places.
### A6 -- Caddy is unreachable via the menu [LOW]
- `services` defines `generate_caddy` but the `select_services` menu in
`install` never lists caddy as a selectable option.
### A7 -- Committed dev artifacts [LOW]
- `summary7-19.txt` and `discord-header.md` should not be in the repo.
Add to `.gitignore` or delete.
---
## MISSING / INCOMPLETE
### M1 -- RESOLVED (Path B deleted)
### M2 -- RESOLVED (Path B deleted)
### M3 -- RESOLVED (Path B deleted)
### M4 -- RESOLVED (`lib/privileges.sh` deleted)
### M5 -- Review Huntarr inclusion -- switch to elfhosted rebuild [MED]
- The current `services` definition uses the original plexguide/Huntarr.io image.
The project has been rebuilt by elfhosted under a new image/repo.
- Review the new elfhosted image name, default port, and any config changes.
- Update the service definition in `services` and the entry in SERVICES.md.
- Verify the new image is actively maintained before shipping 1.0.0.
---
## PLATFORM SUPPORT
### P2 -- `uninstall` is Linux-only [HIGH] -- deferred (Linux-first)
- Unconditional `apt-get`, `dpkg`, `systemctl`, `groupdel`, `ufw` with no
OS branching. Acceptable for now; revisit when macOS support is scoped.
### P3 -- RESOLVED (Path B deleted)
### P4 -- No WSL2 detection [MED]
- README claims WSL2 support but there is no WSL2 detection.
`systemctl`-based service management fails on WSL distros without systemd.
### P5 -- Inconsistent port-check tools [MED]
- `lib/common.sh` uses `ss`; `install` uses `lsof`. `ss` is absent on macOS.
### P6 -- Hardcoded render GID for Jellyfin GPU [LOW]
- File: `services:435`
- `group_add: "109"` is the render GID on a specific distro, wrong on most
systems and meaningless on macOS.
---
## CODE QUALITY
### Q1 -- Three separate error-handling implementations [HIGH] -- DO FIRST
- `hops`, `uninstall`, and `lib/common.sh` each define their own `error_exit`
and `log` with different formats. Consolidate in `lib/common.sh`.
- Covered by A5; tracked here for completeness.
### Q2 -- `set -e` + intentional non-zero returns is a minefield [MED]
- `validate_password` returns 1/2/3, `check_port` returns 1 -- these work only
because they happen to be in conditionals. Combined with B5 this is fragile.
Consider `set -euo pipefail` with explicit `|| true` where non-zero is intended.
### Q3 -- Debug `echo` statements left in production code [MED] -- DO FIRST
- Files: `lib/system.sh:605,823,1043,1046,1084,1089,1149-1156`
- `DEBUG:` prefixed echo lines should be removed or gated behind a `$DEBUG` flag.
- Clean these before bug fixes so signal isn't buried in debug noise.
### Q4 -- `services-improved` leaks `set -e` when sourced [LOW]
- File: `services-improved` top of file
- File sets `set -e` then is sourced by `user-operations` and `privileges`,
leaking the option into the caller's shell.
---
## SUGGESTED ORDER OF ATTACK
### Cleanup first (do before any bug fixes)
1. [DONE] Delete Path B files (A1/A3)
2. Consolidate duplicate functions into `lib/common.sh` (A5, Q1) -- one copy to fix
3. Reconcile `lib/docker.sh` service maps with `services` (A2) -- one catalog to fix
4. Remove debug echo statements from `lib/system.sh` (Q3) -- reduce noise
### Bug fixes
5. Fix B1 (infinite recursion in `services`) -- unblocks all Linux installs
6. Fix B5 (`((x++))` under `set -e`) -- prevents silent aborts
7. Fix B3 (glob directory detection) -- fixes multi-user and uninstall
8. Fix B4 (wrong filename in firewall setup)
9. Fix B7 (intra-selection port collision detection)
### Security pass
10. S3 (default Authelia cred), S5 (eval on env var), S6 (mktemp),
S7 (string-built commands), S2 (passphrase on cmdline)
11. Fix and wire in `lib/secrets.sh`: replace broken crypto, hook into install
flow to encrypt `.env` at rest (A4/S1/S2)
### Remaining
12. Fix B12 empty-password regex, B8 watchtower port, B9 backup self-copy
13. macOS / WSL2 support (B6, P4) -- future roadmap