Files
hops/TODO.md
T
Stephen Klein cce25bfa4d Add TODO: review Huntarr for elfhosted rebuild (M5)
Current definition uses the retired plexguide image. Need to
evaluate and switch to the elfhosted-maintained rebuild before 1.0.0.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-06-10 21:23:39 -04:00

12 KiB

HOPS TODO

Generated by codebase audit (2026-06-10). Ranked by severity.

Decisions (2026-06-10)

  • Canonical pipeline: Path A (hops -> install -> services). Path B deleted.
  • Deleted: setup, privileged-setup, user-operations, services-improved, lib/privileges.sh -- all Path B artifacts, gone.
  • Service catalog: services is the single source of truth. Latest tags kept.
  • lib/secrets.sh: keep and fix. Goal is to encrypt the .env file at rest (passwords/API keys written by install). Fix the broken AES-GCM crypto and wire encryption/decryption into the install flow.
  • macOS: future roadmap. Linux is the target for now.

CRITICAL BUGS (breaks primary use cases)

B1 -- Infinite recursion in services on Linux [CRITICAL]

  • File: services:25-46
  • get_timezone_mount() and get_gpu_devices() call themselves on the non-Darwin branch via echo "$(get_timezone_mount)". Hits bash FUNCNEST limit on every Linux compose generation. Main ./hops install is broken on Linux.
  • Fix: replace the recursive calls with the literal YAML strings they should emit.

B2 -- Brace mismatch in lib/privileges.sh [CRITICAL] -- RESOLVED: delete file

  • File: lib/privileges.sh:429,612
  • Moot -- lib/privileges.sh is Path B dead code, scheduled for deletion (see A3).

HIGH BUGS

B3 -- Glob stored as string, directory detection always fails [HIGH]

  • Files: hops:154-166, uninstall:127-147
  • homelab_dirs=( "/home/*/hops" ) stores a literal glob; the quoted for-loop never expands it. Multi-user detection is broken, cd "$HOMELAB_DIR" fails under set -e.
  • Fix: iterate unquoted or use compgen -G "/home/*/hops".

B4 -- Missing service definitions file reference [HIGH]

  • File: install:916
  • setup_firewall() sources "$SCRIPT_DIR/hops_service_definitions.sh" which does not exist (the file is named services). Per-service firewall rules are silently never applied.
  • Fix: correct the filename to services.

B5 -- ((x++)) aborts script under set -e [HIGH]

  • Files: hops:299,317, install:784, and others
  • ((running_count++)) returns exit code 1 when the pre-increment value is 0, which kills the script under set -e.
  • Fix: use running_count=$((running_count + 1)) or append || true.

B6 -- hops entry point is Linux-only despite macOS library support [HIGH]

  • File: hops:108-136,263
  • check_dependencies requires systemctl, check_system_requirements calls free and df -BG, show_service_status calls systemctl. All Linux-only. The documented entry point fails immediately on macOS.
  • Fix: add OS guards or document hops as Linux-only.

B7 -- Port collisions not detected within a selection [HIGH]

  • File: services (port map)
  • sabnzbd and traefik dashboard both use 8080; traefik and nginx-proxy-manager both bind 80/443; authelia and transmission both use 9091.
  • check_all_ports only checks host listeners, not intra-selection conflicts, so users can generate an un-startable compose silently.
  • Fix: add intra-selection conflict check before compose generation.

MEDIUM BUGS

B8 -- Watchtower assigned a bogus port [MED]

  • Files: lib/docker.sh:47, services-improved:90
  • Watchtower has no web UI. Assigning it port 8080 emits a spurious ports: block and broken healthcheck in the generated compose.

B9 -- Update backup copies into itself [MED]

  • File: hops:586-595
  • update_hops does cp -r "$SCRIPT_DIR" "$backup_dir" where $backup_dir is inside $SCRIPT_DIR. Results in recursive self-copy including .git/.
  • Fix: create the backup dir outside the script directory.

B10 -- secure_delete stat flag wrong on macOS [MED]

  • File: lib/secrets.sh:146
  • Uses stat -c%s (GNU) which fails on macOS (stat -f%z). Manual-overwrite fallback silently no-ops on macOS.

B11 -- jellystat generated with wrong template in services-improved [MED]

  • File: services-improved:422
  • Routed through the generic media-server template; gets no postgres DB and no JWT_SECRET, so it cannot run. The hand-written services version is correct.

B12 -- Empty-password detection regex broken [LOW]

  • File: lib/security.sh:361-384
  • grep "PASSWORD=\s*$" without -E or -P means \s is matched literally, not as whitespace. Empty-password detection is dead.

SECURITY

S1 -- Broken/unauthenticated encryption [HIGH]

  • File: lib/secrets.sh:85,115
  • openssl enc -aes-256-gcm via the CLI does not handle the GCM auth tag. This is not authenticated encryption and round-trips unreliably.
  • Fix: use a supported openssl mode or switch to gpg --symmetric.

S2 -- Passphrases/keys exposed in process list [HIGH]

  • Files: lib/secrets.sh:85,115, lib/security.sh:140,156,175,204,416,442
  • -pass pass:"$key" and --passphrase "$x" on the command line are visible to any local user via ps.
  • Fix: use -pass fd:N or --passphrase-fd N with a file descriptor.

S3 -- Committed default Authelia credential [HIGH]

  • File: services:1148-1157
  • users_database.yml ships a default admin account with a known password hash (hash of literal "password"). Every Authelia deploy has this credential.
  • Fix: force password change on first login or generate the hash at deploy time.

S4 -- Traefik dashboard exposed with no auth [MED]

  • File: services:672-673,684
  • api.insecure=true exposes the Traefik dashboard on :8080 with no auth. Consider disabling or requiring middleware auth.

S5 -- eval on environment-derived value [MED]

  • Files: install:598,671, uninstall:136,462, lib/system.sh:306, others
  • eval echo "~$SUDO_USER" expands an env-sourced value through eval.
  • Fix: getent passwd "$SUDO_USER" | cut -d: -f6

S6 -- Predictable temp file paths [MED]

  • Files: lib/secrets.sh:16,188,288, uninstall:374
  • /tmp/hops_env_$$ etc. in world-writable /tmp are symlink-race targets before the chmod 600 runs.
  • Fix: use mktemp and assign before use.

S7 -- Commands built as strings, run unquoted [MED]

  • File: install:731-736,755,773-779
  • pull_cmd="sudo -u $SUDO_USER docker compose pull" run as $pull_cmd is fragile with unusual usernames and bypasses quoting.
  • Fix: use bash arrays.

S9 -- Non-idempotent sysctl append [LOW]

  • File: privileged-setup:224
  • echo 'net.ipv4.ip_forward = 1' >> /etc/sysctl.conf appended every run. Accumulates duplicate lines.
  • Fix: check before appending (grep -q ... || echo ... >>)

ARCHITECTURE / DESIGN

A1 -- Two divergent install pipelines [HIGH] -- RESOLVED

  • Path A chosen. Delete: setup, privileged-setup, user-operations, services-improved.

A2 -- Three sources of truth for the service catalog [HIGH] -- DO FIRST

  • services: get_service_ports() + inline image strings (CANONICAL)
  • services-improved: scheduled for deletion (Path B)
  • lib/docker.sh: HOPS_SERVICES array -- reconcile or remove duplicates
  • Fix: lib/docker.sh service maps must match services; remove anything only used by Path B.
  • Must be done before port collision fix (B7) so there is one authoritative map.

A3 -- lib/privileges.sh is dead code [MED] -- RESOLVED: delete

  • Path B artifact. Delete it.

A4 -- lib/secrets.sh crypto needs fixing and wiring in [MED]

  • Goal: encrypt the .env file at rest after install writes it.
  • Fix broken AES-GCM (use gpg --symmetric or a supported openssl mode).
  • Fix passphrase-on-command-line exposure (S1, S2).
  • Wire encrypt/decrypt calls into install flow.

A5 -- hops duplicates functions from lib/common.sh [HIGH] -- DO FIRST

  • log, error_exit, warning, success, info, validate_timezone, validate_password, generate_secure_password, create_docker_networks, get_service_port/image are all defined twice (or three times).
  • Fix: source lib/common.sh from hops and remove local duplicates.
  • Must be done before bug fixes to avoid patching the same logic in multiple places.

A6 -- Caddy is unreachable via the menu [LOW]

  • services defines generate_caddy but the select_services menu in install never lists caddy as a selectable option.

A7 -- Committed dev artifacts [LOW]

  • summary7-19.txt and discord-header.md should not be in the repo. Add to .gitignore or delete.

MISSING / INCOMPLETE

M1 -- RESOLVED (Path B deleted)

M2 -- RESOLVED (Path B deleted)

M3 -- RESOLVED (Path B deleted)

M4 -- RESOLVED (lib/privileges.sh deleted)

M5 -- Review Huntarr inclusion -- switch to elfhosted rebuild [MED]

  • The current services definition uses the original plexguide/Huntarr.io image. The project has been rebuilt by elfhosted under a new image/repo.
  • Review the new elfhosted image name, default port, and any config changes.
  • Update the service definition in services and the entry in SERVICES.md.
  • Verify the new image is actively maintained before shipping 1.0.0.

PLATFORM SUPPORT

P2 -- uninstall is Linux-only [HIGH] -- deferred (Linux-first)

  • Unconditional apt-get, dpkg, systemctl, groupdel, ufw with no OS branching. Acceptable for now; revisit when macOS support is scoped.

P3 -- RESOLVED (Path B deleted)

P4 -- No WSL2 detection [MED]

  • README claims WSL2 support but there is no WSL2 detection. systemctl-based service management fails on WSL distros without systemd.

P5 -- Inconsistent port-check tools [MED]

  • lib/common.sh uses ss; install uses lsof. ss is absent on macOS.

P6 -- Hardcoded render GID for Jellyfin GPU [LOW]

  • File: services:435
  • group_add: "109" is the render GID on a specific distro, wrong on most systems and meaningless on macOS.

CODE QUALITY

Q1 -- Three separate error-handling implementations [HIGH] -- DO FIRST

  • hops, uninstall, and lib/common.sh each define their own error_exit and log with different formats. Consolidate in lib/common.sh.
  • Covered by A5; tracked here for completeness.

Q2 -- set -e + intentional non-zero returns is a minefield [MED]

  • validate_password returns 1/2/3, check_port returns 1 -- these work only because they happen to be in conditionals. Combined with B5 this is fragile. Consider set -euo pipefail with explicit || true where non-zero is intended.

Q3 -- Debug echo statements left in production code [MED] -- DO FIRST

  • Files: lib/system.sh:605,823,1043,1046,1084,1089,1149-1156
  • DEBUG: prefixed echo lines should be removed or gated behind a $DEBUG flag.
  • Clean these before bug fixes so signal isn't buried in debug noise.

Q4 -- services-improved leaks set -e when sourced [LOW]

  • File: services-improved top of file
  • File sets set -e then is sourced by user-operations and privileges, leaking the option into the caller's shell.

SUGGESTED ORDER OF ATTACK

Cleanup first (do before any bug fixes)

  1. [DONE] Delete Path B files (A1/A3)
  2. Consolidate duplicate functions into lib/common.sh (A5, Q1) -- one copy to fix
  3. Reconcile lib/docker.sh service maps with services (A2) -- one catalog to fix
  4. Remove debug echo statements from lib/system.sh (Q3) -- reduce noise

Bug fixes

  1. Fix B1 (infinite recursion in services) -- unblocks all Linux installs
  2. Fix B5 (((x++)) under set -e) -- prevents silent aborts
  3. Fix B3 (glob directory detection) -- fixes multi-user and uninstall
  4. Fix B4 (wrong filename in firewall setup)
  5. Fix B7 (intra-selection port collision detection)

Security pass

  1. S3 (default Authelia cred), S5 (eval on env var), S6 (mktemp), S7 (string-built commands), S2 (passphrase on cmdline)
  2. Fix and wire in lib/secrets.sh: replace broken crypto, hook into install flow to encrypt .env at rest (A4/S1/S2)

Remaining

  1. Fix B12 empty-password regex, B8 watchtower port, B9 backup self-copy
  2. macOS / WSL2 support (B6, P4) -- future roadmap