Files
infrastructure-monitoring-d…/CLAUDE.md
T

6.7 KiB

Infrastructure Monitoring Dashboard

Project Overview

A web-based status feed aggregator for a K-12 school district IT department. Provides a single-pane-of-glass view of vendor service health, replacing the need to manually check multiple status pages during incidents.

Target Vendors

Vendor Type Status Source
Microsoft 365 Productivity suite Service Communications API (Graph API)
SpamTitan Email security Synthetic check — district appliance (mailportal.nhsd.net)
PowerSchool Student Information System Atlassian Statuspage API (status.powerschool.com)
Classlink SSO / Identity Atlassian Statuspage API (status.classlink.com)
Apple Device ecosystem Apple System Status JSON feed
DRC Assessment / Testing Synthetic check — PA INSIGHT portal (status page is JS-rendered Angular app)
FinalSite School website CMS Atlassian Statuspage API (status.finalsite.com)
Google Workspace Productivity suite Google Workspace Status Dashboard JSON feed
Follett Library management Synthetic check — district Destiny instance (northhills.follettdestiny.com)
EdInsight Data analytics (Harris Education Solutions) Synthetic check — no public status page found
Raptor Visitor management Status.io API (status.raptortech.com); incidents in "Monitoring" state (state ≥ 300) are suppressed from the message
SchoolMessenger Communication platform Atlassian Statuspage API (PowerSchool status page, SchoolMessenger components filtered)
McGraw Hill Curriculum / assessment Synthetic check — ConnectED portal (status.mcgrawhill.com is JS-rendered)
Fortinet Network security Atlassian Statuspage API (FortiGate Cloud — status.fortigate.forticloud.com)
SherpaDesk Helpdesk / ticketing Synthetic check — district portal (nhsd.sherpadesk.com); HEAD not supported, uses GET
Study Island Instructional practice Atlassian Statuspage API (Edmentum — status.edmentum.com, Study Island component filtered)
Classkick Classroom assessment Synthetic check — app portal (StatusCast API requires auth token)
ClassDojo Classroom communication Synthetic check — app portal (no machine-readable status feed)
Savvas K-12 Curriculum / learning platform Atlassian Statuspage API (status.savvas.com)
Amazon AWS Cloud infrastructure RSS feed polling (EC2 us-east-1/2, S3, CloudFront, Route 53)
Cloudflare CDN / DNS Atlassian Statuspage API (cloudflarestatus.com)
SmartPass Hall pass management Instatus JSON API (smartpass.instatus.com)
School Dismissal Manager Dismissal management Synthetic check — admin portal (status page redirects to StatusGator)
Promethean Interactive displays Synthetic check — prometheanworld.com (panels used in standalone mode; no cloud features)
RAZ-Kids Reading platform Synthetic check — Learning A-Z login portal; always returns unknown due to Cloudflare managed challenge blocking server-side fetches
Internet Connectivity TCP check to 8.8.8.8:53

Note: Exchange Online is intentionally excluded — it is a component of M365 Service Health and would be redundant.

New vendors should be added incrementally, not speculatively.

FortiGate Dashboard Features

The WAN throughput graphs and FortiGate health card were implemented but are currently disabled due to a FortiGate API access issue. The code has been removed from the frontend and backend pending resolution. The .env variables below are placeholders for when this is revisited.

Credentials / Environment

backend/.env is gitignored and contains:

  • AZURE_TENANT_ID, AZURE_CLIENT_ID, AZURE_CLIENT_SECRET — Microsoft 365 Graph API
  • FORTIGATE_HOST, FORTIGATE_API_TOKEN — FortiGate REST API (not currently used)
  • FORTIGATE_WAN1_INTERFACE, FORTIGATE_WAN1_LABEL — Crown Castle WAN (not currently used)
  • FORTIGATE_WAN2_INTERFACE, FORTIGATE_WAN2_LABEL — Comcast WAN (not currently used)

Source Control

Hosting

  • Web server: Shared Caddy instance also used by the staff lifecycle portal (C:\staff-lifecycle-portal\caddy\Caddyfile)
  • URL: https://status.nhsd.net (standard HTTPS, no custom port)
  • Access: All local network devices (no IP restriction on the status block)
  • TLS: Caddy internal TLS (self-signed). Browser cert warnings are acceptable; distribute caddy/data/caddy/pki/authorities/local/root.crt via Group Policy to eliminate them.
  • DNS: A record status.nhsd.net → 10.1.20.214 on nhsd-dc-04p.nhsd.net
  • Caddy reload: caddy reload --config "C:\staff-lifecycle-portal\caddy\Caddyfile"

Architecture

  • Frontend: HTML/CSS/JS dashboard — lightweight, no heavy framework. Designed to work on a wall-mounted monitor or quick browser check.
  • Backend: Node.js service that polls vendor status on a schedule and caches results.
  • Web server: Caddy reverse-proxies /api/* to the Node backend on port 3000 and serves the static frontend directly.
  • Services: Node backend runs as an NSSM Windows service named StatusDashboard (auto-start). Caddy is managed by the staff-lifecycle-portal project.
  • Data flow: Backend polls vendors → caches to local store → frontend fetches from backend API → auto-refreshes on interval.
  • Logs: C:\infrastructure-monitoring-dashboard\logs\backend.log

API Endpoints

Endpoint Description Poll interval
GET /api/status All vendor status cards 2 minutes
GET /api/health Backend liveness check On demand

Directory Structure

infrastructure-monitoring-dashboard/
├── CLAUDE.md
├── README.md
├── .gitignore
├── bin/
│   └── nssm/                # Drop nssm.exe here (git-ignored)
├── config/
│   └── Caddyfile            # Unused — Caddy config lives in the lifecycle portal project
├── frontend/
│   ├── index.html
│   ├── css/style.css
│   └── js/app.js
├── backend/
│   ├── package.json
│   ├── server.js
│   └── providers/               # One module per vendor
├── logs/                    # backend.log (git-ignored)
└── scripts/                 # NSSM service install/uninstall helpers

Design Principles

  • Keep it simple. This is a status board, not a monitoring platform.
  • Degrade gracefully — if a vendor check fails, show "unknown" rather than crashing.
  • Each vendor integration should be a self-contained module so they can be added/removed independently.
  • Optimize for glanceability — status should be obvious from across the room (color-coded, large indicators).