7a58eda29b
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
113 lines
6.7 KiB
Markdown
113 lines
6.7 KiB
Markdown
# Infrastructure Monitoring Dashboard
|
|
|
|
## Project Overview
|
|
|
|
A web-based status feed aggregator for a K-12 school district IT department. Provides a single-pane-of-glass view of vendor service health, replacing the need to manually check multiple status pages during incidents.
|
|
|
|
## Target Vendors
|
|
|
|
| Vendor | Type | Status Source |
|
|
|---|---|---|
|
|
| Microsoft 365 | Productivity suite | Service Communications API (Graph API) |
|
|
| SpamTitan | Email security | Synthetic check — district appliance (mailportal.nhsd.net) |
|
|
| PowerSchool | Student Information System | Atlassian Statuspage API (status.powerschool.com) |
|
|
| Classlink | SSO / Identity | Atlassian Statuspage API (status.classlink.com) |
|
|
| Apple | Device ecosystem | Apple System Status JSON feed |
|
|
| DRC | Assessment / Testing | Synthetic check — PA INSIGHT portal (status page is JS-rendered Angular app) |
|
|
| FinalSite | School website CMS | Atlassian Statuspage API (status.finalsite.com) |
|
|
| Google Workspace | Productivity suite | Google Workspace Status Dashboard JSON feed |
|
|
| Follett | Library management | Synthetic check — district Destiny instance (northhills.follettdestiny.com) |
|
|
| EdInsight | Data analytics (Harris Education Solutions) | Synthetic check — no public status page found |
|
|
| Raptor | Visitor management | Status.io API (status.raptortech.com); incidents in "Monitoring" state (state ≥ 300) are suppressed from the message |
|
|
| SchoolMessenger | Communication platform | Atlassian Statuspage API (PowerSchool status page, SchoolMessenger components filtered) |
|
|
| McGraw Hill | Curriculum / assessment | Synthetic check — ConnectED portal (status.mcgrawhill.com is JS-rendered) |
|
|
| Fortinet | Network security | Atlassian Statuspage API (FortiGate Cloud — status.fortigate.forticloud.com) |
|
|
| SherpaDesk | Helpdesk / ticketing | Synthetic check — district portal (nhsd.sherpadesk.com); HEAD not supported, uses GET |
|
|
| Study Island | Instructional practice | Atlassian Statuspage API (Edmentum — status.edmentum.com, Study Island component filtered) |
|
|
| Classkick | Classroom assessment | Synthetic check — app portal (StatusCast API requires auth token) |
|
|
| ClassDojo | Classroom communication | Synthetic check — app portal (no machine-readable status feed) |
|
|
| Savvas K-12 | Curriculum / learning platform | Atlassian Statuspage API (status.savvas.com) |
|
|
| Amazon AWS | Cloud infrastructure | RSS feed polling (EC2 us-east-1/2, S3, CloudFront, Route 53) |
|
|
| Cloudflare | CDN / DNS | Atlassian Statuspage API (cloudflarestatus.com) |
|
|
| SmartPass | Hall pass management | Instatus JSON API (smartpass.instatus.com) |
|
|
| School Dismissal Manager | Dismissal management | Synthetic check — admin portal (status page redirects to StatusGator) |
|
|
| Promethean | Interactive displays | Synthetic check — prometheanworld.com (panels used in standalone mode; no cloud features) |
|
|
| RAZ-Kids | Reading platform | Synthetic check — Learning A-Z login portal; always returns `unknown` due to Cloudflare managed challenge blocking server-side fetches |
|
|
| Internet | Connectivity | TCP check to 8.8.8.8:53 |
|
|
|
|
Note: Exchange Online is intentionally excluded — it is a component of M365 Service Health and would be redundant.
|
|
|
|
New vendors should be added incrementally, not speculatively.
|
|
|
|
## FortiGate Dashboard Features
|
|
|
|
The WAN throughput graphs and FortiGate health card were implemented but are currently disabled due to a FortiGate API access issue. The code has been removed from the frontend and backend pending resolution. The `.env` variables below are placeholders for when this is revisited.
|
|
|
|
## Credentials / Environment
|
|
|
|
`backend/.env` is gitignored and contains:
|
|
- `AZURE_TENANT_ID`, `AZURE_CLIENT_ID`, `AZURE_CLIENT_SECRET` — Microsoft 365 Graph API
|
|
- `FORTIGATE_HOST`, `FORTIGATE_API_TOKEN` — FortiGate REST API (not currently used)
|
|
- `FORTIGATE_WAN1_INTERFACE`, `FORTIGATE_WAN1_LABEL` — Crown Castle WAN (not currently used)
|
|
- `FORTIGATE_WAN2_INTERFACE`, `FORTIGATE_WAN2_LABEL` — Comcast WAN (not currently used)
|
|
|
|
## Source Control
|
|
|
|
- **Remote**: https://git.canadabot.net/canadabot/infrastructure-monitoring-dashboard (remote name: `origin`)
|
|
- Credentials stored in `C:\users\kleins\.gitea_credentials`
|
|
|
|
## Hosting
|
|
|
|
- **Web server**: Shared Caddy instance also used by the staff lifecycle portal (`C:\staff-lifecycle-portal\caddy\Caddyfile`)
|
|
- **URL**: https://status.nhsd.net (standard HTTPS, no custom port)
|
|
- **Access**: All local network devices (no IP restriction on the status block)
|
|
- **TLS**: Caddy internal TLS (self-signed). Browser cert warnings are acceptable; distribute `caddy/data/caddy/pki/authorities/local/root.crt` via Group Policy to eliminate them.
|
|
- **DNS**: A record `status.nhsd.net → 10.1.20.214` on nhsd-dc-04p.nhsd.net
|
|
- **Caddy reload**: `caddy reload --config "C:\staff-lifecycle-portal\caddy\Caddyfile"`
|
|
|
|
## Architecture
|
|
|
|
- **Frontend**: HTML/CSS/JS dashboard — lightweight, no heavy framework. Designed to work on a wall-mounted monitor or quick browser check.
|
|
- **Backend**: Node.js service that polls vendor status on a schedule and caches results.
|
|
- **Web server**: Caddy reverse-proxies `/api/*` to the Node backend on port 3000 and serves the static frontend directly.
|
|
- **Services**: Node backend runs as an NSSM Windows service named `StatusDashboard` (auto-start). Caddy is managed by the staff-lifecycle-portal project.
|
|
- **Data flow**: Backend polls vendors → caches to local store → frontend fetches from backend API → auto-refreshes on interval.
|
|
- **Logs**: `C:\infrastructure-monitoring-dashboard\logs\backend.log`
|
|
|
|
## API Endpoints
|
|
|
|
| Endpoint | Description | Poll interval |
|
|
|---|---|---|
|
|
| `GET /api/status` | All vendor status cards | 2 minutes |
|
|
| `GET /api/health` | Backend liveness check | On demand |
|
|
|
|
## Directory Structure
|
|
|
|
```
|
|
infrastructure-monitoring-dashboard/
|
|
├── CLAUDE.md
|
|
├── README.md
|
|
├── .gitignore
|
|
├── bin/
|
|
│ └── nssm/ # Drop nssm.exe here (git-ignored)
|
|
├── config/
|
|
│ └── Caddyfile # Unused — Caddy config lives in the lifecycle portal project
|
|
├── frontend/
|
|
│ ├── index.html
|
|
│ ├── css/style.css
|
|
│ └── js/app.js
|
|
├── backend/
|
|
│ ├── package.json
|
|
│ ├── server.js
|
|
│ └── providers/ # One module per vendor
|
|
├── logs/ # backend.log (git-ignored)
|
|
└── scripts/ # NSSM service install/uninstall helpers
|
|
```
|
|
|
|
## Design Principles
|
|
|
|
- Keep it simple. This is a status board, not a monitoring platform.
|
|
- Degrade gracefully — if a vendor check fails, show "unknown" rather than crashing.
|
|
- Each vendor integration should be a self-contained module so they can be added/removed independently.
|
|
- Optimize for glanceability — status should be obvious from across the room (color-coded, large indicators).
|