From 473f585391dc4f70c18b88635e0b661cea5224ad Mon Sep 17 00:00:00 2001 From: "Tran G. (Revernomad) Khoa" Date: Thu, 16 Apr 2026 13:54:30 +0700 Subject: [PATCH] feat: initial system design documents for Languard Server Manager Complete backend design for an Arma 3 dedicated server management panel: - ARCHITECTURE.md: System architecture, tech stack, component responsibilities, data flows - DATABASE.md: SQLite schema with WAL mode, CHECK constraints, 16+ tables - API.md: REST + WebSocket API contract with auth, CRUD, and real-time channels - MODULES.md: Python module breakdown with class definitions and dependencies - THREADING.md: Concurrency model with thread safety, auto-restart, and WS bridge - IMPLEMENTATION_PLAN.md: 7-phase implementation plan with security from Phase 1 Key design decisions: - Sync SQLAlchemy only (no aiosqlite), thread-local DB connections - Structured config builder (not f-strings) preventing config injection - RCon request multiplexer for concurrent UDP access - BackgroundScheduler for sync DB cleanup jobs - ban.txt bidirectional sync with documented field mapping - Auto-restart sequenced after thread cleanup Co-Authored-By: Claude Opus 4.6 --- API.md | 755 ++++++++++++++++++++++++++++++++++ ARCHITECTURE.md | 309 ++++++++++++++ DATABASE.md | 586 +++++++++++++++++++++++++++ IMPLEMENTATION_PLAN.md | 445 ++++++++++++++++++++ MODULES.md | 900 +++++++++++++++++++++++++++++++++++++++++ THREADING.md | 600 +++++++++++++++++++++++++++ 6 files changed, 3595 insertions(+) create mode 100644 API.md create mode 100644 ARCHITECTURE.md create mode 100644 DATABASE.md create mode 100644 IMPLEMENTATION_PLAN.md create mode 100644 MODULES.md create mode 100644 THREADING.md diff --git a/API.md b/API.md new file mode 100644 index 0000000..e573223 --- /dev/null +++ b/API.md @@ -0,0 +1,755 @@ +# Languard Server Manager — API Contract + +## Base URL +``` +http://localhost:8000/api +``` + +## Authentication +- All endpoints except `POST /auth/login` require: `Authorization: Bearer ` +- WebSocket: pass token as query param: `ws://localhost:8000/ws/{server_id}?token=` +- JWT payload: `{ "sub": "user_id", "username": "string", "role": "admin|viewer", "exp": timestamp }` + +## Common Response Envelope +```json +{ + "success": true, + "data": { ... }, + "error": null +} +``` +Error response: +```json +{ + "success": false, + "data": null, + "error": { + "code": "NOT_FOUND", + "message": "Server with id 5 not found" + } +} +``` + +## HTTP Status Codes +| Code | Meaning | +|------|---------| +| 200 | Success | +| 201 | Created | +| 204 | No content (DELETE) | +| 400 | Validation error | +| 401 | Unauthenticated | +| 403 | Forbidden (insufficient role) | +| 404 | Not found | +| 409 | Conflict (already running, duplicate) | +| 422 | Unprocessable (Pydantic validation) | +| 500 | Internal server error | + +--- + +## Auth Endpoints + +### POST /auth/login +Login and receive JWT. + +**Request:** +```json +{ + "username": "admin", + "password": "secret" +} +``` +**Response 200:** +```json +{ + "success": true, + "data": { + "access_token": "eyJhbGciOiJIUzI1NiIs...", + "token_type": "bearer", + "expires_in": 86400, + "user": { + "id": 1, + "username": "admin", + "role": "admin" + } + } +} +``` + +### POST /auth/logout +Invalidate token (client-side token deletion; server-side blacklist optional). + +### GET /auth/me +Return current user info. + +### PUT /auth/password +Change password. Admin only. +```json +{ "current_password": "old", "new_password": "new" } +``` + +### GET /auth/users +List all users. Admin only. + +### POST /auth/users +Create user. Admin only. +```json +{ "username": "viewer1", "password": "pass", "role": "viewer" } +``` + +### DELETE /auth/users/{user_id} +Delete user. Admin only. + +--- + +## Server Endpoints + +### GET /servers +List all servers with current status. Supports pagination. + +**Query params:** `?limit=50&offset=0` + +**Response 200:** +```json +{ + "success": true, + "data": [ + { + "id": 1, + "name": "Main Server", + "description": "Primary COOP server", + "status": "running", + "pid": 12345, + "game_port": 2302, + "rcon_port": 2306, + "player_count": 15, + "max_players": 40, + "current_mission": "MyMission.Altis", + "uptime_seconds": 3600, + "cpu_percent": 34.2, + "ram_mb": 1850.5, + "started_at": "2026-04-16T10:00:00Z" + // current_mission: computed from RCon 'players' response or mission_rotation + server status + // uptime_seconds: computed as (now - started_at) in the service layer + } + ] +} +``` + +### POST /servers +Create a new server. Admin only. + +**Request:** +```json +{ + "name": "Main Server", + "description": "Primary COOP server", + "exe_path": "C:/Arma3Server/arma3server_x64.exe", + "game_port": 2302, + "rcon_port": 2306, + "auto_restart": true, + "max_restarts": 3 +} +``` +**Note:** `password_admin` is auto-generated if not provided in the request. The generated value is returned in the response (shown once — not stored in plaintext API responses after creation). `rcon_password` is also auto-generated if not provided. +**Response 201:** Returns full server object including auto-generated credentials. + +### GET /servers/{server_id} +Get server detail with full status. + +**Response 200:** +```json +{ + "success": true, + "data": { + "id": 1, + "name": "Main Server", + "status": "running", + "pid": 12345, + "game_port": 2302, + "rcon_port": 2306, + "auto_restart": true, + "restart_count": 0, + "player_count": 15, + "max_players": 40, + "cpu_percent": 34.2, + "ram_mb": 1850.5, + "started_at": "2026-04-16T10:00:00Z", + "uptime_seconds": 3600, + "current_mission": "MyMission.Altis" + } +} +``` + +### PUT /servers/{server_id} +Update server metadata (name, description, exe_path, ports). Admin only. + +### DELETE /servers/{server_id} +Delete server (must be stopped first). Admin only. Removes DB rows and `servers/{id}/` directory. + +### POST /servers/{server_id}/start +Start the server. Admin only. + +**Response 200:** +```json +{ "success": true, "data": { "status": "starting", "pid": null } } +``` +**Response 409:** Server already running. + +### POST /servers/{server_id}/stop +Graceful stop (send `#shutdown` via RCon, then kill after 30s). Admin only. + +**Request (optional):** +```json +{ "force": false, "reason": "Maintenance" } +``` + +### POST /servers/{server_id}/restart +Stop then start. Admin only. + +### POST /servers/{server_id}/kill +Force-kill the process immediately. Admin only. Emergency use only. + +--- + +## Server Config Endpoints + +### GET /servers/{server_id}/config +Get all config sections combined. + +**Response 200:** +```json +{ + "success": true, + "data": { + "server": { /* server_configs row */ }, + "basic": { /* basic_configs row */ }, + "profile": { /* server_profiles row */ }, + "launch": { /* launch_params row */ }, + "rcon": { "rcon_password": "***", "max_ping": 200, "enabled": true } + } +} +``` + +### PUT /servers/{server_id}/config/server +Update server.cfg settings. Admin only. + +**Request:** Partial object matching `server_configs` columns (snake_case). Any omitted field keeps current value. + +```json +{ + "hostname": "Updated Server Name", + "max_players": 64, + "battleye": 1, + "verify_signatures": 2, + "motd_lines": ["Welcome!", "Have fun"], + "motd_interval": 5.0 +} +``` + +### PUT /servers/{server_id}/config/basic +Update basic.cfg (bandwidth) settings. Admin only. + +```json +{ + "max_bandwidth": 50000000, + "max_msg_send": 256 +} +``` + +### PUT /servers/{server_id}/config/profile +Update difficulty profile. Admin only. + +```json +{ + "third_person_view": 0, + "weapon_crosshair": 0, + "ai_level_preset": 3, + "skill_ai": 0.7, + "precision_ai": 0.6 +} +``` + +### PUT /servers/{server_id}/config/launch +Update launch parameters. Admin only. + +```json +{ + "world": "empty", + "limit_fps": 50, + "auto_init": false, + "load_mission_to_memory": true, + "bandwidth_alg": 2, + "enable_ht": true, + "huge_pages": false +} +``` + +### PUT /servers/{server_id}/config/rcon +Update BattlEye RCon settings. Admin only. +Regenerates `battleye/beserver.cfg` immediately. **Note:** BattlEye reads beserver.cfg only at server startup — RCon config changes require a server restart to take effect. The updated config file is ready for the next start. + +**Note on `rcon_port`:** This field is stored in the `servers` table (not `rcon_configs`). +The service layer updates both tables as needed. Include only fields you want to change. + +```json +{ + "rcon_password": "newpassword", + "rcon_port": 2306, + "max_ping": 300, + "enabled": true +} +``` + +### GET /servers/{server_id}/config/preview +Returns rendered `server.cfg` as plain text string (for preview in UI). **Admin only** — contains plaintext credentials. + +**Response 200:** `Content-Type: text/plain` + +### GET /servers/{server_id}/config/download/{filename} +Download generated config file. Filename must be one of: `server.cfg` | `basic.cfg` | `server.Arma3Profile` (whitelist-validated, no path traversal). **Admin only** — config files contain plaintext passwords. + +--- + +## Mission Endpoints + +### GET /servers/{server_id}/missions +List all mission PBOs for a server. + +**Response 200:** +```json +{ + "success": true, + "data": [ + { + "id": 1, + "filename": "MyMission.Altis.pbo", + "mission_name": "MyMission.Altis", + "terrain": "Altis", + "file_size": 102400, + "uploaded_at": "2026-04-16T09:00:00Z" + } + ] +} +``` + +### POST /servers/{server_id}/missions/upload +Upload a mission PBO. Admin only. `multipart/form-data`. + +**Form fields:** +- `file`: the `.pbo` file (filename is sanitized with `os.path.basename()` to prevent path traversal; only `.pbo` extension allowed) + +**Response 201:** +```json +{ + "success": true, + "data": { + "id": 2, + "filename": "NewMission.Stratis.pbo", + "mission_name": "NewMission.Stratis", + "terrain": "Stratis", + "file_size": 51200 + } +} +``` + +### DELETE /servers/{server_id}/missions/{mission_id} +Delete a mission PBO (removes file from disk). Admin only. + +### GET /servers/{server_id}/missions/rotation +Get current mission rotation (ordered list). + +**Response 200:** +```json +{ + "success": true, + "data": [ + { + "id": 1, + "sort_order": 0, + "mission": { "id": 1, "mission_name": "MyMission.Altis" }, + "difficulty": "Regular", + "params": { "RespawnDelay": 15 } + } + ] +} +``` + +### PUT /servers/{server_id}/missions/rotation +Replace the entire mission rotation. Admin only. + +```json +{ + "rotation": [ + { "mission_id": 1, "difficulty": "Regular", "params": {} }, + { "mission_id": 2, "difficulty": "Veteran", "params": { "RespawnDelay": 30 } } + ] +} +``` + +--- + +## Mod Endpoints + +### GET /mods +List all registered mods (global list). + +### POST /mods +Register a mod folder. Admin only. + +```json +{ + "name": "@CBA_A3", + "folder_path": "C:/Arma3Server/@CBA_A3", + "workshop_id": "450814997", + "description": "Community Base Addons" +} +``` + +### DELETE /mods/{mod_id} +Delete mod registration. Admin only. + +### GET /servers/{server_id}/mods +Get mods enabled for a server. + +**Response 200:** +```json +{ + "success": true, + "data": [ + { + "mod_id": 1, + "name": "@CBA_A3", + "folder_path": "C:/Arma3Server/@CBA_A3", + "is_server_mod": false, + "sort_order": 0 + } + ] +} +``` + +### PUT /servers/{server_id}/mods +Replace the mod list for a server. Admin only. + +```json +{ + "mods": [ + { "mod_id": 1, "is_server_mod": false, "sort_order": 0 }, + { "mod_id": 2, "is_server_mod": true, "sort_order": 1 } + ] +} +``` + +--- + +## Player Endpoints + +### GET /servers/{server_id}/players +Get currently connected players. + +**Response 200:** +```json +{ + "success": true, + "data": [ + { + "player_num": 1, + "name": "PlayerOne", + "guid": "abc123...", + "ping": 45, + "verified": true, + "joined_at": "2026-04-16T10:15:00Z" + } + ] +} +``` + +### POST /servers/{server_id}/players/{player_num}/kick +Kick a player via RCon. Admin only. + +```json +{ "reason": "AFK" } +``` + +### POST /servers/{server_id}/players/{player_num}/ban +Ban a player via RCon. Admin only. + +```json +{ + "reason": "Griefing", + "duration_minutes": 0 +} +``` + +### GET /servers/{server_id}/players/history +Player connection history. Supports pagination. + +**Query params:** `?limit=50&offset=0&search=PlayerName` + +--- + +## Ban Endpoints + +### GET /servers/{server_id}/bans +List all bans for a server. + +**Query params:** `?active_only=true&limit=50&offset=0` + +### POST /servers/{server_id}/bans +Add ban manually. Admin only. + +```json +{ + "guid": "abc123...", + "steam_uid": "76561198...", + "name": "PlayerName", + "reason": "Cheating", + "duration_minutes": 0 +} +``` + +### DELETE /servers/{server_id}/bans/{ban_id} +Remove a ban. Admin only. + +--- + +## Log Endpoints + +### GET /servers/{server_id}/logs +Query stored log lines. + +**Query params:** `?limit=200&offset=0&level=error&since=2026-04-16T10:00:00Z&search=BattlEye` + +**Response 200:** +```json +{ + "success": true, + "data": { + "total": 1542, + "logs": [ + { + "id": 100, + "timestamp": "2026-04-16T10:05:23Z", + "level": "info", + "message": "Player PlayerOne connected" + } + ] + } +} +``` + +### DELETE /servers/{server_id}/logs +Clear all stored log lines for a server. Admin only. + +--- + +## Metrics Endpoints + +### GET /servers/{server_id}/metrics +Get time-series metrics. + +**Query params:** `?from=2026-04-16T00:00:00Z&to=2026-04-16T23:59:59Z&resolution=5m` + +**Response 200:** +```json +{ + "success": true, + "data": [ + { + "timestamp": "2026-04-16T10:00:00Z", + "cpu_percent": 34.2, + "ram_mb": 1850.5, + "player_count": 15 + } + ] +} +``` + +--- + +## RCon Endpoints + +### POST /servers/{server_id}/rcon/command +Send raw RCon/admin command. Admin only. + +```json +{ "command": "#restart" } +``` + +**Available commands:** +- `#restart` — Restart mission +- `#reassign` — Restart with roles unassigned +- `#missions` — Open mission selection +- `#lock` / `#unlock` — Lock/unlock server +- `#mission NAME.TERRAIN [difficulty]` — Load specific mission +- `#shutdown` — Shut down server +- `#monitor N` — Toggle performance monitoring +- `say -1 MESSAGE` — Message all players + +### POST /servers/{server_id}/rcon/say +Broadcast a message to all players. Admin only. + +```json +{ "message": "Server restarting in 5 minutes!" } +``` + +--- + +## Event Log Endpoints + +### GET /servers/{server_id}/events +Get server event history (audit trail). + +**Query params:** `?limit=50&offset=0&event_type=crashed` + +--- + +## System Endpoints + +### GET /system/status +Overall system status. **Requires authentication** (admin or viewer). + +```json +{ + "success": true, + "data": { + "version": "1.0.0", + "running_servers": 2, + "total_servers": 3, + "uptime_seconds": 86400 + } +} +``` + +### GET /system/health +Health check (for load balancer/Docker). Returns 200 if healthy. + +--- + +## WebSocket API + +### Connection +``` +ws://localhost:8000/ws/{server_id}?token= +``` +Use `server_id = "all"` to subscribe to events from all servers. + +### Client → Server Messages + +```json +{ "type": "ping" } +{ "type": "subscribe", "channels": ["logs", "players", "metrics", "status"] } +{ "type": "unsubscribe", "channels": ["metrics"] } +``` + +**Channel subscription**: The `ConnectionManager` tracks per-connection channel subscriptions. Only messages matching subscribed channels are delivered. Default subscriptions on connect: `["status"]`. + +**Channel names match message types exactly:** `status`, `log`, `players`, `metrics`, `event`. Subscribe with channel names matching the `type` field in server→client messages. + +### Server → Client Messages + +#### Status Update +Sent when server status changes (starting → running → stopped, etc.) +```json +{ + "type": "status", + "server_id": 1, + "data": { + "status": "running", + "pid": 12345, + "started_at": "2026-04-16T10:00:00Z" + } +} +``` + +#### Log Line +Sent for each new RPT log line. +```json +{ + "type": "log", + "server_id": 1, + "data": { + "timestamp": "2026-04-16T10:05:23Z", + "level": "info", + "message": "BattlEye Server: Initialized (v1.240)" + } +} +``` + +#### Player List Update +Sent after each RCon poll (every 10s). +```json +{ + "type": "players", + "server_id": 1, + "data": { + "players": [ + { "player_num": 1, "name": "PlayerOne", "ping": 45, "verified": true } + ], + "count": 1 + } +} +``` + +#### Metrics Update +Sent every 5 seconds. +```json +{ + "type": "metrics", + "server_id": 1, + "data": { + "cpu_percent": 34.2, + "ram_mb": 1850.5, + "player_count": 1, + "timestamp": "2026-04-16T10:05:25Z" + } +} +``` + +#### Server Event +Sent for significant events (crash, restart, etc.) +```json +{ + "type": "event", + "server_id": 1, + "data": { + "event_type": "crashed", + "detail": { "exit_code": 1 }, + "timestamp": "2026-04-16T10:30:00Z" + } +} +``` + +#### Pong +```json +{ "type": "pong" } +``` + +--- + +## Rate Limiting + +- `POST /auth/login`: 5 attempts per minute per IP. Exceeded returns `429 Too Many Requests`. +- All other endpoints: 60 requests per minute per token. Exceeded returns `429`. +- Implemented via FastAPI middleware (e.g., `slowapi`). + +--- + +## Error Codes Reference + +| Code | Description | +|------|-------------| +| `UNAUTHORIZED` | Missing or invalid token | +| `FORBIDDEN` | Role insufficient | +| `NOT_FOUND` | Resource not found | +| `SERVER_ALREADY_RUNNING` | Start called on running server | +| `SERVER_NOT_RUNNING` | Stop/command on stopped server | +| `RCON_UNAVAILABLE` | RCon connection failed | +| `INVALID_CONFIG` | Config validation failed | +| `EXE_NOT_FOUND` | arma3server.exe not at configured path | +| `PORT_IN_USE` | Game port already occupied | +| `UPLOAD_FAILED` | Mission file upload error | +| `VALIDATION_ERROR` | Pydantic validation failure | +| `INTERNAL_ERROR` | Unexpected server error | +| `MOD_IN_USE` | Cannot delete mod — enabled on one or more servers | +| `MISSION_IN_ROTATION` | Cannot delete mission — in active rotation | +| `RATE_LIMITED` | Too many requests | diff --git a/ARCHITECTURE.md b/ARCHITECTURE.md new file mode 100644 index 0000000..48f4dd3 --- /dev/null +++ b/ARCHITECTURE.md @@ -0,0 +1,309 @@ +# Languard Server Manager — System Architecture + +## Overview + +Languard is a web-based management panel for Arma 3 dedicated servers. It provides a Python backend that manages one or more `arma3server_x64.exe` processes, exposes a REST + WebSocket API to a React frontend, and persists all state in SQLite. + +--- + +## Technology Stack + +| Layer | Technology | Rationale | +|-------|-----------|-----------| +| Backend framework | **FastAPI** (Python 3.11+) | Async-native, built-in WebSocket, OpenAPI docs auto-generated | +| Database | **SQLite** via `SQLAlchemy` (sync) | Zero-config, file-based, sufficient for single-host server manager; all access is synchronous (WAL mode for concurrent reads) | +| Process management | `subprocess` + `threading` | Wrap arma3server.exe, watch stdout/stderr, check exit codes; **cwd** set to server instance dir for relative paths; on Windows `terminate()` is a hard kill (no SIGTERM) | +| Real-time comms | **WebSocket** (FastAPI) | Push log lines, player lists, server status to React | +| RCon client | Custom UDP client | BattlEye RCon protocol for in-game admin commands | +| Config generation | Python structured builder | Generate server.cfg, basic.cfg, server.Arma3Profile with proper escaping (no f-string injection) | +| Scheduling | `APScheduler` (BackgroundScheduler) | Auto-restart, mission rotation timers, log/metrics cleanup (sync DB ops → BackgroundScheduler, not AsyncIOScheduler) | +| Auth | **JWT** (python-jose) + bcrypt | Secure the API; React stores token in localStorage | +| Frontend | React + TypeScript (external repo) | Connects to this backend's API | + +--- + +## High-Level Architecture + +``` +┌─────────────────────────────────────────────────────────────┐ +│ React Frontend │ +│ Server List │ Server Detail │ Logs │ Players │ Config UI │ +└────────────────────────┬────────────────────────────────────┘ + │ HTTP REST + WebSocket + ▼ +┌─────────────────────────────────────────────────────────────┐ +│ FastAPI Application │ +│ │ +│ ┌──────────────┐ ┌──────────────┐ ┌──────────────────┐ │ +│ │ Auth Router │ │ Server Router│ │ Config Router │ │ +│ └──────────────┘ └──────────────┘ └──────────────────┘ │ +│ ┌──────────────┐ ┌──────────────┐ ┌──────────────────┐ │ +│ │Mission Router│ │ Mod Router │ │ WS Router │ │ +│ └──────────────┘ └──────────────┘ └──────────────────┘ │ +│ │ +│ ┌─────────────────────────────────────────────────────┐ │ +│ │ Service Layer │ │ +│ │ ServerService │ ConfigService │ RConService │ │ +│ │ LogService │ MetricsService│ MissionService │ │ +│ └─────────────────────────────────────────────────────┘ │ +│ │ +│ ┌─────────────────────────────────────────────────────┐ │ +│ │ Thread Pool │ │ +│ │ ProcessMonitorThread (per server) │ │ +│ │ LogTailThread (per server) │ │ +│ │ MetricsCollectorThread (per server) │ │ +│ │ RConPollerThread (per server) │ │ +│ │ BroadcastThread (global) │ │ +│ └─────────────────────────────────────────────────────┘ │ +│ │ +│ ┌─────────────────────────────────────────────────────┐ │ +│ │ Data Access Layer (DAL) │ │ +│ │ ServerRepository │ PlayerRepository │ │ +│ │ LogRepository │ MetricsRepository │ │ +│ └─────────────────────────────────────────────────────┘ │ +│ │ +│ ┌───────────────────┐ ┌────────────────────────────────┐ │ +│ │ SQLite (DB) │ │ Filesystem │ │ +│ │ languard.db │ │ servers/{id}/server.cfg │ │ +│ │ │ │ servers/{id}/basic.cfg │ │ +│ │ │ │ servers/{id}/server/ │ │ ← profile dir (Arma3 -name=server) +│ │ │ │ server.Arma3Profile │ │ ← profile settings +│ │ │ │ arma3server_*.rpt │ │ ← RPT logs (tailable) +│ │ │ │ servers/{id}/battleye/ │ │ +│ │ │ │ beserver.cfg │ │ ← RCon config +│ │ │ │ servers/{id}/mpmissions/ │ │ +│ └───────────────────┘ └────────────────────────────────┘ │ +└─────────────────────────────────────────────────────────────┘ + │ subprocess + ▼ +┌─────────────────────────────────────────────────────────────┐ +│ Arma 3 Server Processes (OS level) │ +│ arma3server_x64.exe (port 2302) │ +│ arma3server_x64.exe (port 2402) │ +│ ... │ +└─────────────────────────────────────────────────────────────┘ +``` + +--- + +## Component Responsibilities + +### FastAPI Routers +- Validate input (Pydantic models) +- Call service layer +- Return JSON responses +- Handle WebSocket connections + +### Service Layer +- Orchestrate operations (start server = generate config + launch process + start threads) +- No direct DB access — delegates to repositories +- No direct process access — delegates to ProcessManager + +### ProcessManager +- Singleton that owns all subprocess handles +- Thread-safe dict: `{server_id: subprocess.Popen}` +- `start()` sets `cwd=servers/{server_id}/` so relative config paths resolve correctly +- On Windows: `terminate()` = `TerminateProcess` (hard kill), no graceful SIGTERM — graceful shutdown must go through RCon `#shutdown` first +- Provides: `start()`, `stop()`, `restart()`, `is_running()`, `send_command()` + +### Thread Pool (per running server) +| Thread | Interval | Purpose | +|--------|----------|---------| +| `ProcessMonitorThread` | 1s | Detect crash / unexpected exit; update DB status; trigger auto-restart | +| `LogTailThread` | 100ms | Read new lines from .rpt file; store in DB; push to WS clients | +| `MetricsCollectorThread` | 5s | Collect CPU%, RAM MB for the process via psutil; write to DB | +| `RConPollerThread` | 10s | Query connected players via BattlEye RCon; update DB player table | +| `BroadcastThread` | event-driven | Consume from internal queue; push JSON to all subscribed WS clients | + +### RCon Client +- UDP socket to BattlEye RCon port (configured in `beserver.cfg` inside the server's `battleye/` directory) +- Implements BE RCon protocol: login, keepalive, send command, parse response +- **Request multiplexer**: tracks pending requests by sequence byte, routes responses to the correct caller via `threading.Event` per request. Prevents response misrouting when RConPollerThread and API-request RConService calls share the same UDP socket. +- Used by: `RConPollerThread`, `RConService` (for admin commands from UI) + +### Config Generator +- Takes `ServerConfig` Pydantic model from DB +- Renders `server.cfg`, `basic.cfg`, `*.Arma3Profile` using a **structured builder** (NOT f-strings — prevents config injection) +- Escapes double quotes and newlines in all user-supplied string values +- Writes files to `servers/{server_id}/` directory +- `server.Arma3Profile` written to `servers/{server_id}/server/` (Arma 3 reads from the `-name` subdirectory) + +### SQLite DAL +- Sync reads/writes using SQLAlchemy Core (not ORM — simpler for this use case) +- Thread-safe via SQLAlchemy's connection pooling +- One `languard.db` file at project root +- **PRAGMA busy_timeout=5000** — prevents "database is locked" errors under concurrent thread writes +- Thread-local connections via `get_thread_db()` — one connection per background thread + +--- + +## Data Flow: Start Server + +``` +Frontend → POST /api/servers/{id}/start + → ServerService.start(server_id) + ├── Load ServerConfig from DB + ├── ConfigGenerator.write_configs(server_id, config) + │ ├── server.cfg → servers/{id}/server.cfg + │ ├── basic.cfg → servers/{id}/basic.cfg + │ ├── server.Arma3Profile → servers/{id}/server/server.Arma3Profile + │ └── beserver.cfg → servers/{id}/battleye/beserver.cfg + ├── ProcessManager.start(server_id, exe_path, args, cwd=servers/{id}/) + ├── DB: update server.status = "starting" + ├── Spawn ProcessMonitorThread(server_id) + ├── Spawn LogTailThread(server_id) — tails servers/{id}/server/arma3server_*.rpt + ├── Spawn MetricsCollectorThread(server_id) + ├── Spawn RConPollerThread(server_id) [after 30s delay for server startup] + └── BroadcastThread pushes status update to WS clients +``` + +## Data Flow: Real-time Logs + +``` +arma3server.exe writes servers/{id}/server/arma3server_*.rpt + → LogTailThread reads new lines (recursive glob for *.rpt in profile dir) + → LogRepository.insert(server_id, line, timestamp) + → BroadcastQueue.put({type: "log", server_id, line, timestamp}) + → BroadcastThread sends to all WS subscribers for this server + → React frontend appends to log viewer +``` + +## Data Flow: Player List + +``` +RConPollerThread (every 10s) + → RConClient.send("players") + → Parse response: [{id, name, guid, ping, verified}] + → PlayerRepository.upsert_all(server_id, players) + → BroadcastQueue.put({type: "players", server_id, players}) + → React frontend updates player list +``` + +--- + +## Security Model + +- All API routes (except `POST /api/auth/login`) require a valid **JWT Bearer token** +- JWT contains: `user_id`, `username`, `role` (`admin` | `viewer`) +- `viewer` role: read-only (GET endpoints, WebSocket) +- `admin` role: all operations +- CORS configured to accept only the frontend origin +- Passwords hashed with **bcrypt** (cost factor 12) +- `serverCommandPassword` and `passwordAdmin` stored encrypted in SQLite (AES-256 via `cryptography` library, key from env) +- **Port conflict validation** at server creation and start: checks game_port through game_port+4 (game, Steam query, Steam master, Steam auth, RCon) against all existing servers +- **ban.txt sync**: bans table is source of truth for UI; on ban add/delete via API, also write to `battleye/ban.txt`; on startup, read `ban.txt` and upsert into DB. Without this sync, DB-only bans are not enforced by BattlEye. +- Generated config files containing passwords (server.cfg, beserver.cfg) have restrictive file permissions (0600 on Unix, restricted ACL on Windows) +- Input sanitization on all string fields before config generation — no shell injection or config directive injection + +--- + +## Configuration (Environment Variables) + +```env +LANGUARD_SECRET_KEY= +LANGUARD_ENCRYPTION_KEY= +LANGUARD_DB_PATH=./languard.db +LANGUARD_SERVERS_DIR=./servers +LANGUARD_ARMA_EXE=C:/Arma3Server/arma3server_x64.exe +LANGUARD_HOST=0.0.0.0 +LANGUARD_PORT=8000 +LANGUARD_CORS_ORIGINS=http://localhost:5173,http://localhost:3000 +LANGUARD_LOG_RETENTION_DAYS=7 +``` + +--- + +## Directory Layout + +``` +languard-server-manager/ +├── backend/ +│ ├── main.py # FastAPI app factory +│ ├── config.py # Settings from env +│ ├── database.py # SQLAlchemy engine + session +│ ├── auth/ +│ │ ├── router.py +│ │ ├── service.py +│ │ └── schemas.py +│ ├── servers/ +│ │ ├── router.py # REST endpoints for servers +│ │ ├── service.py # ServerService +│ │ ├── process_manager.py # ProcessManager singleton +│ │ ├── config_generator.py # server.cfg / basic.cfg / beserver.cfg writer +│ │ └── schemas.py # Pydantic schemas +│ ├── rcon/ +│ │ ├── client.py # BattlEye RCon UDP client +│ │ └── service.py # RConService +│ ├── players/ +│ │ ├── router.py +│ │ ├── service.py +│ │ └── schemas.py +│ ├── missions/ +│ │ ├── router.py +│ │ └── service.py +│ ├── mods/ +│ │ ├── router.py +│ │ └── service.py +│ ├── logs/ +│ │ ├── router.py +│ │ └── service.py +│ ├── metrics/ +│ │ ├── router.py +│ │ └── service.py +│ ├── websocket/ +│ │ ├── router.py # WS connection handler +│ │ ├── manager.py # ConnectionManager (per-server subscriptions) +│ │ └── broadcaster.py # BroadcastThread + queue +│ ├── threads/ +│ │ ├── process_monitor.py # ProcessMonitorThread +│ │ ├── log_tail.py # LogTailThread +│ │ ├── metrics_collector.py # MetricsCollectorThread +│ │ └── rcon_poller.py # RConPollerThread +│ ├── system/ +│ │ └── router.py # GET /system/status, GET /system/health +│ ├── dal/ +│ │ ├── server_repository.py +│ │ ├── config_repository.py +│ │ ├── player_repository.py +│ │ ├── log_repository.py +│ │ ├── metrics_repository.py +│ │ ├── mission_repository.py +│ │ ├── mod_repository.py +│ │ ├── ban_repository.py +│ │ └── event_repository.py +│ └── migrations/ +│ └── 001_initial_schema.sql +├── servers/ # Runtime data per server instance +│ └── {server_id}/ +│ ├── server.cfg +│ ├── basic.cfg +│ ├── server/ # Arma 3 profile dir (matches -name=server) +│ │ ├── server.Arma3Profile +│ │ └── arma3server_*.rpt # Timestamped RPT logs +│ ├── battleye/ +│ │ └── beserver.cfg # BattlEye RCon config (generated on start) +│ └── mpmissions/ +├── frontend/ # React app (separate repo or subfolder) +├── requirements.txt +├── .env.example +├── ARCHITECTURE.md +├── DATABASE.md +├── API.md +├── MODULES.md +├── THREADING.md +└── IMPLEMENTATION_PLAN.md +``` + +--- + +## Key Design Decisions + +| Decision | Choice | Reason | +|----------|--------|--------| +| Sync vs async DB | **Sync SQLAlchemy only** | All DB access is synchronous; background threads are non-async; `get_thread_db()` provides thread-local connections; no aiosqlite dependency | +| ORM vs Core | **SQLAlchemy Core** | Simpler SQL control, less magic for embedded use case | +| WebSocket auth | JWT in query param on connect | Browser WS API doesn't support headers; query param `?token=...` | +| Process ownership | **ProcessManager singleton** | Single source of truth; prevents duplicate launches | +| Log storage | **DB + rolling file** | DB for fast queries/streaming; raw .rpt preserved on disk | +| Config files | **Regenerate on each start** | Always fresh from DB; no sync drift between DB and filesystem; **structured builder** (not f-strings) prevents config injection | +| RCon port convention | **User-configurable** | BattlEye RCon port is set in `beserver.cfg` (inside `battleye/` dir). Default suggestion: game port + 4 (e.g., 2302 → 2306). Must not conflict with game (2302), Steam query (2303), VON (2304), or Steam auth (2305) ports. **Note:** RCon config changes require server restart — BattlEye reads beserver.cfg only at startup. | diff --git a/DATABASE.md b/DATABASE.md new file mode 100644 index 0000000..145e558 --- /dev/null +++ b/DATABASE.md @@ -0,0 +1,586 @@ +# Languard Server Manager — Database Design + +## Engine +- **SQLite** via `SQLAlchemy Core` (sync for all access — routes and threads) +- File: `languard.db` at project root (configurable via `LANGUARD_DB_PATH`) +- WAL mode enabled: `PRAGMA journal_mode=WAL` — allows concurrent reads during writes +- Foreign keys enabled: `PRAGMA foreign_keys=ON` +- Busy timeout: `PRAGMA busy_timeout=5000` — prevents "database is locked" errors under concurrent thread writes + +--- + +## Schema + +### Table: `users` + +Stores web UI admin accounts. + +```sql +CREATE TABLE users ( + id INTEGER PRIMARY KEY AUTOINCREMENT, + username TEXT NOT NULL UNIQUE, + password_hash TEXT NOT NULL, -- bcrypt hash + role TEXT NOT NULL DEFAULT 'viewer', -- 'admin' | 'viewer' + CHECK (role IN ('admin', 'viewer')), + created_at TEXT NOT NULL DEFAULT (datetime('now')), + last_login TEXT +); +``` + +--- + +### Table: `servers` + +One row per managed Arma 3 server instance. + +```sql +CREATE TABLE servers ( + id INTEGER PRIMARY KEY AUTOINCREMENT, + name TEXT NOT NULL, -- display name in UI + description TEXT, + status TEXT NOT NULL DEFAULT 'stopped', + -- status values: 'stopped' | 'starting' | 'running' | 'stopping' | 'crashed' | 'error' + CHECK (status IN ('stopped', 'starting', 'running', 'stopping', 'crashed', 'error')), + CHECK (game_port BETWEEN 1024 AND 65535), + CHECK (rcon_port BETWEEN 1024 AND 65535), + + -- Process info + pid INTEGER, -- OS process ID when running + exe_path TEXT NOT NULL, -- path to arma3server_x64.exe + started_at TEXT, -- ISO datetime + stopped_at TEXT, + + -- Network + game_port INTEGER NOT NULL DEFAULT 2302, + rcon_port INTEGER NOT NULL DEFAULT 2306, -- user-configurable; written to battleye/beserver.cfg + steam_query_port INTEGER GENERATED ALWAYS AS (game_port + 1) VIRTUAL, -- convention, not enforced by engine + + -- Auto-management + auto_restart INTEGER NOT NULL DEFAULT 0, -- 1 = restart on crash + max_restarts INTEGER NOT NULL DEFAULT 3, -- within restart_window_seconds + restart_window_seconds INTEGER NOT NULL DEFAULT 300, + restart_count INTEGER NOT NULL DEFAULT 0, + last_restart_at TEXT, + + created_at TEXT NOT NULL DEFAULT (datetime('now')), + updated_at TEXT NOT NULL DEFAULT (datetime('now')) +); + +CREATE INDEX idx_servers_status ON servers(status); +CREATE INDEX idx_servers_game_port ON servers(game_port); +CREATE INDEX idx_servers_rcon_port ON servers(rcon_port); +``` + +--- + +### Table: `server_configs` + +Stores all parameters for generating `server.cfg`. One row per server. + +```sql +CREATE TABLE server_configs ( + id INTEGER PRIMARY KEY AUTOINCREMENT, + server_id INTEGER NOT NULL UNIQUE REFERENCES servers(id) ON DELETE CASCADE, + + -- Basic identity + hostname TEXT NOT NULL DEFAULT 'My Arma 3 Server', + password TEXT, -- join password (encrypted at app layer via Fernet) + password_admin TEXT NOT NULL, -- encrypted (no default — must be set on creation) + server_command_password TEXT, -- encrypted + + -- Players + max_players INTEGER NOT NULL DEFAULT 40, + kick_duplicate INTEGER NOT NULL DEFAULT 1, + persistent INTEGER NOT NULL DEFAULT 1, + + -- Voting + vote_threshold REAL NOT NULL DEFAULT 0.33, + vote_mission_players INTEGER NOT NULL DEFAULT 1, + vote_timeout INTEGER NOT NULL DEFAULT 60, -- seconds + role_timeout INTEGER NOT NULL DEFAULT 90, + briefing_timeout INTEGER NOT NULL DEFAULT 60, + debriefing_timeout INTEGER NOT NULL DEFAULT 45, + lobby_idle_timeout INTEGER NOT NULL DEFAULT 300, + + -- Voice + disable_von INTEGER NOT NULL DEFAULT 0, + von_codec INTEGER NOT NULL DEFAULT 1, -- 1 = OPUS + CHECK (von_codec IN (0, 1)), + von_codec_quality INTEGER NOT NULL DEFAULT 20, -- 1-30 + + -- Network quality kick thresholds + max_ping INTEGER NOT NULL DEFAULT 250, + max_packet_loss INTEGER NOT NULL DEFAULT 50, + max_desync INTEGER NOT NULL DEFAULT 200, + disconnect_timeout INTEGER NOT NULL DEFAULT 15, + kick_on_ping INTEGER NOT NULL DEFAULT 1, + kick_on_packet_loss INTEGER NOT NULL DEFAULT 1, + kick_on_desync INTEGER NOT NULL DEFAULT 1, + kick_on_timeout INTEGER NOT NULL DEFAULT 1, + + -- Security + battleye INTEGER NOT NULL DEFAULT 1, + verify_signatures INTEGER NOT NULL DEFAULT 2, -- 0 | 1 | 2 (1 = check but don't kick) + allowed_file_patching INTEGER NOT NULL DEFAULT 0, -- 0 | 1 | 2 + + -- Difficulty + forced_difficulty TEXT NOT NULL DEFAULT 'Regular', -- Recruit | Regular | Veteran | Custom + + -- Misc + timestamp_format TEXT NOT NULL DEFAULT 'short', -- none | short | full + auto_select_mission INTEGER NOT NULL DEFAULT 0, + random_mission_order INTEGER NOT NULL DEFAULT 0, + missions_to_restart INTEGER NOT NULL DEFAULT 0, + missions_to_shutdown INTEGER NOT NULL DEFAULT 0, + log_file TEXT NOT NULL DEFAULT 'server_console.log', + skip_lobby INTEGER NOT NULL DEFAULT 0, + drawing_in_map INTEGER NOT NULL DEFAULT 1, + upnp INTEGER NOT NULL DEFAULT 0, + loopback INTEGER NOT NULL DEFAULT 0, + statistics_enabled INTEGER NOT NULL DEFAULT 1, + force_rotor_lib INTEGER NOT NULL DEFAULT 0, -- 0=player, 1=AFM, 2=SFM + CHECK (force_rotor_lib IN (0, 1, 2)), + required_build INTEGER NOT NULL DEFAULT 0, + steam_protocol_max_data_size INTEGER NOT NULL DEFAULT 1024, + + -- MOTD + motd_lines TEXT NOT NULL DEFAULT '[]', -- JSON array of strings + motd_interval REAL NOT NULL DEFAULT 5.0, + + -- Event scripts + on_user_connected TEXT DEFAULT '', + on_user_disconnected TEXT DEFAULT '', + on_unsigned_data TEXT DEFAULT 'kick (_this select 0)', + on_hacked_data TEXT DEFAULT 'kick (_this select 0)', + double_id_detected TEXT DEFAULT '', + + -- Headless clients (JSON arrays) + headless_clients TEXT NOT NULL DEFAULT '[]', -- e.g. '["127.0.0.1"]' + local_clients TEXT NOT NULL DEFAULT '[]', + + -- Admin UIDs whitelist + admin_uids TEXT NOT NULL DEFAULT '[]', -- JSON array of Steam UIDs + + -- File extension whitelists (JSON arrays) + allowed_load_extensions TEXT NOT NULL DEFAULT '["hpp","sqs","sqf","fsm","cpp","paa","txt","xml","inc","ext","sqm","ods","fxy","lip","csv","kb","bik","bikb","html","htm","biedi"]', + allowed_preprocess_extensions TEXT NOT NULL DEFAULT '["hpp","sqs","sqf","fsm","cpp","paa","txt","xml","inc","ext","sqm","ods","fxy","lip","csv","kb","bik","bikb","html","htm","biedi"]', + allowed_html_extensions TEXT NOT NULL DEFAULT '["htm","html","xml","txt"]', + + updated_at TEXT NOT NULL DEFAULT (datetime('now')), + + CHECK (verify_signatures IN (0, 1, 2)), + CHECK (allowed_file_patching IN (0, 1, 2)), + CHECK (von_codec_quality BETWEEN 1 AND 30), + CHECK (forced_difficulty IN ('Recruit', 'Regular', 'Veteran', 'Custom')), + CHECK (vote_threshold >= 0.0 AND vote_threshold <= 1.0), + CHECK (max_players > 0) +); +``` + +--- + +### Table: `basic_configs` + +Stores `basic.cfg` (bandwidth) settings. One row per server. + +```sql +CREATE TABLE basic_configs ( + id INTEGER PRIMARY KEY AUTOINCREMENT, + server_id INTEGER NOT NULL UNIQUE REFERENCES servers(id) ON DELETE CASCADE, + + min_bandwidth INTEGER NOT NULL DEFAULT 800000, + max_bandwidth INTEGER NOT NULL DEFAULT 25000000, + max_msg_send INTEGER NOT NULL DEFAULT 384, -- default 128; higher = desync risk + max_size_guaranteed INTEGER NOT NULL DEFAULT 512, + max_size_non_guaranteed INTEGER NOT NULL DEFAULT 256, + min_error_to_send REAL NOT NULL DEFAULT 0.003, + max_custom_file_size INTEGER NOT NULL DEFAULT 100000, + + updated_at TEXT NOT NULL DEFAULT (datetime('now')) +); +``` + +--- + +### Table: `server_profiles` + +Stores `server.Arma3Profile` difficulty settings. One row per server. + +```sql +CREATE TABLE server_profiles ( + id INTEGER PRIMARY KEY AUTOINCREMENT, + server_id INTEGER NOT NULL UNIQUE REFERENCES servers(id) ON DELETE CASCADE, + + -- Custom difficulty options (all 0/1 or 0/1/2) + reduced_damage INTEGER NOT NULL DEFAULT 0, + group_indicators INTEGER NOT NULL DEFAULT 0, + friendly_tags INTEGER NOT NULL DEFAULT 0, + enemy_tags INTEGER NOT NULL DEFAULT 0, + detected_mines INTEGER NOT NULL DEFAULT 0, + commands INTEGER NOT NULL DEFAULT 1, + waypoints INTEGER NOT NULL DEFAULT 1, + tactical_ping INTEGER NOT NULL DEFAULT 0, + weapon_info INTEGER NOT NULL DEFAULT 2, + stance_indicator INTEGER NOT NULL DEFAULT 2, + stamina_bar INTEGER NOT NULL DEFAULT 0, + weapon_crosshair INTEGER NOT NULL DEFAULT 0, + vision_aid INTEGER NOT NULL DEFAULT 0, + third_person_view INTEGER NOT NULL DEFAULT 0, + camera_shake INTEGER NOT NULL DEFAULT 1, + score_table INTEGER NOT NULL DEFAULT 1, + death_messages INTEGER NOT NULL DEFAULT 1, + von_id INTEGER NOT NULL DEFAULT 1, + map_content_friendly INTEGER NOT NULL DEFAULT 0, + map_content_enemy INTEGER NOT NULL DEFAULT 0, + map_content_mines INTEGER NOT NULL DEFAULT 0, + auto_report INTEGER NOT NULL DEFAULT 0, + multiple_saves INTEGER NOT NULL DEFAULT 0, + + -- AI level + ai_level_preset INTEGER NOT NULL DEFAULT 3, -- 0=Low,1=Normal,2=High,3=Custom + skill_ai REAL NOT NULL DEFAULT 0.5, + precision_ai REAL NOT NULL DEFAULT 0.5, + + CHECK (ai_level_preset BETWEEN 0 AND 3), + CHECK (skill_ai BETWEEN 0.0 AND 1.0), + CHECK (precision_ai BETWEEN 0.0 AND 1.0), + CHECK (group_indicators BETWEEN 0 AND 2), + CHECK (weapon_info BETWEEN 0 AND 2), + CHECK (stance_indicator BETWEEN 0 AND 2), + + updated_at TEXT NOT NULL DEFAULT (datetime('now')) +); +``` + +--- + +### Table: `launch_params` + +Extra command-line parameters added to the server launch command. One row per server. + +```sql +CREATE TABLE launch_params ( + id INTEGER PRIMARY KEY AUTOINCREMENT, + server_id INTEGER NOT NULL UNIQUE REFERENCES servers(id) ON DELETE CASCADE, + + world TEXT NOT NULL DEFAULT 'empty', + extra_params TEXT NOT NULL DEFAULT '', -- raw extra params string + limit_fps INTEGER NOT NULL DEFAULT 50, + auto_init INTEGER NOT NULL DEFAULT 0, + load_mission_to_memory INTEGER NOT NULL DEFAULT 0, + bandwidth_alg INTEGER, -- NULL | 2 + CHECK (bandwidth_alg IS NULL OR bandwidth_alg = 2), + enable_ht INTEGER NOT NULL DEFAULT 0, + huge_pages INTEGER NOT NULL DEFAULT 0, + cpu_count INTEGER, -- NULL = auto + ex_threads INTEGER NOT NULL DEFAULT 7, + max_mem INTEGER, -- NULL = auto + no_logs INTEGER NOT NULL DEFAULT 0, + netlog INTEGER NOT NULL DEFAULT 0, + + updated_at TEXT NOT NULL DEFAULT (datetime('now')) +); +``` + +--- + +### Table: `mods` + +Registered mods. Many-to-many with servers. + +```sql +CREATE TABLE mods ( + id INTEGER PRIMARY KEY AUTOINCREMENT, + name TEXT NOT NULL, + folder_path TEXT NOT NULL UNIQUE, -- absolute or relative path + workshop_id TEXT, -- Steam Workshop ID if applicable + description TEXT, + created_at TEXT NOT NULL DEFAULT (datetime('now')) +); + +CREATE TABLE server_mods ( + server_id INTEGER NOT NULL REFERENCES servers(id) ON DELETE CASCADE, + mod_id INTEGER NOT NULL REFERENCES mods(id) ON DELETE CASCADE, + is_server_mod INTEGER NOT NULL DEFAULT 0, -- 1 = -serverMod (not broadcast to clients) + sort_order INTEGER NOT NULL DEFAULT 0, + PRIMARY KEY (server_id, mod_id) +); + +CREATE INDEX idx_server_mods_server ON server_mods(server_id); +``` + +--- + +### Table: `missions` + +Mission PBO files tracked per server. + +```sql +CREATE TABLE missions ( + id INTEGER PRIMARY KEY AUTOINCREMENT, + server_id INTEGER NOT NULL REFERENCES servers(id) ON DELETE CASCADE, + filename TEXT NOT NULL, -- e.g. "MyMission.Altis.pbo" + mission_name TEXT NOT NULL, -- e.g. "MyMission.Altis" + terrain TEXT NOT NULL, -- e.g. "Altis" + file_size INTEGER, -- bytes + uploaded_at TEXT NOT NULL DEFAULT (datetime('now')), + UNIQUE (server_id, filename) +); + +CREATE INDEX idx_missions_server ON missions(server_id); +``` + +--- + +### Table: `mission_rotation` + +Ordered mission cycle for a server. + +```sql +CREATE TABLE mission_rotation ( + id INTEGER PRIMARY KEY AUTOINCREMENT, + server_id INTEGER NOT NULL REFERENCES servers(id) ON DELETE CASCADE, + mission_id INTEGER NOT NULL REFERENCES missions(id) ON DELETE CASCADE, + sort_order INTEGER NOT NULL DEFAULT 0, + difficulty TEXT NOT NULL DEFAULT 'Regular', + CHECK (difficulty IN ('Recruit', 'Regular', 'Veteran', 'Custom')), + params_json TEXT NOT NULL DEFAULT '{}', -- mission params override as JSON + UNIQUE (server_id, sort_order) +); + +CREATE INDEX idx_mission_rotation_server ON mission_rotation(server_id); +``` + +--- + +### Table: `players` + +Currently connected players (live state, refreshed by RConPollerThread). + +```sql +CREATE TABLE players ( + id INTEGER PRIMARY KEY AUTOINCREMENT, + server_id INTEGER NOT NULL REFERENCES servers(id) ON DELETE CASCADE, + player_num INTEGER NOT NULL, -- BE player# (slot number) + name TEXT NOT NULL, + guid TEXT, -- BattlEye GUID + steam_uid TEXT, + ip TEXT, + ping INTEGER, + verified INTEGER NOT NULL DEFAULT 0, -- 1 = signature verified + joined_at TEXT NOT NULL DEFAULT (datetime('now')), + updated_at TEXT NOT NULL DEFAULT (datetime('now')), + UNIQUE (server_id, player_num) +); + +CREATE INDEX idx_players_server ON players(server_id); +``` + +--- + +### Table: `player_history` + +Historical record of connections. Inserted when player disconnects. + +```sql +CREATE TABLE player_history ( + id INTEGER PRIMARY KEY AUTOINCREMENT, + server_id INTEGER NOT NULL REFERENCES servers(id) ON DELETE CASCADE, + name TEXT NOT NULL, + guid TEXT, + steam_uid TEXT, + ip TEXT, + joined_at TEXT NOT NULL, + left_at TEXT NOT NULL DEFAULT (datetime('now')), + session_duration_seconds INTEGER +); + +CREATE INDEX idx_player_history_server ON player_history(server_id); +CREATE INDEX idx_player_history_steam ON player_history(steam_uid); +``` + +### Player History Retention Cleanup (run daily via APScheduler, keep 90 days) +```sql +DELETE FROM player_history +WHERE left_at < datetime('now', '-90 days'); +``` + +--- + +### Table: `bans` + +Local ban records (source of truth for the UI). **Must sync bidirectionally with `battleye/ban.txt`** — BattlEye reads only from ban.txt. On API ban add/delete: also write to ban.txt. On startup: read ban.txt and upsert into DB. + +**ban.txt format** (one entry per line): +``` +GUID|IP timestamp reason +``` +Example: `a1b2c3d4e5f6|192.168.1.1 1713260000 Cheating` + +**Sync caveats:** ban.txt does not store `banned_by`, `expires_at`, or `is_active`. Timed bans are represented by a future timestamp (not minutes); permanent bans have timestamp `0`. On startup, `banned_by` is set to `'ban.txt'` for entries read from file. Deactivated bans (`is_active=0`) are not written to ban.txt. + +```sql +CREATE TABLE bans ( + id INTEGER PRIMARY KEY AUTOINCREMENT, + server_id INTEGER NOT NULL REFERENCES servers(id) ON DELETE CASCADE, + guid TEXT, + steam_uid TEXT, + name TEXT, + reason TEXT, + banned_by TEXT, -- admin username + banned_at TEXT NOT NULL DEFAULT (datetime('now')), + expires_at TEXT, -- NULL = permanent + is_active INTEGER NOT NULL DEFAULT 1, + CHECK (is_active IN (0, 1)) +); + +CREATE INDEX idx_bans_server ON bans(server_id); +CREATE INDEX idx_bans_guid ON bans(guid); +CREATE INDEX idx_bans_steam_uid ON bans(steam_uid); +CREATE INDEX idx_bans_active ON bans(is_active); +``` + +--- + +### Table: `logs` + +Parsed RPT log lines (rolling retention, default 7 days). + +```sql +CREATE TABLE logs ( + id INTEGER PRIMARY KEY AUTOINCREMENT, + server_id INTEGER NOT NULL REFERENCES servers(id) ON DELETE CASCADE, + timestamp TEXT NOT NULL, + level TEXT NOT NULL DEFAULT 'info', -- 'info' | 'warning' | 'error' + CHECK (level IN ('info', 'warning', 'error')), + message TEXT NOT NULL, + created_at TEXT NOT NULL DEFAULT (datetime('now')) +); + +CREATE INDEX idx_logs_server_ts ON logs(server_id, timestamp); +CREATE INDEX idx_logs_level ON logs(level); -- for ?level= filter +CREATE INDEX idx_logs_created ON logs(created_at); -- for retention cleanup +``` + +--- + +### Table: `metrics` + +Time-series CPU/RAM/player count snapshots. + +```sql +CREATE TABLE metrics ( + id INTEGER PRIMARY KEY AUTOINCREMENT, + server_id INTEGER NOT NULL REFERENCES servers(id) ON DELETE CASCADE, + timestamp TEXT NOT NULL DEFAULT (datetime('now')), + cpu_percent REAL, + ram_mb REAL, + player_count INTEGER +); + +CREATE INDEX idx_metrics_server_ts ON metrics(server_id, timestamp); +``` + +--- + +### Table: `server_events` + +Audit trail of all significant events (start, stop, crash, restart, admin actions). + +```sql +CREATE TABLE server_events ( + id INTEGER PRIMARY KEY AUTOINCREMENT, + server_id INTEGER NOT NULL REFERENCES servers(id) ON DELETE CASCADE, + event_type TEXT NOT NULL, + -- event_type values: + -- 'started' | 'stopped' | 'crashed' | 'restarted' | 'config_updated' + -- 'player_kicked' | 'player_banned' | 'mission_changed' | 'admin_login' + -- 'rcon_command' | 'auto_restarted' + actor TEXT, -- username or 'system' + detail TEXT, -- JSON with event-specific data + created_at TEXT NOT NULL DEFAULT (datetime('now')) +); + +CREATE INDEX idx_events_server ON server_events(server_id, created_at); +``` + +--- + +### Table: `rcon_configs` + +BattlEye RCon credentials per server. + +```sql +CREATE TABLE rcon_configs ( + id INTEGER PRIMARY KEY AUTOINCREMENT, + server_id INTEGER NOT NULL UNIQUE REFERENCES servers(id) ON DELETE CASCADE, + rcon_password TEXT NOT NULL, -- encrypted at app layer + max_ping INTEGER NOT NULL DEFAULT 200, + enabled INTEGER NOT NULL DEFAULT 1, + updated_at TEXT NOT NULL DEFAULT (datetime('now')) +); +``` + +--- + +## Relationships Diagram + +``` +users (1) ──────────────────────────────────── (many) server_events.actor + +servers (1) ──┬── (1) server_configs + ├── (1) basic_configs + ├── (1) server_profiles + ├── (1) launch_params + ├── (1) rcon_configs + ├── (many) server_mods ──── (many) mods + ├── (many) missions + ├── (many) mission_rotation → missions + ├── (many) players + ├── (many) player_history + ├── (many) bans + ├── (many) logs + ├── (many) metrics + └── (many) server_events +``` + +--- + +## Maintenance Queries + +### Log Retention Cleanup (run daily via APScheduler) +```sql +DELETE FROM logs +WHERE created_at < datetime('now', '-7 days'); +``` + +### Metrics Retention Cleanup (keep 30 days) +```sql +DELETE FROM metrics +WHERE timestamp < datetime('now', '-30 days'); +``` + +### Clear disconnected players on server stop +```sql +DELETE FROM players WHERE server_id = ?; +``` + +### Vacuum (run weekly) +```sql +VACUUM; +``` + +--- + +## Migration Strategy + +- Migrations are plain `.sql` files in `backend/migrations/` +- Naming: `001_initial_schema.sql`, `002_add_bans.sql`, etc. +- Tracked in a `schema_migrations` table: + ```sql + CREATE TABLE schema_migrations ( + version INTEGER PRIMARY KEY, + applied_at TEXT NOT NULL DEFAULT (datetime('now')) + ); + ``` +- Applied automatically at app startup by `database.py:run_migrations()` diff --git a/IMPLEMENTATION_PLAN.md b/IMPLEMENTATION_PLAN.md new file mode 100644 index 0000000..598e653 --- /dev/null +++ b/IMPLEMENTATION_PLAN.md @@ -0,0 +1,445 @@ +# Languard Server Manager — Implementation Plan + +## Prerequisites + +Before starting, ensure the following are available: +- Python 3.11+ +- A working Arma 3 dedicated server installation (for testing) +- Node.js 18+ (for frontend dev server) +- The reference docs: ARCHITECTURE.md, DATABASE.md, API.md, MODULES.md, THREADING.md + +--- + +## Phase 1 — Foundation (Start Here) + +**Goal:** Running FastAPI server with DB, auth, and basic server CRUD. + +### Step 1.1 — Project scaffold + +``` +mkdir backend +cd backend +python -m venv venv +venv/Scripts/activate +pip install fastapi uvicorn[standard] sqlalchemy python-jose[cryptography] passlib[bcrypt] cryptography psutil apscheduler python-multipart slowapi pytest pytest-asyncio httpx +# uvloop (faster event loop) is Linux/macOS only — skip on Windows: +# pip install uvloop # only on Linux/macOS +pip freeze > requirements.txt +``` + +Create: +- `backend/config.py` — Settings class (see MODULES.md) +- `backend/main.py` — FastAPI app factory, startup/shutdown hooks +- `backend/conftest.py` — pytest fixtures (in-memory SQLite, test client) +- `.env.example` — All env vars documented + +### Step 1.2 — Database + Migrations + +1. Create `backend/migrations/001_initial_schema.sql` — all tables from DATABASE.md + - Include all CHECK constraints (role, status, verify_signatures, von_codec_quality, etc.) + - Include `PRAGMA busy_timeout=5000` in engine setup + - **Important:** Put `CREATE TABLE IF NOT EXISTS schema_migrations` as the very first + statement — the migration runner queries this table before it can track anything. +2. Create `backend/dal/event_repository.py` — `ServerEventRepository` (needed by Phase 3 threads) +3. Create `backend/database.py`: + - `get_engine()` with WAL + FK pragma + - `run_migrations()` — reads and applies `.sql` files from migrations/ + - `get_db()` — FastAPI dependency (sync session) + - `get_thread_db()` — thread-local session factory +3. Call `run_migrations()` in `main.py:on_startup()` + +**Test:** Start app, confirm `languard.db` created with all tables. Run `pytest` with in-memory SQLite to verify schema creates cleanly. + +### Step 1.3 — Auth module + +1. `backend/auth/utils.py` — `hash_password`, `verify_password`, `create_access_token`, `decode_access_token` +2. `backend/auth/schemas.py` — `LoginRequest`, `TokenResponse`, `UserResponse` +3. `backend/auth/service.py` — `AuthService` (create user, login, list users) +4. `backend/auth/router.py` — login, me, users CRUD +5. `backend/dependencies.py` — `get_current_user`, `require_admin` +6. `main.py` — seed default admin user on first startup if users table empty + - **Generate a random password** and print it to stdout once (NOT admin/admin) + - Add rate limiting to `POST /auth/login` (5 attempts/minute per IP via slowapi) + - Add input sanitization for all string fields in auth schemas + +**Test:** `POST /api/auth/login` returns JWT. `GET /api/auth/me` with token returns user. Rate limiting returns 429 after 5 failed attempts. + +### Step 1.4 — Server CRUD (no process management yet) + +1. `backend/dal/server_repository.py` +2. `backend/dal/config_repository.py` +3. `backend/servers/schemas.py` +4. `backend/servers/router.py` — GET, POST, PUT, DELETE /servers and /servers/{id} +5. `backend/servers/service.py` — CRUD methods only (skip start/stop for now) +6. `backend/utils/file_utils.py` — `ensure_server_dirs()`, `sanitize_filename()` +7. `backend/utils/port_checker.py` — `is_port_in_use()`, `check_server_ports_available()` +8. Port validation on create/start: check game_port through game_port+4 + +**Test:** Create server via API, confirm DB row + directory created. + +--- + +## Phase 2 — Process Management + +**Goal:** Start/stop actual `arma3server.exe` processes. + +### Step 2.1 — Config Generator + +1. `backend/servers/config_generator.py` +2. **Use a structured builder** (NOT f-strings) — escape double quotes and newlines in all user-supplied string values to prevent config injection +3. Write `server.cfg` covering all params from DATABASE.md, including mission rotation as `class Missions {}` block +4. Write `basic.cfg` +5. Write `server.Arma3Profile` — **written to `servers/{id}/server/server.Arma3Profile`** (Arma 3 reads from the `-name` subdirectory) +6. Write `BESERVER_CFG_TEMPLATE` — **required for BattlEye RCon to work** + ``` + # servers/{id}/battleye/beserver.cfg + RConPassword {rcon_password} + RConPort {rcon_port} + ``` + `write_beserver_cfg()` must create the `battleye/` directory and write this file. + Without it BattlEye will not open an RCon port regardless of launch parameters. +7. `build_launch_args()` — assembles full CLI arg list + - Include `-bepath=./battleye` to point BE at the generated config (relative to cwd) + - Include `-profiles=./` and `-name=server` for profile directory + - All relative paths resolve against `cwd=servers/{id}/` set in ProcessManager +8. Set file permissions 0600 on config files containing passwords (server.cfg, beserver.cfg) + +**Test:** `ConfigGenerator.write_all(server_id)` → inspect all generated files for correctness. +Verify `servers/{id}/battleye/beserver.cfg` exists with the correct RCon password. +Verify `servers/{id}/server/server.Arma3Profile` exists. +Test config injection prevention: set hostname to `X"; passwordAdmin = "pwned"; //` — verify generated server.cfg does NOT contain the injected directive. +Validate generated `server.cfg` manually by running the server with it. + +### Step 2.2 — Process Manager + +1. `backend/servers/process_manager.py` — `ProcessManager` singleton +2. `start(server_id, exe_path, args, cwd=servers/{id}/)` — subprocess.Popen with cwd set to server instance dir +3. `stop(server_id, timeout=30)` — on Windows: `terminate()` = hard kill (no SIGTERM). Graceful shutdown is via RCon `#shutdown` in ServerService. +4. `kill()`, `is_running()`, `get_pid()` +5. `recover_on_startup()` — verify PID is alive AND process name matches arma3server (prevents PID reuse) +6. Wire `ServerService.start()` and `ServerService.stop()` +7. Add `POST /servers/{id}/start`, `POST /servers/{id}/stop`, `POST /servers/{id}/kill` endpoints + +**Test:** Start a server via API → confirm process appears in Task Manager. Stop it → confirm process ends. + +### Step 2.3 — Config endpoints + +1. `GET /servers/{id}/config` +2. `PUT /servers/{id}/config/server` +3. `PUT /servers/{id}/config/basic` +4. `PUT /servers/{id}/config/profile` +5. `PUT /servers/{id}/config/launch` +6. `GET /servers/{id}/config/preview` + +**Test:** Update hostname via API → regenerate and start server → confirm new hostname appears in server browser. + +--- + +## Phase 3 — Background Threads + +**Goal:** Live monitoring — process crash detection, log tailing, metrics. + +### Step 3.1 — Thread infrastructure + +1. `backend/threads/base_thread.py` — `BaseServerThread` +2. `backend/threads/thread_registry.py` — `ThreadRegistry` singleton +3. Wire `start_server_threads()` / `stop_server_threads()` into `ServerService.start()` / `ServerService.stop()` + +### Step 3.2 — Process Monitor Thread + +1. `backend/threads/process_monitor.py` +2. Crash detection + status update in DB +3. Auto-restart with exponential backoff + +**Test:** Start server → kill process manually → confirm DB status changes to 'crashed'. +**Test:** Enable auto_restart → kill → confirm server restarts automatically. + +### Step 3.3 — Log Tail Thread + +1. `backend/logs/parser.py` — `RPTParser` +2. `backend/dal/log_repository.py` +3. `backend/threads/log_tail.py` +4. `backend/logs/service.py` +5. `backend/logs/router.py` — `GET /servers/{id}/logs` + +**Test:** Start server → `GET /api/servers/{id}/logs` returns recent RPT lines. + +### Step 3.4 — Metrics Collector Thread + +1. `backend/metrics/service.py` +2. `backend/dal/metrics_repository.py` +3. `backend/threads/metrics_collector.py` +4. `backend/metrics/router.py` — `GET /servers/{id}/metrics` + +**Test:** Running server → query metrics endpoint → see CPU/RAM data points. + +--- + +## Phase 4 — BattlEye RCon + +**Goal:** Real-time player list, in-game admin commands. + +### Step 4.1 — RCon Client + +1. `backend/rcon/client.py` — `BERConClient` +2. Implement BE RCon UDP protocol: + - Packet structure: `'BE'` + CRC32 (little-endian) + type byte + payload + - Login: type `0x00`, payload = password + - Command: type `0x01`, payload = sequence byte + command string + - Keepalive: type `0x02`, payload = empty +3. **Request multiplexer**: track pending requests by sequence byte, route responses to correct caller via `threading.Event` per request. Background receiver thread reads all incoming packets. +4. `parse_players_response()` — parse `players` command output +5. Handle unsolicited server messages (type 0x02) — enqueue for event logging + +BattlEye RCon packet format reference: +``` +Login packet (client → server): + 42 45 # 'BE' + [CRC32 LE] # checksum of bytes after CRC + FF # packet type prefix + 00 # login type + [password] # ASCII password + +Command packet: + 42 45 + [CRC32 LE] + FF + 01 + [seq byte] # 0x00-0xFF, wraps around + [command] # ASCII command string + +Command response (server → client): + 42 45 + [CRC32 LE] + FF + 01 # 0x01 = command response (same type byte as outgoing command) + [seq byte] + [response] # ASCII response text + +Server-pushed message (server → client, unsolicited): + 42 45 + [CRC32 LE] + FF + 02 # 0x02 = server message (chat events, kill events, etc.) + [seq byte] + [message] # ASCII message text +``` + +**Test:** Connect BERConClient to a running server with BattlEye → successfully login → send `players` → receive response. + +### Step 4.2 — RCon Service + Poller Thread + +1. `backend/rcon/service.py` — `RConService` +2. `backend/threads/rcon_poller.py` +3. `backend/dal/player_repository.py` +4. `backend/players/service.py` +5. `backend/players/router.py` — `GET /servers/{id}/players` + +**Test:** Players join server → `GET /players` returns them with pings. + +### Step 4.3 — Admin Actions via RCon + +1. `POST /servers/{id}/players/{num}/kick` +2. `POST /servers/{id}/players/{num}/ban` +3. `POST /servers/{id}/rcon/command` +4. `POST /servers/{id}/rcon/say` +5. `backend/dal/ban_repository.py` +6. `GET/POST/DELETE /servers/{id}/bans` +7. **ban.txt bidirectional sync**: on ban add/delete via API, write to `battleye/ban.txt`; on startup, read `ban.txt` and upsert into DB + +**Test:** Kick a player via API → confirm player disconnected from server. + +--- + +## Phase 5 — WebSocket Real-Time + +**Goal:** Live updates to React frontend without polling. + +### Step 5.1 — Broadcast infrastructure + +1. `backend/websocket/broadcaster.py` — `BroadcastThread` + `enqueue()` +2. `backend/websocket/manager.py` — `ConnectionManager` +3. Store event loop reference in `main.py:on_startup()`: + ```python + import asyncio + # on_startup() runs inside the asyncio event loop — use get_running_loop(), + # not get_event_loop() (deprecated in Python 3.10+ from async context). + _event_loop = asyncio.get_running_loop() + broadcaster.init(_event_loop, connection_manager) + ``` +4. Start `BroadcastThread` in `on_startup()` +5. Wire `BroadcastThread.enqueue()` calls into all background threads + +### Step 5.2 — WebSocket endpoint + +1. `backend/websocket/router.py` +2. JWT validation from query param +3. Subscribe/unsubscribe message handling +4. Ping/pong keepalive + +**Test:** Connect to `ws://localhost:8000/ws/1?token=...` → see live log lines stream in terminal. + +### Step 5.3 — Integrate all event sources + +Wire `BroadcastThread.enqueue()` into: +- `ProcessMonitorThread` → status updates, crash events +- `LogTailThread` → log lines +- `MetricsCollectorThread` → metrics snapshots +- `RConPollerThread` → player list updates +- `ServerService.start/stop` → status transitions + +**Test:** React frontend connects to WS → server starts → see status, logs, metrics all update in real time. + +--- + +## Phase 6 — Mission & Mod Management + +### Step 6.1 — Missions + +1. `backend/missions/service.py` +2. `backend/missions/router.py` +3. Upload PBO validation (check `.pbo` extension, parse name) +4. Mission rotation CRUD + +**Test:** Upload a `.pbo` → appears in `GET /missions` → set as rotation → start server → mission available. + +### Step 6.2 — Mods + +1. `backend/mods/service.py` +2. `backend/mods/router.py` +3. `build_mod_string()` — assemble `-mod=` and `-serverMod=` args +4. Wire mod string into `ConfigGenerator.build_launch_args()` + +**Test:** Register `@CBA_A3` → enable on server → start → server loads mod. + +--- + +## Phase 7 — Polish & Production + +### Step 7.1 — APScheduler jobs + +Add to `on_startup()`: +```python +# Use BackgroundScheduler (not AsyncIOScheduler) because cleanup methods +# perform sync SQLite operations. AsyncIOScheduler would block the event loop. +from apscheduler.schedulers.background import BackgroundScheduler +scheduler = BackgroundScheduler() +scheduler.add_job(log_service.cleanup_old_logs, 'cron', hour=3) +scheduler.add_job(metrics_service.cleanup_old_metrics, 'cron', hour=3, minute=30) +scheduler.add_job(player_service.cleanup_old_history, 'cron', hour=4) # 90-day retention +scheduler.start() +``` + +### Step 7.2 — Startup recovery + +In `on_startup()` → `ProcessManager.recover_on_startup()`: +- Query DB for servers with `status='running'` +- Check if PID still alive (`psutil.pid_exists(pid)`) +- If alive: re-attach threads (skip process start, just start monitoring threads) +- If dead: mark as `crashed`, clear players + +### Step 7.3 — Events log + +1. `backend/dal/event_repository.py` +2. Insert events for: start, stop, crash, kick, ban, config change, mission change +3. `GET /servers/{id}/events` endpoint + +### Step 7.4 — Security hardening (additional layers) + +1. Encrypt sensitive DB fields: `password`, `password_admin`, `rcon_password` + - `backend/utils/crypto.py` with Fernet + - **Key format:** `LANGUARD_ENCRYPTION_KEY` must be a Fernet base64 key, NOT hex. + Generate with: `python -c "from cryptography.fernet import Fernet; print(Fernet.generate_key().decode())"` + Passing a hex string to `Fernet()` raises `ValueError` at startup. + - Encrypt on write, decrypt on read in repositories + - **NOTE:** Core security (rate limiting, input sanitization, config escaping, exe path validation) is already in Phases 1-2. +2. Additional penetration testing and security audit +3. Content-Security-Policy headers for frontend + +### Step 7.5 — Frontend integration checklist + +Verify React app can: +- [ ] Login and store JWT +- [ ] List servers with live status +- [ ] Start/stop server and see status update via WebSocket (no page refresh) +- [ ] View streaming log output +- [ ] See player list update every 10s +- [ ] See CPU/RAM charts update every 5s +- [ ] Edit all config sections and see preview +- [ ] Upload a mission PBO +- [ ] Kick a player +- [ ] Send a message to all players + +--- + +## Testing Strategy + +### Unit tests (pytest) +- `ConfigGenerator.write_server_cfg()` — compare output against expected string; test config injection prevention +- `ConfigGenerator._escape_config_string()` — test double-quote and newline escaping +- `RPTParser.parse_line()` — test all log formats +- `BERConClient.parse_players_response()` — test with sample output +- `AuthService.login()` — correct password / wrong password / rate limiting +- Repository methods — use in-memory SQLite (`:memory:`) +- `check_server_ports_available()` — test derived port validation +- `sanitize_filename()` — test path traversal prevention +- In-memory SQLite setup in `conftest.py` — shared fixture for all repository tests + +### Integration tests +- Full start/stop cycle with a real arma3server.exe (manual — requires licensed Arma 3 installation, not in CI) +- WebSocket message delivery (can be automated with httpx test client) +- RCon command round-trip (manual — requires running server with BattlEye) + +### Load notes +- SQLite with WAL handles concurrent reads from 4 threads per server well +- For >10 simultaneous servers, consider connection pool size tuning +- WebSocket broadcast scales to ~100 concurrent connections without issue + +--- + +## Environment Setup (Developer) + +```bash +# 1. Clone repo +git clone +cd languard-server-manager + +# 2. Backend +cd backend +python -m venv venv +source venv/bin/activate # or venv\Scripts\activate on Windows +pip install -r requirements.txt + +# 3. Environment +cp .env.example .env +# Edit .env: set LANGUARD_ARMA_EXE to your arma3server_x64.exe path + +# 4. Run backend +uvicorn main:app --reload --host 0.0.0.0 --port 8000 + +# 5. Frontend (separate) +cd ../frontend +npm install +npm run dev +``` + +Backend auto-creates `languard.db` and seeds an admin user on first run: +- Username: `admin` +- Password: **randomly generated** and printed to stdout once (e.g., `Initial admin password: a7b9c2d4e5f6...`) +- Change immediately via `PUT /api/auth/password` + +--- + +## Phase Summary + +| Phase | Deliverable | Est. Complexity | +|-------|-------------|----------------| +| 1 | Foundation (auth + server CRUD) | Low | +| 2 | Process management + config gen | Medium | +| 3 | Background threads (monitor, logs, metrics) | Medium-High | +| 4 | BattlEye RCon (player list, admin cmds) | High | +| 5 | WebSocket real-time | Medium | +| 6 | Mission + mod management | Low-Medium | +| 7 | Polish, security, recovery | Medium | + +Implement phases in order — each phase builds on the previous and is independently testable. diff --git a/MODULES.md b/MODULES.md new file mode 100644 index 0000000..84581cc --- /dev/null +++ b/MODULES.md @@ -0,0 +1,900 @@ +# Languard Server Manager — Python Module Breakdown + +## Project Structure + +``` +backend/ +├── main.py +├── config.py +├── database.py +├── dependencies.py +│ +├── auth/ +│ ├── __init__.py +│ ├── router.py +│ ├── service.py +│ ├── schemas.py +│ └── utils.py +│ +├── servers/ +│ ├── __init__.py +│ ├── router.py +│ ├── service.py +│ ├── schemas.py +│ ├── process_manager.py +│ └── config_generator.py +│ +├── rcon/ +│ ├── __init__.py +│ ├── client.py +│ └── service.py +│ +├── missions/ +│ ├── __init__.py +│ ├── router.py +│ ├── service.py +│ └── schemas.py +│ +├── mods/ +│ ├── __init__.py +│ ├── router.py +│ ├── service.py +│ └── schemas.py +│ +├── players/ +│ ├── __init__.py +│ ├── router.py +│ ├── service.py +│ └── schemas.py +│ +├── logs/ +│ ├── __init__.py +│ ├── router.py +│ ├── service.py +│ └── parser.py +│ +├── metrics/ +│ ├── __init__.py +│ ├── router.py +│ └── service.py +│ +├── websocket/ +│ ├── __init__.py +│ ├── router.py +│ ├── manager.py +│ └── broadcaster.py +│ +├── threads/ +│ ├── __init__.py +│ ├── base_thread.py +│ ├── process_monitor.py +│ ├── log_tail.py +│ ├── metrics_collector.py +│ ├── rcon_poller.py +│ └── thread_registry.py +│ +├── system/ +│ ├── __init__.py +│ └── router.py +│ +├── dal/ +│ ├── __init__.py +│ ├── base_repository.py +│ ├── server_repository.py +│ ├── config_repository.py +│ ├── player_repository.py +│ ├── log_repository.py +│ ├── metrics_repository.py +│ ├── mission_repository.py +│ ├── mod_repository.py +│ ├── ban_repository.py +│ └── event_repository.py +│ +├── migrations/ +│ ├── runner.py +│ └── 001_initial_schema.sql +│ +└── utils/ + ├── __init__.py + ├── crypto.py + ├── file_utils.py + └── port_checker.py +``` + +--- + +## Module Details + +### `main.py` +Entry point. Creates and configures the FastAPI application. + +```python +# Responsibilities: +# - Create FastAPI app instance +# - Register all routers with prefix /api +# - Configure CORS middleware +# - Add JWT auth middleware +# - Register startup/shutdown event handlers: +# startup: run DB migrations, init ProcessManager, restore running servers +# shutdown: gracefully stop all BroadcastThread, close DB +# - Mount static files if serving frontend +# - NOTE: Route handlers that perform blocking I/O (subprocess, file writes, +# socket checks) MUST be declared as plain `def` (not `async def`). +# FastAPI automatically runs plain-def handlers in a thread pool, +# preventing event loop blocking. Only truly async operations +# (WebSocket send, async library calls) should use `async def`. + +Key functions: + create_app() -> FastAPI + on_startup() # DB migrations, recover state + on_shutdown() # Clean up threads, close connections +``` + +--- + +### `config.py` +Loads and validates environment variables using Pydantic `BaseSettings`. + +```python +class Settings(BaseSettings): + secret_key: str + encryption_key: str # Fernet base64 key (NOT hex) + db_path: str = "./languard.db" + servers_dir: str = "./servers" + arma_exe: str = "C:/Arma3Server/arma3server_x64.exe" + host: str = "0.0.0.0" + port: int = 8000 + cors_origins: list[str] = ["http://localhost:5173"] + log_retention_days: int = 7 + metrics_retention_days: int = 30 + player_history_retention_days: int = 90 + jwt_expire_hours: int = 24 + login_rate_limit: str = "5/minute" # per IP + +settings = Settings() # singleton +``` + +--- + +### `database.py` +Database engine setup and session management. + +```python +# Responsibilities: +# - Create SQLAlchemy engine with WAL + FK + busy_timeout pragmas +# - Provide get_db() dependency for FastAPI routes (sync session) +# - Provide get_thread_db() for background threads (thread-local sessions) +# - run_migrations(): apply pending .sql migration files at startup +# - Migration rollback: if a migration fails, the schema_migrations table +# is NOT updated; re-running applies only unapplied migrations (idempotent) + +# Pragma setup: +# PRAGMA journal_mode=WAL +# PRAGMA foreign_keys=ON +# PRAGMA busy_timeout=5000 # 5s wait before "database is locked" error + +Key functions: + get_engine() -> Engine + get_db() -> Generator[Connection, None, None] # FastAPI dependency + get_thread_db() -> Connection # for threads + run_migrations(engine: Engine) +``` + +--- + +### `dependencies.py` +Reusable FastAPI dependencies. + +```python +# Responsibilities: +# - get_current_user(token) -> User (JWT validation) +# - require_admin(user) -> User (role check) +# - get_server_or_404(server_id, db) -> ServerRow + +Key functions: + get_current_user(credentials: HTTPAuthorizationCredentials) -> User + require_admin(user: User = Depends(get_current_user)) -> User + get_server_or_404(server_id: int, db: Connection) -> dict +``` + +--- + +### `auth/` + +**`router.py`** — FastAPI router for auth endpoints. +- `POST /auth/login` +- `POST /auth/logout` +- `GET /auth/me` +- `PUT /auth/password` +- `GET /auth/users` (admin) +- `POST /auth/users` (admin) +- `DELETE /auth/users/{user_id}` (admin) + +**`service.py`** — `AuthService` +```python +class AuthService: + def login(username, password) -> TokenResponse + def create_user(username, password, role) -> User + def change_password(user_id, current_pw, new_pw) -> bool + def list_users() -> list[User] + def delete_user(user_id) -> bool +``` + +**`utils.py`** +```python +def hash_password(password: str) -> str # bcrypt +def verify_password(plain, hashed) -> bool +def create_access_token(data: dict) -> str # JWT sign +def decode_access_token(token: str) -> dict # JWT verify +``` + +--- + +### `servers/` + +**`router.py`** — All server CRUD + lifecycle endpoints. +- `GET /servers` +- `POST /servers` +- `GET /servers/{id}` +- `PUT /servers/{id}` +- `DELETE /servers/{id}` +- `POST /servers/{id}/start` +- `POST /servers/{id}/stop` +- `POST /servers/{id}/restart` +- `POST /servers/{id}/kill` +- `GET /servers/{id}/config` +- `PUT /servers/{id}/config/server` +- `PUT /servers/{id}/config/basic` +- `PUT /servers/{id}/config/profile` +- `PUT /servers/{id}/config/launch` +- `PUT /servers/{id}/config/rcon` +- `GET /servers/{id}/config/preview` +- `GET /servers/{id}/config/download/{filename}` + +**`service.py`** — `ServerService` +```python +class ServerService: + def list_servers() -> list[ServerSummary] + def get_server(server_id) -> ServerDetail + def create_server(data: CreateServerRequest) -> Server + def update_server(server_id, data) -> Server + def delete_server(server_id) -> bool + + def start(server_id) -> StatusResponse + # 1. Load config from DB + # 2. Validate exe_path exists and basename matches allowlist + # (arma3server_x64.exe, arma3server.exe) — prevents executing arbitrary binaries + # 3. Check ALL derived ports not in use (game_port through game_port+3 + rcon_port) + # 4. ConfigGenerator.write_all(server_id) + # — if write fails: DB status='error', return error (no process launch) + # 5. Build launch args + # 6. ProcessManager.start(server_id, exe, args, cwd=servers/{id}/) + # 7. DB: status = 'starting' + # 8. ThreadRegistry.start_server_threads(server_id) + # 9. Broadcast status update + + def stop(server_id, force=False) -> StatusResponse + # 1. If not force: RConService.send_command('#shutdown') + # 2. Wait up to 30s for process exit + # 3. If still running: ProcessManager.kill(server_id) + # 4. DB: status = 'stopped', pid = NULL + # 5. ThreadRegistry.stop_server_threads(server_id) + # 6. PlayerRepository.clear(server_id) + # 7. Broadcast status update + + def restart(server_id) -> StatusResponse + + def update_config_server(server_id, data) -> ServerConfig + def update_config_basic(server_id, data) -> BasicConfig + def update_config_profile(server_id, data) -> ServerProfile + def update_config_launch(server_id, data) -> LaunchParams + def update_config_rcon(server_id, data) -> RConConfig + # Updates rcon_configs row (rcon_password, max_ping, enabled) via ConfigRepository. + # If data includes rcon_port, also updates servers.rcon_port via ServerRepository — + # rcon_port lives in the servers table, not rcon_configs. + # Regenerates battleye/beserver.cfg immediately after saving. + def get_config_preview(server_id) -> str +``` + +**`process_manager.py`** — `ProcessManager` singleton +```python +class ProcessManager: + _instance = None + _processes: dict[int, subprocess.Popen] = {} + _lock: threading.Lock + + @classmethod + def get() -> ProcessManager + + def start(server_id, exe_path, args: list[str], cwd: str) -> int # returns PID + # subprocess.Popen([exe_path, *args], cwd=cwd, stdout=PIPE, stderr=STDOUT) + # cwd = servers/{server_id}/ so relative config paths resolve correctly + + def stop(server_id, timeout=30) -> bool + # On Windows: subprocess.terminate() = TerminateProcess (hard kill, no SIGTERM) + # Graceful shutdown is handled by ServerService via RCon #shutdown first. + # This method is the forceful fallback: terminate() → wait(timeout) + + def kill(server_id) -> bool + # terminate() immediately (hard kill on Windows) + + def is_running(server_id) -> bool + + def get_pid(server_id) -> int | None + + def get_process(server_id) -> subprocess.Popen | None + + def list_running() -> list[int] # list of server_ids + + def recover_on_startup(db) + # At app startup: query DB for servers with status='running' + # Check if pid still alive AND verify process name matches arma3server + # (prevents PID reuse by unrelated processes) + # If alive: re-attach monitoring threads (skip process start) + # If dead or wrong process: mark crashed, clear players +``` + +**`config_generator.py`** — `ConfigGenerator` +```python +class ConfigGenerator: + def write_all(server_id: int, db: Connection) -> None + # Writes server.cfg, basic.cfg, server.Arma3Profile, battleye/beserver.cfg + # Creates directories if they don't exist + # Sets restrictive file permissions on files containing passwords: + # Unix: chmod 0600 + # Windows: use icacls to grant only the service account read/write access + # Raises IOError if write fails — caller must handle (set DB status='error') + + def write_server_cfg(server_id, config: dict, path: Path) -> None + # Uses structured builder — NOT f-strings or string.Template + # Escapes double quotes in all string values (replace " with \"") + # Validates no newline injection in string fields + # Renders mission rotation as class Missions { class Mission1 { ... }; }; + + def write_basic_cfg(server_id, config: dict, path: Path) -> None + + def write_arma3profile(server_id, profile: dict, path: Path) -> None + # Writes to servers/{id}/server/server.Arma3Profile (profile subdirectory) + + def write_beserver_cfg(server_id, rcon_config: dict, path: Path) -> None + # Generates servers/{id}/battleye/beserver.cfg + # Content: "RConPassword \nRConPort \n" + # Without this file BattlEye will not open an RCon port. + + def build_launch_args(server_id, config: dict, launch: dict, mod_string: str) -> list[str] + # Returns list of command-line arguments for arma3server_x64.exe + # e.g. ['-port=2302', '-config=server.cfg', '-cfg=basic.cfg', + # '-profiles=./', '-name=server', '-world=empty', + # '-mod=@CBA;@ACE', '-serverMod=@ACE_server', + # '-bepath=./battleye', + # '-limitFPS=50', '-autoInit', '-loadMissionToMemory', ...] + # NOTE: -profiles is relative to cwd (which is set to servers/{id}/) + # -bepath is required for BattlEye to find beserver.cfg + + def _escape_config_string(value: str) -> str + # Escapes backslashes FIRST, then double quotes and newlines for safe Arma 3 config interpolation. + # Order matters: backslash → \\, then " → \", then newline → \\n + # If backslashes are not escaped first, input "test\\" produces "test\\" + # which Arma 3 reads as an escaped backslash + unescaped closing quote = injection. + value = value.replace('\\', '\\\\') # backslash FIRST + value = value.replace('"', '\\"') # then double-quote + value = value.replace('\n', '\\n') # then newline + value = value.replace('\r', '') # strip carriage returns + value = value.replace('\t', ' ') # tabs → spaces + return value + + def _render_mission_class(rotation: list[dict]) -> str + # Renders the class Missions {} block for server.cfg + # class Missions { class Mission1 { template="..."; difficulty="..."; }; ... }; +``` + +--- + +### `rcon/` + +**`client.py`** — `BERConClient` +```python +class BERConClient: + """ + Implements BattlEye RCon protocol over UDP. + Packet type bytes: + Client → Server: 0xFF 0x00 [password] → login + Client → Server: 0xFF 0x01 [seq] [command] → send command + Client → Server: 0xFF 0x02 → keepalive (empty payload) + Server → Client: 0xFF 0x00 [0x00|0x01] → login response (0x01=ok) + Server → Client: 0xFF 0x01 [seq] [response] → command response + Server → Client: 0xFF 0x02 [seq] [message] → unsolicited server message (chat/kill events) + Note: 0x01 is the type byte for BOTH outgoing commands AND incoming command responses. + """ + def __init__(host: str, port: int, password: str) + + # Request multiplexer: prevents response misrouting when + # RConPollerThread and API-request RConService share the same socket. + _pending_requests: dict[int, threading.Event] = {} # seq → Event + _responses: dict[int, str] = {} # seq → response + _seq_counter: int = 0 + _lock: threading.Lock + + def connect() -> bool + def disconnect() + def login() -> bool + def send_command(command: str, timeout: float = 5.0) -> str | None + # Sends command with sequence number, creates Event, waits for response + # Routes response to correct caller by matching sequence byte + def keepalive() # send empty packet every 30s + def is_connected() -> bool + + # Background receiver thread: + def _receiver_loop() + # Reads all incoming UDP packets + # For type 0x01 (command response): sets Event + stores response for matching seq + # For type 0x02 (server message): enqueues for processing (player events, chat) + + def parse_players_response(response: str) -> list[PlayerInfo] + # Parse output of 'players' command + # Format: "Players on server:\n[#] [IP:Port] [Ping] [GUID] [Name]\n..." +``` + +**`service.py`** — `RConService` +```python +class RConService: + def __init__(server_id: int) + + def send_command(command: str) -> str | None + # Gets or creates BERConClient, sends command, returns response + + def kick_player(player_num: int, reason: str = "") -> bool + + def ban_player(player_num: int, duration_minutes: int, reason: str) -> bool + + def unban(guid: str) -> bool + + def say_all(message: str) -> bool + + def get_players() -> list[PlayerInfo] + + def send_mission_command(mission_name: str) -> bool + + def shutdown() -> bool + + def restart() -> bool + + def lock() -> bool + + def unlock() -> bool +``` + +--- + +### `missions/` + +**`service.py`** — `MissionService` +```python +class MissionService: + def list_missions(server_id) -> list[Mission] + + def upload_mission(server_id, filename: str, file_data: bytes) -> Mission + # 1. Validate .pbo extension + # 2. Parse mission_name and terrain from filename + # 3. Write to servers/{server_id}/mpmissions/{filename} + # 4. Insert into missions table + # 5. Return Mission object + + def delete_mission(server_id, mission_id) -> bool + # 1. Check not in active rotation + # 2. Delete file from disk + # 3. Delete from DB + + def get_rotation(server_id) -> list[RotationEntry] + + def update_rotation(server_id, rotation: list[RotationEntry]) -> bool + # 1. Delete existing rotation rows + # 2. Insert new ordered list + # 3. Trigger config regeneration +``` + +--- + +### `mods/` + +**`service.py`** — `ModService` +```python +class ModService: + def list_all() -> list[Mod] + def register_mod(name, folder_path, workshop_id, description) -> Mod + # Validates folder exists + def delete_mod(mod_id) -> bool + # Check not in use by any server + def get_server_mods(server_id) -> list[ServerMod] + def update_server_mods(server_id, mods: list) -> bool + # Replaces server_mods rows, regenerates mod string + def build_mod_string(server_id) -> tuple[str, str] + # Returns (-mod=..., -serverMod=...) strings +``` + +--- + +### `players/` + +**`service.py`** — `PlayerService` +```python +class PlayerService: + def get_current_players(server_id) -> list[Player] + def kick(server_id, player_num, reason) -> bool + # RConService.kick_player() + log event + def ban(server_id, player_num, duration_minutes, reason) -> bool + # RConService.ban_player() + insert into bans table + def get_history(server_id, limit, offset, search) -> PaginatedResult + def update_from_rcon(server_id, rcon_players: list) -> None + # Upsert players table; detect disconnections; insert player_history rows +``` + +--- + +### `logs/` + +**`parser.py`** — `RPTParser` +```python +class RPTParser: + # Parses Arma 3 RPT log format + # Example line: "10:05:23 BattlEye Server: Initialized (v1.240)" + # With timestamp format "short": "10:05:23" + # With timestamp format "full": "2026/04/16, 10:05:23" + + def parse_line(line: str) -> LogEntry | None + # Returns: {timestamp, level, message} + # level detection: 'error' if 'Error' in msg, 'warning' if 'Warning', else 'info' + + def parse_timestamp(raw: str) -> datetime +``` + +**`service.py`** — `LogService` +```python +class LogService: + def query(server_id, limit, offset, level, since, search) -> PaginatedLogs + def clear(server_id) -> int # returns deleted count + def get_rpt_path(server_id) -> Path | None + # Delegates to file_utils.get_rpt_path() — globs for latest timestamped .rpt + def cleanup_old_logs() # called by APScheduler +``` + +--- + +### `metrics/` + +**`service.py`** — `MetricsService` +```python +class MetricsService: + def query(server_id, from_dt, to_dt, resolution) -> list[MetricPoint] + # Aggregates by resolution ('1m', '5m', '1h') + def insert(server_id, cpu, ram, player_count) -> None + def cleanup_old_metrics() # called by APScheduler + def get_latest(server_id) -> MetricPoint | None +``` + +--- + +### `websocket/` + +**`manager.py`** — `ConnectionManager` +```python +class ConnectionManager: + """ + Manages active WebSocket connections grouped by server_id. + 'all' is a special server_id that receives events for all servers. + """ + _connections: dict[str, set[WebSocket]] + _lock: asyncio.Lock + + async def connect(ws: WebSocket, server_id: str, channels: list[str]) + async def disconnect(ws: WebSocket, server_id: str) + async def broadcast(server_id: str, message: dict) + # Sends to all connections subscribed to server_id + 'all' + async def send_personal(ws: WebSocket, message: dict) +``` + +**`broadcaster.py`** — `BroadcastThread` +```python +class BroadcastThread(threading.Thread): + """ + Runs in background thread. + Reads from a queue (put by background threads). + Posts messages to asyncio event loop via run_coroutine_threadsafe(). + """ + _queue: queue.Queue + _loop: asyncio.AbstractEventLoop + _manager: ConnectionManager + _running: bool + + def run() # main loop: get from queue, schedule broadcast coroutine + + @staticmethod + def enqueue(server_id: int, msg_type: str, data: dict) + # Thread-safe: called from any background thread +``` + +**`router.py`** — WebSocket endpoint +```python +@router.websocket("/ws/{server_id}") +async def websocket_endpoint(ws: WebSocket, server_id: str, token: str = Query(...)): + # 1. Validate JWT token from query param + # 2. Accept WebSocket connection + # 3. Register with ConnectionManager + # 4. Loop: receive messages (ping/subscribe/unsubscribe) + # 5. On disconnect: deregister from ConnectionManager +``` + +--- + +### `threads/` + +**`base_thread.py`** — `BaseServerThread` +```python +class BaseServerThread(threading.Thread): + def __init__(server_id: int, interval: float) + def stop() # sets _stop_event + def is_stopped() -> bool + def run() # creates thread-local DB connection, calls setup(), + # then loops: tick(self._db) + wait(interval) + # finally calls teardown() in finally block + def setup() # override for init work (receives self._db) + def tick() # override for per-interval work (uses self._db) + def teardown() # override for cleanup (close files, sockets) + def on_error(e: Exception) # default: log, continue + # if same error repeats 5x in a row: escalate + self.stop() +``` + +**`process_monitor.py`** — `ProcessMonitorThread` +```python +class ProcessMonitorThread(BaseServerThread): + interval = 1.0 # seconds + + def tick(): + # 1. Check if process is still alive (os.kill(pid, 0)) + # 2. If dead: + # a. Get exit code + # b. DB: status = 'crashed', stopped_at = now + # c. Clear players from DB + # d. Broadcast: {type: 'status', status: 'crashed'} + # e. Insert server_events: {event_type: 'crashed', exit_code} + # f. If auto_restart enabled and restart_count < max_restarts: + # DB: increment restart_count + # Schedule restart after 10s (threading.Timer) + # g. self.stop() +``` + +**`log_tail.py`** — `LogTailThread` +```python +class LogTailThread(BaseServerThread): + interval = 0.1 # 100ms + + def setup(): + # Find .rpt file path + # Open file, seek to end (tail behavior) + self._file = open(rpt_path, 'r', encoding='utf-8', errors='replace') + self._file.seek(0, 2) # seek to end + + def tick(): + # 1. Read all new lines from self._file + # 2. For each line: + # a. RPTParser.parse_line(line) -> LogEntry + # b. LogRepository.insert(server_id, entry) + # c. BroadcastThread.enqueue(server_id, 'log', entry) + + def on_rpt_rotate(): + # Close and reopen if file was rotated (new server start) +``` + +**`metrics_collector.py`** — `MetricsCollectorThread` +```python +class MetricsCollectorThread(BaseServerThread): + interval = 5.0 # seconds + + def tick(): + # 1. Get PID from ProcessManager + # 2. psutil.Process(pid).cpu_percent(interval=0.5) + # 3. psutil.Process(pid).memory_info().rss / (1024*1024) # MB + # 4. PlayerRepository.count(server_id) -> player_count + # 5. MetricsRepository.insert(server_id, cpu, ram, player_count) + # 6. BroadcastThread.enqueue(server_id, 'metrics', {cpu, ram, player_count}) +``` + +**`rcon_poller.py`** — `RConPollerThread` +```python +class RConPollerThread(BaseServerThread): + interval = 10.0 # seconds + startup_delay = 30.0 # wait 30s after server start before first poll + _rcon_ready = False # flag: set True only after successful setup + + def setup(): + # Use _stop_event.wait() instead of time.sleep() so the thread + # can be interrupted immediately during shutdown + if self._stop_event.wait(self.startup_delay): + self._rcon_ready = False + return # stop was requested during startup delay + self._rcon = RConService(self.server_id) + self._rcon_ready = True + + def tick(): + if not self._rcon_ready: + return # setup() failed or was interrupted — skip tick + # 1. RConService.get_players() -> list[PlayerInfo] + # 2. PlayerService.update_from_rcon(server_id, players) + # 3. BroadcastThread.enqueue(server_id, 'players', {players, count}) + # 4. RConClient.keepalive() if needed +``` + +**`thread_registry.py`** — `ThreadRegistry` +```python +class ThreadRegistry: + """ + Singleton. Manages all background threads per server. + """ + _threads: dict[int, dict[str, BaseServerThread]] + _lock: threading.Lock + + @classmethod + def get() -> ThreadRegistry + + def start_server_threads(server_id: int) -> None + # Instantiates and starts: + # ProcessMonitorThread, LogTailThread, + # MetricsCollectorThread, RConPollerThread + + def stop_server_threads(server_id: int) -> None + # Calls stop() on each thread; joins with timeout + + def get_thread(server_id, thread_type: str) -> BaseServerThread | None + + def list_active(server_id) -> list[str] # thread names + + def stop_all() -> None # on app shutdown +``` + +--- + +### `dal/` + +**`base_repository.py`** +```python +class BaseRepository: + def __init__(db: Connection) + + def execute(sql: str, params: tuple = ()) -> CursorResult + def fetchone(sql: str, params: tuple = ()) -> dict | None + def fetchall(sql: str, params: tuple = ()) -> list[dict] + def insert(table: str, data: dict) -> int # returns last_insert_rowid + def update(table: str, data: dict, where: str, params: tuple) -> int + def delete(table: str, where: str, params: tuple) -> int + def row_to_dict(row) -> dict +``` + +**`server_repository.py`** +```python +class ServerRepository(BaseRepository): + def get_all() -> list[dict] + def get_by_id(server_id) -> dict | None + def create(data: dict) -> int + def update_status(server_id, status, pid=None, started_at=None) -> None + def update(server_id, data: dict) -> None + def delete(server_id) -> None + def get_running() -> list[dict] # for startup recovery + def increment_restart_count(server_id) -> None + def reset_restart_count(server_id) -> None +``` + +**`event_repository.py`** +```python +class ServerEventRepository(BaseRepository): + def insert(server_id: int, event_type: str, actor: str, detail: dict) -> int + def get_events(server_id: int, limit: int, offset: int, event_type: str | None) -> list[dict] + def get_recent(server_id: int, limit: int = 20) -> list[dict] +``` + +**`config_repository.py`** +```python +class ConfigRepository(BaseRepository): + def get_server_config(server_id) -> dict | None + def upsert_server_config(server_id, data: dict) -> None + def get_basic_config(server_id) -> dict | None + def upsert_basic_config(server_id, data: dict) -> None + def get_profile(server_id) -> dict | None + def upsert_profile(server_id, data: dict) -> None + def get_launch_params(server_id) -> dict | None + def upsert_launch_params(server_id, data: dict) -> None + def get_rcon_config(server_id) -> dict | None + def upsert_rcon_config(server_id, data: dict) -> None + def get_full_config(server_id) -> dict # all sections combined +``` + +--- + +### `system/` + +**`router.py`** — System-level endpoints (no auth required for health check). +```python +# GET /system/status → running_servers, total_servers, uptime, version +# GET /system/health → 200 OK if app is alive (for load balancer / Docker healthcheck) + +@router.get("/system/status") +async def system_status() -> APIResponse: + # Returns: {version, running_servers, total_servers, uptime_seconds} + +@router.get("/system/health") +async def health_check() -> dict: + # Returns: {"status": "ok"} +``` + +--- + +### `utils/` + +**`crypto.py`** +```python +# AES-256 field encryption for sensitive values (passwords, RCon pw) +# Uses cryptography.fernet.Fernet + +def encrypt(plaintext: str) -> str +def decrypt(ciphertext: str) -> str +def get_fernet() -> Fernet # from settings.encryption_key +``` + +**`file_utils.py`** +```python +def ensure_server_dirs(server_id: int) -> None + # Creates servers/{id}/, servers/{id}/server/ (profile dir), + # servers/{id}/mpmissions/, servers/{id}/battleye/ + +def get_server_dir(server_id: int) -> Path +def get_profile_dir(server_id: int) -> Path + # Returns servers/{id}/server/ — Arma 3 profile dir (matches -name=server) +def get_missions_dir(server_id: int) -> Path +def get_rpt_path(server_id: int) -> Path | None + # Arma 3 creates timestamped RPT files in the profile dir: + # servers/{id}/server/arma3server_YYYY-MM-DD_HH-MM-SS.rpt + # Uses rglob('*.rpt') to search recursively within profile dir. + # Returns the most-recently-modified one. + # Returns None if no .rpt file exists yet (server still starting up). +def safe_delete_file(path: Path) -> bool +def sanitize_filename(filename: str) -> str + # Returns Path(filename).name — prevents path traversal on both Unix and Windows + # os.path.basename() on Windows does NOT strip forward slashes; + # Path.name handles both separators correctly. +``` + +**`port_checker.py`** +```python +def is_port_in_use(port: int, host: str = "0.0.0.0") -> bool + # socket.connect check + +def check_server_ports_available(game_port: int, rcon_port: int | None = None, host: str = "0.0.0.0") -> list[int] + # Checks ALL ports: game_port, game_port+1 (Steam query), + # game_port+2 (VON), game_port+3 (Steam auth), + # plus the actual rcon_port (user-configurable, defaults to game_port+4) + # If rcon_port is None, defaults to game_port+4 + # If rcon_port is None, defaults to game_port+4 + # Returns list of ports that are in use (empty = all available) + +def find_available_port(start: int = 2302, step: int = 100) -> int + # Find next available game port (checking all 5 derived ports per candidate) +``` + +--- + +## Key Dependencies (requirements.txt) + +``` +fastapi==0.111.0 +uvicorn[standard]==0.29.0 +pydantic==2.7.0 +pydantic-settings==2.2.1 +sqlalchemy==2.0.30 +python-jose[cryptography]==3.3.0 # JWT +passlib[bcrypt]==1.7.4 # password hashing +cryptography==42.0.5 # field-level encryption (Fernet) +psutil==5.9.8 # process metrics +apscheduler==3.10.4 # scheduled jobs (log/metrics/player_history cleanup) +python-multipart==0.0.9 # file upload support +slowapi==0.1.9 # rate limiting middleware +uvloop==0.19.0; sys_platform != "win32" # faster event loop (Linux/macOS only — skip on Windows) +``` diff --git a/THREADING.md b/THREADING.md new file mode 100644 index 0000000..8d38eaf --- /dev/null +++ b/THREADING.md @@ -0,0 +1,600 @@ +# Languard Server Manager — Threading & Concurrency Design + +## Overview + +The system uses a hybrid concurrency model: +- **FastAPI (asyncio)** handles HTTP requests and WebSocket connections +- **Python threads** (`threading.Thread`) handle long-running background work per server +- **Queue** bridges the thread world → asyncio world for WebSocket broadcasting +- **SQLAlchemy sync sessions** are used in threads (thread-local connections) + +--- + +## Thread Map + +``` +Main Process (FastAPI / asyncio event loop) +│ +├── [uvicorn] HTTP/WS event loop (asyncio) +│ ├── REST request handlers (async def) +│ └── WebSocket handlers (async def) +│ +├── BroadcastThread (daemon thread, 1 global) +│ └── Reads from broadcast_queue (thread-safe) +│ Calls asyncio.run_coroutine_threadsafe() +│ → ConnectionManager.broadcast() +│ +└── Per-running-server thread group (started when server starts, stopped when server stops): + ├── ProcessMonitorThread (1 per server, 1s interval) + ├── LogTailThread (1 per server, 100ms interval) + ├── MetricsCollectorThread (1 per server, 5s interval) + └── RConPollerThread (1 per server, 10s interval, 30s startup delay) +``` + +For **N running servers**, there are: +- `4*N` background threads + 1 BroadcastThread = `4N+1` background threads total + +--- + +## Thread Safety Rules + +| Resource | Access Pattern | Protection | +|----------|---------------|------------| +| `ProcessManager._processes` | read/write from multiple threads | `threading.Lock` | +| `ThreadRegistry._threads` | read/write from main + shutdown | `threading.Lock` | +| `broadcast_queue` | multi-writer, single reader | `queue.Queue` (thread-safe built-in) | +| `ConnectionManager._connections` | async, single event loop | `asyncio.Lock` | +| SQLite connections | one connection per thread | Thread-local via `threading.local()` | +| Config files on disk | write on start, read-only during run | No lock needed (regenerated before start) | + +### SQLite Thread Safety +```python +# Each background thread creates its own SQLAlchemy connection +# from the same engine (WAL mode allows concurrent reads) +# PRAGMA busy_timeout=5000 prevents "database is locked" errors + +class BaseServerThread(threading.Thread): + def run(self): + # Create thread-local DB connection — single connection per thread + engine = get_engine() + self._db = engine.connect() + try: + self.setup() + while not self._stop_event.is_set(): + try: + self.tick() + except Exception as e: + self.on_error(e) + self._stop_event.wait(self.interval) + except Exception as e: + logger.error(f"{self.name} setup error: {e}") + finally: + self.teardown() # always release resources (even on setup failure) + self._db.close() # always close connection +``` + +--- + +## BroadcastThread — Asyncio Bridge + +This is the critical bridge between background threads and the asyncio WebSocket layer. + +``` +Background Thread Asyncio Event Loop +───────────────── ────────────────── +BroadcastThread.enqueue( uvicorn runs here + server_id=1, + msg_type='log', + data={...} +) + │ + ▼ +broadcast_queue.put({ loop = asyncio.get_event_loop() + 'server_id': 1, (stored at app startup) + 'type': 'log', + 'data': {...} +}) + │ + ▼ +BroadcastThread.run() ──────────────────► asyncio.run_coroutine_threadsafe( + while True: connection_manager.broadcast( + msg = queue.get() server_id=1, + fut = run_coroutine_threadsafe( message={type, data} + broadcast_coro, ), + self._loop loop=self._loop + ) ) + fut.result(timeout=5) +``` + +### Implementation Sketch +```python +# broadcaster.py +import asyncio +import queue +import threading + +_broadcast_queue: queue.Queue = queue.Queue(maxsize=10000) +_event_loop: asyncio.AbstractEventLoop | None = None + +class BroadcastThread(threading.Thread): + daemon = True + + def __init__(self, loop: asyncio.AbstractEventLoop, manager): + super().__init__(name="BroadcastThread") + self._loop = loop + self._manager = manager + self._running = True + + def run(self): + while self._running: + try: + msg = _broadcast_queue.get(timeout=1.0) + server_id = msg['server_id'] + # Build the outgoing WebSocket message envelope. + # Include server_id so clients subscribed to 'all' can identify the source. + # API contract: {type, server_id, data} + outgoing = { + 'type': msg['type'], + 'server_id': server_id, + 'data': msg['data'], + } + future = asyncio.run_coroutine_threadsafe( + self._manager.broadcast(str(server_id), outgoing, channel=msg['type']), + self._loop + ) + try: + future.result(timeout=5.0) + except TimeoutError: + # Don't block the queue — log and continue + logger.warning(f"Broadcast timeout for server {server_id} msg type {msg['type']}") + except queue.Empty: + continue + except Exception as e: + logger.error(f"BroadcastThread error: {e}") + + def stop(self): + self._running = False + + @staticmethod + def enqueue(server_id: int, msg_type: str, data: dict): + """Thread-safe. Called from any background thread.""" + try: + _broadcast_queue.put_nowait({ + 'server_id': server_id, + 'type': msg_type, + 'data': data, + }) + except queue.Full: + logger.warning(f"Broadcast queue full, dropping {msg_type} for server {server_id}") +``` + +--- + +## ProcessMonitorThread — Crash Detection & Auto-Restart + +```python +class ProcessMonitorThread(BaseServerThread): + interval = 1.0 + + def tick(self): + proc = ProcessManager.get().get_process(self.server_id) + if proc is None: + self.stop() + return + + exit_code = proc.poll() + if exit_code is not None: + # Process has exited + self._handle_process_exit(exit_code) + self.stop() + + def _handle_process_exit(self, exit_code: int): + is_crash = (exit_code != 0) + status = 'crashed' if is_crash else 'stopped' + + server = ServerRepository(self._db).get_by_id(self.server_id) + ServerRepository(self._db).update_status( + self.server_id, status, pid=None, + stopped_at=datetime.utcnow().isoformat() + ) + PlayerRepository(self._db).clear(self.server_id) + ServerEventRepository(self._db).insert( + self.server_id, status, + actor='system', + detail={'exit_code': exit_code} + ) + + BroadcastThread.enqueue(self.server_id, 'status', {'status': status}) + BroadcastThread.enqueue(self.server_id, 'event', { + 'event_type': status, + 'detail': {'exit_code': exit_code} + }) + + # Stop other threads for this server. Must NOT be called synchronously + # from within this thread's own run() if stop_server_threads() joins threads, + # as a thread cannot join itself. Use a daemon thread to do the cleanup + # after this thread's run() returns naturally. + # IMPORTANT: The auto-restart Timer must be started AFTER thread cleanup + # completes. The cleanup daemon thread starts the restart timer when done. + import threading as _threading + + def _cleanup_and_maybe_restart(): + try: + ThreadRegistry.get().stop_server_threads(self.server_id) + # Only schedule restart after threads are fully cleaned up + if is_crash and server.get('auto_restart'): + self._schedule_auto_restart(server) + except Exception as e: + logger.error(f"Cleanup/restart failed for server {self.server_id}: {e}") + BroadcastThread.enqueue(self.server_id, 'event', { + 'event_type': 'auto_restart_failed', + 'detail': {'error': str(e)} + }) + + _threading.Thread( + target=_cleanup_and_maybe_restart, + daemon=True, + name=f"StopCleanup-{self.server_id}" + ).start() + + def _schedule_auto_restart(self, server: dict): + # IMPORTANT: This method runs in the daemon cleanup thread, NOT the + # ProcessMonitorThread. Must create its own DB connection — do NOT + # use self._db (it belongs to the ProcessMonitorThread's thread context + # and may be closed by teardown() already). + from database import get_thread_db + db = get_thread_db() + + restart_count = server['restart_count'] + max_restarts = server['max_restarts'] + window = server['restart_window_seconds'] + last_restart = server.get('last_restart_at') + + # Reset restart_count if last restart was outside the window + if last_restart: + last_dt = datetime.fromisoformat(last_restart) + elapsed = (datetime.utcnow() - last_dt).total_seconds() + if elapsed > window: + ServerRepository(db).reset_restart_count(self.server_id) + restart_count = 0 + + if restart_count < max_restarts: + delay = min(10 * (restart_count + 1), 60) # exponential backoff + logger.info(f"Auto-restarting server {self.server_id} in {delay}s (attempt {restart_count+1}/{max_restarts})") + threading.Timer(delay, self._auto_restart).start() + else: + logger.warning(f"Server {self.server_id} exceeded max auto-restarts ({max_restarts})") + BroadcastThread.enqueue(self.server_id, 'event', { + 'event_type': 'max_restarts_exceeded', + 'detail': {'restart_count': restart_count} + }) + + def _auto_restart(self): + from servers.service import ServerService + try: + ServerService().start(self.server_id) + except Exception as e: + logger.error(f"Auto-restart failed for server {self.server_id}: {e}") +``` + +--- + +## LogTailThread — RPT File Tailing + +The Arma 3 RPT file grows while the server runs. This thread tails it like `tail -f`. + +```python +class LogTailThread(BaseServerThread): + interval = 0.1 # 100ms + + def setup(self): + self._file = None + self._current_path: Path | None = None + self._last_size: int = 0 + self._open_latest_rpt() + + def _open_latest_rpt(self): + """ + Arma 3 writes timestamped RPT files in the profile subdirectory: + servers/{id}/server/arma3server_YYYY-MM-DD_HH-MM-SS.rpt + + Use rglob('*.rpt') to search recursively within the server dir. + The profile subdirectory is determined by -profiles + -name flags. + + NOTE: Do NOT use os.stat().st_ino for rotation detection — on Windows/NTFS + st_ino is always 0, making inode comparison completely non-functional. + Instead, track the filename and file size. If a newer .rpt appears or the + current file shrinks (truncated/replaced), reopen. + """ + rpt_files = list(Path(get_server_dir(self.server_id)).rglob("*.rpt")) + if not rpt_files: + return # Server hasn't created RPT yet; retry in next tick + + latest = max(rpt_files, key=lambda p: p.stat().st_mtime) + try: + self._file = open(latest, 'r', encoding='utf-8', errors='replace') + self._file.seek(0, 2) # seek to end — tail, don't replay old output + self._current_path = latest + self._last_size = self._file.tell() + except OSError: + self._file = None + + def tick(self): + if self._file is None: + self._open_latest_rpt() + return + + # Rotation detection: only re-glob every 5 seconds (not every 100ms tick) + # to avoid excessive filesystem I/O with large mpmissions directories. + now = time.monotonic() + if now - getattr(self, '_last_glob_time', 0) > 5.0: + self._last_glob_time = now + rpt_files = list(Path(get_server_dir(self.server_id)).rglob("*.rpt")) + if rpt_files: + latest = max(rpt_files, key=lambda p: p.stat().st_mtime) + if latest != self._current_path: + # A new RPT file was created — switch to it + self._file.close() + self._open_latest_rpt() + return + + try: + current_size = self._current_path.stat().st_size + except OSError: + return + + if current_size < self._last_size: + # File shrank — truncated or replaced; reopen + self._file.close() + self._open_latest_rpt() + return + + # Read new lines + while True: + line = self._file.readline() + if not line: + break + self._last_size = self._file.tell() + line = line.rstrip('\n') + if not line: + continue + + entry = RPTParser.parse_line(line) + if entry: + LogRepository(self._db).insert(self.server_id, entry) + BroadcastThread.enqueue(self.server_id, 'log', entry) + + def teardown(self): + """Close the open RPT file handle when the thread stops.""" + if self._file is not None: + try: + self._file.close() + except OSError: + pass + self._file = None +``` + +--- + +## RConPollerThread — Player List Synchronization + +```python +class RConPollerThread(BaseServerThread): + interval = 10.0 + STARTUP_DELAY = 30.0 # wait for server to fully initialize + _rcon_ready = False # flag: True only after successful setup + + def setup(self): + # Wait for server to start up before attempting RCon + if self._stop_event.wait(self.STARTUP_DELAY): + self._rcon_ready = False + return # stop was requested during wait + self._rcon = RConService(self.server_id) + self._connected = self._rcon.connect() + self._rcon_ready = True + + def tick(self): + if not self._rcon_ready: + return # setup() failed or was interrupted + if not self._connected: + self._reconnect_attempts = getattr(self, '_reconnect_attempts', 0) + 1 + delay = min(10 * 2 ** self._reconnect_attempts, 120) # exponential backoff + if self._reconnect_attempts > 1: + logger.info(f"RCon reconnect attempt {self._reconnect_attempts} for server {self.server_id} (next in {delay}s)") + if self._stop_event.wait(delay): + return + self._connected = self._rcon.connect() + if not self._connected: + return + self._reconnect_attempts = 0 # reset on successful connection + + try: + players = self._rcon.get_players() + PlayerService(self._db).update_from_rcon(self.server_id, players) + BroadcastThread.enqueue(self.server_id, 'players', { + 'players': [p.dict() for p in players], + 'count': len(players) + }) + except ConnectionError: + self._connected = False + logger.warning(f"RCon connection lost for server {self.server_id}") +``` + +--- + +## Thread Lifecycle + +### Start Server Flow +``` +POST /servers/{id}/start + │ + ├── ServerService.start() + │ ├── ConfigGenerator.write_all() + │ ├── ProcessManager.start() ← creates subprocess.Popen + │ └── ThreadRegistry.start_server_threads(id) + │ ├── ProcessMonitorThread(id).start() + │ ├── LogTailThread(id).start() + │ ├── MetricsCollectorThread(id).start() + │ └── RConPollerThread(id).start() + │ + └── BroadcastThread.enqueue(id, 'status', {status: 'starting'}) +``` + +### Stop Server Flow +``` +POST /servers/{id}/stop + │ + ├── RConService.shutdown() ← sends #shutdown via RCon + ├── Wait up to 30s for process exit (ProcessManager.stop(timeout=30)) + ├── If still running: ProcessManager.kill() + ├── ThreadRegistry.stop_server_threads(id) + │ ├── ProcessMonitorThread.stop() (sets _stop_event) + │ ├── LogTailThread.stop() + │ ├── MetricsCollectorThread.stop() + │ └── RConPollerThread.stop() + │ └── Thread.join(timeout=5) for each + │ + └── BroadcastThread.enqueue(id, 'status', {status: 'stopped'}) +``` + +### App Shutdown Flow +``` +FastAPI shutdown event + │ + ├── ThreadRegistry.stop_all() ← stop all threads for all servers + ├── BroadcastThread.stop() + ├── ConnectionManager.close_all() + └── database engine dispose +``` + +--- + +## Stop Event Pattern + +All background threads use a `threading.Event` for graceful shutdown: + +```python +class BaseServerThread(threading.Thread): + def __init__(self, server_id: int, interval: float): + super().__init__(name=f"{self.__class__.__name__}-{server_id}", daemon=True) + self.server_id = server_id + self.interval = interval + self._stop_event = threading.Event() + + def stop(self): + self._stop_event.set() + + def is_stopped(self) -> bool: + return self._stop_event.is_set() + + def teardown(self): + """Override to release resources (close files, sockets) after the loop ends.""" + pass + + def run(self): + try: + self.setup() + except Exception as e: + logger.error(f"{self.name} setup error: {e}") + return # setup failed completely — no partial resources to clean + + try: + while not self._stop_event.is_set(): + try: + self.tick() + except Exception as e: + self.on_error(e) + # Use wait() instead of sleep() — responds immediately to stop() + self._stop_event.wait(self.interval) + finally: + self.teardown() # always runs; subclasses close files/sockets here +``` + +--- + +## WebSocket Connection Manager (asyncio) + +```python +# websocket/manager.py +class ConnectionManager: + def __init__(self): + # server_id → set[WebSocket] + # Use set (not list) so .add()/.discard() work correctly. + self._connections: dict[str, set[WebSocket]] = defaultdict(set) + # Per-connection channel subscriptions: ws → set[str] + self._channel_subs: dict[WebSocket, set[str]] = defaultdict(set) + self._lock = asyncio.Lock() + + async def connect(self, ws: WebSocket, server_id: str): + await ws.accept() + async with self._lock: + self._connections[server_id].add(ws) + self._channel_subs[ws].add('status') # default channel + # Only add to 'all' bucket if server_id is explicitly 'all' + if server_id == 'all': + self._connections['all'].add(ws) + + async def disconnect(self, ws: WebSocket, server_id: str): + async with self._lock: + self._connections[server_id].discard(ws) + self._connections['all'].discard(ws) + self._channel_subs.pop(ws, None) + + async def subscribe(self, ws: WebSocket, channels: list[str]): + async with self._lock: + self._channel_subs[ws].update(channels) + + async def unsubscribe(self, ws: WebSocket, channels: list[str]): + async with self._lock: + self._channel_subs[ws].difference_update(channels) + + async def broadcast(self, server_id: str, message: dict, channel: str = None): + """Send to all clients subscribed to server_id AND the message's channel.""" + targets: set[WebSocket] = set() + async with self._lock: + # Collect clients for this server_id + 'all' subscribers + server_clients = self._connections.get(server_id, set()) + all_clients = self._connections.get('all', set()) + candidates = server_clients | all_clients + + # Filter by channel subscription if specified + if channel: + targets = {ws for ws in candidates + if channel in self._channel_subs.get(ws, set())} + else: + targets = candidates + + dead = [] + for ws in targets: + try: + await ws.send_json(message) + except Exception: + dead.append(ws) + + # Clean up dead connections + if dead: + async with self._lock: + for ws in dead: + for bucket in self._connections.values(): + bucket.discard(ws) + self._channel_subs.pop(ws, None) +``` + +--- + +## Memory & Performance Considerations + +| Thread | Memory Impact | CPU Impact | +|--------|--------------|-----------| +| ProcessMonitorThread | Minimal (one `os.kill` check) | Negligible | +| LogTailThread | Buffer for unread log lines | Low (file I/O) | +| MetricsCollectorThread | psutil subprocess scan | Low-Medium | +| RConPollerThread | UDP socket + response buffer | Low | +| BroadcastThread | Queue buffer (max 10000 entries) | Low | + +### Recommendations +- Set all threads as `daemon=True` — they die automatically if main process exits +- `broadcast_queue.maxsize=10000` — backpressure; drop on Full (log warning) +- `LogTailThread` buffers max ~100 lines per tick before writing to DB in batch +- `MetricsCollectorThread` uses `psutil.Process.cpu_percent(interval=0.5)` — blocks 500ms, acceptable at 5s interval +- For N=10 servers: 41 background threads — well within Python's thread limits