# Languard Servers Manager — Implementation Plan ## Prerequisites Before starting, ensure the following are available: - Python 3.11+ - A working Arma 3 dedicated server installation (for testing) - Node.js 18+ (for frontend dev server) - The reference docs: ARCHITECTURE.md, DATABASE.md, API.md, MODULES.md, THREADING.md --- ## Phase 1 — Foundation (Start Here) **Goal:** Running FastAPI server with DB, auth, and basic server CRUD. ### Step 1.1 — Project scaffold ``` mkdir backend cd backend python -m venv venv venv/Scripts/activate pip install fastapi uvicorn[standard] sqlalchemy python-jose[cryptography] passlib[bcrypt] cryptography psutil apscheduler python-multipart slowapi pytest pytest-asyncio httpx # uvloop (faster event loop) is Linux/macOS only — skip on Windows: # pip install uvloop # only on Linux/macOS pip freeze > requirements.txt ``` Create: - `backend/config.py` — Settings class (see MODULES.md) - `backend/main.py` — FastAPI app factory, startup/shutdown hooks - `backend/conftest.py` — pytest fixtures (in-memory SQLite, test client) - `.env.example` — All env vars documented ### Step 1.2 — Database + Migrations 1. Create `backend/migrations/001_initial_schema.sql` — all tables from DATABASE.md - Include all CHECK constraints (role, status, verify_signatures, von_codec_quality, etc.) - Include `PRAGMA busy_timeout=5000` in engine setup - **Important:** Put `CREATE TABLE IF NOT EXISTS schema_migrations` as the very first statement — the migration runner queries this table before it can track anything. 2. Create `backend/dal/event_repository.py` — `ServerEventRepository` (needed by Phase 3 threads) 3. Create `backend/database.py`: - `get_engine()` with WAL + FK pragma - `run_migrations()` — reads and applies `.sql` files from migrations/ - `get_db()` — FastAPI dependency (sync session) - `get_thread_db()` — thread-local session factory 3. Call `run_migrations()` in `main.py:on_startup()` **Test:** Start app, confirm `languard.db` created with all tables. Run `pytest` with in-memory SQLite to verify schema creates cleanly. ### Step 1.3 — Auth module 1. `backend/auth/utils.py` — `hash_password`, `verify_password`, `create_access_token`, `decode_access_token` 2. `backend/auth/schemas.py` — `LoginRequest`, `TokenResponse`, `UserResponse` 3. `backend/auth/service.py` — `AuthService` (create user, login, list users) 4. `backend/auth/router.py` — login, me, users CRUD 5. `backend/dependencies.py` — `get_current_user`, `require_admin` 6. `main.py` — seed default admin user on first startup if users table empty - **Generate a random password** and print it to stdout once (NOT admin/admin) - Add rate limiting to `POST /auth/login` (5 attempts/minute per IP via slowapi) - Add input sanitization for all string fields in auth schemas **Test:** `POST /api/auth/login` returns JWT. `GET /api/auth/me` with token returns user. Rate limiting returns 429 after 5 failed attempts. ### Step 1.4 — Server CRUD (no process management yet) 1. `backend/dal/server_repository.py` 2. `backend/dal/config_repository.py` 3. `backend/servers/schemas.py` 4. `backend/servers/router.py` — GET, POST, PUT, DELETE /servers and /servers/{id} 5. `backend/servers/service.py` — CRUD methods only (skip start/stop for now) 6. `backend/utils/file_utils.py` — `ensure_server_dirs()`, `sanitize_filename()` 7. `backend/utils/port_checker.py` — `is_port_in_use()`, `check_server_ports_available()` 8. Port validation on create/start: check game_port through game_port+4 **Test:** Create server via API, confirm DB row + directory created. --- ## Phase 2 — Process Management **Goal:** Start/stop actual `arma3server.exe` processes. ### Step 2.1 — Config Generator 1. `backend/servers/config_generator.py` 2. **Use a structured builder** (NOT f-strings) — escape double quotes and newlines in all user-supplied string values to prevent config injection 3. Write `server.cfg` covering all params from DATABASE.md, including mission rotation as `class Missions {}` block 4. Write `basic.cfg` 5. Write `server.Arma3Profile` — **written to `servers/{id}/server/server.Arma3Profile`** (Arma 3 reads from the `-name` subdirectory) 6. Write `BESERVER_CFG_TEMPLATE` — **required for BattlEye RCon to work** ``` # servers/{id}/battleye/beserver.cfg RConPassword {rcon_password} RConPort {rcon_port} ``` `write_beserver_cfg()` must create the `battleye/` directory and write this file. Without it BattlEye will not open an RCon port regardless of launch parameters. 7. `build_launch_args()` — assembles full CLI arg list - Include `-bepath=./battleye` to point BE at the generated config (relative to cwd) - Include `-profiles=./` and `-name=server` for profile directory - All relative paths resolve against `cwd=servers/{id}/` set in ProcessManager 8. Set file permissions 0600 on config files containing passwords (server.cfg, beserver.cfg) **Test:** `ConfigGenerator.write_all(server_id)` → inspect all generated files for correctness. Verify `servers/{id}/battleye/beserver.cfg` exists with the correct RCon password. Verify `servers/{id}/server/server.Arma3Profile` exists. Test config injection prevention: set hostname to `X"; passwordAdmin = "pwned"; //` — verify generated server.cfg does NOT contain the injected directive. Validate generated `server.cfg` manually by running the server with it. ### Step 2.2 — Process Manager 1. `backend/servers/process_manager.py` — `ProcessManager` singleton 2. `start(server_id, exe_path, args, cwd=servers/{id}/)` — subprocess.Popen with cwd set to server instance dir 3. `stop(server_id, timeout=30)` — on Windows: `terminate()` = hard kill (no SIGTERM). Graceful shutdown is via RCon `#shutdown` in ServerService. 4. `kill()`, `is_running()`, `get_pid()` 5. `recover_on_startup()` — verify PID is alive AND process name matches arma3server (prevents PID reuse) 6. Wire `ServerService.start()` and `ServerService.stop()` 7. Add `POST /servers/{id}/start`, `POST /servers/{id}/stop`, `POST /servers/{id}/kill` endpoints **Test:** Start a server via API → confirm process appears in Task Manager. Stop it → confirm process ends. ### Step 2.3 — Config endpoints 1. `GET /servers/{id}/config` 2. `PUT /servers/{id}/config/server` 3. `PUT /servers/{id}/config/basic` 4. `PUT /servers/{id}/config/profile` 5. `PUT /servers/{id}/config/launch` 6. `GET /servers/{id}/config/preview` **Test:** Update hostname via API → regenerate and start server → confirm new hostname appears in server browser. --- ## Phase 3 — Background Threads **Goal:** Live monitoring — process crash detection, log tailing, metrics. ### Step 3.1 — Thread infrastructure 1. `backend/threads/base_thread.py` — `BaseServerThread` 2. `backend/threads/thread_registry.py` — `ThreadRegistry` singleton 3. Wire `start_server_threads()` / `stop_server_threads()` into `ServerService.start()` / `ServerService.stop()` ### Step 3.2 — Process Monitor Thread 1. `backend/threads/process_monitor.py` 2. Crash detection + status update in DB 3. Auto-restart with exponential backoff **Test:** Start server → kill process manually → confirm DB status changes to 'crashed'. **Test:** Enable auto_restart → kill → confirm server restarts automatically. ### Step 3.3 — Log Tail Thread 1. `backend/logs/parser.py` — `RPTParser` 2. `backend/dal/log_repository.py` 3. `backend/threads/log_tail.py` 4. `backend/logs/service.py` 5. `backend/logs/router.py` — `GET /servers/{id}/logs` **Test:** Start server → `GET /api/servers/{id}/logs` returns recent RPT lines. ### Step 3.4 — Metrics Collector Thread 1. `backend/metrics/service.py` 2. `backend/dal/metrics_repository.py` 3. `backend/threads/metrics_collector.py` 4. `backend/metrics/router.py` — `GET /servers/{id}/metrics` **Test:** Running server → query metrics endpoint → see CPU/RAM data points. --- ## Phase 4 — BattlEye RCon **Goal:** Real-time player list, in-game admin commands. ### Step 4.1 — RCon Client 1. `backend/rcon/client.py` — `BERConClient` 2. Implement BE RCon UDP protocol: - Packet structure: `'BE'` + CRC32 (little-endian) + type byte + payload - Login: type `0x00`, payload = password - Command: type `0x01`, payload = sequence byte + command string - Keepalive: type `0x02`, payload = empty 3. **Request multiplexer**: track pending requests by sequence byte, route responses to correct caller via `threading.Event` per request. Background receiver thread reads all incoming packets. 4. `parse_players_response()` — parse `players` command output 5. Handle unsolicited server messages (type 0x02) — enqueue for event logging BattlEye RCon packet format reference: ``` Login packet (client → server): 42 45 # 'BE' [CRC32 LE] # checksum of bytes after CRC FF # packet type prefix 00 # login type [password] # ASCII password Command packet: 42 45 [CRC32 LE] FF 01 [seq byte] # 0x00-0xFF, wraps around [command] # ASCII command string Command response (server → client): 42 45 [CRC32 LE] FF 01 # 0x01 = command response (same type byte as outgoing command) [seq byte] [response] # ASCII response text Server-pushed message (server → client, unsolicited): 42 45 [CRC32 LE] FF 02 # 0x02 = server message (chat events, kill events, etc.) [seq byte] [message] # ASCII message text ``` **Test:** Connect BERConClient to a running server with BattlEye → successfully login → send `players` → receive response. ### Step 4.2 — RCon Service + Poller Thread 1. `backend/rcon/service.py` — `RConService` 2. `backend/threads/rcon_poller.py` 3. `backend/dal/player_repository.py` 4. `backend/players/service.py` 5. `backend/players/router.py` — `GET /servers/{id}/players` **Test:** Players join server → `GET /players` returns them with pings. ### Step 4.3 — Admin Actions via RCon 1. `POST /servers/{id}/players/{num}/kick` 2. `POST /servers/{id}/players/{num}/ban` 3. `POST /servers/{id}/rcon/command` 4. `POST /servers/{id}/rcon/say` 5. `backend/dal/ban_repository.py` 6. `GET/POST/DELETE /servers/{id}/bans` 7. **ban.txt bidirectional sync**: on ban add/delete via API, write to `battleye/ban.txt`; on startup, read `ban.txt` and upsert into DB **Test:** Kick a player via API → confirm player disconnected from server. --- ## Phase 5 — WebSocket Real-Time **Goal:** Live updates to React frontend without polling. ### Step 5.1 — Broadcast infrastructure 1. `backend/websocket/broadcaster.py` — `BroadcastThread` + `enqueue()` 2. `backend/websocket/manager.py` — `ConnectionManager` 3. Store event loop reference in `main.py:on_startup()`: ```python import asyncio # on_startup() runs inside the asyncio event loop — use get_running_loop(), # not get_event_loop() (deprecated in Python 3.10+ from async context). _event_loop = asyncio.get_running_loop() broadcaster.init(_event_loop, connection_manager) ``` 4. Start `BroadcastThread` in `on_startup()` 5. Wire `BroadcastThread.enqueue()` calls into all background threads ### Step 5.2 — WebSocket endpoint 1. `backend/websocket/router.py` 2. JWT validation from query param 3. Subscribe/unsubscribe message handling 4. Ping/pong keepalive **Test:** Connect to `ws://localhost:8000/ws/1?token=...` → see live log lines stream in terminal. ### Step 5.3 — Integrate all event sources Wire `BroadcastThread.enqueue()` into: - `ProcessMonitorThread` → status updates, crash events - `LogTailThread` → log lines - `MetricsCollectorThread` → metrics snapshots - `RConPollerThread` → player list updates - `ServerService.start/stop` → status transitions **Test:** React frontend connects to WS → server starts → see status, logs, metrics all update in real time. --- ## Phase 6 — Mission & Mod Management ### Step 6.1 — Missions 1. `backend/missions/service.py` 2. `backend/missions/router.py` 3. Upload PBO validation (check `.pbo` extension, parse name) 4. Mission rotation CRUD **Test:** Upload a `.pbo` → appears in `GET /missions` → set as rotation → start server → mission available. ### Step 6.2 — Mods 1. `backend/mods/service.py` 2. `backend/mods/router.py` 3. `build_mod_string()` — assemble `-mod=` and `-serverMod=` args 4. Wire mod string into `ConfigGenerator.build_launch_args()` **Test:** Register `@CBA_A3` → enable on server → start → server loads mod. --- ## Phase 7 — Polish & Production ### Step 7.1 — APScheduler jobs Add to `on_startup()`: ```python # Use BackgroundScheduler (not AsyncIOScheduler) because cleanup methods # perform sync SQLite operations. AsyncIOScheduler would block the event loop. from apscheduler.schedulers.background import BackgroundScheduler scheduler = BackgroundScheduler() scheduler.add_job(log_service.cleanup_old_logs, 'cron', hour=3) scheduler.add_job(metrics_service.cleanup_old_metrics, 'cron', hour=3, minute=30) scheduler.add_job(player_service.cleanup_old_history, 'cron', hour=4) # 90-day retention scheduler.start() ``` ### Step 7.2 — Startup recovery In `on_startup()` → `ProcessManager.recover_on_startup()`: - Query DB for servers with `status='running'` - Check if PID still alive (`psutil.pid_exists(pid)`) - If alive: re-attach threads (skip process start, just start monitoring threads) - If dead: mark as `crashed`, clear players ### Step 7.3 — Events log 1. `backend/dal/event_repository.py` 2. Insert events for: start, stop, crash, kick, ban, config change, mission change 3. `GET /servers/{id}/events` endpoint ### Step 7.4 — Security hardening (additional layers) 1. Encrypt sensitive DB fields: `password`, `password_admin`, `rcon_password` - `backend/utils/crypto.py` with Fernet - **Key format:** `LANGUARD_ENCRYPTION_KEY` must be a Fernet base64 key, NOT hex. Generate with: `python -c "from cryptography.fernet import Fernet; print(Fernet.generate_key().decode())"` Passing a hex string to `Fernet()` raises `ValueError` at startup. - Encrypt on write, decrypt on read in repositories - **NOTE:** Core security (rate limiting, input sanitization, config escaping, exe path validation) is already in Phases 1-2. 2. Additional penetration testing and security audit 3. Content-Security-Policy headers for frontend ### Step 7.5 — Frontend integration checklist Verify React app can: - [ ] Login and store JWT - [ ] List servers with live status - [ ] Start/stop server and see status update via WebSocket (no page refresh) - [ ] View streaming log output - [ ] See player list update every 10s - [ ] See CPU/RAM charts update every 5s - [ ] Edit all config sections and see preview - [ ] Upload a mission PBO - [ ] Kick a player - [ ] Send a message to all players --- ## Testing Strategy ### Unit tests (pytest) - `ConfigGenerator.write_server_cfg()` — compare output against expected string; test config injection prevention - `ConfigGenerator._escape_config_string()` — test double-quote and newline escaping - `RPTParser.parse_line()` — test all log formats - `BERConClient.parse_players_response()` — test with sample output - `AuthService.login()` — correct password / wrong password / rate limiting - Repository methods — use in-memory SQLite (`:memory:`) - `check_server_ports_available()` — test derived port validation - `sanitize_filename()` — test path traversal prevention - In-memory SQLite setup in `conftest.py` — shared fixture for all repository tests ### Integration tests - Full start/stop cycle with a real arma3server.exe (manual — requires licensed Arma 3 installation, not in CI) - WebSocket message delivery (can be automated with httpx test client) - RCon command round-trip (manual — requires running server with BattlEye) ### Load notes - SQLite with WAL handles concurrent reads from 4 threads per server well - For >10 simultaneous servers, consider connection pool size tuning - WebSocket broadcast scales to ~100 concurrent connections without issue --- ## Environment Setup (Developer) ```bash # 1. Clone repo git clone cd languard-servers-manager # 2. Backend cd backend python -m venv venv source venv/bin/activate # or venv\Scripts\activate on Windows pip install -r requirements.txt # 3. Environment cp .env.example .env # Edit .env: set LANGUARD_ARMA_EXE to your arma3server_x64.exe path # 4. Run backend uvicorn main:app --reload --host 0.0.0.0 --port 8000 # 5. Frontend (separate) cd ../frontend npm install npm run dev ``` Backend auto-creates `languard.db` and seeds an admin user on first run: - Username: `admin` - Password: **randomly generated** and printed to stdout once (e.g., `Initial admin password: a7b9c2d4e5f6...`) - Change immediately via `PUT /api/auth/password` --- ## Phase Summary | Phase | Deliverable | Est. Complexity | |-------|-------------|----------------| | 1 | Foundation (auth + server CRUD) | Low | | 2 | Process management + config gen | Medium | | 3 | Background threads (monitor, logs, metrics) | Medium-High | | 4 | BattlEye RCon (player list, admin cmds) | High | | 5 | WebSocket real-time | Medium | | 6 | Mission + mod management | Low-Medium | | 7 | Polish, security, recovery | Medium | Implement phases in order — each phase builds on the previous and is independently testable.