feat: initial system design documents for Languard Server Manager

Complete backend design for an Arma 3 dedicated server management panel:
- ARCHITECTURE.md: System architecture, tech stack, component responsibilities, data flows
- DATABASE.md: SQLite schema with WAL mode, CHECK constraints, 16+ tables
- API.md: REST + WebSocket API contract with auth, CRUD, and real-time channels
- MODULES.md: Python module breakdown with class definitions and dependencies
- THREADING.md: Concurrency model with thread safety, auto-restart, and WS bridge
- IMPLEMENTATION_PLAN.md: 7-phase implementation plan with security from Phase 1

Key design decisions:
- Sync SQLAlchemy only (no aiosqlite), thread-local DB connections
- Structured config builder (not f-strings) preventing config injection
- RCon request multiplexer for concurrent UDP access
- BackgroundScheduler for sync DB cleanup jobs
- ban.txt bidirectional sync with documented field mapping
- Auto-restart sequenced after thread cleanup

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
Tran G. (Revernomad) Khoa
2026-04-16 13:54:30 +07:00
commit 473f585391
6 changed files with 3595 additions and 0 deletions

755
API.md Normal file
View File

@@ -0,0 +1,755 @@
# Languard Server Manager — API Contract
## Base URL
```
http://localhost:8000/api
```
## Authentication
- All endpoints except `POST /auth/login` require: `Authorization: Bearer <JWT>`
- WebSocket: pass token as query param: `ws://localhost:8000/ws/{server_id}?token=<JWT>`
- JWT payload: `{ "sub": "user_id", "username": "string", "role": "admin|viewer", "exp": timestamp }`
## Common Response Envelope
```json
{
"success": true,
"data": { ... },
"error": null
}
```
Error response:
```json
{
"success": false,
"data": null,
"error": {
"code": "NOT_FOUND",
"message": "Server with id 5 not found"
}
}
```
## HTTP Status Codes
| Code | Meaning |
|------|---------|
| 200 | Success |
| 201 | Created |
| 204 | No content (DELETE) |
| 400 | Validation error |
| 401 | Unauthenticated |
| 403 | Forbidden (insufficient role) |
| 404 | Not found |
| 409 | Conflict (already running, duplicate) |
| 422 | Unprocessable (Pydantic validation) |
| 500 | Internal server error |
---
## Auth Endpoints
### POST /auth/login
Login and receive JWT.
**Request:**
```json
{
"username": "admin",
"password": "secret"
}
```
**Response 200:**
```json
{
"success": true,
"data": {
"access_token": "eyJhbGciOiJIUzI1NiIs...",
"token_type": "bearer",
"expires_in": 86400,
"user": {
"id": 1,
"username": "admin",
"role": "admin"
}
}
}
```
### POST /auth/logout
Invalidate token (client-side token deletion; server-side blacklist optional).
### GET /auth/me
Return current user info.
### PUT /auth/password
Change password. Admin only.
```json
{ "current_password": "old", "new_password": "new" }
```
### GET /auth/users
List all users. Admin only.
### POST /auth/users
Create user. Admin only.
```json
{ "username": "viewer1", "password": "pass", "role": "viewer" }
```
### DELETE /auth/users/{user_id}
Delete user. Admin only.
---
## Server Endpoints
### GET /servers
List all servers with current status. Supports pagination.
**Query params:** `?limit=50&offset=0`
**Response 200:**
```json
{
"success": true,
"data": [
{
"id": 1,
"name": "Main Server",
"description": "Primary COOP server",
"status": "running",
"pid": 12345,
"game_port": 2302,
"rcon_port": 2306,
"player_count": 15,
"max_players": 40,
"current_mission": "MyMission.Altis",
"uptime_seconds": 3600,
"cpu_percent": 34.2,
"ram_mb": 1850.5,
"started_at": "2026-04-16T10:00:00Z"
// current_mission: computed from RCon 'players' response or mission_rotation + server status
// uptime_seconds: computed as (now - started_at) in the service layer
}
]
}
```
### POST /servers
Create a new server. Admin only.
**Request:**
```json
{
"name": "Main Server",
"description": "Primary COOP server",
"exe_path": "C:/Arma3Server/arma3server_x64.exe",
"game_port": 2302,
"rcon_port": 2306,
"auto_restart": true,
"max_restarts": 3
}
```
**Note:** `password_admin` is auto-generated if not provided in the request. The generated value is returned in the response (shown once — not stored in plaintext API responses after creation). `rcon_password` is also auto-generated if not provided.
**Response 201:** Returns full server object including auto-generated credentials.
### GET /servers/{server_id}
Get server detail with full status.
**Response 200:**
```json
{
"success": true,
"data": {
"id": 1,
"name": "Main Server",
"status": "running",
"pid": 12345,
"game_port": 2302,
"rcon_port": 2306,
"auto_restart": true,
"restart_count": 0,
"player_count": 15,
"max_players": 40,
"cpu_percent": 34.2,
"ram_mb": 1850.5,
"started_at": "2026-04-16T10:00:00Z",
"uptime_seconds": 3600,
"current_mission": "MyMission.Altis"
}
}
```
### PUT /servers/{server_id}
Update server metadata (name, description, exe_path, ports). Admin only.
### DELETE /servers/{server_id}
Delete server (must be stopped first). Admin only. Removes DB rows and `servers/{id}/` directory.
### POST /servers/{server_id}/start
Start the server. Admin only.
**Response 200:**
```json
{ "success": true, "data": { "status": "starting", "pid": null } }
```
**Response 409:** Server already running.
### POST /servers/{server_id}/stop
Graceful stop (send `#shutdown` via RCon, then kill after 30s). Admin only.
**Request (optional):**
```json
{ "force": false, "reason": "Maintenance" }
```
### POST /servers/{server_id}/restart
Stop then start. Admin only.
### POST /servers/{server_id}/kill
Force-kill the process immediately. Admin only. Emergency use only.
---
## Server Config Endpoints
### GET /servers/{server_id}/config
Get all config sections combined.
**Response 200:**
```json
{
"success": true,
"data": {
"server": { /* server_configs row */ },
"basic": { /* basic_configs row */ },
"profile": { /* server_profiles row */ },
"launch": { /* launch_params row */ },
"rcon": { "rcon_password": "***", "max_ping": 200, "enabled": true }
}
}
```
### PUT /servers/{server_id}/config/server
Update server.cfg settings. Admin only.
**Request:** Partial object matching `server_configs` columns (snake_case). Any omitted field keeps current value.
```json
{
"hostname": "Updated Server Name",
"max_players": 64,
"battleye": 1,
"verify_signatures": 2,
"motd_lines": ["Welcome!", "Have fun"],
"motd_interval": 5.0
}
```
### PUT /servers/{server_id}/config/basic
Update basic.cfg (bandwidth) settings. Admin only.
```json
{
"max_bandwidth": 50000000,
"max_msg_send": 256
}
```
### PUT /servers/{server_id}/config/profile
Update difficulty profile. Admin only.
```json
{
"third_person_view": 0,
"weapon_crosshair": 0,
"ai_level_preset": 3,
"skill_ai": 0.7,
"precision_ai": 0.6
}
```
### PUT /servers/{server_id}/config/launch
Update launch parameters. Admin only.
```json
{
"world": "empty",
"limit_fps": 50,
"auto_init": false,
"load_mission_to_memory": true,
"bandwidth_alg": 2,
"enable_ht": true,
"huge_pages": false
}
```
### PUT /servers/{server_id}/config/rcon
Update BattlEye RCon settings. Admin only.
Regenerates `battleye/beserver.cfg` immediately. **Note:** BattlEye reads beserver.cfg only at server startup — RCon config changes require a server restart to take effect. The updated config file is ready for the next start.
**Note on `rcon_port`:** This field is stored in the `servers` table (not `rcon_configs`).
The service layer updates both tables as needed. Include only fields you want to change.
```json
{
"rcon_password": "newpassword",
"rcon_port": 2306,
"max_ping": 300,
"enabled": true
}
```
### GET /servers/{server_id}/config/preview
Returns rendered `server.cfg` as plain text string (for preview in UI). **Admin only** — contains plaintext credentials.
**Response 200:** `Content-Type: text/plain`
### GET /servers/{server_id}/config/download/{filename}
Download generated config file. Filename must be one of: `server.cfg` | `basic.cfg` | `server.Arma3Profile` (whitelist-validated, no path traversal). **Admin only** — config files contain plaintext passwords.
---
## Mission Endpoints
### GET /servers/{server_id}/missions
List all mission PBOs for a server.
**Response 200:**
```json
{
"success": true,
"data": [
{
"id": 1,
"filename": "MyMission.Altis.pbo",
"mission_name": "MyMission.Altis",
"terrain": "Altis",
"file_size": 102400,
"uploaded_at": "2026-04-16T09:00:00Z"
}
]
}
```
### POST /servers/{server_id}/missions/upload
Upload a mission PBO. Admin only. `multipart/form-data`.
**Form fields:**
- `file`: the `.pbo` file (filename is sanitized with `os.path.basename()` to prevent path traversal; only `.pbo` extension allowed)
**Response 201:**
```json
{
"success": true,
"data": {
"id": 2,
"filename": "NewMission.Stratis.pbo",
"mission_name": "NewMission.Stratis",
"terrain": "Stratis",
"file_size": 51200
}
}
```
### DELETE /servers/{server_id}/missions/{mission_id}
Delete a mission PBO (removes file from disk). Admin only.
### GET /servers/{server_id}/missions/rotation
Get current mission rotation (ordered list).
**Response 200:**
```json
{
"success": true,
"data": [
{
"id": 1,
"sort_order": 0,
"mission": { "id": 1, "mission_name": "MyMission.Altis" },
"difficulty": "Regular",
"params": { "RespawnDelay": 15 }
}
]
}
```
### PUT /servers/{server_id}/missions/rotation
Replace the entire mission rotation. Admin only.
```json
{
"rotation": [
{ "mission_id": 1, "difficulty": "Regular", "params": {} },
{ "mission_id": 2, "difficulty": "Veteran", "params": { "RespawnDelay": 30 } }
]
}
```
---
## Mod Endpoints
### GET /mods
List all registered mods (global list).
### POST /mods
Register a mod folder. Admin only.
```json
{
"name": "@CBA_A3",
"folder_path": "C:/Arma3Server/@CBA_A3",
"workshop_id": "450814997",
"description": "Community Base Addons"
}
```
### DELETE /mods/{mod_id}
Delete mod registration. Admin only.
### GET /servers/{server_id}/mods
Get mods enabled for a server.
**Response 200:**
```json
{
"success": true,
"data": [
{
"mod_id": 1,
"name": "@CBA_A3",
"folder_path": "C:/Arma3Server/@CBA_A3",
"is_server_mod": false,
"sort_order": 0
}
]
}
```
### PUT /servers/{server_id}/mods
Replace the mod list for a server. Admin only.
```json
{
"mods": [
{ "mod_id": 1, "is_server_mod": false, "sort_order": 0 },
{ "mod_id": 2, "is_server_mod": true, "sort_order": 1 }
]
}
```
---
## Player Endpoints
### GET /servers/{server_id}/players
Get currently connected players.
**Response 200:**
```json
{
"success": true,
"data": [
{
"player_num": 1,
"name": "PlayerOne",
"guid": "abc123...",
"ping": 45,
"verified": true,
"joined_at": "2026-04-16T10:15:00Z"
}
]
}
```
### POST /servers/{server_id}/players/{player_num}/kick
Kick a player via RCon. Admin only.
```json
{ "reason": "AFK" }
```
### POST /servers/{server_id}/players/{player_num}/ban
Ban a player via RCon. Admin only.
```json
{
"reason": "Griefing",
"duration_minutes": 0
}
```
### GET /servers/{server_id}/players/history
Player connection history. Supports pagination.
**Query params:** `?limit=50&offset=0&search=PlayerName`
---
## Ban Endpoints
### GET /servers/{server_id}/bans
List all bans for a server.
**Query params:** `?active_only=true&limit=50&offset=0`
### POST /servers/{server_id}/bans
Add ban manually. Admin only.
```json
{
"guid": "abc123...",
"steam_uid": "76561198...",
"name": "PlayerName",
"reason": "Cheating",
"duration_minutes": 0
}
```
### DELETE /servers/{server_id}/bans/{ban_id}
Remove a ban. Admin only.
---
## Log Endpoints
### GET /servers/{server_id}/logs
Query stored log lines.
**Query params:** `?limit=200&offset=0&level=error&since=2026-04-16T10:00:00Z&search=BattlEye`
**Response 200:**
```json
{
"success": true,
"data": {
"total": 1542,
"logs": [
{
"id": 100,
"timestamp": "2026-04-16T10:05:23Z",
"level": "info",
"message": "Player PlayerOne connected"
}
]
}
}
```
### DELETE /servers/{server_id}/logs
Clear all stored log lines for a server. Admin only.
---
## Metrics Endpoints
### GET /servers/{server_id}/metrics
Get time-series metrics.
**Query params:** `?from=2026-04-16T00:00:00Z&to=2026-04-16T23:59:59Z&resolution=5m`
**Response 200:**
```json
{
"success": true,
"data": [
{
"timestamp": "2026-04-16T10:00:00Z",
"cpu_percent": 34.2,
"ram_mb": 1850.5,
"player_count": 15
}
]
}
```
---
## RCon Endpoints
### POST /servers/{server_id}/rcon/command
Send raw RCon/admin command. Admin only.
```json
{ "command": "#restart" }
```
**Available commands:**
- `#restart` — Restart mission
- `#reassign` — Restart with roles unassigned
- `#missions` — Open mission selection
- `#lock` / `#unlock` — Lock/unlock server
- `#mission NAME.TERRAIN [difficulty]` — Load specific mission
- `#shutdown` — Shut down server
- `#monitor N` — Toggle performance monitoring
- `say -1 MESSAGE` — Message all players
### POST /servers/{server_id}/rcon/say
Broadcast a message to all players. Admin only.
```json
{ "message": "Server restarting in 5 minutes!" }
```
---
## Event Log Endpoints
### GET /servers/{server_id}/events
Get server event history (audit trail).
**Query params:** `?limit=50&offset=0&event_type=crashed`
---
## System Endpoints
### GET /system/status
Overall system status. **Requires authentication** (admin or viewer).
```json
{
"success": true,
"data": {
"version": "1.0.0",
"running_servers": 2,
"total_servers": 3,
"uptime_seconds": 86400
}
}
```
### GET /system/health
Health check (for load balancer/Docker). Returns 200 if healthy.
---
## WebSocket API
### Connection
```
ws://localhost:8000/ws/{server_id}?token=<JWT>
```
Use `server_id = "all"` to subscribe to events from all servers.
### Client → Server Messages
```json
{ "type": "ping" }
{ "type": "subscribe", "channels": ["logs", "players", "metrics", "status"] }
{ "type": "unsubscribe", "channels": ["metrics"] }
```
**Channel subscription**: The `ConnectionManager` tracks per-connection channel subscriptions. Only messages matching subscribed channels are delivered. Default subscriptions on connect: `["status"]`.
**Channel names match message types exactly:** `status`, `log`, `players`, `metrics`, `event`. Subscribe with channel names matching the `type` field in server→client messages.
### Server → Client Messages
#### Status Update
Sent when server status changes (starting → running → stopped, etc.)
```json
{
"type": "status",
"server_id": 1,
"data": {
"status": "running",
"pid": 12345,
"started_at": "2026-04-16T10:00:00Z"
}
}
```
#### Log Line
Sent for each new RPT log line.
```json
{
"type": "log",
"server_id": 1,
"data": {
"timestamp": "2026-04-16T10:05:23Z",
"level": "info",
"message": "BattlEye Server: Initialized (v1.240)"
}
}
```
#### Player List Update
Sent after each RCon poll (every 10s).
```json
{
"type": "players",
"server_id": 1,
"data": {
"players": [
{ "player_num": 1, "name": "PlayerOne", "ping": 45, "verified": true }
],
"count": 1
}
}
```
#### Metrics Update
Sent every 5 seconds.
```json
{
"type": "metrics",
"server_id": 1,
"data": {
"cpu_percent": 34.2,
"ram_mb": 1850.5,
"player_count": 1,
"timestamp": "2026-04-16T10:05:25Z"
}
}
```
#### Server Event
Sent for significant events (crash, restart, etc.)
```json
{
"type": "event",
"server_id": 1,
"data": {
"event_type": "crashed",
"detail": { "exit_code": 1 },
"timestamp": "2026-04-16T10:30:00Z"
}
}
```
#### Pong
```json
{ "type": "pong" }
```
---
## Rate Limiting
- `POST /auth/login`: 5 attempts per minute per IP. Exceeded returns `429 Too Many Requests`.
- All other endpoints: 60 requests per minute per token. Exceeded returns `429`.
- Implemented via FastAPI middleware (e.g., `slowapi`).
---
## Error Codes Reference
| Code | Description |
|------|-------------|
| `UNAUTHORIZED` | Missing or invalid token |
| `FORBIDDEN` | Role insufficient |
| `NOT_FOUND` | Resource not found |
| `SERVER_ALREADY_RUNNING` | Start called on running server |
| `SERVER_NOT_RUNNING` | Stop/command on stopped server |
| `RCON_UNAVAILABLE` | RCon connection failed |
| `INVALID_CONFIG` | Config validation failed |
| `EXE_NOT_FOUND` | arma3server.exe not at configured path |
| `PORT_IN_USE` | Game port already occupied |
| `UPLOAD_FAILED` | Mission file upload error |
| `VALIDATION_ERROR` | Pydantic validation failure |
| `INTERNAL_ERROR` | Unexpected server error |
| `MOD_IN_USE` | Cannot delete mod — enabled on one or more servers |
| `MISSION_IN_ROTATION` | Cannot delete mission — in active rotation |
| `RATE_LIMITED` | Too many requests |

309
ARCHITECTURE.md Normal file
View File

@@ -0,0 +1,309 @@
# Languard Server Manager — System Architecture
## Overview
Languard is a web-based management panel for Arma 3 dedicated servers. It provides a Python backend that manages one or more `arma3server_x64.exe` processes, exposes a REST + WebSocket API to a React frontend, and persists all state in SQLite.
---
## Technology Stack
| Layer | Technology | Rationale |
|-------|-----------|-----------|
| Backend framework | **FastAPI** (Python 3.11+) | Async-native, built-in WebSocket, OpenAPI docs auto-generated |
| Database | **SQLite** via `SQLAlchemy` (sync) | Zero-config, file-based, sufficient for single-host server manager; all access is synchronous (WAL mode for concurrent reads) |
| Process management | `subprocess` + `threading` | Wrap arma3server.exe, watch stdout/stderr, check exit codes; **cwd** set to server instance dir for relative paths; on Windows `terminate()` is a hard kill (no SIGTERM) |
| Real-time comms | **WebSocket** (FastAPI) | Push log lines, player lists, server status to React |
| RCon client | Custom UDP client | BattlEye RCon protocol for in-game admin commands |
| Config generation | Python structured builder | Generate server.cfg, basic.cfg, server.Arma3Profile with proper escaping (no f-string injection) |
| Scheduling | `APScheduler` (BackgroundScheduler) | Auto-restart, mission rotation timers, log/metrics cleanup (sync DB ops → BackgroundScheduler, not AsyncIOScheduler) |
| Auth | **JWT** (python-jose) + bcrypt | Secure the API; React stores token in localStorage |
| Frontend | React + TypeScript (external repo) | Connects to this backend's API |
---
## High-Level Architecture
```
┌─────────────────────────────────────────────────────────────┐
│ React Frontend │
│ Server List │ Server Detail │ Logs │ Players │ Config UI │
└────────────────────────┬────────────────────────────────────┘
│ HTTP REST + WebSocket
┌─────────────────────────────────────────────────────────────┐
│ FastAPI Application │
│ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────────┐ │
│ │ Auth Router │ │ Server Router│ │ Config Router │ │
│ └──────────────┘ └──────────────┘ └──────────────────┘ │
│ ┌──────────────┐ ┌──────────────┐ ┌──────────────────┐ │
│ │Mission Router│ │ Mod Router │ │ WS Router │ │
│ └──────────────┘ └──────────────┘ └──────────────────┘ │
│ │
│ ┌─────────────────────────────────────────────────────┐ │
│ │ Service Layer │ │
│ │ ServerService │ ConfigService │ RConService │ │
│ │ LogService │ MetricsService│ MissionService │ │
│ └─────────────────────────────────────────────────────┘ │
│ │
│ ┌─────────────────────────────────────────────────────┐ │
│ │ Thread Pool │ │
│ │ ProcessMonitorThread (per server) │ │
│ │ LogTailThread (per server) │ │
│ │ MetricsCollectorThread (per server) │ │
│ │ RConPollerThread (per server) │ │
│ │ BroadcastThread (global) │ │
│ └─────────────────────────────────────────────────────┘ │
│ │
│ ┌─────────────────────────────────────────────────────┐ │
│ │ Data Access Layer (DAL) │ │
│ │ ServerRepository │ PlayerRepository │ │
│ │ LogRepository │ MetricsRepository │ │
│ └─────────────────────────────────────────────────────┘ │
│ │
│ ┌───────────────────┐ ┌────────────────────────────────┐ │
│ │ SQLite (DB) │ │ Filesystem │ │
│ │ languard.db │ │ servers/{id}/server.cfg │ │
│ │ │ │ servers/{id}/basic.cfg │ │
│ │ │ │ servers/{id}/server/ │ │ ← profile dir (Arma3 -name=server)
│ │ │ │ server.Arma3Profile │ │ ← profile settings
│ │ │ │ arma3server_*.rpt │ │ ← RPT logs (tailable)
│ │ │ │ servers/{id}/battleye/ │ │
│ │ │ │ beserver.cfg │ │ ← RCon config
│ │ │ │ servers/{id}/mpmissions/ │ │
│ └───────────────────┘ └────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────┘
│ subprocess
┌─────────────────────────────────────────────────────────────┐
│ Arma 3 Server Processes (OS level) │
│ arma3server_x64.exe (port 2302) │
│ arma3server_x64.exe (port 2402) │
│ ... │
└─────────────────────────────────────────────────────────────┘
```
---
## Component Responsibilities
### FastAPI Routers
- Validate input (Pydantic models)
- Call service layer
- Return JSON responses
- Handle WebSocket connections
### Service Layer
- Orchestrate operations (start server = generate config + launch process + start threads)
- No direct DB access — delegates to repositories
- No direct process access — delegates to ProcessManager
### ProcessManager
- Singleton that owns all subprocess handles
- Thread-safe dict: `{server_id: subprocess.Popen}`
- `start()` sets `cwd=servers/{server_id}/` so relative config paths resolve correctly
- On Windows: `terminate()` = `TerminateProcess` (hard kill), no graceful SIGTERM — graceful shutdown must go through RCon `#shutdown` first
- Provides: `start()`, `stop()`, `restart()`, `is_running()`, `send_command()`
### Thread Pool (per running server)
| Thread | Interval | Purpose |
|--------|----------|---------|
| `ProcessMonitorThread` | 1s | Detect crash / unexpected exit; update DB status; trigger auto-restart |
| `LogTailThread` | 100ms | Read new lines from .rpt file; store in DB; push to WS clients |
| `MetricsCollectorThread` | 5s | Collect CPU%, RAM MB for the process via psutil; write to DB |
| `RConPollerThread` | 10s | Query connected players via BattlEye RCon; update DB player table |
| `BroadcastThread` | event-driven | Consume from internal queue; push JSON to all subscribed WS clients |
### RCon Client
- UDP socket to BattlEye RCon port (configured in `beserver.cfg` inside the server's `battleye/` directory)
- Implements BE RCon protocol: login, keepalive, send command, parse response
- **Request multiplexer**: tracks pending requests by sequence byte, routes responses to the correct caller via `threading.Event` per request. Prevents response misrouting when RConPollerThread and API-request RConService calls share the same UDP socket.
- Used by: `RConPollerThread`, `RConService` (for admin commands from UI)
### Config Generator
- Takes `ServerConfig` Pydantic model from DB
- Renders `server.cfg`, `basic.cfg`, `*.Arma3Profile` using a **structured builder** (NOT f-strings — prevents config injection)
- Escapes double quotes and newlines in all user-supplied string values
- Writes files to `servers/{server_id}/` directory
- `server.Arma3Profile` written to `servers/{server_id}/server/` (Arma 3 reads from the `-name` subdirectory)
### SQLite DAL
- Sync reads/writes using SQLAlchemy Core (not ORM — simpler for this use case)
- Thread-safe via SQLAlchemy's connection pooling
- One `languard.db` file at project root
- **PRAGMA busy_timeout=5000** — prevents "database is locked" errors under concurrent thread writes
- Thread-local connections via `get_thread_db()` — one connection per background thread
---
## Data Flow: Start Server
```
Frontend → POST /api/servers/{id}/start
→ ServerService.start(server_id)
├── Load ServerConfig from DB
├── ConfigGenerator.write_configs(server_id, config)
│ ├── server.cfg → servers/{id}/server.cfg
│ ├── basic.cfg → servers/{id}/basic.cfg
│ ├── server.Arma3Profile → servers/{id}/server/server.Arma3Profile
│ └── beserver.cfg → servers/{id}/battleye/beserver.cfg
├── ProcessManager.start(server_id, exe_path, args, cwd=servers/{id}/)
├── DB: update server.status = "starting"
├── Spawn ProcessMonitorThread(server_id)
├── Spawn LogTailThread(server_id) — tails servers/{id}/server/arma3server_*.rpt
├── Spawn MetricsCollectorThread(server_id)
├── Spawn RConPollerThread(server_id) [after 30s delay for server startup]
└── BroadcastThread pushes status update to WS clients
```
## Data Flow: Real-time Logs
```
arma3server.exe writes servers/{id}/server/arma3server_*.rpt
→ LogTailThread reads new lines (recursive glob for *.rpt in profile dir)
→ LogRepository.insert(server_id, line, timestamp)
→ BroadcastQueue.put({type: "log", server_id, line, timestamp})
→ BroadcastThread sends to all WS subscribers for this server
→ React frontend appends to log viewer
```
## Data Flow: Player List
```
RConPollerThread (every 10s)
→ RConClient.send("players")
→ Parse response: [{id, name, guid, ping, verified}]
→ PlayerRepository.upsert_all(server_id, players)
→ BroadcastQueue.put({type: "players", server_id, players})
→ React frontend updates player list
```
---
## Security Model
- All API routes (except `POST /api/auth/login`) require a valid **JWT Bearer token**
- JWT contains: `user_id`, `username`, `role` (`admin` | `viewer`)
- `viewer` role: read-only (GET endpoints, WebSocket)
- `admin` role: all operations
- CORS configured to accept only the frontend origin
- Passwords hashed with **bcrypt** (cost factor 12)
- `serverCommandPassword` and `passwordAdmin` stored encrypted in SQLite (AES-256 via `cryptography` library, key from env)
- **Port conflict validation** at server creation and start: checks game_port through game_port+4 (game, Steam query, Steam master, Steam auth, RCon) against all existing servers
- **ban.txt sync**: bans table is source of truth for UI; on ban add/delete via API, also write to `battleye/ban.txt`; on startup, read `ban.txt` and upsert into DB. Without this sync, DB-only bans are not enforced by BattlEye.
- Generated config files containing passwords (server.cfg, beserver.cfg) have restrictive file permissions (0600 on Unix, restricted ACL on Windows)
- Input sanitization on all string fields before config generation — no shell injection or config directive injection
---
## Configuration (Environment Variables)
```env
LANGUARD_SECRET_KEY=<jwt-signing-secret>
LANGUARD_ENCRYPTION_KEY=<Fernet-base64-key — generate with: python -c "from cryptography.fernet import Fernet; print(Fernet.generate_key().decode())">
LANGUARD_DB_PATH=./languard.db
LANGUARD_SERVERS_DIR=./servers
LANGUARD_ARMA_EXE=C:/Arma3Server/arma3server_x64.exe
LANGUARD_HOST=0.0.0.0
LANGUARD_PORT=8000
LANGUARD_CORS_ORIGINS=http://localhost:5173,http://localhost:3000
LANGUARD_LOG_RETENTION_DAYS=7
```
---
## Directory Layout
```
languard-server-manager/
├── backend/
│ ├── main.py # FastAPI app factory
│ ├── config.py # Settings from env
│ ├── database.py # SQLAlchemy engine + session
│ ├── auth/
│ │ ├── router.py
│ │ ├── service.py
│ │ └── schemas.py
│ ├── servers/
│ │ ├── router.py # REST endpoints for servers
│ │ ├── service.py # ServerService
│ │ ├── process_manager.py # ProcessManager singleton
│ │ ├── config_generator.py # server.cfg / basic.cfg / beserver.cfg writer
│ │ └── schemas.py # Pydantic schemas
│ ├── rcon/
│ │ ├── client.py # BattlEye RCon UDP client
│ │ └── service.py # RConService
│ ├── players/
│ │ ├── router.py
│ │ ├── service.py
│ │ └── schemas.py
│ ├── missions/
│ │ ├── router.py
│ │ └── service.py
│ ├── mods/
│ │ ├── router.py
│ │ └── service.py
│ ├── logs/
│ │ ├── router.py
│ │ └── service.py
│ ├── metrics/
│ │ ├── router.py
│ │ └── service.py
│ ├── websocket/
│ │ ├── router.py # WS connection handler
│ │ ├── manager.py # ConnectionManager (per-server subscriptions)
│ │ └── broadcaster.py # BroadcastThread + queue
│ ├── threads/
│ │ ├── process_monitor.py # ProcessMonitorThread
│ │ ├── log_tail.py # LogTailThread
│ │ ├── metrics_collector.py # MetricsCollectorThread
│ │ └── rcon_poller.py # RConPollerThread
│ ├── system/
│ │ └── router.py # GET /system/status, GET /system/health
│ ├── dal/
│ │ ├── server_repository.py
│ │ ├── config_repository.py
│ │ ├── player_repository.py
│ │ ├── log_repository.py
│ │ ├── metrics_repository.py
│ │ ├── mission_repository.py
│ │ ├── mod_repository.py
│ │ ├── ban_repository.py
│ │ └── event_repository.py
│ └── migrations/
│ └── 001_initial_schema.sql
├── servers/ # Runtime data per server instance
│ └── {server_id}/
│ ├── server.cfg
│ ├── basic.cfg
│ ├── server/ # Arma 3 profile dir (matches -name=server)
│ │ ├── server.Arma3Profile
│ │ └── arma3server_*.rpt # Timestamped RPT logs
│ ├── battleye/
│ │ └── beserver.cfg # BattlEye RCon config (generated on start)
│ └── mpmissions/
├── frontend/ # React app (separate repo or subfolder)
├── requirements.txt
├── .env.example
├── ARCHITECTURE.md
├── DATABASE.md
├── API.md
├── MODULES.md
├── THREADING.md
└── IMPLEMENTATION_PLAN.md
```
---
## Key Design Decisions
| Decision | Choice | Reason |
|----------|--------|--------|
| Sync vs async DB | **Sync SQLAlchemy only** | All DB access is synchronous; background threads are non-async; `get_thread_db()` provides thread-local connections; no aiosqlite dependency |
| ORM vs Core | **SQLAlchemy Core** | Simpler SQL control, less magic for embedded use case |
| WebSocket auth | JWT in query param on connect | Browser WS API doesn't support headers; query param `?token=...` |
| Process ownership | **ProcessManager singleton** | Single source of truth; prevents duplicate launches |
| Log storage | **DB + rolling file** | DB for fast queries/streaming; raw .rpt preserved on disk |
| Config files | **Regenerate on each start** | Always fresh from DB; no sync drift between DB and filesystem; **structured builder** (not f-strings) prevents config injection |
| RCon port convention | **User-configurable** | BattlEye RCon port is set in `beserver.cfg` (inside `battleye/` dir). Default suggestion: game port + 4 (e.g., 2302 → 2306). Must not conflict with game (2302), Steam query (2303), VON (2304), or Steam auth (2305) ports. **Note:** RCon config changes require server restart — BattlEye reads beserver.cfg only at startup. |

586
DATABASE.md Normal file
View File

@@ -0,0 +1,586 @@
# Languard Server Manager — Database Design
## Engine
- **SQLite** via `SQLAlchemy Core` (sync for all access — routes and threads)
- File: `languard.db` at project root (configurable via `LANGUARD_DB_PATH`)
- WAL mode enabled: `PRAGMA journal_mode=WAL` — allows concurrent reads during writes
- Foreign keys enabled: `PRAGMA foreign_keys=ON`
- Busy timeout: `PRAGMA busy_timeout=5000` — prevents "database is locked" errors under concurrent thread writes
---
## Schema
### Table: `users`
Stores web UI admin accounts.
```sql
CREATE TABLE users (
id INTEGER PRIMARY KEY AUTOINCREMENT,
username TEXT NOT NULL UNIQUE,
password_hash TEXT NOT NULL, -- bcrypt hash
role TEXT NOT NULL DEFAULT 'viewer', -- 'admin' | 'viewer'
CHECK (role IN ('admin', 'viewer')),
created_at TEXT NOT NULL DEFAULT (datetime('now')),
last_login TEXT
);
```
---
### Table: `servers`
One row per managed Arma 3 server instance.
```sql
CREATE TABLE servers (
id INTEGER PRIMARY KEY AUTOINCREMENT,
name TEXT NOT NULL, -- display name in UI
description TEXT,
status TEXT NOT NULL DEFAULT 'stopped',
-- status values: 'stopped' | 'starting' | 'running' | 'stopping' | 'crashed' | 'error'
CHECK (status IN ('stopped', 'starting', 'running', 'stopping', 'crashed', 'error')),
CHECK (game_port BETWEEN 1024 AND 65535),
CHECK (rcon_port BETWEEN 1024 AND 65535),
-- Process info
pid INTEGER, -- OS process ID when running
exe_path TEXT NOT NULL, -- path to arma3server_x64.exe
started_at TEXT, -- ISO datetime
stopped_at TEXT,
-- Network
game_port INTEGER NOT NULL DEFAULT 2302,
rcon_port INTEGER NOT NULL DEFAULT 2306, -- user-configurable; written to battleye/beserver.cfg
steam_query_port INTEGER GENERATED ALWAYS AS (game_port + 1) VIRTUAL, -- convention, not enforced by engine
-- Auto-management
auto_restart INTEGER NOT NULL DEFAULT 0, -- 1 = restart on crash
max_restarts INTEGER NOT NULL DEFAULT 3, -- within restart_window_seconds
restart_window_seconds INTEGER NOT NULL DEFAULT 300,
restart_count INTEGER NOT NULL DEFAULT 0,
last_restart_at TEXT,
created_at TEXT NOT NULL DEFAULT (datetime('now')),
updated_at TEXT NOT NULL DEFAULT (datetime('now'))
);
CREATE INDEX idx_servers_status ON servers(status);
CREATE INDEX idx_servers_game_port ON servers(game_port);
CREATE INDEX idx_servers_rcon_port ON servers(rcon_port);
```
---
### Table: `server_configs`
Stores all parameters for generating `server.cfg`. One row per server.
```sql
CREATE TABLE server_configs (
id INTEGER PRIMARY KEY AUTOINCREMENT,
server_id INTEGER NOT NULL UNIQUE REFERENCES servers(id) ON DELETE CASCADE,
-- Basic identity
hostname TEXT NOT NULL DEFAULT 'My Arma 3 Server',
password TEXT, -- join password (encrypted at app layer via Fernet)
password_admin TEXT NOT NULL, -- encrypted (no default — must be set on creation)
server_command_password TEXT, -- encrypted
-- Players
max_players INTEGER NOT NULL DEFAULT 40,
kick_duplicate INTEGER NOT NULL DEFAULT 1,
persistent INTEGER NOT NULL DEFAULT 1,
-- Voting
vote_threshold REAL NOT NULL DEFAULT 0.33,
vote_mission_players INTEGER NOT NULL DEFAULT 1,
vote_timeout INTEGER NOT NULL DEFAULT 60, -- seconds
role_timeout INTEGER NOT NULL DEFAULT 90,
briefing_timeout INTEGER NOT NULL DEFAULT 60,
debriefing_timeout INTEGER NOT NULL DEFAULT 45,
lobby_idle_timeout INTEGER NOT NULL DEFAULT 300,
-- Voice
disable_von INTEGER NOT NULL DEFAULT 0,
von_codec INTEGER NOT NULL DEFAULT 1, -- 1 = OPUS
CHECK (von_codec IN (0, 1)),
von_codec_quality INTEGER NOT NULL DEFAULT 20, -- 1-30
-- Network quality kick thresholds
max_ping INTEGER NOT NULL DEFAULT 250,
max_packet_loss INTEGER NOT NULL DEFAULT 50,
max_desync INTEGER NOT NULL DEFAULT 200,
disconnect_timeout INTEGER NOT NULL DEFAULT 15,
kick_on_ping INTEGER NOT NULL DEFAULT 1,
kick_on_packet_loss INTEGER NOT NULL DEFAULT 1,
kick_on_desync INTEGER NOT NULL DEFAULT 1,
kick_on_timeout INTEGER NOT NULL DEFAULT 1,
-- Security
battleye INTEGER NOT NULL DEFAULT 1,
verify_signatures INTEGER NOT NULL DEFAULT 2, -- 0 | 1 | 2 (1 = check but don't kick)
allowed_file_patching INTEGER NOT NULL DEFAULT 0, -- 0 | 1 | 2
-- Difficulty
forced_difficulty TEXT NOT NULL DEFAULT 'Regular', -- Recruit | Regular | Veteran | Custom
-- Misc
timestamp_format TEXT NOT NULL DEFAULT 'short', -- none | short | full
auto_select_mission INTEGER NOT NULL DEFAULT 0,
random_mission_order INTEGER NOT NULL DEFAULT 0,
missions_to_restart INTEGER NOT NULL DEFAULT 0,
missions_to_shutdown INTEGER NOT NULL DEFAULT 0,
log_file TEXT NOT NULL DEFAULT 'server_console.log',
skip_lobby INTEGER NOT NULL DEFAULT 0,
drawing_in_map INTEGER NOT NULL DEFAULT 1,
upnp INTEGER NOT NULL DEFAULT 0,
loopback INTEGER NOT NULL DEFAULT 0,
statistics_enabled INTEGER NOT NULL DEFAULT 1,
force_rotor_lib INTEGER NOT NULL DEFAULT 0, -- 0=player, 1=AFM, 2=SFM
CHECK (force_rotor_lib IN (0, 1, 2)),
required_build INTEGER NOT NULL DEFAULT 0,
steam_protocol_max_data_size INTEGER NOT NULL DEFAULT 1024,
-- MOTD
motd_lines TEXT NOT NULL DEFAULT '[]', -- JSON array of strings
motd_interval REAL NOT NULL DEFAULT 5.0,
-- Event scripts
on_user_connected TEXT DEFAULT '',
on_user_disconnected TEXT DEFAULT '',
on_unsigned_data TEXT DEFAULT 'kick (_this select 0)',
on_hacked_data TEXT DEFAULT 'kick (_this select 0)',
double_id_detected TEXT DEFAULT '',
-- Headless clients (JSON arrays)
headless_clients TEXT NOT NULL DEFAULT '[]', -- e.g. '["127.0.0.1"]'
local_clients TEXT NOT NULL DEFAULT '[]',
-- Admin UIDs whitelist
admin_uids TEXT NOT NULL DEFAULT '[]', -- JSON array of Steam UIDs
-- File extension whitelists (JSON arrays)
allowed_load_extensions TEXT NOT NULL DEFAULT '["hpp","sqs","sqf","fsm","cpp","paa","txt","xml","inc","ext","sqm","ods","fxy","lip","csv","kb","bik","bikb","html","htm","biedi"]',
allowed_preprocess_extensions TEXT NOT NULL DEFAULT '["hpp","sqs","sqf","fsm","cpp","paa","txt","xml","inc","ext","sqm","ods","fxy","lip","csv","kb","bik","bikb","html","htm","biedi"]',
allowed_html_extensions TEXT NOT NULL DEFAULT '["htm","html","xml","txt"]',
updated_at TEXT NOT NULL DEFAULT (datetime('now')),
CHECK (verify_signatures IN (0, 1, 2)),
CHECK (allowed_file_patching IN (0, 1, 2)),
CHECK (von_codec_quality BETWEEN 1 AND 30),
CHECK (forced_difficulty IN ('Recruit', 'Regular', 'Veteran', 'Custom')),
CHECK (vote_threshold >= 0.0 AND vote_threshold <= 1.0),
CHECK (max_players > 0)
);
```
---
### Table: `basic_configs`
Stores `basic.cfg` (bandwidth) settings. One row per server.
```sql
CREATE TABLE basic_configs (
id INTEGER PRIMARY KEY AUTOINCREMENT,
server_id INTEGER NOT NULL UNIQUE REFERENCES servers(id) ON DELETE CASCADE,
min_bandwidth INTEGER NOT NULL DEFAULT 800000,
max_bandwidth INTEGER NOT NULL DEFAULT 25000000,
max_msg_send INTEGER NOT NULL DEFAULT 384, -- default 128; higher = desync risk
max_size_guaranteed INTEGER NOT NULL DEFAULT 512,
max_size_non_guaranteed INTEGER NOT NULL DEFAULT 256,
min_error_to_send REAL NOT NULL DEFAULT 0.003,
max_custom_file_size INTEGER NOT NULL DEFAULT 100000,
updated_at TEXT NOT NULL DEFAULT (datetime('now'))
);
```
---
### Table: `server_profiles`
Stores `server.Arma3Profile` difficulty settings. One row per server.
```sql
CREATE TABLE server_profiles (
id INTEGER PRIMARY KEY AUTOINCREMENT,
server_id INTEGER NOT NULL UNIQUE REFERENCES servers(id) ON DELETE CASCADE,
-- Custom difficulty options (all 0/1 or 0/1/2)
reduced_damage INTEGER NOT NULL DEFAULT 0,
group_indicators INTEGER NOT NULL DEFAULT 0,
friendly_tags INTEGER NOT NULL DEFAULT 0,
enemy_tags INTEGER NOT NULL DEFAULT 0,
detected_mines INTEGER NOT NULL DEFAULT 0,
commands INTEGER NOT NULL DEFAULT 1,
waypoints INTEGER NOT NULL DEFAULT 1,
tactical_ping INTEGER NOT NULL DEFAULT 0,
weapon_info INTEGER NOT NULL DEFAULT 2,
stance_indicator INTEGER NOT NULL DEFAULT 2,
stamina_bar INTEGER NOT NULL DEFAULT 0,
weapon_crosshair INTEGER NOT NULL DEFAULT 0,
vision_aid INTEGER NOT NULL DEFAULT 0,
third_person_view INTEGER NOT NULL DEFAULT 0,
camera_shake INTEGER NOT NULL DEFAULT 1,
score_table INTEGER NOT NULL DEFAULT 1,
death_messages INTEGER NOT NULL DEFAULT 1,
von_id INTEGER NOT NULL DEFAULT 1,
map_content_friendly INTEGER NOT NULL DEFAULT 0,
map_content_enemy INTEGER NOT NULL DEFAULT 0,
map_content_mines INTEGER NOT NULL DEFAULT 0,
auto_report INTEGER NOT NULL DEFAULT 0,
multiple_saves INTEGER NOT NULL DEFAULT 0,
-- AI level
ai_level_preset INTEGER NOT NULL DEFAULT 3, -- 0=Low,1=Normal,2=High,3=Custom
skill_ai REAL NOT NULL DEFAULT 0.5,
precision_ai REAL NOT NULL DEFAULT 0.5,
CHECK (ai_level_preset BETWEEN 0 AND 3),
CHECK (skill_ai BETWEEN 0.0 AND 1.0),
CHECK (precision_ai BETWEEN 0.0 AND 1.0),
CHECK (group_indicators BETWEEN 0 AND 2),
CHECK (weapon_info BETWEEN 0 AND 2),
CHECK (stance_indicator BETWEEN 0 AND 2),
updated_at TEXT NOT NULL DEFAULT (datetime('now'))
);
```
---
### Table: `launch_params`
Extra command-line parameters added to the server launch command. One row per server.
```sql
CREATE TABLE launch_params (
id INTEGER PRIMARY KEY AUTOINCREMENT,
server_id INTEGER NOT NULL UNIQUE REFERENCES servers(id) ON DELETE CASCADE,
world TEXT NOT NULL DEFAULT 'empty',
extra_params TEXT NOT NULL DEFAULT '', -- raw extra params string
limit_fps INTEGER NOT NULL DEFAULT 50,
auto_init INTEGER NOT NULL DEFAULT 0,
load_mission_to_memory INTEGER NOT NULL DEFAULT 0,
bandwidth_alg INTEGER, -- NULL | 2
CHECK (bandwidth_alg IS NULL OR bandwidth_alg = 2),
enable_ht INTEGER NOT NULL DEFAULT 0,
huge_pages INTEGER NOT NULL DEFAULT 0,
cpu_count INTEGER, -- NULL = auto
ex_threads INTEGER NOT NULL DEFAULT 7,
max_mem INTEGER, -- NULL = auto
no_logs INTEGER NOT NULL DEFAULT 0,
netlog INTEGER NOT NULL DEFAULT 0,
updated_at TEXT NOT NULL DEFAULT (datetime('now'))
);
```
---
### Table: `mods`
Registered mods. Many-to-many with servers.
```sql
CREATE TABLE mods (
id INTEGER PRIMARY KEY AUTOINCREMENT,
name TEXT NOT NULL,
folder_path TEXT NOT NULL UNIQUE, -- absolute or relative path
workshop_id TEXT, -- Steam Workshop ID if applicable
description TEXT,
created_at TEXT NOT NULL DEFAULT (datetime('now'))
);
CREATE TABLE server_mods (
server_id INTEGER NOT NULL REFERENCES servers(id) ON DELETE CASCADE,
mod_id INTEGER NOT NULL REFERENCES mods(id) ON DELETE CASCADE,
is_server_mod INTEGER NOT NULL DEFAULT 0, -- 1 = -serverMod (not broadcast to clients)
sort_order INTEGER NOT NULL DEFAULT 0,
PRIMARY KEY (server_id, mod_id)
);
CREATE INDEX idx_server_mods_server ON server_mods(server_id);
```
---
### Table: `missions`
Mission PBO files tracked per server.
```sql
CREATE TABLE missions (
id INTEGER PRIMARY KEY AUTOINCREMENT,
server_id INTEGER NOT NULL REFERENCES servers(id) ON DELETE CASCADE,
filename TEXT NOT NULL, -- e.g. "MyMission.Altis.pbo"
mission_name TEXT NOT NULL, -- e.g. "MyMission.Altis"
terrain TEXT NOT NULL, -- e.g. "Altis"
file_size INTEGER, -- bytes
uploaded_at TEXT NOT NULL DEFAULT (datetime('now')),
UNIQUE (server_id, filename)
);
CREATE INDEX idx_missions_server ON missions(server_id);
```
---
### Table: `mission_rotation`
Ordered mission cycle for a server.
```sql
CREATE TABLE mission_rotation (
id INTEGER PRIMARY KEY AUTOINCREMENT,
server_id INTEGER NOT NULL REFERENCES servers(id) ON DELETE CASCADE,
mission_id INTEGER NOT NULL REFERENCES missions(id) ON DELETE CASCADE,
sort_order INTEGER NOT NULL DEFAULT 0,
difficulty TEXT NOT NULL DEFAULT 'Regular',
CHECK (difficulty IN ('Recruit', 'Regular', 'Veteran', 'Custom')),
params_json TEXT NOT NULL DEFAULT '{}', -- mission params override as JSON
UNIQUE (server_id, sort_order)
);
CREATE INDEX idx_mission_rotation_server ON mission_rotation(server_id);
```
---
### Table: `players`
Currently connected players (live state, refreshed by RConPollerThread).
```sql
CREATE TABLE players (
id INTEGER PRIMARY KEY AUTOINCREMENT,
server_id INTEGER NOT NULL REFERENCES servers(id) ON DELETE CASCADE,
player_num INTEGER NOT NULL, -- BE player# (slot number)
name TEXT NOT NULL,
guid TEXT, -- BattlEye GUID
steam_uid TEXT,
ip TEXT,
ping INTEGER,
verified INTEGER NOT NULL DEFAULT 0, -- 1 = signature verified
joined_at TEXT NOT NULL DEFAULT (datetime('now')),
updated_at TEXT NOT NULL DEFAULT (datetime('now')),
UNIQUE (server_id, player_num)
);
CREATE INDEX idx_players_server ON players(server_id);
```
---
### Table: `player_history`
Historical record of connections. Inserted when player disconnects.
```sql
CREATE TABLE player_history (
id INTEGER PRIMARY KEY AUTOINCREMENT,
server_id INTEGER NOT NULL REFERENCES servers(id) ON DELETE CASCADE,
name TEXT NOT NULL,
guid TEXT,
steam_uid TEXT,
ip TEXT,
joined_at TEXT NOT NULL,
left_at TEXT NOT NULL DEFAULT (datetime('now')),
session_duration_seconds INTEGER
);
CREATE INDEX idx_player_history_server ON player_history(server_id);
CREATE INDEX idx_player_history_steam ON player_history(steam_uid);
```
### Player History Retention Cleanup (run daily via APScheduler, keep 90 days)
```sql
DELETE FROM player_history
WHERE left_at < datetime('now', '-90 days');
```
---
### Table: `bans`
Local ban records (source of truth for the UI). **Must sync bidirectionally with `battleye/ban.txt`** — BattlEye reads only from ban.txt. On API ban add/delete: also write to ban.txt. On startup: read ban.txt and upsert into DB.
**ban.txt format** (one entry per line):
```
GUID|IP timestamp reason
```
Example: `a1b2c3d4e5f6|192.168.1.1 1713260000 Cheating`
**Sync caveats:** ban.txt does not store `banned_by`, `expires_at`, or `is_active`. Timed bans are represented by a future timestamp (not minutes); permanent bans have timestamp `0`. On startup, `banned_by` is set to `'ban.txt'` for entries read from file. Deactivated bans (`is_active=0`) are not written to ban.txt.
```sql
CREATE TABLE bans (
id INTEGER PRIMARY KEY AUTOINCREMENT,
server_id INTEGER NOT NULL REFERENCES servers(id) ON DELETE CASCADE,
guid TEXT,
steam_uid TEXT,
name TEXT,
reason TEXT,
banned_by TEXT, -- admin username
banned_at TEXT NOT NULL DEFAULT (datetime('now')),
expires_at TEXT, -- NULL = permanent
is_active INTEGER NOT NULL DEFAULT 1,
CHECK (is_active IN (0, 1))
);
CREATE INDEX idx_bans_server ON bans(server_id);
CREATE INDEX idx_bans_guid ON bans(guid);
CREATE INDEX idx_bans_steam_uid ON bans(steam_uid);
CREATE INDEX idx_bans_active ON bans(is_active);
```
---
### Table: `logs`
Parsed RPT log lines (rolling retention, default 7 days).
```sql
CREATE TABLE logs (
id INTEGER PRIMARY KEY AUTOINCREMENT,
server_id INTEGER NOT NULL REFERENCES servers(id) ON DELETE CASCADE,
timestamp TEXT NOT NULL,
level TEXT NOT NULL DEFAULT 'info', -- 'info' | 'warning' | 'error'
CHECK (level IN ('info', 'warning', 'error')),
message TEXT NOT NULL,
created_at TEXT NOT NULL DEFAULT (datetime('now'))
);
CREATE INDEX idx_logs_server_ts ON logs(server_id, timestamp);
CREATE INDEX idx_logs_level ON logs(level); -- for ?level= filter
CREATE INDEX idx_logs_created ON logs(created_at); -- for retention cleanup
```
---
### Table: `metrics`
Time-series CPU/RAM/player count snapshots.
```sql
CREATE TABLE metrics (
id INTEGER PRIMARY KEY AUTOINCREMENT,
server_id INTEGER NOT NULL REFERENCES servers(id) ON DELETE CASCADE,
timestamp TEXT NOT NULL DEFAULT (datetime('now')),
cpu_percent REAL,
ram_mb REAL,
player_count INTEGER
);
CREATE INDEX idx_metrics_server_ts ON metrics(server_id, timestamp);
```
---
### Table: `server_events`
Audit trail of all significant events (start, stop, crash, restart, admin actions).
```sql
CREATE TABLE server_events (
id INTEGER PRIMARY KEY AUTOINCREMENT,
server_id INTEGER NOT NULL REFERENCES servers(id) ON DELETE CASCADE,
event_type TEXT NOT NULL,
-- event_type values:
-- 'started' | 'stopped' | 'crashed' | 'restarted' | 'config_updated'
-- 'player_kicked' | 'player_banned' | 'mission_changed' | 'admin_login'
-- 'rcon_command' | 'auto_restarted'
actor TEXT, -- username or 'system'
detail TEXT, -- JSON with event-specific data
created_at TEXT NOT NULL DEFAULT (datetime('now'))
);
CREATE INDEX idx_events_server ON server_events(server_id, created_at);
```
---
### Table: `rcon_configs`
BattlEye RCon credentials per server.
```sql
CREATE TABLE rcon_configs (
id INTEGER PRIMARY KEY AUTOINCREMENT,
server_id INTEGER NOT NULL UNIQUE REFERENCES servers(id) ON DELETE CASCADE,
rcon_password TEXT NOT NULL, -- encrypted at app layer
max_ping INTEGER NOT NULL DEFAULT 200,
enabled INTEGER NOT NULL DEFAULT 1,
updated_at TEXT NOT NULL DEFAULT (datetime('now'))
);
```
---
## Relationships Diagram
```
users (1) ──────────────────────────────────── (many) server_events.actor
servers (1) ──┬── (1) server_configs
├── (1) basic_configs
├── (1) server_profiles
├── (1) launch_params
├── (1) rcon_configs
├── (many) server_mods ──── (many) mods
├── (many) missions
├── (many) mission_rotation → missions
├── (many) players
├── (many) player_history
├── (many) bans
├── (many) logs
├── (many) metrics
└── (many) server_events
```
---
## Maintenance Queries
### Log Retention Cleanup (run daily via APScheduler)
```sql
DELETE FROM logs
WHERE created_at < datetime('now', '-7 days');
```
### Metrics Retention Cleanup (keep 30 days)
```sql
DELETE FROM metrics
WHERE timestamp < datetime('now', '-30 days');
```
### Clear disconnected players on server stop
```sql
DELETE FROM players WHERE server_id = ?;
```
### Vacuum (run weekly)
```sql
VACUUM;
```
---
## Migration Strategy
- Migrations are plain `.sql` files in `backend/migrations/`
- Naming: `001_initial_schema.sql`, `002_add_bans.sql`, etc.
- Tracked in a `schema_migrations` table:
```sql
CREATE TABLE schema_migrations (
version INTEGER PRIMARY KEY,
applied_at TEXT NOT NULL DEFAULT (datetime('now'))
);
```
- Applied automatically at app startup by `database.py:run_migrations()`

445
IMPLEMENTATION_PLAN.md Normal file
View File

@@ -0,0 +1,445 @@
# Languard Server Manager — Implementation Plan
## Prerequisites
Before starting, ensure the following are available:
- Python 3.11+
- A working Arma 3 dedicated server installation (for testing)
- Node.js 18+ (for frontend dev server)
- The reference docs: ARCHITECTURE.md, DATABASE.md, API.md, MODULES.md, THREADING.md
---
## Phase 1 — Foundation (Start Here)
**Goal:** Running FastAPI server with DB, auth, and basic server CRUD.
### Step 1.1 — Project scaffold
```
mkdir backend
cd backend
python -m venv venv
venv/Scripts/activate
pip install fastapi uvicorn[standard] sqlalchemy python-jose[cryptography] passlib[bcrypt] cryptography psutil apscheduler python-multipart slowapi pytest pytest-asyncio httpx
# uvloop (faster event loop) is Linux/macOS only — skip on Windows:
# pip install uvloop # only on Linux/macOS
pip freeze > requirements.txt
```
Create:
- `backend/config.py` — Settings class (see MODULES.md)
- `backend/main.py` — FastAPI app factory, startup/shutdown hooks
- `backend/conftest.py` — pytest fixtures (in-memory SQLite, test client)
- `.env.example` — All env vars documented
### Step 1.2 — Database + Migrations
1. Create `backend/migrations/001_initial_schema.sql` — all tables from DATABASE.md
- Include all CHECK constraints (role, status, verify_signatures, von_codec_quality, etc.)
- Include `PRAGMA busy_timeout=5000` in engine setup
- **Important:** Put `CREATE TABLE IF NOT EXISTS schema_migrations` as the very first
statement — the migration runner queries this table before it can track anything.
2. Create `backend/dal/event_repository.py``ServerEventRepository` (needed by Phase 3 threads)
3. Create `backend/database.py`:
- `get_engine()` with WAL + FK pragma
- `run_migrations()` — reads and applies `.sql` files from migrations/
- `get_db()` — FastAPI dependency (sync session)
- `get_thread_db()` — thread-local session factory
3. Call `run_migrations()` in `main.py:on_startup()`
**Test:** Start app, confirm `languard.db` created with all tables. Run `pytest` with in-memory SQLite to verify schema creates cleanly.
### Step 1.3 — Auth module
1. `backend/auth/utils.py``hash_password`, `verify_password`, `create_access_token`, `decode_access_token`
2. `backend/auth/schemas.py``LoginRequest`, `TokenResponse`, `UserResponse`
3. `backend/auth/service.py``AuthService` (create user, login, list users)
4. `backend/auth/router.py` — login, me, users CRUD
5. `backend/dependencies.py``get_current_user`, `require_admin`
6. `main.py` — seed default admin user on first startup if users table empty
- **Generate a random password** and print it to stdout once (NOT admin/admin)
- Add rate limiting to `POST /auth/login` (5 attempts/minute per IP via slowapi)
- Add input sanitization for all string fields in auth schemas
**Test:** `POST /api/auth/login` returns JWT. `GET /api/auth/me` with token returns user. Rate limiting returns 429 after 5 failed attempts.
### Step 1.4 — Server CRUD (no process management yet)
1. `backend/dal/server_repository.py`
2. `backend/dal/config_repository.py`
3. `backend/servers/schemas.py`
4. `backend/servers/router.py` — GET, POST, PUT, DELETE /servers and /servers/{id}
5. `backend/servers/service.py` — CRUD methods only (skip start/stop for now)
6. `backend/utils/file_utils.py``ensure_server_dirs()`, `sanitize_filename()`
7. `backend/utils/port_checker.py``is_port_in_use()`, `check_server_ports_available()`
8. Port validation on create/start: check game_port through game_port+4
**Test:** Create server via API, confirm DB row + directory created.
---
## Phase 2 — Process Management
**Goal:** Start/stop actual `arma3server.exe` processes.
### Step 2.1 — Config Generator
1. `backend/servers/config_generator.py`
2. **Use a structured builder** (NOT f-strings) — escape double quotes and newlines in all user-supplied string values to prevent config injection
3. Write `server.cfg` covering all params from DATABASE.md, including mission rotation as `class Missions {}` block
4. Write `basic.cfg`
5. Write `server.Arma3Profile`**written to `servers/{id}/server/server.Arma3Profile`** (Arma 3 reads from the `-name` subdirectory)
6. Write `BESERVER_CFG_TEMPLATE`**required for BattlEye RCon to work**
```
# servers/{id}/battleye/beserver.cfg
RConPassword {rcon_password}
RConPort {rcon_port}
```
`write_beserver_cfg()` must create the `battleye/` directory and write this file.
Without it BattlEye will not open an RCon port regardless of launch parameters.
7. `build_launch_args()` — assembles full CLI arg list
- Include `-bepath=./battleye` to point BE at the generated config (relative to cwd)
- Include `-profiles=./` and `-name=server` for profile directory
- All relative paths resolve against `cwd=servers/{id}/` set in ProcessManager
8. Set file permissions 0600 on config files containing passwords (server.cfg, beserver.cfg)
**Test:** `ConfigGenerator.write_all(server_id)` → inspect all generated files for correctness.
Verify `servers/{id}/battleye/beserver.cfg` exists with the correct RCon password.
Verify `servers/{id}/server/server.Arma3Profile` exists.
Test config injection prevention: set hostname to `X"; passwordAdmin = "pwned"; //` — verify generated server.cfg does NOT contain the injected directive.
Validate generated `server.cfg` manually by running the server with it.
### Step 2.2 — Process Manager
1. `backend/servers/process_manager.py` — `ProcessManager` singleton
2. `start(server_id, exe_path, args, cwd=servers/{id}/)` — subprocess.Popen with cwd set to server instance dir
3. `stop(server_id, timeout=30)` — on Windows: `terminate()` = hard kill (no SIGTERM). Graceful shutdown is via RCon `#shutdown` in ServerService.
4. `kill()`, `is_running()`, `get_pid()`
5. `recover_on_startup()` — verify PID is alive AND process name matches arma3server (prevents PID reuse)
6. Wire `ServerService.start()` and `ServerService.stop()`
7. Add `POST /servers/{id}/start`, `POST /servers/{id}/stop`, `POST /servers/{id}/kill` endpoints
**Test:** Start a server via API → confirm process appears in Task Manager. Stop it → confirm process ends.
### Step 2.3 — Config endpoints
1. `GET /servers/{id}/config`
2. `PUT /servers/{id}/config/server`
3. `PUT /servers/{id}/config/basic`
4. `PUT /servers/{id}/config/profile`
5. `PUT /servers/{id}/config/launch`
6. `GET /servers/{id}/config/preview`
**Test:** Update hostname via API → regenerate and start server → confirm new hostname appears in server browser.
---
## Phase 3 — Background Threads
**Goal:** Live monitoring — process crash detection, log tailing, metrics.
### Step 3.1 — Thread infrastructure
1. `backend/threads/base_thread.py` — `BaseServerThread`
2. `backend/threads/thread_registry.py` — `ThreadRegistry` singleton
3. Wire `start_server_threads()` / `stop_server_threads()` into `ServerService.start()` / `ServerService.stop()`
### Step 3.2 — Process Monitor Thread
1. `backend/threads/process_monitor.py`
2. Crash detection + status update in DB
3. Auto-restart with exponential backoff
**Test:** Start server → kill process manually → confirm DB status changes to 'crashed'.
**Test:** Enable auto_restart → kill → confirm server restarts automatically.
### Step 3.3 — Log Tail Thread
1. `backend/logs/parser.py` — `RPTParser`
2. `backend/dal/log_repository.py`
3. `backend/threads/log_tail.py`
4. `backend/logs/service.py`
5. `backend/logs/router.py` — `GET /servers/{id}/logs`
**Test:** Start server → `GET /api/servers/{id}/logs` returns recent RPT lines.
### Step 3.4 — Metrics Collector Thread
1. `backend/metrics/service.py`
2. `backend/dal/metrics_repository.py`
3. `backend/threads/metrics_collector.py`
4. `backend/metrics/router.py` — `GET /servers/{id}/metrics`
**Test:** Running server → query metrics endpoint → see CPU/RAM data points.
---
## Phase 4 — BattlEye RCon
**Goal:** Real-time player list, in-game admin commands.
### Step 4.1 — RCon Client
1. `backend/rcon/client.py` — `BERConClient`
2. Implement BE RCon UDP protocol:
- Packet structure: `'BE'` + CRC32 (little-endian) + type byte + payload
- Login: type `0x00`, payload = password
- Command: type `0x01`, payload = sequence byte + command string
- Keepalive: type `0x02`, payload = empty
3. **Request multiplexer**: track pending requests by sequence byte, route responses to correct caller via `threading.Event` per request. Background receiver thread reads all incoming packets.
4. `parse_players_response()` — parse `players` command output
5. Handle unsolicited server messages (type 0x02) — enqueue for event logging
BattlEye RCon packet format reference:
```
Login packet (client → server):
42 45 # 'BE'
[CRC32 LE] # checksum of bytes after CRC
FF # packet type prefix
00 # login type
[password] # ASCII password
Command packet:
42 45
[CRC32 LE]
FF
01
[seq byte] # 0x00-0xFF, wraps around
[command] # ASCII command string
Command response (server → client):
42 45
[CRC32 LE]
FF
01 # 0x01 = command response (same type byte as outgoing command)
[seq byte]
[response] # ASCII response text
Server-pushed message (server → client, unsolicited):
42 45
[CRC32 LE]
FF
02 # 0x02 = server message (chat events, kill events, etc.)
[seq byte]
[message] # ASCII message text
```
**Test:** Connect BERConClient to a running server with BattlEye → successfully login → send `players` → receive response.
### Step 4.2 — RCon Service + Poller Thread
1. `backend/rcon/service.py` — `RConService`
2. `backend/threads/rcon_poller.py`
3. `backend/dal/player_repository.py`
4. `backend/players/service.py`
5. `backend/players/router.py` — `GET /servers/{id}/players`
**Test:** Players join server → `GET /players` returns them with pings.
### Step 4.3 — Admin Actions via RCon
1. `POST /servers/{id}/players/{num}/kick`
2. `POST /servers/{id}/players/{num}/ban`
3. `POST /servers/{id}/rcon/command`
4. `POST /servers/{id}/rcon/say`
5. `backend/dal/ban_repository.py`
6. `GET/POST/DELETE /servers/{id}/bans`
7. **ban.txt bidirectional sync**: on ban add/delete via API, write to `battleye/ban.txt`; on startup, read `ban.txt` and upsert into DB
**Test:** Kick a player via API → confirm player disconnected from server.
---
## Phase 5 — WebSocket Real-Time
**Goal:** Live updates to React frontend without polling.
### Step 5.1 — Broadcast infrastructure
1. `backend/websocket/broadcaster.py` — `BroadcastThread` + `enqueue()`
2. `backend/websocket/manager.py` — `ConnectionManager`
3. Store event loop reference in `main.py:on_startup()`:
```python
import asyncio
# on_startup() runs inside the asyncio event loop — use get_running_loop(),
# not get_event_loop() (deprecated in Python 3.10+ from async context).
_event_loop = asyncio.get_running_loop()
broadcaster.init(_event_loop, connection_manager)
```
4. Start `BroadcastThread` in `on_startup()`
5. Wire `BroadcastThread.enqueue()` calls into all background threads
### Step 5.2 — WebSocket endpoint
1. `backend/websocket/router.py`
2. JWT validation from query param
3. Subscribe/unsubscribe message handling
4. Ping/pong keepalive
**Test:** Connect to `ws://localhost:8000/ws/1?token=...` → see live log lines stream in terminal.
### Step 5.3 — Integrate all event sources
Wire `BroadcastThread.enqueue()` into:
- `ProcessMonitorThread` → status updates, crash events
- `LogTailThread` → log lines
- `MetricsCollectorThread` → metrics snapshots
- `RConPollerThread` → player list updates
- `ServerService.start/stop` → status transitions
**Test:** React frontend connects to WS → server starts → see status, logs, metrics all update in real time.
---
## Phase 6 — Mission & Mod Management
### Step 6.1 — Missions
1. `backend/missions/service.py`
2. `backend/missions/router.py`
3. Upload PBO validation (check `.pbo` extension, parse name)
4. Mission rotation CRUD
**Test:** Upload a `.pbo` → appears in `GET /missions` → set as rotation → start server → mission available.
### Step 6.2 — Mods
1. `backend/mods/service.py`
2. `backend/mods/router.py`
3. `build_mod_string()` — assemble `-mod=` and `-serverMod=` args
4. Wire mod string into `ConfigGenerator.build_launch_args()`
**Test:** Register `@CBA_A3` → enable on server → start → server loads mod.
---
## Phase 7 — Polish & Production
### Step 7.1 — APScheduler jobs
Add to `on_startup()`:
```python
# Use BackgroundScheduler (not AsyncIOScheduler) because cleanup methods
# perform sync SQLite operations. AsyncIOScheduler would block the event loop.
from apscheduler.schedulers.background import BackgroundScheduler
scheduler = BackgroundScheduler()
scheduler.add_job(log_service.cleanup_old_logs, 'cron', hour=3)
scheduler.add_job(metrics_service.cleanup_old_metrics, 'cron', hour=3, minute=30)
scheduler.add_job(player_service.cleanup_old_history, 'cron', hour=4) # 90-day retention
scheduler.start()
```
### Step 7.2 — Startup recovery
In `on_startup()` → `ProcessManager.recover_on_startup()`:
- Query DB for servers with `status='running'`
- Check if PID still alive (`psutil.pid_exists(pid)`)
- If alive: re-attach threads (skip process start, just start monitoring threads)
- If dead: mark as `crashed`, clear players
### Step 7.3 — Events log
1. `backend/dal/event_repository.py`
2. Insert events for: start, stop, crash, kick, ban, config change, mission change
3. `GET /servers/{id}/events` endpoint
### Step 7.4 — Security hardening (additional layers)
1. Encrypt sensitive DB fields: `password`, `password_admin`, `rcon_password`
- `backend/utils/crypto.py` with Fernet
- **Key format:** `LANGUARD_ENCRYPTION_KEY` must be a Fernet base64 key, NOT hex.
Generate with: `python -c "from cryptography.fernet import Fernet; print(Fernet.generate_key().decode())"`
Passing a hex string to `Fernet()` raises `ValueError` at startup.
- Encrypt on write, decrypt on read in repositories
- **NOTE:** Core security (rate limiting, input sanitization, config escaping, exe path validation) is already in Phases 1-2.
2. Additional penetration testing and security audit
3. Content-Security-Policy headers for frontend
### Step 7.5 — Frontend integration checklist
Verify React app can:
- [ ] Login and store JWT
- [ ] List servers with live status
- [ ] Start/stop server and see status update via WebSocket (no page refresh)
- [ ] View streaming log output
- [ ] See player list update every 10s
- [ ] See CPU/RAM charts update every 5s
- [ ] Edit all config sections and see preview
- [ ] Upload a mission PBO
- [ ] Kick a player
- [ ] Send a message to all players
---
## Testing Strategy
### Unit tests (pytest)
- `ConfigGenerator.write_server_cfg()` — compare output against expected string; test config injection prevention
- `ConfigGenerator._escape_config_string()` — test double-quote and newline escaping
- `RPTParser.parse_line()` — test all log formats
- `BERConClient.parse_players_response()` — test with sample output
- `AuthService.login()` — correct password / wrong password / rate limiting
- Repository methods — use in-memory SQLite (`:memory:`)
- `check_server_ports_available()` — test derived port validation
- `sanitize_filename()` — test path traversal prevention
- In-memory SQLite setup in `conftest.py` — shared fixture for all repository tests
### Integration tests
- Full start/stop cycle with a real arma3server.exe (manual — requires licensed Arma 3 installation, not in CI)
- WebSocket message delivery (can be automated with httpx test client)
- RCon command round-trip (manual — requires running server with BattlEye)
### Load notes
- SQLite with WAL handles concurrent reads from 4 threads per server well
- For >10 simultaneous servers, consider connection pool size tuning
- WebSocket broadcast scales to ~100 concurrent connections without issue
---
## Environment Setup (Developer)
```bash
# 1. Clone repo
git clone <repo>
cd languard-server-manager
# 2. Backend
cd backend
python -m venv venv
source venv/bin/activate # or venv\Scripts\activate on Windows
pip install -r requirements.txt
# 3. Environment
cp .env.example .env
# Edit .env: set LANGUARD_ARMA_EXE to your arma3server_x64.exe path
# 4. Run backend
uvicorn main:app --reload --host 0.0.0.0 --port 8000
# 5. Frontend (separate)
cd ../frontend
npm install
npm run dev
```
Backend auto-creates `languard.db` and seeds an admin user on first run:
- Username: `admin`
- Password: **randomly generated** and printed to stdout once (e.g., `Initial admin password: a7b9c2d4e5f6...`)
- Change immediately via `PUT /api/auth/password`
---
## Phase Summary
| Phase | Deliverable | Est. Complexity |
|-------|-------------|----------------|
| 1 | Foundation (auth + server CRUD) | Low |
| 2 | Process management + config gen | Medium |
| 3 | Background threads (monitor, logs, metrics) | Medium-High |
| 4 | BattlEye RCon (player list, admin cmds) | High |
| 5 | WebSocket real-time | Medium |
| 6 | Mission + mod management | Low-Medium |
| 7 | Polish, security, recovery | Medium |
Implement phases in order — each phase builds on the previous and is independently testable.

900
MODULES.md Normal file
View File

@@ -0,0 +1,900 @@
# Languard Server Manager — Python Module Breakdown
## Project Structure
```
backend/
├── main.py
├── config.py
├── database.py
├── dependencies.py
├── auth/
│ ├── __init__.py
│ ├── router.py
│ ├── service.py
│ ├── schemas.py
│ └── utils.py
├── servers/
│ ├── __init__.py
│ ├── router.py
│ ├── service.py
│ ├── schemas.py
│ ├── process_manager.py
│ └── config_generator.py
├── rcon/
│ ├── __init__.py
│ ├── client.py
│ └── service.py
├── missions/
│ ├── __init__.py
│ ├── router.py
│ ├── service.py
│ └── schemas.py
├── mods/
│ ├── __init__.py
│ ├── router.py
│ ├── service.py
│ └── schemas.py
├── players/
│ ├── __init__.py
│ ├── router.py
│ ├── service.py
│ └── schemas.py
├── logs/
│ ├── __init__.py
│ ├── router.py
│ ├── service.py
│ └── parser.py
├── metrics/
│ ├── __init__.py
│ ├── router.py
│ └── service.py
├── websocket/
│ ├── __init__.py
│ ├── router.py
│ ├── manager.py
│ └── broadcaster.py
├── threads/
│ ├── __init__.py
│ ├── base_thread.py
│ ├── process_monitor.py
│ ├── log_tail.py
│ ├── metrics_collector.py
│ ├── rcon_poller.py
│ └── thread_registry.py
├── system/
│ ├── __init__.py
│ └── router.py
├── dal/
│ ├── __init__.py
│ ├── base_repository.py
│ ├── server_repository.py
│ ├── config_repository.py
│ ├── player_repository.py
│ ├── log_repository.py
│ ├── metrics_repository.py
│ ├── mission_repository.py
│ ├── mod_repository.py
│ ├── ban_repository.py
│ └── event_repository.py
├── migrations/
│ ├── runner.py
│ └── 001_initial_schema.sql
└── utils/
├── __init__.py
├── crypto.py
├── file_utils.py
└── port_checker.py
```
---
## Module Details
### `main.py`
Entry point. Creates and configures the FastAPI application.
```python
# Responsibilities:
# - Create FastAPI app instance
# - Register all routers with prefix /api
# - Configure CORS middleware
# - Add JWT auth middleware
# - Register startup/shutdown event handlers:
# startup: run DB migrations, init ProcessManager, restore running servers
# shutdown: gracefully stop all BroadcastThread, close DB
# - Mount static files if serving frontend
# - NOTE: Route handlers that perform blocking I/O (subprocess, file writes,
# socket checks) MUST be declared as plain `def` (not `async def`).
# FastAPI automatically runs plain-def handlers in a thread pool,
# preventing event loop blocking. Only truly async operations
# (WebSocket send, async library calls) should use `async def`.
Key functions:
create_app() -> FastAPI
on_startup() # DB migrations, recover state
on_shutdown() # Clean up threads, close connections
```
---
### `config.py`
Loads and validates environment variables using Pydantic `BaseSettings`.
```python
class Settings(BaseSettings):
secret_key: str
encryption_key: str # Fernet base64 key (NOT hex)
db_path: str = "./languard.db"
servers_dir: str = "./servers"
arma_exe: str = "C:/Arma3Server/arma3server_x64.exe"
host: str = "0.0.0.0"
port: int = 8000
cors_origins: list[str] = ["http://localhost:5173"]
log_retention_days: int = 7
metrics_retention_days: int = 30
player_history_retention_days: int = 90
jwt_expire_hours: int = 24
login_rate_limit: str = "5/minute" # per IP
settings = Settings() # singleton
```
---
### `database.py`
Database engine setup and session management.
```python
# Responsibilities:
# - Create SQLAlchemy engine with WAL + FK + busy_timeout pragmas
# - Provide get_db() dependency for FastAPI routes (sync session)
# - Provide get_thread_db() for background threads (thread-local sessions)
# - run_migrations(): apply pending .sql migration files at startup
# - Migration rollback: if a migration fails, the schema_migrations table
# is NOT updated; re-running applies only unapplied migrations (idempotent)
# Pragma setup:
# PRAGMA journal_mode=WAL
# PRAGMA foreign_keys=ON
# PRAGMA busy_timeout=5000 # 5s wait before "database is locked" error
Key functions:
get_engine() -> Engine
get_db() -> Generator[Connection, None, None] # FastAPI dependency
get_thread_db() -> Connection # for threads
run_migrations(engine: Engine)
```
---
### `dependencies.py`
Reusable FastAPI dependencies.
```python
# Responsibilities:
# - get_current_user(token) -> User (JWT validation)
# - require_admin(user) -> User (role check)
# - get_server_or_404(server_id, db) -> ServerRow
Key functions:
get_current_user(credentials: HTTPAuthorizationCredentials) -> User
require_admin(user: User = Depends(get_current_user)) -> User
get_server_or_404(server_id: int, db: Connection) -> dict
```
---
### `auth/`
**`router.py`** — FastAPI router for auth endpoints.
- `POST /auth/login`
- `POST /auth/logout`
- `GET /auth/me`
- `PUT /auth/password`
- `GET /auth/users` (admin)
- `POST /auth/users` (admin)
- `DELETE /auth/users/{user_id}` (admin)
**`service.py`** — `AuthService`
```python
class AuthService:
def login(username, password) -> TokenResponse
def create_user(username, password, role) -> User
def change_password(user_id, current_pw, new_pw) -> bool
def list_users() -> list[User]
def delete_user(user_id) -> bool
```
**`utils.py`**
```python
def hash_password(password: str) -> str # bcrypt
def verify_password(plain, hashed) -> bool
def create_access_token(data: dict) -> str # JWT sign
def decode_access_token(token: str) -> dict # JWT verify
```
---
### `servers/`
**`router.py`** — All server CRUD + lifecycle endpoints.
- `GET /servers`
- `POST /servers`
- `GET /servers/{id}`
- `PUT /servers/{id}`
- `DELETE /servers/{id}`
- `POST /servers/{id}/start`
- `POST /servers/{id}/stop`
- `POST /servers/{id}/restart`
- `POST /servers/{id}/kill`
- `GET /servers/{id}/config`
- `PUT /servers/{id}/config/server`
- `PUT /servers/{id}/config/basic`
- `PUT /servers/{id}/config/profile`
- `PUT /servers/{id}/config/launch`
- `PUT /servers/{id}/config/rcon`
- `GET /servers/{id}/config/preview`
- `GET /servers/{id}/config/download/{filename}`
**`service.py`** — `ServerService`
```python
class ServerService:
def list_servers() -> list[ServerSummary]
def get_server(server_id) -> ServerDetail
def create_server(data: CreateServerRequest) -> Server
def update_server(server_id, data) -> Server
def delete_server(server_id) -> bool
def start(server_id) -> StatusResponse
# 1. Load config from DB
# 2. Validate exe_path exists and basename matches allowlist
# (arma3server_x64.exe, arma3server.exe) — prevents executing arbitrary binaries
# 3. Check ALL derived ports not in use (game_port through game_port+3 + rcon_port)
# 4. ConfigGenerator.write_all(server_id)
# — if write fails: DB status='error', return error (no process launch)
# 5. Build launch args
# 6. ProcessManager.start(server_id, exe, args, cwd=servers/{id}/)
# 7. DB: status = 'starting'
# 8. ThreadRegistry.start_server_threads(server_id)
# 9. Broadcast status update
def stop(server_id, force=False) -> StatusResponse
# 1. If not force: RConService.send_command('#shutdown')
# 2. Wait up to 30s for process exit
# 3. If still running: ProcessManager.kill(server_id)
# 4. DB: status = 'stopped', pid = NULL
# 5. ThreadRegistry.stop_server_threads(server_id)
# 6. PlayerRepository.clear(server_id)
# 7. Broadcast status update
def restart(server_id) -> StatusResponse
def update_config_server(server_id, data) -> ServerConfig
def update_config_basic(server_id, data) -> BasicConfig
def update_config_profile(server_id, data) -> ServerProfile
def update_config_launch(server_id, data) -> LaunchParams
def update_config_rcon(server_id, data) -> RConConfig
# Updates rcon_configs row (rcon_password, max_ping, enabled) via ConfigRepository.
# If data includes rcon_port, also updates servers.rcon_port via ServerRepository —
# rcon_port lives in the servers table, not rcon_configs.
# Regenerates battleye/beserver.cfg immediately after saving.
def get_config_preview(server_id) -> str
```
**`process_manager.py`** — `ProcessManager` singleton
```python
class ProcessManager:
_instance = None
_processes: dict[int, subprocess.Popen] = {}
_lock: threading.Lock
@classmethod
def get() -> ProcessManager
def start(server_id, exe_path, args: list[str], cwd: str) -> int # returns PID
# subprocess.Popen([exe_path, *args], cwd=cwd, stdout=PIPE, stderr=STDOUT)
# cwd = servers/{server_id}/ so relative config paths resolve correctly
def stop(server_id, timeout=30) -> bool
# On Windows: subprocess.terminate() = TerminateProcess (hard kill, no SIGTERM)
# Graceful shutdown is handled by ServerService via RCon #shutdown first.
# This method is the forceful fallback: terminate() → wait(timeout)
def kill(server_id) -> bool
# terminate() immediately (hard kill on Windows)
def is_running(server_id) -> bool
def get_pid(server_id) -> int | None
def get_process(server_id) -> subprocess.Popen | None
def list_running() -> list[int] # list of server_ids
def recover_on_startup(db)
# At app startup: query DB for servers with status='running'
# Check if pid still alive AND verify process name matches arma3server
# (prevents PID reuse by unrelated processes)
# If alive: re-attach monitoring threads (skip process start)
# If dead or wrong process: mark crashed, clear players
```
**`config_generator.py`** — `ConfigGenerator`
```python
class ConfigGenerator:
def write_all(server_id: int, db: Connection) -> None
# Writes server.cfg, basic.cfg, server.Arma3Profile, battleye/beserver.cfg
# Creates directories if they don't exist
# Sets restrictive file permissions on files containing passwords:
# Unix: chmod 0600
# Windows: use icacls to grant only the service account read/write access
# Raises IOError if write fails — caller must handle (set DB status='error')
def write_server_cfg(server_id, config: dict, path: Path) -> None
# Uses structured builder — NOT f-strings or string.Template
# Escapes double quotes in all string values (replace " with \"")
# Validates no newline injection in string fields
# Renders mission rotation as class Missions { class Mission1 { ... }; };
def write_basic_cfg(server_id, config: dict, path: Path) -> None
def write_arma3profile(server_id, profile: dict, path: Path) -> None
# Writes to servers/{id}/server/server.Arma3Profile (profile subdirectory)
def write_beserver_cfg(server_id, rcon_config: dict, path: Path) -> None
# Generates servers/{id}/battleye/beserver.cfg
# Content: "RConPassword <password>\nRConPort <port>\n"
# Without this file BattlEye will not open an RCon port.
def build_launch_args(server_id, config: dict, launch: dict, mod_string: str) -> list[str]
# Returns list of command-line arguments for arma3server_x64.exe
# e.g. ['-port=2302', '-config=server.cfg', '-cfg=basic.cfg',
# '-profiles=./', '-name=server', '-world=empty',
# '-mod=@CBA;@ACE', '-serverMod=@ACE_server',
# '-bepath=./battleye',
# '-limitFPS=50', '-autoInit', '-loadMissionToMemory', ...]
# NOTE: -profiles is relative to cwd (which is set to servers/{id}/)
# -bepath is required for BattlEye to find beserver.cfg
def _escape_config_string(value: str) -> str
# Escapes backslashes FIRST, then double quotes and newlines for safe Arma 3 config interpolation.
# Order matters: backslash → \\, then " → \", then newline → \\n
# If backslashes are not escaped first, input "test\\" produces "test\\"
# which Arma 3 reads as an escaped backslash + unescaped closing quote = injection.
value = value.replace('\\', '\\\\') # backslash FIRST
value = value.replace('"', '\\"') # then double-quote
value = value.replace('\n', '\\n') # then newline
value = value.replace('\r', '') # strip carriage returns
value = value.replace('\t', ' ') # tabs → spaces
return value
def _render_mission_class(rotation: list[dict]) -> str
# Renders the class Missions {} block for server.cfg
# class Missions { class Mission1 { template="..."; difficulty="..."; }; ... };
```
---
### `rcon/`
**`client.py`** — `BERConClient`
```python
class BERConClient:
"""
Implements BattlEye RCon protocol over UDP.
Packet type bytes:
Client → Server: 0xFF 0x00 [password] → login
Client → Server: 0xFF 0x01 [seq] [command] → send command
Client → Server: 0xFF 0x02 → keepalive (empty payload)
Server → Client: 0xFF 0x00 [0x00|0x01] → login response (0x01=ok)
Server → Client: 0xFF 0x01 [seq] [response] → command response
Server → Client: 0xFF 0x02 [seq] [message] → unsolicited server message (chat/kill events)
Note: 0x01 is the type byte for BOTH outgoing commands AND incoming command responses.
"""
def __init__(host: str, port: int, password: str)
# Request multiplexer: prevents response misrouting when
# RConPollerThread and API-request RConService share the same socket.
_pending_requests: dict[int, threading.Event] = {} # seq → Event
_responses: dict[int, str] = {} # seq → response
_seq_counter: int = 0
_lock: threading.Lock
def connect() -> bool
def disconnect()
def login() -> bool
def send_command(command: str, timeout: float = 5.0) -> str | None
# Sends command with sequence number, creates Event, waits for response
# Routes response to correct caller by matching sequence byte
def keepalive() # send empty packet every 30s
def is_connected() -> bool
# Background receiver thread:
def _receiver_loop()
# Reads all incoming UDP packets
# For type 0x01 (command response): sets Event + stores response for matching seq
# For type 0x02 (server message): enqueues for processing (player events, chat)
def parse_players_response(response: str) -> list[PlayerInfo]
# Parse output of 'players' command
# Format: "Players on server:\n[#] [IP:Port] [Ping] [GUID] [Name]\n..."
```
**`service.py`** — `RConService`
```python
class RConService:
def __init__(server_id: int)
def send_command(command: str) -> str | None
# Gets or creates BERConClient, sends command, returns response
def kick_player(player_num: int, reason: str = "") -> bool
def ban_player(player_num: int, duration_minutes: int, reason: str) -> bool
def unban(guid: str) -> bool
def say_all(message: str) -> bool
def get_players() -> list[PlayerInfo]
def send_mission_command(mission_name: str) -> bool
def shutdown() -> bool
def restart() -> bool
def lock() -> bool
def unlock() -> bool
```
---
### `missions/`
**`service.py`** — `MissionService`
```python
class MissionService:
def list_missions(server_id) -> list[Mission]
def upload_mission(server_id, filename: str, file_data: bytes) -> Mission
# 1. Validate .pbo extension
# 2. Parse mission_name and terrain from filename
# 3. Write to servers/{server_id}/mpmissions/{filename}
# 4. Insert into missions table
# 5. Return Mission object
def delete_mission(server_id, mission_id) -> bool
# 1. Check not in active rotation
# 2. Delete file from disk
# 3. Delete from DB
def get_rotation(server_id) -> list[RotationEntry]
def update_rotation(server_id, rotation: list[RotationEntry]) -> bool
# 1. Delete existing rotation rows
# 2. Insert new ordered list
# 3. Trigger config regeneration
```
---
### `mods/`
**`service.py`** — `ModService`
```python
class ModService:
def list_all() -> list[Mod]
def register_mod(name, folder_path, workshop_id, description) -> Mod
# Validates folder exists
def delete_mod(mod_id) -> bool
# Check not in use by any server
def get_server_mods(server_id) -> list[ServerMod]
def update_server_mods(server_id, mods: list) -> bool
# Replaces server_mods rows, regenerates mod string
def build_mod_string(server_id) -> tuple[str, str]
# Returns (-mod=..., -serverMod=...) strings
```
---
### `players/`
**`service.py`** — `PlayerService`
```python
class PlayerService:
def get_current_players(server_id) -> list[Player]
def kick(server_id, player_num, reason) -> bool
# RConService.kick_player() + log event
def ban(server_id, player_num, duration_minutes, reason) -> bool
# RConService.ban_player() + insert into bans table
def get_history(server_id, limit, offset, search) -> PaginatedResult
def update_from_rcon(server_id, rcon_players: list) -> None
# Upsert players table; detect disconnections; insert player_history rows
```
---
### `logs/`
**`parser.py`** — `RPTParser`
```python
class RPTParser:
# Parses Arma 3 RPT log format
# Example line: "10:05:23 BattlEye Server: Initialized (v1.240)"
# With timestamp format "short": "10:05:23"
# With timestamp format "full": "2026/04/16, 10:05:23"
def parse_line(line: str) -> LogEntry | None
# Returns: {timestamp, level, message}
# level detection: 'error' if 'Error' in msg, 'warning' if 'Warning', else 'info'
def parse_timestamp(raw: str) -> datetime
```
**`service.py`** — `LogService`
```python
class LogService:
def query(server_id, limit, offset, level, since, search) -> PaginatedLogs
def clear(server_id) -> int # returns deleted count
def get_rpt_path(server_id) -> Path | None
# Delegates to file_utils.get_rpt_path() — globs for latest timestamped .rpt
def cleanup_old_logs() # called by APScheduler
```
---
### `metrics/`
**`service.py`** — `MetricsService`
```python
class MetricsService:
def query(server_id, from_dt, to_dt, resolution) -> list[MetricPoint]
# Aggregates by resolution ('1m', '5m', '1h')
def insert(server_id, cpu, ram, player_count) -> None
def cleanup_old_metrics() # called by APScheduler
def get_latest(server_id) -> MetricPoint | None
```
---
### `websocket/`
**`manager.py`** — `ConnectionManager`
```python
class ConnectionManager:
"""
Manages active WebSocket connections grouped by server_id.
'all' is a special server_id that receives events for all servers.
"""
_connections: dict[str, set[WebSocket]]
_lock: asyncio.Lock
async def connect(ws: WebSocket, server_id: str, channels: list[str])
async def disconnect(ws: WebSocket, server_id: str)
async def broadcast(server_id: str, message: dict)
# Sends to all connections subscribed to server_id + 'all'
async def send_personal(ws: WebSocket, message: dict)
```
**`broadcaster.py`** — `BroadcastThread`
```python
class BroadcastThread(threading.Thread):
"""
Runs in background thread.
Reads from a queue (put by background threads).
Posts messages to asyncio event loop via run_coroutine_threadsafe().
"""
_queue: queue.Queue
_loop: asyncio.AbstractEventLoop
_manager: ConnectionManager
_running: bool
def run() # main loop: get from queue, schedule broadcast coroutine
@staticmethod
def enqueue(server_id: int, msg_type: str, data: dict)
# Thread-safe: called from any background thread
```
**`router.py`** — WebSocket endpoint
```python
@router.websocket("/ws/{server_id}")
async def websocket_endpoint(ws: WebSocket, server_id: str, token: str = Query(...)):
# 1. Validate JWT token from query param
# 2. Accept WebSocket connection
# 3. Register with ConnectionManager
# 4. Loop: receive messages (ping/subscribe/unsubscribe)
# 5. On disconnect: deregister from ConnectionManager
```
---
### `threads/`
**`base_thread.py`** — `BaseServerThread`
```python
class BaseServerThread(threading.Thread):
def __init__(server_id: int, interval: float)
def stop() # sets _stop_event
def is_stopped() -> bool
def run() # creates thread-local DB connection, calls setup(),
# then loops: tick(self._db) + wait(interval)
# finally calls teardown() in finally block
def setup() # override for init work (receives self._db)
def tick() # override for per-interval work (uses self._db)
def teardown() # override for cleanup (close files, sockets)
def on_error(e: Exception) # default: log, continue
# if same error repeats 5x in a row: escalate + self.stop()
```
**`process_monitor.py`** — `ProcessMonitorThread`
```python
class ProcessMonitorThread(BaseServerThread):
interval = 1.0 # seconds
def tick():
# 1. Check if process is still alive (os.kill(pid, 0))
# 2. If dead:
# a. Get exit code
# b. DB: status = 'crashed', stopped_at = now
# c. Clear players from DB
# d. Broadcast: {type: 'status', status: 'crashed'}
# e. Insert server_events: {event_type: 'crashed', exit_code}
# f. If auto_restart enabled and restart_count < max_restarts:
# DB: increment restart_count
# Schedule restart after 10s (threading.Timer)
# g. self.stop()
```
**`log_tail.py`** — `LogTailThread`
```python
class LogTailThread(BaseServerThread):
interval = 0.1 # 100ms
def setup():
# Find .rpt file path
# Open file, seek to end (tail behavior)
self._file = open(rpt_path, 'r', encoding='utf-8', errors='replace')
self._file.seek(0, 2) # seek to end
def tick():
# 1. Read all new lines from self._file
# 2. For each line:
# a. RPTParser.parse_line(line) -> LogEntry
# b. LogRepository.insert(server_id, entry)
# c. BroadcastThread.enqueue(server_id, 'log', entry)
def on_rpt_rotate():
# Close and reopen if file was rotated (new server start)
```
**`metrics_collector.py`** — `MetricsCollectorThread`
```python
class MetricsCollectorThread(BaseServerThread):
interval = 5.0 # seconds
def tick():
# 1. Get PID from ProcessManager
# 2. psutil.Process(pid).cpu_percent(interval=0.5)
# 3. psutil.Process(pid).memory_info().rss / (1024*1024) # MB
# 4. PlayerRepository.count(server_id) -> player_count
# 5. MetricsRepository.insert(server_id, cpu, ram, player_count)
# 6. BroadcastThread.enqueue(server_id, 'metrics', {cpu, ram, player_count})
```
**`rcon_poller.py`** — `RConPollerThread`
```python
class RConPollerThread(BaseServerThread):
interval = 10.0 # seconds
startup_delay = 30.0 # wait 30s after server start before first poll
_rcon_ready = False # flag: set True only after successful setup
def setup():
# Use _stop_event.wait() instead of time.sleep() so the thread
# can be interrupted immediately during shutdown
if self._stop_event.wait(self.startup_delay):
self._rcon_ready = False
return # stop was requested during startup delay
self._rcon = RConService(self.server_id)
self._rcon_ready = True
def tick():
if not self._rcon_ready:
return # setup() failed or was interrupted — skip tick
# 1. RConService.get_players() -> list[PlayerInfo]
# 2. PlayerService.update_from_rcon(server_id, players)
# 3. BroadcastThread.enqueue(server_id, 'players', {players, count})
# 4. RConClient.keepalive() if needed
```
**`thread_registry.py`** — `ThreadRegistry`
```python
class ThreadRegistry:
"""
Singleton. Manages all background threads per server.
"""
_threads: dict[int, dict[str, BaseServerThread]]
_lock: threading.Lock
@classmethod
def get() -> ThreadRegistry
def start_server_threads(server_id: int) -> None
# Instantiates and starts:
# ProcessMonitorThread, LogTailThread,
# MetricsCollectorThread, RConPollerThread
def stop_server_threads(server_id: int) -> None
# Calls stop() on each thread; joins with timeout
def get_thread(server_id, thread_type: str) -> BaseServerThread | None
def list_active(server_id) -> list[str] # thread names
def stop_all() -> None # on app shutdown
```
---
### `dal/`
**`base_repository.py`**
```python
class BaseRepository:
def __init__(db: Connection)
def execute(sql: str, params: tuple = ()) -> CursorResult
def fetchone(sql: str, params: tuple = ()) -> dict | None
def fetchall(sql: str, params: tuple = ()) -> list[dict]
def insert(table: str, data: dict) -> int # returns last_insert_rowid
def update(table: str, data: dict, where: str, params: tuple) -> int
def delete(table: str, where: str, params: tuple) -> int
def row_to_dict(row) -> dict
```
**`server_repository.py`**
```python
class ServerRepository(BaseRepository):
def get_all() -> list[dict]
def get_by_id(server_id) -> dict | None
def create(data: dict) -> int
def update_status(server_id, status, pid=None, started_at=None) -> None
def update(server_id, data: dict) -> None
def delete(server_id) -> None
def get_running() -> list[dict] # for startup recovery
def increment_restart_count(server_id) -> None
def reset_restart_count(server_id) -> None
```
**`event_repository.py`**
```python
class ServerEventRepository(BaseRepository):
def insert(server_id: int, event_type: str, actor: str, detail: dict) -> int
def get_events(server_id: int, limit: int, offset: int, event_type: str | None) -> list[dict]
def get_recent(server_id: int, limit: int = 20) -> list[dict]
```
**`config_repository.py`**
```python
class ConfigRepository(BaseRepository):
def get_server_config(server_id) -> dict | None
def upsert_server_config(server_id, data: dict) -> None
def get_basic_config(server_id) -> dict | None
def upsert_basic_config(server_id, data: dict) -> None
def get_profile(server_id) -> dict | None
def upsert_profile(server_id, data: dict) -> None
def get_launch_params(server_id) -> dict | None
def upsert_launch_params(server_id, data: dict) -> None
def get_rcon_config(server_id) -> dict | None
def upsert_rcon_config(server_id, data: dict) -> None
def get_full_config(server_id) -> dict # all sections combined
```
---
### `system/`
**`router.py`** — System-level endpoints (no auth required for health check).
```python
# GET /system/status → running_servers, total_servers, uptime, version
# GET /system/health → 200 OK if app is alive (for load balancer / Docker healthcheck)
@router.get("/system/status")
async def system_status() -> APIResponse:
# Returns: {version, running_servers, total_servers, uptime_seconds}
@router.get("/system/health")
async def health_check() -> dict:
# Returns: {"status": "ok"}
```
---
### `utils/`
**`crypto.py`**
```python
# AES-256 field encryption for sensitive values (passwords, RCon pw)
# Uses cryptography.fernet.Fernet
def encrypt(plaintext: str) -> str
def decrypt(ciphertext: str) -> str
def get_fernet() -> Fernet # from settings.encryption_key
```
**`file_utils.py`**
```python
def ensure_server_dirs(server_id: int) -> None
# Creates servers/{id}/, servers/{id}/server/ (profile dir),
# servers/{id}/mpmissions/, servers/{id}/battleye/
def get_server_dir(server_id: int) -> Path
def get_profile_dir(server_id: int) -> Path
# Returns servers/{id}/server/ — Arma 3 profile dir (matches -name=server)
def get_missions_dir(server_id: int) -> Path
def get_rpt_path(server_id: int) -> Path | None
# Arma 3 creates timestamped RPT files in the profile dir:
# servers/{id}/server/arma3server_YYYY-MM-DD_HH-MM-SS.rpt
# Uses rglob('*.rpt') to search recursively within profile dir.
# Returns the most-recently-modified one.
# Returns None if no .rpt file exists yet (server still starting up).
def safe_delete_file(path: Path) -> bool
def sanitize_filename(filename: str) -> str
# Returns Path(filename).name — prevents path traversal on both Unix and Windows
# os.path.basename() on Windows does NOT strip forward slashes;
# Path.name handles both separators correctly.
```
**`port_checker.py`**
```python
def is_port_in_use(port: int, host: str = "0.0.0.0") -> bool
# socket.connect check
def check_server_ports_available(game_port: int, rcon_port: int | None = None, host: str = "0.0.0.0") -> list[int]
# Checks ALL ports: game_port, game_port+1 (Steam query),
# game_port+2 (VON), game_port+3 (Steam auth),
# plus the actual rcon_port (user-configurable, defaults to game_port+4)
# If rcon_port is None, defaults to game_port+4
# If rcon_port is None, defaults to game_port+4
# Returns list of ports that are in use (empty = all available)
def find_available_port(start: int = 2302, step: int = 100) -> int
# Find next available game port (checking all 5 derived ports per candidate)
```
---
## Key Dependencies (requirements.txt)
```
fastapi==0.111.0
uvicorn[standard]==0.29.0
pydantic==2.7.0
pydantic-settings==2.2.1
sqlalchemy==2.0.30
python-jose[cryptography]==3.3.0 # JWT
passlib[bcrypt]==1.7.4 # password hashing
cryptography==42.0.5 # field-level encryption (Fernet)
psutil==5.9.8 # process metrics
apscheduler==3.10.4 # scheduled jobs (log/metrics/player_history cleanup)
python-multipart==0.0.9 # file upload support
slowapi==0.1.9 # rate limiting middleware
uvloop==0.19.0; sys_platform != "win32" # faster event loop (Linux/macOS only — skip on Windows)
```

600
THREADING.md Normal file
View File

@@ -0,0 +1,600 @@
# Languard Server Manager — Threading & Concurrency Design
## Overview
The system uses a hybrid concurrency model:
- **FastAPI (asyncio)** handles HTTP requests and WebSocket connections
- **Python threads** (`threading.Thread`) handle long-running background work per server
- **Queue** bridges the thread world → asyncio world for WebSocket broadcasting
- **SQLAlchemy sync sessions** are used in threads (thread-local connections)
---
## Thread Map
```
Main Process (FastAPI / asyncio event loop)
├── [uvicorn] HTTP/WS event loop (asyncio)
│ ├── REST request handlers (async def)
│ └── WebSocket handlers (async def)
├── BroadcastThread (daemon thread, 1 global)
│ └── Reads from broadcast_queue (thread-safe)
│ Calls asyncio.run_coroutine_threadsafe()
│ → ConnectionManager.broadcast()
└── Per-running-server thread group (started when server starts, stopped when server stops):
├── ProcessMonitorThread (1 per server, 1s interval)
├── LogTailThread (1 per server, 100ms interval)
├── MetricsCollectorThread (1 per server, 5s interval)
└── RConPollerThread (1 per server, 10s interval, 30s startup delay)
```
For **N running servers**, there are:
- `4*N` background threads + 1 BroadcastThread = `4N+1` background threads total
---
## Thread Safety Rules
| Resource | Access Pattern | Protection |
|----------|---------------|------------|
| `ProcessManager._processes` | read/write from multiple threads | `threading.Lock` |
| `ThreadRegistry._threads` | read/write from main + shutdown | `threading.Lock` |
| `broadcast_queue` | multi-writer, single reader | `queue.Queue` (thread-safe built-in) |
| `ConnectionManager._connections` | async, single event loop | `asyncio.Lock` |
| SQLite connections | one connection per thread | Thread-local via `threading.local()` |
| Config files on disk | write on start, read-only during run | No lock needed (regenerated before start) |
### SQLite Thread Safety
```python
# Each background thread creates its own SQLAlchemy connection
# from the same engine (WAL mode allows concurrent reads)
# PRAGMA busy_timeout=5000 prevents "database is locked" errors
class BaseServerThread(threading.Thread):
def run(self):
# Create thread-local DB connection — single connection per thread
engine = get_engine()
self._db = engine.connect()
try:
self.setup()
while not self._stop_event.is_set():
try:
self.tick()
except Exception as e:
self.on_error(e)
self._stop_event.wait(self.interval)
except Exception as e:
logger.error(f"{self.name} setup error: {e}")
finally:
self.teardown() # always release resources (even on setup failure)
self._db.close() # always close connection
```
---
## BroadcastThread — Asyncio Bridge
This is the critical bridge between background threads and the asyncio WebSocket layer.
```
Background Thread Asyncio Event Loop
───────────────── ──────────────────
BroadcastThread.enqueue( uvicorn runs here
server_id=1,
msg_type='log',
data={...}
)
broadcast_queue.put({ loop = asyncio.get_event_loop()
'server_id': 1, (stored at app startup)
'type': 'log',
'data': {...}
})
BroadcastThread.run() ──────────────────► asyncio.run_coroutine_threadsafe(
while True: connection_manager.broadcast(
msg = queue.get() server_id=1,
fut = run_coroutine_threadsafe( message={type, data}
broadcast_coro, ),
self._loop loop=self._loop
) )
fut.result(timeout=5)
```
### Implementation Sketch
```python
# broadcaster.py
import asyncio
import queue
import threading
_broadcast_queue: queue.Queue = queue.Queue(maxsize=10000)
_event_loop: asyncio.AbstractEventLoop | None = None
class BroadcastThread(threading.Thread):
daemon = True
def __init__(self, loop: asyncio.AbstractEventLoop, manager):
super().__init__(name="BroadcastThread")
self._loop = loop
self._manager = manager
self._running = True
def run(self):
while self._running:
try:
msg = _broadcast_queue.get(timeout=1.0)
server_id = msg['server_id']
# Build the outgoing WebSocket message envelope.
# Include server_id so clients subscribed to 'all' can identify the source.
# API contract: {type, server_id, data}
outgoing = {
'type': msg['type'],
'server_id': server_id,
'data': msg['data'],
}
future = asyncio.run_coroutine_threadsafe(
self._manager.broadcast(str(server_id), outgoing, channel=msg['type']),
self._loop
)
try:
future.result(timeout=5.0)
except TimeoutError:
# Don't block the queue — log and continue
logger.warning(f"Broadcast timeout for server {server_id} msg type {msg['type']}")
except queue.Empty:
continue
except Exception as e:
logger.error(f"BroadcastThread error: {e}")
def stop(self):
self._running = False
@staticmethod
def enqueue(server_id: int, msg_type: str, data: dict):
"""Thread-safe. Called from any background thread."""
try:
_broadcast_queue.put_nowait({
'server_id': server_id,
'type': msg_type,
'data': data,
})
except queue.Full:
logger.warning(f"Broadcast queue full, dropping {msg_type} for server {server_id}")
```
---
## ProcessMonitorThread — Crash Detection & Auto-Restart
```python
class ProcessMonitorThread(BaseServerThread):
interval = 1.0
def tick(self):
proc = ProcessManager.get().get_process(self.server_id)
if proc is None:
self.stop()
return
exit_code = proc.poll()
if exit_code is not None:
# Process has exited
self._handle_process_exit(exit_code)
self.stop()
def _handle_process_exit(self, exit_code: int):
is_crash = (exit_code != 0)
status = 'crashed' if is_crash else 'stopped'
server = ServerRepository(self._db).get_by_id(self.server_id)
ServerRepository(self._db).update_status(
self.server_id, status, pid=None,
stopped_at=datetime.utcnow().isoformat()
)
PlayerRepository(self._db).clear(self.server_id)
ServerEventRepository(self._db).insert(
self.server_id, status,
actor='system',
detail={'exit_code': exit_code}
)
BroadcastThread.enqueue(self.server_id, 'status', {'status': status})
BroadcastThread.enqueue(self.server_id, 'event', {
'event_type': status,
'detail': {'exit_code': exit_code}
})
# Stop other threads for this server. Must NOT be called synchronously
# from within this thread's own run() if stop_server_threads() joins threads,
# as a thread cannot join itself. Use a daemon thread to do the cleanup
# after this thread's run() returns naturally.
# IMPORTANT: The auto-restart Timer must be started AFTER thread cleanup
# completes. The cleanup daemon thread starts the restart timer when done.
import threading as _threading
def _cleanup_and_maybe_restart():
try:
ThreadRegistry.get().stop_server_threads(self.server_id)
# Only schedule restart after threads are fully cleaned up
if is_crash and server.get('auto_restart'):
self._schedule_auto_restart(server)
except Exception as e:
logger.error(f"Cleanup/restart failed for server {self.server_id}: {e}")
BroadcastThread.enqueue(self.server_id, 'event', {
'event_type': 'auto_restart_failed',
'detail': {'error': str(e)}
})
_threading.Thread(
target=_cleanup_and_maybe_restart,
daemon=True,
name=f"StopCleanup-{self.server_id}"
).start()
def _schedule_auto_restart(self, server: dict):
# IMPORTANT: This method runs in the daemon cleanup thread, NOT the
# ProcessMonitorThread. Must create its own DB connection — do NOT
# use self._db (it belongs to the ProcessMonitorThread's thread context
# and may be closed by teardown() already).
from database import get_thread_db
db = get_thread_db()
restart_count = server['restart_count']
max_restarts = server['max_restarts']
window = server['restart_window_seconds']
last_restart = server.get('last_restart_at')
# Reset restart_count if last restart was outside the window
if last_restart:
last_dt = datetime.fromisoformat(last_restart)
elapsed = (datetime.utcnow() - last_dt).total_seconds()
if elapsed > window:
ServerRepository(db).reset_restart_count(self.server_id)
restart_count = 0
if restart_count < max_restarts:
delay = min(10 * (restart_count + 1), 60) # exponential backoff
logger.info(f"Auto-restarting server {self.server_id} in {delay}s (attempt {restart_count+1}/{max_restarts})")
threading.Timer(delay, self._auto_restart).start()
else:
logger.warning(f"Server {self.server_id} exceeded max auto-restarts ({max_restarts})")
BroadcastThread.enqueue(self.server_id, 'event', {
'event_type': 'max_restarts_exceeded',
'detail': {'restart_count': restart_count}
})
def _auto_restart(self):
from servers.service import ServerService
try:
ServerService().start(self.server_id)
except Exception as e:
logger.error(f"Auto-restart failed for server {self.server_id}: {e}")
```
---
## LogTailThread — RPT File Tailing
The Arma 3 RPT file grows while the server runs. This thread tails it like `tail -f`.
```python
class LogTailThread(BaseServerThread):
interval = 0.1 # 100ms
def setup(self):
self._file = None
self._current_path: Path | None = None
self._last_size: int = 0
self._open_latest_rpt()
def _open_latest_rpt(self):
"""
Arma 3 writes timestamped RPT files in the profile subdirectory:
servers/{id}/server/arma3server_YYYY-MM-DD_HH-MM-SS.rpt
Use rglob('*.rpt') to search recursively within the server dir.
The profile subdirectory is determined by -profiles + -name flags.
NOTE: Do NOT use os.stat().st_ino for rotation detection — on Windows/NTFS
st_ino is always 0, making inode comparison completely non-functional.
Instead, track the filename and file size. If a newer .rpt appears or the
current file shrinks (truncated/replaced), reopen.
"""
rpt_files = list(Path(get_server_dir(self.server_id)).rglob("*.rpt"))
if not rpt_files:
return # Server hasn't created RPT yet; retry in next tick
latest = max(rpt_files, key=lambda p: p.stat().st_mtime)
try:
self._file = open(latest, 'r', encoding='utf-8', errors='replace')
self._file.seek(0, 2) # seek to end — tail, don't replay old output
self._current_path = latest
self._last_size = self._file.tell()
except OSError:
self._file = None
def tick(self):
if self._file is None:
self._open_latest_rpt()
return
# Rotation detection: only re-glob every 5 seconds (not every 100ms tick)
# to avoid excessive filesystem I/O with large mpmissions directories.
now = time.monotonic()
if now - getattr(self, '_last_glob_time', 0) > 5.0:
self._last_glob_time = now
rpt_files = list(Path(get_server_dir(self.server_id)).rglob("*.rpt"))
if rpt_files:
latest = max(rpt_files, key=lambda p: p.stat().st_mtime)
if latest != self._current_path:
# A new RPT file was created — switch to it
self._file.close()
self._open_latest_rpt()
return
try:
current_size = self._current_path.stat().st_size
except OSError:
return
if current_size < self._last_size:
# File shrank — truncated or replaced; reopen
self._file.close()
self._open_latest_rpt()
return
# Read new lines
while True:
line = self._file.readline()
if not line:
break
self._last_size = self._file.tell()
line = line.rstrip('\n')
if not line:
continue
entry = RPTParser.parse_line(line)
if entry:
LogRepository(self._db).insert(self.server_id, entry)
BroadcastThread.enqueue(self.server_id, 'log', entry)
def teardown(self):
"""Close the open RPT file handle when the thread stops."""
if self._file is not None:
try:
self._file.close()
except OSError:
pass
self._file = None
```
---
## RConPollerThread — Player List Synchronization
```python
class RConPollerThread(BaseServerThread):
interval = 10.0
STARTUP_DELAY = 30.0 # wait for server to fully initialize
_rcon_ready = False # flag: True only after successful setup
def setup(self):
# Wait for server to start up before attempting RCon
if self._stop_event.wait(self.STARTUP_DELAY):
self._rcon_ready = False
return # stop was requested during wait
self._rcon = RConService(self.server_id)
self._connected = self._rcon.connect()
self._rcon_ready = True
def tick(self):
if not self._rcon_ready:
return # setup() failed or was interrupted
if not self._connected:
self._reconnect_attempts = getattr(self, '_reconnect_attempts', 0) + 1
delay = min(10 * 2 ** self._reconnect_attempts, 120) # exponential backoff
if self._reconnect_attempts > 1:
logger.info(f"RCon reconnect attempt {self._reconnect_attempts} for server {self.server_id} (next in {delay}s)")
if self._stop_event.wait(delay):
return
self._connected = self._rcon.connect()
if not self._connected:
return
self._reconnect_attempts = 0 # reset on successful connection
try:
players = self._rcon.get_players()
PlayerService(self._db).update_from_rcon(self.server_id, players)
BroadcastThread.enqueue(self.server_id, 'players', {
'players': [p.dict() for p in players],
'count': len(players)
})
except ConnectionError:
self._connected = False
logger.warning(f"RCon connection lost for server {self.server_id}")
```
---
## Thread Lifecycle
### Start Server Flow
```
POST /servers/{id}/start
├── ServerService.start()
│ ├── ConfigGenerator.write_all()
│ ├── ProcessManager.start() ← creates subprocess.Popen
│ └── ThreadRegistry.start_server_threads(id)
│ ├── ProcessMonitorThread(id).start()
│ ├── LogTailThread(id).start()
│ ├── MetricsCollectorThread(id).start()
│ └── RConPollerThread(id).start()
└── BroadcastThread.enqueue(id, 'status', {status: 'starting'})
```
### Stop Server Flow
```
POST /servers/{id}/stop
├── RConService.shutdown() ← sends #shutdown via RCon
├── Wait up to 30s for process exit (ProcessManager.stop(timeout=30))
├── If still running: ProcessManager.kill()
├── ThreadRegistry.stop_server_threads(id)
│ ├── ProcessMonitorThread.stop() (sets _stop_event)
│ ├── LogTailThread.stop()
│ ├── MetricsCollectorThread.stop()
│ └── RConPollerThread.stop()
│ └── Thread.join(timeout=5) for each
└── BroadcastThread.enqueue(id, 'status', {status: 'stopped'})
```
### App Shutdown Flow
```
FastAPI shutdown event
├── ThreadRegistry.stop_all() ← stop all threads for all servers
├── BroadcastThread.stop()
├── ConnectionManager.close_all()
└── database engine dispose
```
---
## Stop Event Pattern
All background threads use a `threading.Event` for graceful shutdown:
```python
class BaseServerThread(threading.Thread):
def __init__(self, server_id: int, interval: float):
super().__init__(name=f"{self.__class__.__name__}-{server_id}", daemon=True)
self.server_id = server_id
self.interval = interval
self._stop_event = threading.Event()
def stop(self):
self._stop_event.set()
def is_stopped(self) -> bool:
return self._stop_event.is_set()
def teardown(self):
"""Override to release resources (close files, sockets) after the loop ends."""
pass
def run(self):
try:
self.setup()
except Exception as e:
logger.error(f"{self.name} setup error: {e}")
return # setup failed completely — no partial resources to clean
try:
while not self._stop_event.is_set():
try:
self.tick()
except Exception as e:
self.on_error(e)
# Use wait() instead of sleep() — responds immediately to stop()
self._stop_event.wait(self.interval)
finally:
self.teardown() # always runs; subclasses close files/sockets here
```
---
## WebSocket Connection Manager (asyncio)
```python
# websocket/manager.py
class ConnectionManager:
def __init__(self):
# server_id → set[WebSocket]
# Use set (not list) so .add()/.discard() work correctly.
self._connections: dict[str, set[WebSocket]] = defaultdict(set)
# Per-connection channel subscriptions: ws → set[str]
self._channel_subs: dict[WebSocket, set[str]] = defaultdict(set)
self._lock = asyncio.Lock()
async def connect(self, ws: WebSocket, server_id: str):
await ws.accept()
async with self._lock:
self._connections[server_id].add(ws)
self._channel_subs[ws].add('status') # default channel
# Only add to 'all' bucket if server_id is explicitly 'all'
if server_id == 'all':
self._connections['all'].add(ws)
async def disconnect(self, ws: WebSocket, server_id: str):
async with self._lock:
self._connections[server_id].discard(ws)
self._connections['all'].discard(ws)
self._channel_subs.pop(ws, None)
async def subscribe(self, ws: WebSocket, channels: list[str]):
async with self._lock:
self._channel_subs[ws].update(channels)
async def unsubscribe(self, ws: WebSocket, channels: list[str]):
async with self._lock:
self._channel_subs[ws].difference_update(channels)
async def broadcast(self, server_id: str, message: dict, channel: str = None):
"""Send to all clients subscribed to server_id AND the message's channel."""
targets: set[WebSocket] = set()
async with self._lock:
# Collect clients for this server_id + 'all' subscribers
server_clients = self._connections.get(server_id, set())
all_clients = self._connections.get('all', set())
candidates = server_clients | all_clients
# Filter by channel subscription if specified
if channel:
targets = {ws for ws in candidates
if channel in self._channel_subs.get(ws, set())}
else:
targets = candidates
dead = []
for ws in targets:
try:
await ws.send_json(message)
except Exception:
dead.append(ws)
# Clean up dead connections
if dead:
async with self._lock:
for ws in dead:
for bucket in self._connections.values():
bucket.discard(ws)
self._channel_subs.pop(ws, None)
```
---
## Memory & Performance Considerations
| Thread | Memory Impact | CPU Impact |
|--------|--------------|-----------|
| ProcessMonitorThread | Minimal (one `os.kill` check) | Negligible |
| LogTailThread | Buffer for unread log lines | Low (file I/O) |
| MetricsCollectorThread | psutil subprocess scan | Low-Medium |
| RConPollerThread | UDP socket + response buffer | Low |
| BroadcastThread | Queue buffer (max 10000 entries) | Low |
### Recommendations
- Set all threads as `daemon=True` — they die automatically if main process exits
- `broadcast_queue.maxsize=10000` — backpressure; drop on Full (log warning)
- `LogTailThread` buffers max ~100 lines per tick before writing to DB in batch
- `MetricsCollectorThread` uses `psutil.Process.cpu_percent(interval=0.5)` — blocks 500ms, acceptable at 5s interval
- For N=10 servers: 41 background threads — well within Python's thread limits