Files
languard-servers-manager/IMPLEMENTATION_PLAN.md
Khoa (Revenovich) Tran Gia a60b94c20c fix: address santa-loop review findings (round 1)
Update remaining old-name references in body text:
- ARCHITECTURE.md:219 directory layout: languard-server-manager/ → languard-servers-manager/
- IMPLEMENTATION_PLAN.md:405 setup instructions: cd languard-server-manager → cd languard-servers-manager
2026-04-16 14:04:57 +07:00

17 KiB

Languard Servers Manager — Implementation Plan

Prerequisites

Before starting, ensure the following are available:

  • Python 3.11+
  • A working Arma 3 dedicated server installation (for testing)
  • Node.js 18+ (for frontend dev server)
  • The reference docs: ARCHITECTURE.md, DATABASE.md, API.md, MODULES.md, THREADING.md

Phase 1 — Foundation (Start Here)

Goal: Running FastAPI server with DB, auth, and basic server CRUD.

Step 1.1 — Project scaffold

mkdir backend
cd backend
python -m venv venv
venv/Scripts/activate
pip install fastapi uvicorn[standard] sqlalchemy python-jose[cryptography] passlib[bcrypt] cryptography psutil apscheduler python-multipart slowapi pytest pytest-asyncio httpx
# uvloop (faster event loop) is Linux/macOS only — skip on Windows:
# pip install uvloop  # only on Linux/macOS
pip freeze > requirements.txt

Create:

  • backend/config.py — Settings class (see MODULES.md)
  • backend/main.py — FastAPI app factory, startup/shutdown hooks
  • backend/conftest.py — pytest fixtures (in-memory SQLite, test client)
  • .env.example — All env vars documented

Step 1.2 — Database + Migrations

  1. Create backend/migrations/001_initial_schema.sql — all tables from DATABASE.md
    • Include all CHECK constraints (role, status, verify_signatures, von_codec_quality, etc.)
    • Include PRAGMA busy_timeout=5000 in engine setup
    • Important: Put CREATE TABLE IF NOT EXISTS schema_migrations as the very first statement — the migration runner queries this table before it can track anything.
  2. Create backend/dal/event_repository.pyServerEventRepository (needed by Phase 3 threads)
  3. Create backend/database.py:
    • get_engine() with WAL + FK pragma
    • run_migrations() — reads and applies .sql files from migrations/
    • get_db() — FastAPI dependency (sync session)
    • get_thread_db() — thread-local session factory
  4. Call run_migrations() in main.py:on_startup()

Test: Start app, confirm languard.db created with all tables. Run pytest with in-memory SQLite to verify schema creates cleanly.

Step 1.3 — Auth module

  1. backend/auth/utils.pyhash_password, verify_password, create_access_token, decode_access_token
  2. backend/auth/schemas.pyLoginRequest, TokenResponse, UserResponse
  3. backend/auth/service.pyAuthService (create user, login, list users)
  4. backend/auth/router.py — login, me, users CRUD
  5. backend/dependencies.pyget_current_user, require_admin
  6. main.py — seed default admin user on first startup if users table empty
    • Generate a random password and print it to stdout once (NOT admin/admin)
    • Add rate limiting to POST /auth/login (5 attempts/minute per IP via slowapi)
    • Add input sanitization for all string fields in auth schemas

Test: POST /api/auth/login returns JWT. GET /api/auth/me with token returns user. Rate limiting returns 429 after 5 failed attempts.

Step 1.4 — Server CRUD (no process management yet)

  1. backend/dal/server_repository.py
  2. backend/dal/config_repository.py
  3. backend/servers/schemas.py
  4. backend/servers/router.py — GET, POST, PUT, DELETE /servers and /servers/{id}
  5. backend/servers/service.py — CRUD methods only (skip start/stop for now)
  6. backend/utils/file_utils.pyensure_server_dirs(), sanitize_filename()
  7. backend/utils/port_checker.pyis_port_in_use(), check_server_ports_available()
  8. Port validation on create/start: check game_port through game_port+4

Test: Create server via API, confirm DB row + directory created.


Phase 2 — Process Management

Goal: Start/stop actual arma3server.exe processes.

Step 2.1 — Config Generator

  1. backend/servers/config_generator.py
  2. Use a structured builder (NOT f-strings) — escape double quotes and newlines in all user-supplied string values to prevent config injection
  3. Write server.cfg covering all params from DATABASE.md, including mission rotation as class Missions {} block
  4. Write basic.cfg
  5. Write server.Arma3Profilewritten to servers/{id}/server/server.Arma3Profile (Arma 3 reads from the -name subdirectory)
  6. Write BESERVER_CFG_TEMPLATErequired for BattlEye RCon to work
    # servers/{id}/battleye/beserver.cfg
    RConPassword {rcon_password}
    RConPort {rcon_port}
    
    write_beserver_cfg() must create the battleye/ directory and write this file. Without it BattlEye will not open an RCon port regardless of launch parameters.
  7. build_launch_args() — assembles full CLI arg list
    • Include -bepath=./battleye to point BE at the generated config (relative to cwd)
    • Include -profiles=./ and -name=server for profile directory
    • All relative paths resolve against cwd=servers/{id}/ set in ProcessManager
  8. Set file permissions 0600 on config files containing passwords (server.cfg, beserver.cfg)

Test: ConfigGenerator.write_all(server_id) → inspect all generated files for correctness. Verify servers/{id}/battleye/beserver.cfg exists with the correct RCon password. Verify servers/{id}/server/server.Arma3Profile exists. Test config injection prevention: set hostname to X"; passwordAdmin = "pwned"; // — verify generated server.cfg does NOT contain the injected directive. Validate generated server.cfg manually by running the server with it.

Step 2.2 — Process Manager

  1. backend/servers/process_manager.pyProcessManager singleton
  2. start(server_id, exe_path, args, cwd=servers/{id}/) — subprocess.Popen with cwd set to server instance dir
  3. stop(server_id, timeout=30) — on Windows: terminate() = hard kill (no SIGTERM). Graceful shutdown is via RCon #shutdown in ServerService.
  4. kill(), is_running(), get_pid()
  5. recover_on_startup() — verify PID is alive AND process name matches arma3server (prevents PID reuse)
  6. Wire ServerService.start() and ServerService.stop()
  7. Add POST /servers/{id}/start, POST /servers/{id}/stop, POST /servers/{id}/kill endpoints

Test: Start a server via API → confirm process appears in Task Manager. Stop it → confirm process ends.

Step 2.3 — Config endpoints

  1. GET /servers/{id}/config
  2. PUT /servers/{id}/config/server
  3. PUT /servers/{id}/config/basic
  4. PUT /servers/{id}/config/profile
  5. PUT /servers/{id}/config/launch
  6. GET /servers/{id}/config/preview

Test: Update hostname via API → regenerate and start server → confirm new hostname appears in server browser.


Phase 3 — Background Threads

Goal: Live monitoring — process crash detection, log tailing, metrics.

Step 3.1 — Thread infrastructure

  1. backend/threads/base_thread.pyBaseServerThread
  2. backend/threads/thread_registry.pyThreadRegistry singleton
  3. Wire start_server_threads() / stop_server_threads() into ServerService.start() / ServerService.stop()

Step 3.2 — Process Monitor Thread

  1. backend/threads/process_monitor.py
  2. Crash detection + status update in DB
  3. Auto-restart with exponential backoff

Test: Start server → kill process manually → confirm DB status changes to 'crashed'. Test: Enable auto_restart → kill → confirm server restarts automatically.

Step 3.3 — Log Tail Thread

  1. backend/logs/parser.pyRPTParser
  2. backend/dal/log_repository.py
  3. backend/threads/log_tail.py
  4. backend/logs/service.py
  5. backend/logs/router.pyGET /servers/{id}/logs

Test: Start server → GET /api/servers/{id}/logs returns recent RPT lines.

Step 3.4 — Metrics Collector Thread

  1. backend/metrics/service.py
  2. backend/dal/metrics_repository.py
  3. backend/threads/metrics_collector.py
  4. backend/metrics/router.pyGET /servers/{id}/metrics

Test: Running server → query metrics endpoint → see CPU/RAM data points.


Phase 4 — BattlEye RCon

Goal: Real-time player list, in-game admin commands.

Step 4.1 — RCon Client

  1. backend/rcon/client.pyBERConClient
  2. Implement BE RCon UDP protocol:
    • Packet structure: 'BE' + CRC32 (little-endian) + type byte + payload
    • Login: type 0x00, payload = password
    • Command: type 0x01, payload = sequence byte + command string
    • Keepalive: type 0x02, payload = empty
  3. Request multiplexer: track pending requests by sequence byte, route responses to correct caller via threading.Event per request. Background receiver thread reads all incoming packets.
  4. parse_players_response() — parse players command output
  5. Handle unsolicited server messages (type 0x02) — enqueue for event logging

BattlEye RCon packet format reference:

Login packet (client → server):
  42 45          # 'BE'
  [CRC32 LE]     # checksum of bytes after CRC
  FF             # packet type prefix
  00             # login type
  [password]     # ASCII password

Command packet:
  42 45
  [CRC32 LE]
  FF
  01
  [seq byte]     # 0x00-0xFF, wraps around
  [command]      # ASCII command string

Command response (server → client):
  42 45
  [CRC32 LE]
  FF
  01             # 0x01 = command response (same type byte as outgoing command)
  [seq byte]
  [response]     # ASCII response text

Server-pushed message (server → client, unsolicited):
  42 45
  [CRC32 LE]
  FF
  02             # 0x02 = server message (chat events, kill events, etc.)
  [seq byte]
  [message]      # ASCII message text

Test: Connect BERConClient to a running server with BattlEye → successfully login → send players → receive response.

Step 4.2 — RCon Service + Poller Thread

  1. backend/rcon/service.pyRConService
  2. backend/threads/rcon_poller.py
  3. backend/dal/player_repository.py
  4. backend/players/service.py
  5. backend/players/router.pyGET /servers/{id}/players

Test: Players join server → GET /players returns them with pings.

Step 4.3 — Admin Actions via RCon

  1. POST /servers/{id}/players/{num}/kick
  2. POST /servers/{id}/players/{num}/ban
  3. POST /servers/{id}/rcon/command
  4. POST /servers/{id}/rcon/say
  5. backend/dal/ban_repository.py
  6. GET/POST/DELETE /servers/{id}/bans
  7. ban.txt bidirectional sync: on ban add/delete via API, write to battleye/ban.txt; on startup, read ban.txt and upsert into DB

Test: Kick a player via API → confirm player disconnected from server.


Phase 5 — WebSocket Real-Time

Goal: Live updates to React frontend without polling.

Step 5.1 — Broadcast infrastructure

  1. backend/websocket/broadcaster.pyBroadcastThread + enqueue()
  2. backend/websocket/manager.pyConnectionManager
  3. Store event loop reference in main.py:on_startup():
    import asyncio
    # on_startup() runs inside the asyncio event loop — use get_running_loop(),
    # not get_event_loop() (deprecated in Python 3.10+ from async context).
    _event_loop = asyncio.get_running_loop()
    broadcaster.init(_event_loop, connection_manager)
    
  4. Start BroadcastThread in on_startup()
  5. Wire BroadcastThread.enqueue() calls into all background threads

Step 5.2 — WebSocket endpoint

  1. backend/websocket/router.py
  2. JWT validation from query param
  3. Subscribe/unsubscribe message handling
  4. Ping/pong keepalive

Test: Connect to ws://localhost:8000/ws/1?token=... → see live log lines stream in terminal.

Step 5.3 — Integrate all event sources

Wire BroadcastThread.enqueue() into:

  • ProcessMonitorThread → status updates, crash events
  • LogTailThread → log lines
  • MetricsCollectorThread → metrics snapshots
  • RConPollerThread → player list updates
  • ServerService.start/stop → status transitions

Test: React frontend connects to WS → server starts → see status, logs, metrics all update in real time.


Phase 6 — Mission & Mod Management

Step 6.1 — Missions

  1. backend/missions/service.py
  2. backend/missions/router.py
  3. Upload PBO validation (check .pbo extension, parse name)
  4. Mission rotation CRUD

Test: Upload a .pbo → appears in GET /missions → set as rotation → start server → mission available.

Step 6.2 — Mods

  1. backend/mods/service.py
  2. backend/mods/router.py
  3. build_mod_string() — assemble -mod= and -serverMod= args
  4. Wire mod string into ConfigGenerator.build_launch_args()

Test: Register @CBA_A3 → enable on server → start → server loads mod.


Phase 7 — Polish & Production

Step 7.1 — APScheduler jobs

Add to on_startup():

# Use BackgroundScheduler (not AsyncIOScheduler) because cleanup methods
# perform sync SQLite operations. AsyncIOScheduler would block the event loop.
from apscheduler.schedulers.background import BackgroundScheduler
scheduler = BackgroundScheduler()
scheduler.add_job(log_service.cleanup_old_logs, 'cron', hour=3)
scheduler.add_job(metrics_service.cleanup_old_metrics, 'cron', hour=3, minute=30)
scheduler.add_job(player_service.cleanup_old_history, 'cron', hour=4)  # 90-day retention
scheduler.start()

Step 7.2 — Startup recovery

In on_startup()ProcessManager.recover_on_startup():

  • Query DB for servers with status='running'
  • Check if PID still alive (psutil.pid_exists(pid))
  • If alive: re-attach threads (skip process start, just start monitoring threads)
  • If dead: mark as crashed, clear players

Step 7.3 — Events log

  1. backend/dal/event_repository.py
  2. Insert events for: start, stop, crash, kick, ban, config change, mission change
  3. GET /servers/{id}/events endpoint

Step 7.4 — Security hardening (additional layers)

  1. Encrypt sensitive DB fields: password, password_admin, rcon_password
    • backend/utils/crypto.py with Fernet
    • Key format: LANGUARD_ENCRYPTION_KEY must be a Fernet base64 key, NOT hex. Generate with: python -c "from cryptography.fernet import Fernet; print(Fernet.generate_key().decode())" Passing a hex string to Fernet() raises ValueError at startup.
    • Encrypt on write, decrypt on read in repositories
    • NOTE: Core security (rate limiting, input sanitization, config escaping, exe path validation) is already in Phases 1-2.
  2. Additional penetration testing and security audit
  3. Content-Security-Policy headers for frontend

Step 7.5 — Frontend integration checklist

Verify React app can:

  • Login and store JWT
  • List servers with live status
  • Start/stop server and see status update via WebSocket (no page refresh)
  • View streaming log output
  • See player list update every 10s
  • See CPU/RAM charts update every 5s
  • Edit all config sections and see preview
  • Upload a mission PBO
  • Kick a player
  • Send a message to all players

Testing Strategy

Unit tests (pytest)

  • ConfigGenerator.write_server_cfg() — compare output against expected string; test config injection prevention
  • ConfigGenerator._escape_config_string() — test double-quote and newline escaping
  • RPTParser.parse_line() — test all log formats
  • BERConClient.parse_players_response() — test with sample output
  • AuthService.login() — correct password / wrong password / rate limiting
  • Repository methods — use in-memory SQLite (:memory:)
  • check_server_ports_available() — test derived port validation
  • sanitize_filename() — test path traversal prevention
  • In-memory SQLite setup in conftest.py — shared fixture for all repository tests

Integration tests

  • Full start/stop cycle with a real arma3server.exe (manual — requires licensed Arma 3 installation, not in CI)
  • WebSocket message delivery (can be automated with httpx test client)
  • RCon command round-trip (manual — requires running server with BattlEye)

Load notes

  • SQLite with WAL handles concurrent reads from 4 threads per server well
  • For >10 simultaneous servers, consider connection pool size tuning
  • WebSocket broadcast scales to ~100 concurrent connections without issue

Environment Setup (Developer)

# 1. Clone repo
git clone <repo>
cd languard-servers-manager

# 2. Backend
cd backend
python -m venv venv
source venv/bin/activate   # or venv\Scripts\activate on Windows
pip install -r requirements.txt

# 3. Environment
cp .env.example .env
# Edit .env: set LANGUARD_ARMA_EXE to your arma3server_x64.exe path

# 4. Run backend
uvicorn main:app --reload --host 0.0.0.0 --port 8000

# 5. Frontend (separate)
cd ../frontend
npm install
npm run dev

Backend auto-creates languard.db and seeds an admin user on first run:

  • Username: admin
  • Password: randomly generated and printed to stdout once (e.g., Initial admin password: a7b9c2d4e5f6...)
  • Change immediately via PUT /api/auth/password

Phase Summary

Phase Deliverable Est. Complexity
1 Foundation (auth + server CRUD) Low
2 Process management + config gen Medium
3 Background threads (monitor, logs, metrics) Medium-High
4 BattlEye RCon (player list, admin cmds) High
5 WebSocket real-time Medium
6 Mission + mod management Low-Medium
7 Polish, security, recovery Medium

Implement phases in order — each phase builds on the previous and is independently testable.