Files
languard-servers-manager/IMPLEMENTATION_PLAN.md
Tran G. (Revernomad) Khoa b17d199301 fix: address design review ACT NOW items (6 risk gaps)
- Add migrate_config() to ConfigGenerator protocol for schema version upgrades
- Add per-server operation lock to ProcessManager to prevent start/stop races
- Add busy_timeout retry/backoff strategy (exponential: 1s, 2s, 4s) for DB lock exhaustion
- Add ConfigForm testing strategy and error boundary for malformed schemas
- Add schema cache invalidation on adapter version change
- Add ConfigMigrationError to typed adapter exceptions
2026-04-16 17:29:19 +07:00

26 KiB

Languard Servers Manager — Implementation Plan

Prerequisites

Before starting, ensure the following are available:

  • Python 3.11+
  • A working Arma 3 dedicated server installation (for testing the first adapter)
  • Node.js 18+ (for frontend dev server)
  • The reference docs: ARCHITECTURE.md, DATABASE.md, API.md, MODULES.md, THREADING.md

Phase 0 — Adapter Framework (New)

Goal: Build the adapter protocol + registry system before any other code. This is the foundation that makes every subsequent phase modular.

Step 0.1 — Adapter protocols, exceptions, and registry

  1. Create backend/adapters/__init__.py — auto-imports built-in adapters
  2. Create backend/adapters/protocols.py — all capability Protocol definitions:
    • ConfigGenerator (merged: schema + generation), ProcessConfig, LogParser
    • RemoteAdmin, RemoteAdminClient
    • MissionManager, ModManager, BanManager
    • GameAdapter (composite protocol with has_capability() method)
    • ConfigGenerator includes get_sections(), get_sensitive_fields(section), get_config_version()
    • RemoteAdmin includes get_player_data_schema() -> type[BaseModel] | None
    • MissionManager includes get_mission_data_schema() -> type[BaseModel] | None
    • ModManager includes get_mod_data_schema() -> type[BaseModel] | None
    • BanManager includes get_ban_data_schema() -> type[BaseModel] | None
  3. Create backend/adapters/exceptions.py — typed adapter exceptions:
    • AdapterError (base)
    • ConfigWriteError — atomic write failed (tmp file cleanup done)
    • ConfigValidationError — adapter Pydantic validation failed
    • LaunchArgsError — invalid launch arguments
    • RemoteAdminError — admin protocol communication failed
    • ExeNotAllowedError — executable not in adapter allowlist
  4. Create backend/adapters/registry.pyGameAdapterRegistry singleton
  5. Add has_capability(name) -> bool method to GameAdapter protocol — core uses explicit capability probes instead of scattered None checks
  6. Write unit tests: register adapter, get adapter, list game types, missing adapter raises error, exceptions are catchable by type, has_capability returns correct bools

Step 0.2 — Arma 3 adapter skeleton

  1. Create backend/adapters/arma3/__init__.py — exports and registers ARMA3_ADAPTER
  2. Create backend/adapters/arma3/adapter.pyArma3Adapter class (all methods return stubs initially)
  3. Create backend/adapters/arma3/process_config.pyArma3ProcessConfig (full implementation)
  4. Create backend/adapters/arma3/config_generator.py — Pydantic models (ServerConfig, BasicConfig, ProfileConfig, LaunchConfig, RConConfig) + Arma3ConfigGenerator (schema + generation merged)
  5. Third-party adapter loading: add languard.adapters entry_point group to pyproject.toml:
    [project.entry-points."languard.adapters"]
    arma3 = "adapters.arma3:ARMA3_ADAPTER"
    
    Core scans entry_points at startup via importlib.metadata in addition to built-in imports.
  6. Write unit tests: adapter registers, protocols satisfied, config schema produces valid JSON Schema

Test: Import adapters module → GameAdapterRegistry.get("arma3") returns a valid adapter. GameAdapterRegistry.list_game_types() returns [{"game_type": "arma3", "display_name": "Arma 3", ...}].


Phase 1 — Foundation

Goal: Running FastAPI server with DB, auth, and basic server CRUD using the adapter framework.

Step 1.1 — Project scaffold

mkdir backend
cd backend
python -m venv venv
venv/Scripts/activate
pip install fastapi uvicorn[standard] sqlalchemy python-jose[cryptography] passlib[bcrypt] cryptography psutil apscheduler python-multipart slowapi pytest pytest-asyncio httpx
pip freeze > requirements.txt

Create:

  • backend/config.py — Settings class
  • backend/main.py — FastAPI app factory, startup/shutdown hooks
  • backend/conftest.py — pytest fixtures (in-memory SQLite, test client)
  • .env.example — All env vars documented

Step 1.2 — Database + Migrations

  1. Create backend/core/migrations/001_initial_schema.sql — all core tables:
    • schema_migrations, users, servers (with game_type), game_configs
    • mods (with game_type, game_data), server_mods
    • missions, mission_rotation (with game_data)
    • players (with slot_id TEXT, game_data), player_history
    • bans (with game_data), logs, metrics, server_events
    • Include all CHECK constraints and indexes
    • Include PRAGMA busy_timeout=5000 in engine setup
  2. Create backend/core/dal/event_repository.py
  3. Create backend/database.py:
    • get_engine() with WAL + FK pragma
    • run_migrations()
    • get_db() — FastAPI dependency
    • get_thread_db() — thread-local session factory
  4. Call run_migrations() in main.py:on_startup()

Test: Start app, confirm languard.db created with all tables. Run pytest with in-memory SQLite.

Step 1.3 — Auth module

  1. backend/core/auth/utils.pyhash_password, verify_password, create_access_token, decode_access_token
  2. backend/core/auth/schemas.pyLoginRequest, TokenResponse, UserResponse
  3. backend/core/auth/service.pyAuthService
  4. backend/core/auth/router.py — login, me, users CRUD
  5. backend/dependencies.pyget_current_user, require_admin, get_adapter_for_server
  6. main.py — seed default admin user on first startup (random password printed to stdout)
  7. Add rate limiting to POST /auth/login (5 attempts/minute per IP via slowapi)

Test: POST /api/auth/login returns JWT. GET /api/auth/me with token returns user. Rate limiting returns 429 after 5 failed attempts.

Step 1.4 — Server CRUD (no process management yet)

  1. backend/core/dal/server_repository.py
  2. backend/core/dal/config_repository.py — manages game_configs table
  3. backend/core/servers/schemas.pyCreateServerRequest (includes game_type)
  4. backend/core/servers/router.py — GET, POST, PUT, DELETE /servers
  5. backend/core/servers/service.py — CRUD methods + create_server seeds config sections from adapter defaults
  6. backend/core/utils/file_utils.pyensure_server_dirs() (uses adapter's get_server_dir_layout())
  7. backend/core/utils/port_checker.pyis_port_in_use(), check_server_ports_available()
    • Full cross-game port checking: query ALL running servers, resolve each adapter, get port conventions for each, check the full derived port set
    • Example: Arma 3 uses game port + 1 (Steam query), BattlEye RCon port; another game may use different conventions — all checked

Test: Create server via API with game_type: "arma3" → confirm DB row + game_configs rows + directory created. Create a second server with a port that conflicts with derived ports of the first → confirm 409 error.

Step 1.5 — Game type discovery endpoints

  1. backend/core/games/router.pyGET /games, GET /games/{type}, GET /games/{type}/config-schema, GET /games/{type}/defaults

Test: GET /api/games returns [{"game_type": "arma3", ...}]. GET /api/games/arma3/config-schema returns JSON Schema for all 5 Arma 3 config sections.

Step 1.6 — Migration script for existing Arma 3 data

If upgrading from the single-game schema, create a migration script:

  1. Create backend/core/migrations/002_migrate_arma3_config.py
  2. Column type map: max_players INT→JSON maxPlayers, hostname TEXT→JSON hostname, etc.
  3. migrate_config_table(): read old Arma 3 config table rows → build game_configs JSON blobs → insert into new table → delete old rows
  4. migrate_player_data(): convert player_num INTEGER → slot_id TEXT
  5. Transaction + rollback: all migration runs inside a single DB transaction; on failure, full rollback
  6. Row count verification: after migration, assert row counts match between old and new tables
  7. Idempotent: safe to run multiple times (checks if migration already applied)

Test: Create test DB with old single-game schema + sample data → run migration script → verify all data in new tables → verify old tables dropped.


Phase 2 — Arma 3 Adapter Implementation

Goal: Complete the Arma 3 adapter with config generation and process management. This phase proves the adapter architecture works end-to-end with the primary game.

Step 2.1 — Config Generator (Arma 3 adapter)

  1. backend/adapters/arma3/config_generator.pyArma3ConfigGenerator
  2. Use a structured builder (NOT f-strings) — escape double quotes and newlines in all user-supplied string values
  3. Write server.cfg covering all params from config schema, including mission rotation as class Missions {} block
  4. Write basic.cfg
  5. Write server.Arma3Profile — written to servers/{id}/server/server.Arma3Profile
  6. Write beserver.cfg — creates battleye/ directory, writes RCon config
  7. build_launch_args() — assembles full CLI arg list including -bepath=./battleye
  8. preview_config() — renders all files without writing to disk, returns dict[str, str] of label→content (filenames for file-based, variable names for env-var, argument names for CLI)
  9. Set file permissions 0600 on config files containing passwords
  10. Atomic write pattern: all config files written to .tmp files first, then os.replace() for atomic rename. On any write failure, all .tmp files are cleaned up and original files remain untouched. Raises ConfigWriteError on failure.

Test: Arma3ConfigGenerator.write_configs(server_id, dir, config) → inspect all generated files. Test config injection prevention: set hostname to X"; passwordAdmin = "pwned"; // — verify generated server.cfg does NOT contain the injected directive. Test atomic write: mock os.replace() to raise OSError → confirm .tmp files are cleaned up and original files are untouched.

Step 2.2 — Process Manager (core)

  1. backend/core/servers/process_manager.pyProcessManager singleton (game-agnostic)
  2. start(server_id, exe_path, args, cwd=servers/{id}/)
  3. stop(server_id, timeout=30) — on Windows: terminate() = hard kill
  4. kill(), is_running(), get_pid()
  5. recover_on_startup() — verify PID is alive AND process name matches adapter allowlist (prevents PID reuse)
  6. Wire ServerService.start() and ServerService.stop() — both delegate to adapter for exe validation and config generation
  7. Add POST /servers/{id}/start, POST /servers/{id}/stop, POST /servers/{id}/kill endpoints
  8. Typed exception handling in start flow: catch and map adapter exceptions to HTTP responses:
    • ConfigWriteError → 500 (atomic write failed, tmp cleaned)
    • ConfigValidationError → 422 (invalid config values)
    • LaunchArgsError → 400 (invalid launch arguments)
    • ExeNotAllowedError → 403 (executable not in adapter allowlist)

Test: Start a server via API → confirm process appears in Task Manager. Stop it → confirm process ends. Test error paths: set invalid exe path → confirm 403 ExeNotAllowedError response.

Step 2.3 — Config endpoints (core + adapter validation)

  1. GET /servers/{id}/config — reads all sections from game_configs
  2. GET /servers/{id}/config/{section} — reads single section, response includes _meta with config_version and schema_version
  3. PUT /servers/{id}/config/{section} — validates against adapter's Pydantic model, encrypts sensitive fields via adapter.get_sensitive_fields(section), stores in game_configs
    • Optimistic locking: client must send config_version in request body; if it doesn't match the current row's config_version, return 409 Conflict with CONFIG_VERSION_CONFLICT error code
    • On successful write, increment config_version in the row
  4. GET /servers/{id}/config/preview — delegates to adapter's preview_config(), returns dict[str, str] of label→content
  5. GET /servers/{id}/config/download/{filename} — filename validated against adapter allowlist

Test: Update hostname via API → regenerate and start server → confirm new hostname appears in server browser. Test optimistic locking: two concurrent PUT requests with same config_version → one succeeds (200), one fails (409 Conflict).


Phase 3 — Background Threads (Core + Adapter)

Goal: Live monitoring — process crash detection, log tailing, metrics.

Step 3.1 — Thread infrastructure

  1. backend/core/threads/base_thread.pyBaseServerThread
  2. backend/core/threads/thread_registry.pyThreadRegistry (adapter-aware)
  3. Wire start_server_threads() / stop_server_threads() into ServerService.start() / ServerService.stop()

Step 3.2 — Process Monitor Thread (core)

  1. backend/core/threads/process_monitor.py
  2. Crash detection + status update in DB
  3. Auto-restart with exponential backoff (daemon cleanup thread pattern)

Test: Start server → kill process manually → confirm DB status changes to 'crashed'. Test: Enable auto_restart → kill → confirm server restarts automatically.

Step 3.3 — Log Parser (Arma 3 adapter) + Log Tail Thread (core)

  1. backend/adapters/arma3/log_parser.pyRPTParser implementing LogParser protocol
  2. backend/core/threads/log_tail.pyLogTailThread (generic, takes adapter's LogParser)
  3. backend/core/dal/log_repository.py
  4. backend/core/logs/service.py
  5. backend/core/logs/router.pyGET /servers/{id}/logs

Test: Start server → GET /api/servers/{id}/logs returns recent RPT lines.

Step 3.4 — Metrics Collector Thread (core)

  1. backend/core/metrics/service.py
  2. backend/core/dal/metrics_repository.py
  3. backend/core/threads/metrics_collector.py
  4. backend/core/metrics/router.pyGET /servers/{id}/metrics

Test: Running server → query metrics endpoint → see CPU/RAM data points.


Phase 4 — Remote Admin (Arma 3: BattlEye RCon)

Goal: Real-time player list, in-game admin commands via the adapter's RemoteAdmin protocol.

Step 4.1 — RCon Client (Arma 3 adapter)

  1. backend/adapters/arma3/rcon_client.pyBERConClient
  2. Implement BE RCon UDP protocol:
    • Packet structure: 'BE' + CRC32 (little-endian) + type byte + payload
    • Login: type 0x00, payload = password
    • Command: type 0x01, payload = sequence byte + command string
    • Keepalive: type 0x02, payload = empty
  3. Request multiplexer: track pending requests by sequence byte, route responses to correct caller via threading.Event per request
  4. parse_players_response() — parse players command output
  5. Handle unsolicited server messages (type 0x02)

Test: Connect BERConClient to a running server with BattlEye → successfully login → send players → receive response.

Step 4.2 — RCon Service (Arma 3 adapter) + Remote Admin Poller Thread (core)

  1. backend/adapters/arma3/rcon_service.pyArma3RConService implementing RemoteAdmin protocol
  2. backend/core/threads/remote_admin_poller.pyRemoteAdminPollerThread (generic, takes adapter's RemoteAdmin)
  3. backend/core/dal/player_repository.py
  4. backend/core/players/service.py
  5. backend/core/players/router.pyGET /servers/{id}/players

Test: Players join server → GET /players returns them with pings.

Step 4.3 — Admin Actions via Remote Admin

  1. POST /servers/{id}/players/{slot_id}/kick — delegates to adapter's remote_admin.kick_player()
  2. POST /servers/{id}/players/{slot_id}/ban — delegates to adapter's remote_admin.ban_player()
  3. POST /servers/{id}/remote-admin/command — delegates to adapter's remote_admin.send_command()
  4. POST /servers/{id}/remote-admin/say — delegates to adapter's remote_admin.say_all()
  5. backend/core/dal/ban_repository.py
  6. GET/POST/DELETE /servers/{id}/bans

Step 4.4 — Ban Manager (Arma 3 adapter)

  1. backend/adapters/arma3/ban_manager.pyArma3BanManager implementing BanManager protocol
  2. ban.txt bidirectional sync: on ban add/delete via API, also write to battleye/ban.txt; on startup, read ban.txt and upsert into DB

Test: Kick a player via API → confirm player disconnected from server.


Phase 5 — WebSocket Real-Time

Goal: Live updates to React frontend without polling. Fully game-agnostic.

Step 5.1 — Broadcast infrastructure

  1. backend/core/websocket/broadcaster.pyBroadcastThread + enqueue()
  2. backend/core/websocket/manager.pyConnectionManager
  3. Store event loop reference in main.py:on_startup()
  4. Start BroadcastThread in on_startup()
  5. Wire BroadcastThread.enqueue() calls into all background threads

Step 5.2 — WebSocket endpoint

  1. backend/core/websocket/router.py
  2. JWT validation from query param
  3. Subscribe/unsubscribe message handling
  4. Ping/pong keepalive

Test: Connect to ws://localhost:8000/ws/1?token=... → see live log lines stream in terminal.

Step 5.3 — Integrate all event sources

Wire BroadcastThread.enqueue() into:

  • ProcessMonitorThread → status updates, crash events
  • LogTailThread → log lines
  • MetricsCollectorThread → metrics snapshots
  • RemoteAdminPollerThread → player list updates
  • ServerService.start/stop → status transitions

Test: React frontend connects to WS → server starts → see status, logs, metrics all update in real time.


Phase 6 — Mission & Mod Management (Arma 3 Adapter)

Step 6.1 — Missions

  1. backend/adapters/arma3/mission_manager.pyArma3MissionManager implementing MissionManager protocol
  2. backend/core/missions/router.py — generic endpoints (delegate to adapter if capability supported)
  3. Upload file validation (extension from adapter's MissionManager.file_extension)
  4. Mission rotation CRUD

Test: Upload a .pbo → appears in GET /missions → set as rotation → start server → mission available.

Step 6.2 — Mods

  1. backend/adapters/arma3/mod_manager.pyArma3ModManager implementing ModManager protocol
  2. backend/core/mods/router.py — generic endpoints (delegate to adapter if capability supported)
  3. build_mod_args() — assemble -mod= and -serverMod= args
  4. Wire mod args into Arma3ConfigGenerator.build_launch_args()

Test: Register @CBA_A3 → enable on server → start → server loads mod.


Phase 7 — Polish & Production

Step 7.1 — APScheduler jobs

from apscheduler.schedulers.background import BackgroundScheduler
scheduler = BackgroundScheduler()
scheduler.add_job(log_service.cleanup_old_logs, 'cron', hour=3)
scheduler.add_job(metrics_service.cleanup_old_metrics, 'cron', hour=3, minute=30)
scheduler.add_job(player_service.cleanup_old_history, 'cron', hour=4)
scheduler.start()

Step 7.2 — Startup recovery

In on_startup()ProcessManager.recover_on_startup():

  • Query DB for servers with status='running'
  • Check if PID still alive (psutil.pid_exists(pid))
  • Validate process name against adapter's get_allowed_executables()
  • If alive: re-attach threads (skip process start, just start monitoring threads)
  • If dead: mark as crashed, clear players

Step 7.3 — Events log

  1. backend/core/dal/event_repository.py
  2. Insert events for: start, stop, crash, kick, ban, config change, mission change
  3. GET /servers/{id}/events endpoint

Step 7.4 — Security hardening

  1. Encrypt sensitive DB fields in game_configs JSON (passwords, rcon_password)
    • backend/core/utils/crypto.py with Fernet
    • LANGUARD_ENCRYPTION_KEY must be a Fernet base64 key
    • Adapter declares sensitive fields: adapter.get_sensitive_fields(section) -> list[str]
    • ConfigRepository handles Fernet encrypt/decrypt transparently: encrypts declared fields on write, decrypts on read
  2. Content-Security-Policy headers for frontend
  3. Penetration testing and security audit

Step 7.5 — Frontend integration checklist

Verify React app can:

  • Login and store JWT
  • See list of supported game types
  • Create server with game type selection
  • List servers with live status (any game type)
  • Start/stop server and see status update via WebSocket
  • View streaming log output (parsed by adapter)
  • See player list update (via adapter's remote admin)
  • See CPU/RAM charts update
  • Edit config sections (dynamic form from adapter's JSON Schema)
  • Upload a mission file (if adapter supports missions)
  • Manage mods (if adapter supports mods)
  • Kick/ban a player (if adapter supports remote admin)
  • Send a message to all players (if adapter supports remote admin)

Phase 8 — Second Adapter (Validation)

Goal: Prove the architecture works by adding a second game adapter. This validates that new games require zero core changes.

Choose a second game (examples):

  • Minecraft Java Edition — Has RCON (Source protocol), server.properties config, JAR executable, world/ directory, plugins/ mods
  • Rust — Has RCON (websocket-based), server.cfg, RustDedicated.exe, oxide/mods
  • Valheim — Has no RCON, start_server.sh config, valheim_server.exe, mods via BepInEx

Steps for a new adapter:

  1. Create backend/adapters/<game_type>/ directory (built-in) or separate Python package (third-party)
  2. Implement required protocols: ConfigGenerator (schema + generation), ProcessConfig, LogParser
  3. Implement optional protocols as needed: RemoteAdmin, MissionManager, ModManager, BanManager
  4. Create adapter class implementing GameAdapter
  5. Register adapter:
    • Built-in: add to backend/adapters/<game_type>/__init__.py and auto-import in adapters/__init__.py
    • Third-party: add languard.adapters entry_point in pyproject.toml:
      [project.entry-points."languard.adapters"]
      mygame = "my_package.adapters:MYGAME_ADAPTER"
      
      Core discovers these via importlib.metadata at startup.
  6. No core code changes needed
  7. No DB migrations needed
  8. Test: create a server with the new game_type, start it, monitor it

Testing Strategy

Unit tests (pytest)

  • GameAdapterRegistry — register, get, list, missing adapter
  • Arma3ConfigGenerator — Pydantic model validation for each section (merged schema + generation)
  • Arma3ConfigGenerator.write_server_cfg() — compare output against expected string; test config injection prevention
  • Arma3ConfigGenerator._escape_config_string() — test double-quote and newline escaping
  • RPTParser.parse_line() — test all log formats
  • BERConClient.parse_players_response() — test with sample output
  • AuthService.login() — correct/wrong password / rate limiting
  • Repository methods — use in-memory SQLite (:memory:)
  • check_server_ports_available() — test derived port validation (via adapter conventions)
  • sanitize_filename() — test path traversal prevention
  • Protocol conformance — verify Arma3Adapter satisfies all GameAdapter protocol methods

Integration tests

  • Full start/stop cycle with a real arma3server.exe (manual — requires licensed Arma 3)
  • WebSocket message delivery (can be automated with httpx test client)
  • RCon command round-trip (manual — requires running server with BattlEye)
  • Adapter resolution: create server with game_type, verify correct adapter is used throughout

Adapter contract tests

  • Template test suite that any new adapter should pass
  • Tests: ConfigGenerator produces valid sections and valid config files, ProcessConfig returns allowed executables, LogParser parses sample lines
  • ConfigGenerator migration test: migrate_config(old_version, config_json) returns valid migrated dict; ConfigMigrationError on invalid old_version

Load notes

  • SQLite with WAL handles concurrent reads from 4 threads per server well
  • For >10 simultaneous servers, consider connection pool size tuning
  • WebSocket broadcast scales to ~100 concurrent connections without issue

Environment Setup (Developer)

# 1. Clone repo
git clone <repo>
cd languard-servers-manager

# 2. Backend
cd backend
python -m venv venv
source venv/bin/activate   # or venv\Scripts\activate on Windows
pip install -r requirements.txt

# 3. Environment
cp .env.example .env
# Edit .env: set game-specific paths (LANGUARD_ARMA3_DEFAULT_EXE, etc.)

# 4. Run backend
uvicorn main:app --reload --host 0.0.0.0 --port 8000

# 5. Frontend (separate)
cd ../frontend
npm install
npm run dev

Backend auto-creates languard.db, seeds an admin user on first run, and registers the Arma 3 adapter automatically.


Phase Summary

Phase Deliverable Key Change from Single-Game
0 Adapter framework (protocols + exceptions + registry) NEW — foundation for modularity
1 Foundation (auth + server CRUD + game discovery + migration) Core tables, game_type field, game_configs JSON, migration from old schema
2 Arma 3 adapter: config gen + process mgmt Config generation in adapter, atomic writes, typed exceptions, optimistic locking
3 Background threads (core + adapter injection) Generic threads + adapter parsers/clients, per-server lock for RemoteAdmin
4 Remote admin (Arma 3: BattlEye RCon) RCon in adapter, generic poller in core
5 WebSocket real-time No change — fully game-agnostic
6 Mission + mod management (Arma 3 adapter) In adapter, generic endpoints in core
7 Polish, security, recovery Adapter-declared sensitive fields, Fernet encryption
8 Second game adapter NEW — validates zero core changes, entry_points for third-party

Implement phases in order — each phase builds on the previous and is independently testable. Phase 0 must come first as it defines the contract that all subsequent code depends on.