# CLAUDE.md This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. ## Common Commands ```bash # Run all tests (no network required) python test_suite.py # Check Python version and dependencies python check_deps.py # Full pipeline (parse → compare → fetch → link) python run.py # Parse + compare only (no download, no linking) python run.py --skip-fetch --skip-link # Diagnose mod folder name / steam_id issues python check_names.py python check_names.py --fix --fix-ids # Launch the GUI python gui.py ``` There is no build step, linter config, or package install beyond `pip install -r requirements.txt`. ## Architecture ### Package vs CLI layer `arma_modlist_tools/` is a pure library — no I/O side effects, no `sys.exit`, no `print`. All CLI scripts (`run.py`, `fetch_mods.py`, `link_mods.py`, etc.) sit at the project root and call into the package. New functionality goes in the package first, then a CLI script wraps it. ### Data flow ``` modlist_html/*.html └─ parser.parse_modlist_dir() └─ compare.compare_presets() └─ comparison.json ←─ source of truth for groups + mod identity ├─ fetcher.build_server_index() ←─ Caddy JSON API │ └─ fetcher.find_mod_folder() (steam_id first, name fallback) │ └─ downloads/{group}/@ModName/ │ └─ linker.link_group() │ └─ arma_dir/@ModName (junction/symlink) └─ reporter.build_missing_report() → missing_report.json ``` ### Group naming convention - `"shared"` — mods present in **all** compared presets - `""` — mods unique to one preset (key from `comparison["unique"]`) This group label is stored in `missing_report.json` per-mod so `sync_missing.py` knows where to place newly available mods without re-reading `comparison.json`. ### Server index structure `build_server_index()` returns: ```python { "by_steam_id": {"450814997": "https://server/@cba_a3/"}, # primary lookup "by_name": {"cbaa3": "https://server/@cba_a3/"}, # normalized fallback "folders": [...] # raw Caddy listing } ``` `_normalize_name` strips `@`, lowercases, removes all non-alphanumeric: `"@CBA_A3"` → `"cbaa3"`. Used in both the index builder and every lookup. ### Junction / symlink critical rules **Detection:** `os.path.islink()` returns `False` for Windows junctions. Always use `_is_junction()` from `linker.py`, which checks `st_file_attributes & 0x400` (`FILE_ATTRIBUTE_REPARSE_POINT`) on Windows. **Removal:** Use `os.rmdir()` on Windows and `os.unlink()` on Linux. **Never** `shutil.rmtree()` — it follows the junction and deletes the target mod files. **Creation:** `cmd /c mklink /J ` on Windows, `os.symlink()` on Linux. ### check_names.py classification (two-pass) Pass 1 collects raw `(server_name, local_steam_id)` for every disk folder. Pass 2 builds `ok_disk_names` — the set of disk names that already match the server exactly. Any MISMATCH whose proposed server name is in `ok_disk_names` is reclassified as `ID_COLLISION` (the local `meta.cpp` has a wrong `publishedid` that belongs to a different mod). This prevents false rename suggestions caused by shared/duplicate steam IDs on the server. `--fix-ids` corrects `meta.cpp` using steam IDs from `comparison.json` (sourced from Steam Workshop URLs in the HTML presets) as the authoritative source. ### GUI package `gui/` is a CustomTkinter desktop application wrapping the CLI toolchain. Entry point is `gui.py` at the project root, which calls `gui.run_app()`. **Key files:** - `gui/__init__.py` — sets dark theme + blue color scheme; exports `run_app()` - `gui/app.py` — `ArmaModManagerApp` main window; manages view routing, config loading, thread-safe log queue, and background pipeline execution - `gui/wizard.py` — `SetupWizard` dialog shown on first launch when no `config.json` exists - `gui/_constants.py` — window dimensions, status color constants, file paths - `gui/_io.py` — `_QueueWriter` redirects stdout/stderr to a thread-safe queue so pipeline output streams into the Logs view. `write()` strips ANSI/CSI escape codes and converts bare `\r` to `\n` before enqueuing, so `tqdm` progress output is legible in the textbox. **Views** (`gui/views/`): each inherits `BaseView`; `build()` runs once on creation, `refresh()` runs on each navigation: - `dashboard.py` — overview, status, quick stats - `mods.py` — browse and manage downloaded mods by group - `tools.py` — link/unlink, rename folders, sync missing mods, check server - `logs.py` — real-time log viewer fed from the stdout/stderr queue - `settings.py` — in-app editor for `config.json` (server URL, paths, credentials) **`_find_folder` (mods.py) — four-level name matching:** The mods view resolves a mod's local folder by mod name from `comparison.json`, which may differ from the server-canonical folder name used by the fetcher. Lookup order: 1. Exact: `@{mod_name}` 2. Case-insensitive: `@CBA_A3` matches `CBA_A3` 3. Normalized (`_normalize_name`): strips all non-alphanumeric — handles punctuation/spacing differences, e.g. `@US GEAr- Units (IFA3)` matches `US GEAr: Units (IFA3)` (both → `usgearunitsifa3`) 4. Steam ID via `meta.cpp`: reads `publishedid` from each folder's `meta.cpp` and matches against `mod["steam_id"]` — handles the case where the folder name bears no resemblance to the modlist name but the mod content is correct **`selection.json`** — GUI selection state file, tracked in git. Persists which mods/groups are selected between GUI sessions. Written by the GUI; safe to delete (GUI recreates it on next save). **`run_tool` subprocess streaming:** Tool scripts are launched via `subprocess.Popen` (not `subprocess.run`) with `stdout=PIPE, stderr=STDOUT`, read line-by-line via `iter(proc.stdout.readline, "")`, and posted to the log queue immediately. Python's own output buffering is disabled with the `-u` flag and `PYTHONUNBUFFERED=1` in the environment — without these, output would batch inside the pipe and only appear when the script exits. The `Popen` call uses `encoding="utf-8", errors="replace"` and sets `PYTHONUTF8=1` in the child environment so that tqdm's Unicode block characters (e.g. `▉`) don't crash the pipe reader on Windows, where the default `charmap` codec cannot decode them. **GUI threading model:** Every network or long-running operation runs in a `threading.Thread(daemon=True)` so the Tkinter event loop is never blocked. The only safe way to update widgets from a background thread is `self.after(0, callback)` — never touch widgets directly from a worker thread. `_poll_log` drains the entire log queue in one `after(80, ...)` tick and does a single batched `CTkTextbox.insert()` call rather than one per log entry, keeping the UI smooth even when `tqdm` emits many rapid updates during downloads. The wizard's "Test Connection" button follows the same pattern: `requests.get` runs in a daemon thread; the result is posted back via `self.after(0, ...)` with widget references captured *before* the thread starts, so stale references cannot update the wrong widgets if the user navigates away mid-request. **`run_pipeline` worker — import guard:** `from run import step_fetch, step_link` is performed inside its own `try/except` *before* stdout is redirected. If this import fails for any reason the exception is posted to the log via `self.after(0, ...)` and `_pipeline_done` is called so the UI resets cleanly. Previously an import failure would silently kill the worker thread and leave the pipeline button disabled forever. **`build_server_index` progress callback:** Accepts an optional `progress_fn(current, total, name)` callback. `step_fetch` in `run.py` uses this to print `Indexing N/M: @FolderName` every 25 folders so the log never goes silent during the server scan phase. The library itself never calls `print` — the caller owns the I/O. ### `update_mods.py` — orphan file removal After downloading updated files, `update_mods.py` compares every file in the local mod folder against the server's file list and **deletes any local files that no longer exist on the server**. This prevents stale `.pbo` or `.bisign` files from accumulating when a mod's content changes upstream. Each removed file is logged as `[-] orphan removed: ` and the final summary line includes an orphan count. The orphan check runs even when no files need downloading (e.g. timestamps match but the local folder has extras). ### GUI localization (`gui/locales.py`) All user-facing strings are centralised in `gui/locales.py`. Two languages are supported: English (`"en"`) and Vietnamese (`"vi"`). **API:** ```python from gui.locales import t, set_language, get_language t("nav.dashboard") # → "Dashboard" or "Tổng quan" t("dashboard.stats", total=42, shared=10) # → "42 mods · 10 shared" set_language("vi") # switch active language get_language() # → "vi" ``` **Key naming:** flat dot-notation — `"."`, e.g. `"dashboard.run_btn"`, `"wizard.step1_title"`, `"tools.cn_warn"`. **Dynamic strings** use `str.format_map` with keyword args. The dict value contains `{placeholder}` and the caller passes `t("key", placeholder=value)`. **Hot-swap:** `app.switch_language(lang)` calls `set_language()`, saves the preference to `config.json` under `"ui": {"language": "..."}`, retranslates sidebar nav buttons, then calls `view.refresh()` on every cached view. Views that build all content in `refresh()` (Settings, Mods) update automatically. Views with static `build()`-time widgets (Dashboard, Logs, Tools) store widget references and retranslate them at the top of `refresh()`. **Constraints:** - `CTkTabview` tab names in `tools.py` are kept in English — they double as frame lookup keys (`tv.tab("Check Names")`) and cannot be renamed after creation. - Segmented button values in `tools.py` (`"Status"`, `"Link"`, `"Unlink"`) are kept in English — they drive the logic in `_lm_on_change()`. - `_VIEW_NAMES` routing keys (`"Dashboard"`, `"Mods"`, etc.) are kept in English — they are `_view_cache` dict keys. **Adding a new string:** Add the key to both `_EN` and `_VI` dicts in `locales.py`. The `assert set(_EN.keys()) == set(_VI.keys())` guard at module load will catch any mismatch. ## Python Version Compatibility Minimum is Python **3.9**. All files that use `X | Y` union type annotations **must** have `from __future__ import annotations` as the first import. Without it, the `|` syntax raises `TypeError` at runtime on Python < 3.10. Every module in `arma_modlist_tools/` already has it; any new CLI script you add must include it too. ### `fix_console_encoding` — `None` stdout guard When the GUI is launched via `pythonw.exe` (no console window), Python sets `sys.stdout` and `sys.stderr` to `None`. `fix_console_encoding()` must check `if sys.stdout is None or sys.stderr is None: return` **before** accessing `.encoding`, otherwise it raises `AttributeError: 'NoneType' object has no attribute 'encoding'`. This error surfaces in the GUI as *"Failed to load pipeline"* because `run.py` calls `fix_console_encoding()` at module level and the exception is caught by the pipeline import guard. ## Test Suite `test_suite.py` uses a custom harness (no pytest/unittest dependency). Structure: ```python group("section name") # prints header test("description", callable) # runs fn, catches exceptions, tracks pass/fail skip("description", "reason") # marks skipped ``` Tests that exercise the linker use `tempfile.TemporaryDirectory()` — never the real `arma_dir`. Tests that would require network calls mock `list_mod_files` with `unittest.mock.patch`. ## Key Files Not in Git - `config.json` — credentials + paths (copy from `config.template.json`) - `downloads/` — downloaded mod files, can be several GB - `modlist_json/` — generated JSON output The `.html` preset files in `modlist_html/` **are** tracked as example inputs.