Files
arma-modlist-tools/CLAUDE.md
Tran G. (Revernomad) Khoa ecfa5fa636 fix: match mod folder by steam_id when folder name diverges from modlist name
_find_folder in mods.py now has a fourth fallback: reads publishedid from
meta.cpp inside each candidate folder and matches against mod["steam_id"].
Fixes mods appearing as "not downloaded" when the folder name on disk differs
from the name in the modlist but the mod content (meta.cpp) is correct.

Also adds 8 tests covering all four match strategies and edge cases.
2026-04-14 14:37:32 +07:00

178 lines
12 KiB
Markdown

# CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
## Common Commands
```bash
# Run all tests (no network required)
python test_suite.py
# Check Python version and dependencies
python check_deps.py
# Full pipeline (parse → compare → fetch → link)
python run.py
# Parse + compare only (no download, no linking)
python run.py --skip-fetch --skip-link
# Diagnose mod folder name / steam_id issues
python check_names.py
python check_names.py --fix --fix-ids
# Launch the GUI
python gui.py
```
There is no build step, linter config, or package install beyond `pip install -r requirements.txt`.
## Architecture
### Package vs CLI layer
`arma_modlist_tools/` is a pure library — no I/O side effects, no `sys.exit`, no `print`. All CLI scripts (`run.py`, `fetch_mods.py`, `link_mods.py`, etc.) sit at the project root and call into the package. New functionality goes in the package first, then a CLI script wraps it.
### Data flow
```
modlist_html/*.html
└─ parser.parse_modlist_dir()
└─ compare.compare_presets()
└─ comparison.json ←─ source of truth for groups + mod identity
├─ fetcher.build_server_index() ←─ Caddy JSON API
│ └─ fetcher.find_mod_folder() (steam_id first, name fallback)
│ └─ downloads/{group}/@ModName/
│ └─ linker.link_group()
│ └─ arma_dir/@ModName (junction/symlink)
└─ reporter.build_missing_report() → missing_report.json
```
### Group naming convention
- `"shared"` — mods present in **all** compared presets
- `"<preset_name>"` — mods unique to one preset (key from `comparison["unique"]`)
This group label is stored in `missing_report.json` per-mod so `sync_missing.py` knows where to place newly available mods without re-reading `comparison.json`.
### Server index structure
`build_server_index()` returns:
```python
{
"by_steam_id": {"450814997": "https://server/@cba_a3/"}, # primary lookup
"by_name": {"cbaa3": "https://server/@cba_a3/"}, # normalized fallback
"folders": [...] # raw Caddy listing
}
```
`_normalize_name` strips `@`, lowercases, removes all non-alphanumeric: `"@CBA_A3"``"cbaa3"`. Used in both the index builder and every lookup.
### Junction / symlink critical rules
**Detection:** `os.path.islink()` returns `False` for Windows junctions. Always use `_is_junction()` from `linker.py`, which checks `st_file_attributes & 0x400` (`FILE_ATTRIBUTE_REPARSE_POINT`) on Windows.
**Removal:** Use `os.rmdir()` on Windows and `os.unlink()` on Linux. **Never** `shutil.rmtree()` — it follows the junction and deletes the target mod files.
**Creation:** `cmd /c mklink /J <link> <target>` on Windows, `os.symlink()` on Linux.
### check_names.py classification (two-pass)
Pass 1 collects raw `(server_name, local_steam_id)` for every disk folder.
Pass 2 builds `ok_disk_names` — the set of disk names that already match the server exactly. Any MISMATCH whose proposed server name is in `ok_disk_names` is reclassified as `ID_COLLISION` (the local `meta.cpp` has a wrong `publishedid` that belongs to a different mod). This prevents false rename suggestions caused by shared/duplicate steam IDs on the server.
`--fix-ids` corrects `meta.cpp` using steam IDs from `comparison.json` (sourced from Steam Workshop URLs in the HTML presets) as the authoritative source.
### GUI package
`gui/` is a CustomTkinter desktop application wrapping the CLI toolchain. Entry point is `gui.py` at the project root, which calls `gui.run_app()`.
**Key files:**
- `gui/__init__.py` — sets dark theme + blue color scheme; exports `run_app()`
- `gui/app.py``ArmaModManagerApp` main window; manages view routing, config loading, thread-safe log queue, and background pipeline execution
- `gui/wizard.py``SetupWizard` dialog shown on first launch when no `config.json` exists
- `gui/_constants.py` — window dimensions, status color constants, file paths
- `gui/_io.py``_QueueWriter` redirects stdout/stderr to a thread-safe queue so pipeline output streams into the Logs view. `write()` strips ANSI/CSI escape codes and converts bare `\r` to `\n` before enqueuing, so `tqdm` progress output is legible in the textbox.
**Views** (`gui/views/`): each inherits `BaseView`; `build()` runs once on creation, `refresh()` runs on each navigation:
- `dashboard.py` — overview, status, quick stats
- `mods.py` — browse and manage downloaded mods by group
- `tools.py` — link/unlink, rename folders, sync missing mods, check server
- `logs.py` — real-time log viewer fed from the stdout/stderr queue
- `settings.py` — in-app editor for `config.json` (server URL, paths, credentials)
**`_find_folder` (mods.py) — four-level name matching:** The mods view resolves a mod's local folder by mod name from `comparison.json`, which may differ from the server-canonical folder name used by the fetcher. Lookup order:
1. Exact: `@{mod_name}`
2. Case-insensitive: `@CBA_A3` matches `CBA_A3`
3. Normalized (`_normalize_name`): strips all non-alphanumeric — handles punctuation/spacing differences, e.g. `@US GEAr- Units (IFA3)` matches `US GEAr: Units (IFA3)` (both → `usgearunitsifa3`)
4. Steam ID via `meta.cpp`: reads `publishedid` from each folder's `meta.cpp` and matches against `mod["steam_id"]` — handles the case where the folder name bears no resemblance to the modlist name but the mod content is correct
**`selection.json`** — GUI selection state file, tracked in git. Persists which mods/groups are selected between GUI sessions. Written by the GUI; safe to delete (GUI recreates it on next save).
**`run_tool` subprocess streaming:** Tool scripts are launched via `subprocess.Popen` (not `subprocess.run`) with `stdout=PIPE, stderr=STDOUT`, read line-by-line via `iter(proc.stdout.readline, "")`, and posted to the log queue immediately. Python's own output buffering is disabled with the `-u` flag and `PYTHONUNBUFFERED=1` in the environment — without these, output would batch inside the pipe and only appear when the script exits. The `Popen` call uses `encoding="utf-8", errors="replace"` and sets `PYTHONUTF8=1` in the child environment so that tqdm's Unicode block characters (e.g. `▉`) don't crash the pipe reader on Windows, where the default `charmap` codec cannot decode them.
**GUI threading model:** Every network or long-running operation runs in a `threading.Thread(daemon=True)` so the Tkinter event loop is never blocked. The only safe way to update widgets from a background thread is `self.after(0, callback)` — never touch widgets directly from a worker thread. `_poll_log` drains the entire log queue in one `after(80, ...)` tick and does a single batched `CTkTextbox.insert()` call rather than one per log entry, keeping the UI smooth even when `tqdm` emits many rapid updates during downloads. The wizard's "Test Connection" button follows the same pattern: `requests.get` runs in a daemon thread; the result is posted back via `self.after(0, ...)` with widget references captured *before* the thread starts, so stale references cannot update the wrong widgets if the user navigates away mid-request.
**`run_pipeline` worker — import guard:** `from run import step_fetch, step_link` is performed inside its own `try/except` *before* stdout is redirected. If this import fails for any reason the exception is posted to the log via `self.after(0, ...)` and `_pipeline_done` is called so the UI resets cleanly. Previously an import failure would silently kill the worker thread and leave the pipeline button disabled forever.
**`build_server_index` progress callback:** Accepts an optional `progress_fn(current, total, name)` callback. `step_fetch` in `run.py` uses this to print `Indexing N/M: @FolderName` every 25 folders so the log never goes silent during the server scan phase. The library itself never calls `print` — the caller owns the I/O.
### `update_mods.py` — orphan file removal
After downloading updated files, `update_mods.py` compares every file in the local mod folder against the server's file list and **deletes any local files that no longer exist on the server**. This prevents stale `.pbo` or `.bisign` files from accumulating when a mod's content changes upstream. Each removed file is logged as `[-] orphan removed: <rel_path>` and the final summary line includes an orphan count. The orphan check runs even when no files need downloading (e.g. timestamps match but the local folder has extras).
### GUI localization (`gui/locales.py`)
All user-facing strings are centralised in `gui/locales.py`. Two languages are supported: English (`"en"`) and Vietnamese (`"vi"`).
**API:**
```python
from gui.locales import t, set_language, get_language
t("nav.dashboard") # → "Dashboard" or "Tổng quan"
t("dashboard.stats", total=42, shared=10) # → "42 mods · 10 shared"
set_language("vi") # switch active language
get_language() # → "vi"
```
**Key naming:** flat dot-notation — `"<view>.<widget_purpose>"`, e.g. `"dashboard.run_btn"`, `"wizard.step1_title"`, `"tools.cn_warn"`.
**Dynamic strings** use `str.format_map` with keyword args. The dict value contains `{placeholder}` and the caller passes `t("key", placeholder=value)`.
**Hot-swap:** `app.switch_language(lang)` calls `set_language()`, saves the preference to `config.json` under `"ui": {"language": "..."}`, retranslates sidebar nav buttons, then calls `view.refresh()` on every cached view. Views that build all content in `refresh()` (Settings, Mods) update automatically. Views with static `build()`-time widgets (Dashboard, Logs, Tools) store widget references and retranslate them at the top of `refresh()`.
**Constraints:**
- `CTkTabview` tab names in `tools.py` are kept in English — they double as frame lookup keys (`tv.tab("Check Names")`) and cannot be renamed after creation.
- Segmented button values in `tools.py` (`"Status"`, `"Link"`, `"Unlink"`) are kept in English — they drive the logic in `_lm_on_change()`.
- `_VIEW_NAMES` routing keys (`"Dashboard"`, `"Mods"`, etc.) are kept in English — they are `_view_cache` dict keys.
**Adding a new string:** Add the key to both `_EN` and `_VI` dicts in `locales.py`. The `assert set(_EN.keys()) == set(_VI.keys())` guard at module load will catch any mismatch.
## Python Version Compatibility
Minimum is Python **3.9**. All files that use `X | Y` union type annotations **must** have `from __future__ import annotations` as the first import. Without it, the `|` syntax raises `TypeError` at runtime on Python < 3.10. Every module in `arma_modlist_tools/` already has it; any new CLI script you add must include it too.
### `fix_console_encoding` — `None` stdout guard
When the GUI is launched via `pythonw.exe` (no console window), Python sets `sys.stdout` and `sys.stderr` to `None`. `fix_console_encoding()` must check `if sys.stdout is None or sys.stderr is None: return` **before** accessing `.encoding`, otherwise it raises `AttributeError: 'NoneType' object has no attribute 'encoding'`. This error surfaces in the GUI as *"Failed to load pipeline"* because `run.py` calls `fix_console_encoding()` at module level and the exception is caught by the pipeline import guard.
## Test Suite
`test_suite.py` uses a custom harness (no pytest/unittest dependency). Structure:
```python
group("section name") # prints header
test("description", callable) # runs fn, catches exceptions, tracks pass/fail
skip("description", "reason") # marks skipped
```
Tests that exercise the linker use `tempfile.TemporaryDirectory()` — never the real `arma_dir`. Tests that would require network calls mock `list_mod_files` with `unittest.mock.patch`.
## Key Files Not in Git
- `config.json` — credentials + paths (copy from `config.template.json`)
- `downloads/` — downloaded mod files, can be several GB
- `modlist_json/` — generated JSON output
The `.html` preset files in `modlist_html/` **are** tracked as example inputs.