Files

Tran G. (Revernomad) Khoa b24828ac68 docs: update CLAUDE.md, README, and Vietnamese guide for migration step

- CLAUDE.md: document migrator.py algorithm and junction-removal rationale
- README.md: pipeline now 5 steps, add --skip-migrate flag, add migrator.py
  to folder structure, update test count 142 -> 158
- docs/huong-dan-su-dung.md: 5-step pipeline table, new glossary entry,
  updated footer version note

2026-04-14 15:11:39 +07:00

13 KiB

Raw Blame History

CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

Common Commands

# Run all tests (no network required)
python test_suite.py

# Check Python version and dependencies
python check_deps.py

# Full pipeline (parse → compare → fetch → link)
python run.py

# Parse + compare only (no download, no linking)
python run.py --skip-fetch --skip-link

# Diagnose mod folder name / steam_id issues
python check_names.py
python check_names.py --fix --fix-ids

# Launch the GUI
python gui.py

There is no build step, linter config, or package install beyond pip install -r requirements.txt.

Architecture

Package vs CLI layer

arma_modlist_tools/ is a pure library — no I/O side effects, no sys.exit, no print. All CLI scripts (run.py, fetch_mods.py, link_mods.py, etc.) sit at the project root and call into the package. New functionality goes in the package first, then a CLI script wraps it.

Data flow

modlist_html/*.html
    └─ parser.parse_modlist_dir()
           └─ compare.compare_presets()
                  └─ comparison.json  ←─ source of truth for groups + mod identity
                         ├─ fetcher.build_server_index()  ←─ Caddy JSON API
                         │      └─ fetcher.find_mod_folder()  (steam_id first, name fallback)
                         │             └─ downloads/{group}/@ModName/
                         │                    └─ linker.link_group()
                         │                           └─ arma_dir/@ModName  (junction/symlink)
                         └─ reporter.build_missing_report()  →  missing_report.json

Group naming convention

"shared" — mods present in all compared presets
"<preset_name>" — mods unique to one preset (key from comparison["unique"])

This group label is stored in missing_report.json per-mod so sync_missing.py knows where to place newly available mods without re-reading comparison.json.

Server index structure

build_server_index() returns:

{
    "by_steam_id": {"450814997": "https://server/@cba_a3/"},  # primary lookup
    "by_name":     {"cbaa3":     "https://server/@cba_a3/"},  # normalized fallback
    "folders":     [...]                                       # raw Caddy listing
}

_normalize_name strips @, lowercases, removes all non-alphanumeric: "@CBA_A3" → "cbaa3". Used in both the index builder and every lookup.

Junction / symlink critical rules

Detection: os.path.islink() returns False for Windows junctions. Always use _is_junction() from linker.py, which checks st_file_attributes & 0x400 (FILE_ATTRIBUTE_REPARSE_POINT) on Windows.

Removal: Use os.rmdir() on Windows and os.unlink() on Linux. Never shutil.rmtree() — it follows the junction and deletes the target mod files.

Creation: cmd /c mklink /J <link> <target> on Windows, os.symlink() on Linux.

check_names.py classification (two-pass)

Pass 1 collects raw (server_name, local_steam_id) for every disk folder. Pass 2 builds ok_disk_names — the set of disk names that already match the server exactly. Any MISMATCH whose proposed server name is in ok_disk_names is reclassified as ID_COLLISION (the local meta.cpp has a wrong publishedid that belongs to a different mod). This prevents false rename suggestions caused by shared/duplicate steam IDs on the server.

--fix-ids corrects meta.cpp using steam IDs from comparison.json (sourced from Steam Workshop URLs in the HTML presets) as the authoritative source.

GUI package

gui/ is a CustomTkinter desktop application wrapping the CLI toolchain. Entry point is gui.py at the project root, which calls gui.run_app().

Key files:

gui/__init__.py — sets dark theme + blue color scheme; exports run_app()
gui/app.py — ArmaModManagerApp main window; manages view routing, config loading, thread-safe log queue, and background pipeline execution
gui/wizard.py — SetupWizard dialog shown on first launch when no config.json exists
gui/_constants.py — window dimensions, status color constants, file paths
gui/_io.py — _QueueWriter redirects stdout/stderr to a thread-safe queue so pipeline output streams into the Logs view. write() strips ANSI/CSI escape codes and converts bare \r to \n before enqueuing, so tqdm progress output is legible in the textbox.

Views (gui/views/): each inherits BaseView; build() runs once on creation, refresh() runs on each navigation:

dashboard.py — overview, status, quick stats
mods.py — browse and manage downloaded mods by group
tools.py — link/unlink, rename folders, sync missing mods, check server
logs.py — real-time log viewer fed from the stdout/stderr queue
settings.py — in-app editor for config.json (server URL, paths, credentials)

_find_folder (mods.py) — four-level name matching: The mods view resolves a mod's local folder by mod name from comparison.json, which may differ from the server-canonical folder name used by the fetcher. Lookup order:

Exact: @{mod_name}
Case-insensitive: @CBA_A3 matches CBA_A3
Normalized (_normalize_name): strips all non-alphanumeric — handles punctuation/spacing differences, e.g. @US GEAr- Units (IFA3) matches US GEAr: Units (IFA3) (both → usgearunitsifa3)
Steam ID via meta.cpp: reads publishedid from each folder's meta.cpp and matches against mod["steam_id"] — handles the case where the folder name bears no resemblance to the modlist name but the mod content is correct

selection.json — GUI selection state file, tracked in git. Persists which mods/groups are selected between GUI sessions. Written by the GUI; safe to delete (GUI recreates it on next save).

run_tool subprocess streaming: Tool scripts are launched via subprocess.Popen (not subprocess.run) with stdout=PIPE, stderr=STDOUT, read line-by-line via iter(proc.stdout.readline, ""), and posted to the log queue immediately. Python's own output buffering is disabled with the -u flag and PYTHONUNBUFFERED=1 in the environment — without these, output would batch inside the pipe and only appear when the script exits. The Popen call uses encoding="utf-8", errors="replace" and sets PYTHONUTF8=1 in the child environment so that tqdm's Unicode block characters (e.g. ▉) don't crash the pipe reader on Windows, where the default charmap codec cannot decode them.

GUI threading model: Every network or long-running operation runs in a threading.Thread(daemon=True) so the Tkinter event loop is never blocked. The only safe way to update widgets from a background thread is self.after(0, callback) — never touch widgets directly from a worker thread. _poll_log drains the entire log queue in one after(80, ...) tick and does a single batched CTkTextbox.insert() call rather than one per log entry, keeping the UI smooth even when tqdm emits many rapid updates during downloads. The wizard's "Test Connection" button follows the same pattern: requests.get runs in a daemon thread; the result is posted back via self.after(0, ...) with widget references captured before the thread starts, so stale references cannot update the wrong widgets if the user navigates away mid-request.

run_pipeline worker — import guard: from run import step_fetch, step_link is performed inside its own try/except before stdout is redirected. If this import fails for any reason the exception is posted to the log via self.after(0, ...) and _pipeline_done is called so the UI resets cleanly. Previously an import failure would silently kill the worker thread and leave the pipeline button disabled forever.

build_server_index progress callback: Accepts an optional progress_fn(current, total, name) callback. step_fetch in run.py uses this to print Indexing N/M: @FolderName every 25 folders so the log never goes silent during the server scan phase. The library itself never calls print — the caller owns the I/O.

`migrator.py` — mod group migration

Before step_fetch runs, step_migrate moves locally-downloaded mod folders to match the group assignments in the new comparison.json. This avoids re-downloading mods that already exist on disk under a different group when presets are switched (e.g. A → A_v1).

Algorithm:

_build_local_index(downloads) — scans every downloads/{group}/@Folder, reads meta.cpp to extract publishedid, builds {steam_id → (group, path)} and {norm_name → (group, path)} maps.
_build_target_list(comparison) — flattens comparison.json into [(new_group, steam_id, mod_name)].
For each target: locate mod on disk (steam_id first, normalised name fallback); skip if already in correct group or destination exists; remove stale junction from arma_dir if present; move folder with shutil.move.

Junction removal is critical: a stale junction (target moved away) still has the reparse point attribute, so _is_junction() returns True and link_group would skip it as already_linked without recreating it at the new path. Removing the junction before the move lets step_link recreate it correctly.

CLI: python run.py --skip-migrate bypasses the step if needed.

`update_mods.py` — orphan file removal

After downloading updated files, update_mods.py compares every file in the local mod folder against the server's file list and deletes any local files that no longer exist on the server. This prevents stale .pbo or .bisign files from accumulating when a mod's content changes upstream. Each removed file is logged as [-] orphan removed: <rel_path> and the final summary line includes an orphan count. The orphan check runs even when no files need downloading (e.g. timestamps match but the local folder has extras).

GUI localization (`gui/locales.py`)

All user-facing strings are centralised in gui/locales.py. Two languages are supported: English ("en") and Vietnamese ("vi").

API:

from gui.locales import t, set_language, get_language

t("nav.dashboard")                          # → "Dashboard" or "Tổng quan"
t("dashboard.stats", total=42, shared=10)   # → "42 mods · 10 shared"
set_language("vi")                          # switch active language
get_language()                              # → "vi"

Key naming: flat dot-notation — "<view>.<widget_purpose>", e.g. "dashboard.run_btn", "wizard.step1_title", "tools.cn_warn".

Dynamic strings use str.format_map with keyword args. The dict value contains {placeholder} and the caller passes t("key", placeholder=value).

Hot-swap: app.switch_language(lang) calls set_language(), saves the preference to config.json under "ui": {"language": "..."}, retranslates sidebar nav buttons, then calls view.refresh() on every cached view. Views that build all content in refresh() (Settings, Mods) update automatically. Views with static build()-time widgets (Dashboard, Logs, Tools) store widget references and retranslate them at the top of refresh().

Constraints:

CTkTabview tab names in tools.py are kept in English — they double as frame lookup keys (tv.tab("Check Names")) and cannot be renamed after creation.
Segmented button values in tools.py ("Status", "Link", "Unlink") are kept in English — they drive the logic in _lm_on_change().
_VIEW_NAMES routing keys ("Dashboard", "Mods", etc.) are kept in English — they are _view_cache dict keys.

Adding a new string: Add the key to both _EN and _VI dicts in locales.py. The assert set(_EN.keys()) == set(_VI.keys()) guard at module load will catch any mismatch.

Python Version Compatibility

Minimum is Python 3.9. All files that use X | Y union type annotations must have from __future__ import annotations as the first import. Without it, the | syntax raises TypeError at runtime on Python < 3.10. Every module in arma_modlist_tools/ already has it; any new CLI script you add must include it too.

`fix_console_encoding` — `None` stdout guard

When the GUI is launched via pythonw.exe (no console window), Python sets sys.stdout and sys.stderr to None. fix_console_encoding() must check if sys.stdout is None or sys.stderr is None: return before accessing .encoding, otherwise it raises AttributeError: 'NoneType' object has no attribute 'encoding'. This error surfaces in the GUI as "Failed to load pipeline" because run.py calls fix_console_encoding() at module level and the exception is caught by the pipeline import guard.

Test Suite

test_suite.py uses a custom harness (no pytest/unittest dependency). Structure:

group("section name")   # prints header
test("description", callable)  # runs fn, catches exceptions, tracks pass/fail
skip("description", "reason")  # marks skipped

Tests that exercise the linker use tempfile.TemporaryDirectory() — never the real arma_dir. Tests that would require network calls mock list_mod_files with unittest.mock.patch.

Key Files Not in Git

config.json — credentials + paths (copy from config.template.json)
downloads/ — downloaded mod files, can be several GB
modlist_json/ — generated JSON output

The .html preset files in modlist_html/ are tracked as example inputs.

13 KiB Raw Blame History