Skip to content

Conversation

@JSv4
Copy link
Owner

@JSv4 JSv4 commented Feb 10, 2026

Summary

  • Adds DocxodusEngine as an alternative to XmlPowerToolsEngine, wrapping Docxodus — a modernized .NET 8.0 fork of Open-XML-PowerTools with better move detection
  • Extracts BaseEngine class with all shared binary extraction and subprocess logic; both engines are thin 3-line subclasses setting DIST_DIR_NAME, BIN_DIR_NAME, and BINARY_BASE_NAME
  • Adds Docxodus as a git submodule, with guarded build (skips with warning if submodule not initialized)
  • Refactors build_differ.py into a reusable build_engine() function called for both engines (also fixes pre-existing bug where win-arm64 was compiled but never compressed)
  • Updates CI workflow to checkout submodules and install .NET 8.0 SDK

Test plan

  • Existing test_openxml_differ.py passes (backward compatibility)
  • New test_docxodus_engine.py passes (Docxodus integration)
  • New test_engine_contract.py passes (parametrized contract tests over both engines)
  • hatch build produces wheel with both sets of binaries

JSv4 added 5 commits February 10, 2026 00:52
Introduce Docxodus (a modernized .NET 8.0 fork of Open-XML-PowerTools with
better move detection) as an alternative engine alongside XmlPowerToolsEngine.

- Extract BaseEngine class with shared binary extraction and subprocess logic
- XmlPowerToolsEngine and DocxodusEngine are thin subclasses setting 3 constants
- Add Docxodus as a git submodule at docxodus/
- Refactor build_differ.py into reusable build_engine() function (also fixes
  missing win-arm64 compression)
- Update CI workflow for submodules and .NET SDK
- Add integration tests and parametrized contract tests for both engines
Rewrite README to prominently feature Docxodus as the recommended comparison
engine, with a link back to the Docxodus repo. Reorganize sections around the
dual-engine architecture and add a quick example.
Thread WmlComparerSettings options from Python kwargs through CLI
flags to the Docxodus C# binary. Supports detail_threshold,
case_insensitive, detect_moves, simplify_move_markup,
move_similarity_threshold, move_minimum_word_count,
detect_format_changes, conflate_spaces, and date_time.

- Extract _build_command() in BaseEngine, override in DocxodusEngine
- Add input validation for thresholds and word count
- Update Docxodus CLI to parse --flags (backward compat with legacy format)
- Rebuild all platform binaries with new flag support
- Add 13 new tests (integration, validation, unit)
- Update README with Comparison Settings section
Run tests across 3 OSes x 3 Python versions on push to main
and on pull requests. Includes package build verification.
Add global.json to pin the .NET SDK to 8.0.x, preventing CI runners
with .NET 10 pre-installed from using the wrong compiler (which breaks
Docxodus due to List<T>.Reverse() vs LINQ Reverse() resolution).

Also fix build_differ.py run_command() to raise on non-zero exit codes
instead of silently continuing past build failures.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant