๐Ÿš€ Transpiler-Pro

Transpiler-Pro is an enterprise-grade documentation pipeline designed to transform Markdown into Antora-compliant AsciiDoc. Tailored specifically for SUSE technical standards, it goes beyond simple conversion by utilizing Natural Language Processing (NLP) to "heal" linguistic errors, shift tenses, and enforce branding.

๐Ÿ“Œ Core Mission

Transitioning legacy Markdown to AsciiDoc often results in "broken" UI components (tabs, collapsibles) and inconsistent grammar. Transpiler-Pro automates the tedious parts of this migration through four key pillars:

  1. Structural Integrity & SEO Stability - Converts complex Markdown (Admonitions, Collapsibles, Tables) into Antora-compliant AsciiDoc while "freezing" headers with hardcoded, SEO-friendly anchors to prevent broken links during renames.
  2. Style Validation - Checks content against the official SUSE Vale Style Guide.
  3. Linguistic Healing - Uses AI to automatically fix future tense and wordiness while maintaining subject-verb agreement.
  4. Content Parity Audit - (New) Automatically validates that no text, code blocks, or headings were lost during the conversion process via a high-fidelity parity engine.

โš™๏ธ The "Shield-Convert-Repair-Audit" Architecture

Transpiler-Pro operates using a multi-stage "Transformation and Healing" process:

Phase X - Structural Conversion (The Converter)

Standard converters often mangle Docusaurus-style components or generate unstable IDs.

Phase Y - Linguistic Repair (The NLP Engine)

Unlike simple find-and-replace tools, Transpiler-Pro understands context using the spaCy en_core_web_sm model.

Phase Z - Content Parity Audit (The Validator)

To guarantee zero data loss, the pipeline concludes with a high-velocity validation engine optimized for technical documentation:

๐Ÿ“‚ Project Structure

.
โ”œโ”€โ”€ src/transpiler_pro/
โ”‚   โ”œโ”€โ”€ core/
โ”‚   โ”‚   โ”œโ”€โ”€ converter.py    # Structural transformation & block restoration (Phase X)
โ”‚   โ”‚   โ”œโ”€โ”€ linter.py       # Style sensing via Vale CLI
โ”‚   โ”‚   โ”œโ”€โ”€ repair.py       # NLP-driven Tense & Subject-Verb Agreement (Phase Y)
โ”‚   โ”‚   โ”œโ”€โ”€ validator.py    # Content Parity & Audit logic (Phase Z)
โ”‚   โ”‚   โ””โ”€โ”€ fixer.py        # Rule-based repair (Spelling & Branding)
โ”‚   โ”œโ”€โ”€ cli.py              # Typer orchestration (The Entry Point)
โ”œโ”€โ”€ styles/suse-styles/     # Official SUSE Vale rulesets (Synced via Git)
โ”œโ”€โ”€ data/
โ”‚   โ”œโ”€โ”€ inputs/             # Place your .md files here
โ”‚   โ”œโ”€โ”€ intermediate/       # Raw .adoc conversions (Pre-repair)
โ”‚   โ”œโ”€โ”€ audit-logs/         # Detailed parity reports (Phase Z evidence)
โ”‚   โ”œโ”€โ”€ outputs/            # Final "healed" .adoc files
โ”‚   โ””โ”€โ”€ knowledge_base.json # Branding & Technical Term dictionary
โ””โ”€โ”€ pyproject.toml          # Central configuration for the entire pipeline

๐Ÿ›  Installation & Setup

Follow these steps to set up the environment locally. Transpiler-Pro uses uv for lightning-fast, reproducible builds.

1. Prerequisites

Ensure you have the following installed on your system:

2. Environment Setup

# Clone the repository
git clone https://github.com/your-org/transpiler-pro.git
cd transpiler-pro

# Install Python dependencies and create virtual environment
uv sync

# Download the NLP Linguistic Model (Required for Phase Y & Z)
uv run python -m spacy download en_core_web_sm

3. Initialize Styles

Sync the official openSUSE style guide to your local machine:

uv run transpiler-pro sync

๐Ÿš€ Usage Guide

Transpiler-Pro is designed for high portability. While it defaults to the internal data/ directory structure, every command supports custom path flags, allowing you to target any external documentation repository.

1. Full Pipeline (The "Golden" Path)

The full-run command orchestrates the entire sequence (Sync โžœ Convert โžœ Repair โžœ Audit). This is the recommended way to ensure your content is structurally stable, linguistically "healed," and verified for zero content loss.

# Option A: Standard run using default data/ folders
uv run transpiler-pro full-run

# Option B: Target external directories (Enterprise Portability)
uv run transpiler-pro full-run --input ~/my-project/docs --output ~/my-project/dist

# Option C: Bypass the audit for large-scale rapid prototyping
uv run transpiler-pro full-run --no-audit

2. Individual Phase Control

For granular debugging or specialized workflows, you can trigger individual phases of the transformation engine.

Phase X: Structural Conversion

Converts Markdown to AsciiDoc, injects SEO-friendly persistent IDs, and mirrors assets (images, .yml) to the output path.

# Convert Markdown to AsciiDoc by providing input and output directories
uv run transpiler-pro x-convert --input ./raw-md --output ./intermediate-adoc

# If you want to use the default data/ folders, simply run:
uv run transpiler-pro x-convert

Phase Y: Linguistic Healing

Processes AsciiDoc files through the NLP engine to fix future tense, apply branding rules, and resolve subject-verb agreement.

# Run the repair phase with custom paths
uv run transpiler-pro y-repair --input ./intermediate-adoc --output ./final-adoc

# If you want to use the default data/ folders, simply run:
uv run transpiler-pro y-repair

Phase S: Style Synchronization

Force-updates the local SUSE Vale style guides from the remote repository.

uv run transpiler-pro sync

๐Ÿ“Š Verification & Build Integrity

Transpiler-Pro includes two distinct layers of quality control to ensure "Technical Parity" and "Syntax Perfection."

1. Content Parity Audit (Phase Z)

This verifies that no technical information was lost. It performs a high-fidelity token comparison between the source Markdown and the generated AsciiDoc, filtering out formatting noise.

# Verify integrity between any two directories
uv run transpiler-pro audit --input ./source-md --output ./converted-adoc

# If you want to use the default data/ folders, simply run:
uv run transpiler-pro audit

2. Asciidoctor Build Check (The "Check" Command)

The ultimate syntax test. It renders your .adoc files into a mirrored HTML preview folder using the official asciidoctor parser. It is configured to fail on WARN to catch duplicate IDs or broken macros.

# Generate a complete HTML preview in a sandbox directory
uv run transpiler-pro check --input ./final-adoc --build-dir ./preview-html

# Target a specific file for rapid syntax debugging
uv run transpiler-pro check --file instance.adoc --input ./data/outputs

# If you want to use the default data/ folders, simply run:
uv run transpiler-pro check --file instance.adoc

Targeted Processing

If you are working on a specific document and do not want to process the entire library, use the --file (or -f) flag. This works across full-run, x-convert, y-repair, and check.

# Run the entire pipeline for a single file
uv run transpiler-pro full-run --file security-guide.md

# Build a preview for just one file
uv run transpiler-pro check --file security-guide.adoc

๐Ÿ“Š Audit & Quality Control

Transpiler-Pro provides a two-layered validation system to ensure your documentation is both linguistically polished and structurally complete.

1. Linguistic Healing Logs (Phase Y)

During the repair phase, the tool tracks automated improvements and identifies manual tasks:

2. Content Parity Dashboard (Phase Z)

After conversion, the tool runs a strict comparison between the Markdown source and the AsciiDoc result:

๐Ÿงช Development & Testing

To verify the NLP logic, structural regex, and parity engine:

# Run the test suite (Unit tests for Shields and NLP)
uv run pytest

# Generate the API Reference (Project Portal)
uv run python docs.py

Portal Last Updated: 2026-04-18 19:43:23