Extensions¶
This document defines the supported extension mechanism for the DITA Package Processor.
Extensions are implemented as new pipeline steps. There are no plugins, hooks, or dynamic loaders. If behavior is not expressed as a ProcessingStep, it is not a supported extension.
Extension Model (At a Glance)¶
- Extensions are new
ProcessingStepclasses - Steps participate in the pipeline in an explicit order
- Each step has one responsibility
- Shared state flows only through
ProcessingContext - Side effects are limited to the package directory
If an extension cannot be expressed this way, it does not belong in this project.
Architectural Roles¶
| Component | Responsibility |
|---|---|
| Pipeline | Owns execution order, lifecycle, and logging |
| ProcessingContext | Shared, explicit runtime state |
| ProcessingStep | One discrete transformation |
| dita_xml | Centralized XML parsing and rewriting |
| utils | Stateless helper functions |
Extensions integrate only at the ProcessingStep level.
Step Contracts¶
Each step operates under a clear contract:
preconditions may be assumed; postconditions must be guaranteed.
| Step | Preconditions | Postconditions |
|---|---|---|
| RemoveIndexMapStep | index.ditamap exists and references a .ditamap |
Main map resolved; index.ditamap deleted |
| RenameMainMapStep | Main map path resolved | Main map renamed to <docx_stem>.ditamap |
| ProcessMapsStep | Renamed main map exists | Abstract topic injected; maps numbered; wrapper topics created; topicrefs normalized |
| RefactorGlossaryStep | Definition map configured and exists | Definition child topics transformed to glossentry |
Failure semantics
- Structural violations fail fast
- Content inconsistencies log warnings and continue
- Failures stop the pipeline immediately
- No rollback is performed
ProcessingContext Usage¶
ProcessingContext is the only supported shared state mechanism.
Stable Attributes¶
Always present:
package_dirdocx_stemtopics_dir(derived)media_dir(derived)
Derived Attributes¶
Populated by specific steps:
-
main_map_path
Set byRemoveIndexMapStep -
renamed_main_map_path
Set byRenameMainMapStep
Steps may only read derived attributes after the responsible step has executed.
Adding Context Attributes (Extensions)¶
Extensions may introduce new context attributes if they follow these rules:
- Use explicit, descriptive names
- Document the attribute in the step docstring
- Do not shadow existing attributes
- Keep attributes optional unless enforced by a prior step
Example:
context.regex_cleanup_applied = True
Context is shared state, not a general-purpose key-value store.
Creating a New Step¶
Step Definition¶
from dita_package_processor.steps.base import ProcessingStep
class MyNewStep(ProcessingStep):
name = "my-new-step"
def run(self, context, logger):
# Implement exactly one responsibility
...
Rules
- Inherit from
ProcessingStep - Implement
run(context, logger) - Declare a stable, unique
name - Do not invoke other steps
Step Registration¶
Steps are registered explicitly when constructing the pipeline.
Execution order is intentional and visible:
Pipeline(
steps=[
RemoveIndexMapStep(),
RenameMainMapStep(),
MyNewStep(),
ProcessMapsStep(),
RefactorGlossaryStep(),
],
logger=logger,
)
There is no automatic discovery.
Placement Guidelines¶
Use these guidelines when inserting a new step:
| Step Type | Recommended Position |
|---|---|
| File discovery / deletion | Early |
| Map restructuring | Before topic-level steps |
| Topic generation | Middle |
| Content rewriting | Late |
| Validation / cleanup | Last |
If placement is ambiguous, the step is probably doing too much.
XML Safety Rules¶
- Do not perform ad-hoc XML manipulation inside steps
- Reuse helpers from
dita_xml.py - Centralize XPath and tree logic in the facade module
Pattern to follow:
doc = read_xml(path)
doc = transform_function(doc)
write_xml(doc)
This keeps XML behavior consistent and maintainable.
Explicitly Unsupported Anti-Patterns¶
The following are not supported and should not be introduced:
- Steps invoking other steps
- Feature flags inside steps to simulate ordering
- Orchestration logic in the CLI
- Shared globals
- Implicit dependencies between steps
If you need conditional behavior, add a new step and make it explicit.
Summary¶
Extensions in this project are deliberately constrained:
- Linear pipeline
- Explicit execution order
- One responsibility per step
- Controlled shared state
These constraints are what keep the system predictable, testable, and maintainable at scale.