Skip to content

Materialization

Materialization is responsible for turning execution intent into concrete outputs. It bridges the gap between abstract actions and tangible artifacts, such as files, directories, or reports.

This module does not decide what should happen. It only concerns itself with how declared actions are realized in a controlled environment. Safety guarantees, sandboxing, and idempotence are central concerns.

Materialization logic is designed to be deterministic and auditable, ensuring that the same plan always produces the same results under the same conditions.

dita_package_processor.materialization.builder

builder.py

Materialization target builder for the DITA Package Processor.

This module is a thin adapter responsible for preparing a concrete, publication-ready target directory for materialization output.

Materialization is intentionally separated from execution and validation:

  • Planning answers: "What should happen?"
  • Execution answers: "What actually ran?"
  • Materialization answers: "Is it safe to publish the results, and where?"
  • Builder answers: "Can we write to the target location?"

This module does NOT: - perform discovery - emit plans - execute handlers - validate execution semantics - detect collisions - decide safety

It operates strictly on a MaterializationManifest and filesystem targets.

Builder contract: "I prepare the place. I do not decide if it’s safe."

MaterializationError

Bases: RuntimeError

Raised when the target filesystem location cannot be prepared safely.

TargetMaterializationBuilder

Prepares the target directory for materialization output.

This builder is intentionally thin. It assumes upstream orchestration has already validated safety (execution state, layout, collisions, etc.) and provides a complete MaterializationManifest.

Design constraints: - deterministic - idempotent - no implicit deletion - no implicit overwrites - filesystem readiness only

__init__(*, manifest)

Initialize the builder.

Parameters:

Name Type Description Default
manifest MaterializationManifest

MaterializationManifest describing the target root and intended outputs.

required
build()

Prepare the target directory for materialization output.

This method creates the target root directory if missing and performs minimal sanity checks that the location is usable.

Raises:

Type Description
MaterializationError

If the target location cannot be prepared.

dita_package_processor.materialization.collision

collision.py

Target collision detection for materialization.

This module detects filesystem conflicts before execution occurs.

It ensures: - No two planned outputs resolve to the same target location. - Collision detection is deterministic and explicit. - No filesystem mutation occurs here.

This module supports two APIs:

1) Legacy style (stateful): detector = CollisionDetector(artifacts=[...]) detector.detect()

2) Injectable orchestrator style (stateless): detector = MaterializationCollisionDetector() detector.detect(artifacts=[...])

CollisionDetector

Legacy collision detector API (stateful).

This class remains for backward compatibility with tests and older code.

Usage: detector = CollisionDetector(artifacts=[...]) detector.detect()

__init__(*, artifacts)

Initialize the detector.

Parameters:

Name Type Description Default
artifacts Iterable[TargetArtifact]

Iterable of TargetArtifact objects.

required
detect()

Detect collisions among target artifacts.

Raises:

Type Description
MaterializationCollisionError

If collisions are detected.

MaterializationCollisionDetector

Injectable collision detector API (stateless).

This is the preferred interface for orchestration wiring.

Usage: detector = MaterializationCollisionDetector() detector.detect(artifacts=[...])

detect(*, artifacts)

Detect collisions among target artifacts.

Parameters:

Name Type Description Default
artifacts Iterable[TargetArtifact]

Iterable of TargetArtifact objects.

required

Raises:

Type Description
MaterializationCollisionError

If collisions are detected.

MaterializationCollisionError

Bases: RuntimeError

Raised when target path collisions are detected.

TargetArtifact dataclass

Represents a single materialized output artifact.

Attributes: path: Absolute or resolved target path. source_action_id: ID of the plan action producing this artifact.

dita_package_processor.materialization.layout

layout.py

Materialization layout rules for the DITA Package Processor.

This module defines deterministic rules for mapping artifact-relative paths (from plans, handlers, or reports) into a concrete target package layout.

Materialization layout rules answer: "Given an artifact path, where should it live in the target package?"

Design constraints: - deterministic: same input produces same output - conservative: refuses ambiguous or unsafe paths - pure mapping: no copying, no filesystem mutation - explicit: rules are readable and testable

Default policy: - .ditamap files are placed at target root (flattened to filename) - .dita files are placed under target_root/topics/ (unless already under topics/) - all other files are placed under target_root/media/ - if already under media/, preserve relative structure - if under images/, nest under media/images/ - otherwise flatten to filename under media/

DefaultDitaLayoutPolicy dataclass

Default deterministic layout policy for DITA-like packages.

This policy is intentionally conservative and predictable.

map_relative_path(rel_path)

Map a relative artifact path into the default target layout.

Parameters:

Name Type Description Default
rel_path Path

Relative artifact path.

required

Returns:

Type Description
Path

Relative target path.

Raises:

Type Description
LayoutError

If input is unsafe.

LayoutError

Bases: ValueError

Raised when an input path cannot be mapped safely into the target layout.

LayoutPolicy

Bases: Protocol

Strategy interface for layout policies.

Implementations must be deterministic and must not touch the filesystem.

map_relative_path(rel_path)

Map a relative artifact path into a normalized relative target path.

Parameters:

Name Type Description Default
rel_path Path

Relative path for an artifact (e.g., "topics/a.dita").

required

Returns:

Type Description
Path

Relative target path (e.g., "topics/a.dita" or "media/a.png").

Raises:

Type Description
LayoutError

If the path is unsafe or cannot be mapped.

MaterializationLayoutEngine

Orchestrates layout resolution for materialization.

This engine provides a stable coordination surface for the materialization orchestrator without performing filesystem mutation.

Responsibilities: - own TargetLayout - expose resolution as an explicit phase

__init__(*, target_root, policy=None)

Initialize the layout engine.

Parameters:

Name Type Description Default
target_root Path

Root of the materialized target package.

required
policy LayoutPolicy | None

Optional layout policy override.

None
resolve_path(*, rel_path)

Resolve an artifact-relative path into its target location.

Parameters:

Name Type Description Default
rel_path Path

Relative artifact path.

required

Returns:

Type Description
Path

Concrete target path.

Raises:

Type Description
LayoutError

If mapping fails.

TargetLayout dataclass

Target layout resolver.

Converts artifact-relative paths into concrete target paths under a provided target root, using a policy object.

Pattern: Strategy

resolve(*, rel_path)

Resolve a relative artifact path to a concrete path under target_root.

Parameters:

Name Type Description Default
rel_path Path

Relative artifact path (must be safe).

required

Returns:

Type Description
Path

Concrete target path.

Raises:

Type Description
LayoutError

If rel_path is invalid or unsafe.

dita_package_processor.materialization.models

Materialization domain models.

These models describe the intended final shape of a materialized DITA package after execution has completed.

They do NOT: - perform filesystem operations - execute handlers - infer missing artifacts

They DO: - represent resolved, final target paths - preserve traceability to source actions - guarantee collision-free materialization - serve as the execution → publishing contract

MaterializationManifest dataclass

Declarative manifest describing a fully materialized target package.

This object is the authoritative contract between materialization and execution.

Invariants guaranteed by this model: - All target paths are resolved and absolute - No duplicate target paths exist - Every file is intentional and traceable

__post_init__()

Enforce collision-free and structurally valid manifests.

iter_files()

Iterate over materialized files.

Returns:

Type Description
Iterable[MaterializedFile]

Iterable of MaterializedFile.

to_dict()

Serialize the materialization manifest.

Returns:

Type Description
Dict[str, object]

JSON-safe dictionary.

MaterializedFile dataclass

Declarative record of a single materialized file.

This represents a resolved target artifact, not a filesystem operation.

Attributes

path: Absolute path of the file in the target package. source_action_id: ID of the execution action that produced this file. role: Optional semantic role (e.g. "map", "topic", "media"). layout_metadata: Deterministic layout annotations explaining how this file was placed (policy name, original relative path, etc.).

to_dict()

Serialize the materialized file record.

Returns:

Type Description
Dict[str, object]

JSON-safe dictionary.

dita_package_processor.materialization.orchestrator

orchestrator.py

Materialization orchestration for the DITA Package Processor.

Materialization is a first-class pipeline phase and MUST occur before execution. This prevents publishing unsafe or ambiguous target layouts.

This orchestrator is intentionally split into two phases:

Preflight (MANDATORY, pre-execution): - validate target_root readiness - validate layout mapping rules - detect collisions among planned outputs - emit a materialization manifest (optional, deterministic)

Finalize (post-execution): - validate execution results are safe for publication (optional) - write finalized manifest / reports (optional)

Design constraints: - deterministic - idempotent - no hidden filesystem mutation - no inference - explicit failure on ambiguity

Pattern notes: - Orchestrator coordinates collaborators (Builder, Validator, CollisionDetector, ManifestWriter). Collaborators are dependency-injected for testability and stability.

Key compatibility note

The default builder (TargetMaterializationBuilder) may change its init signature over time (e.g., requiring a keyword-only manifest). This orchestrator adapts via runtime signature inspection and will pass only supported kwargs.

Builder

Bases: Protocol

Protocol for target filesystem readiness builders.

build()

Prepare the filesystem target for materialization.

CollisionDetectorProtocol

Bases: Protocol

Protocol for collision detection implementations.

detect()

Detect collisions and raise if any are found.

ManifestWriter

Bases: Protocol

Protocol for manifest writers.

write_final(*, execution_report)

Optionally write a final manifest artifact.

write_preflight()

Optionally write a preflight manifest artifact.

MaterializationManifest dataclass

Deterministic materialization manifest.

This is a conservative, minimal representation of what materialization needs to prepare for (resolved target paths and their origin actions).

Parameters

target_root: Target root directory for materialization. artifacts: Concrete resolved target artifacts derived from the plan.

MaterializationOrchestrationError

Bases: RuntimeError

Raised when orchestration of materialization fails.

MaterializationOrchestrator

Coordinate the materialization layer.

This orchestrator is a strict coordinator: it does not implement mapping logic or collision semantics itself. It delegates those concerns to collaborators.

Parameters

plan: Immutable plan that will be executed. target_root: Target directory where the materialized package will live. builder: Prepares filesystem destination. If omitted, a compatible default TargetMaterializationBuilder is created using a derived manifest. validator: Preflight validator. Defaults to NoOpValidator. collision_detector: Detects collisions among resolved target artifacts. Defaults to a CollisionDetector built from derived target artifacts. manifest_writer: Optional manifest writer. Defaults to NoOpManifestWriter.

finalize(*, execution_report)

Finalize materialization after execution completes.

This is a post-execution hook. It may write manifests or perform additional publication-safety checks based on execution outcomes.

Parameters

execution_report: Immutable forensic execution report.

Raises

MaterializationOrchestrationError If finalization fails.

preflight()

Execute the pre-execution materialization safety gate.

This MUST run before execution. If this fails, execution must not run.

Raises

MaterializationOrchestrationError If any preflight stage fails.

NoOpManifestWriter dataclass

Default manifest writer that performs no I/O.

The system can later be extended to emit JSON manifests deterministically.

write_final(*, execution_report)

No-op.

write_preflight()

No-op.

NoOpValidator dataclass

Default validator that performs no validation.

This exists so materialization can be wired before all validators are implemented. Replace with real validators as the safety surface grows.

validate_preflight()

No-op validation.

Validator

Bases: Protocol

Protocol for materialization validators.

validate_preflight()

Validate pre-execution safety invariants.

dita_package_processor.materialization.validation

validation.py

Materialization validation rules for the DITA Package Processor.

This module enforces preconditions required for safe materialization. It performs pure validation only.

Materialization validation answers:

  • "Is this plan suitable for producing a target package?"
  • "Is this target location safe to use?"

This module does NOT: - create directories - modify the filesystem - inspect execution reports - infer intent

It exists to fail loudly before irreversible work begins.

MaterializationValidationError

Bases: RuntimeError

Raised when materialization preconditions are not satisfied.

These are hard failures that indicate unsafe or invalid input.

MaterializationValidator

Validates materialization preconditions.

This class encapsulates all rules required to decide whether a target directory may be materialized safely.

Design constraints: - deterministic - side-effect free - explicit failure modes

__init__(*, plan, target_root)

Initialize the validator.

Parameters:

Name Type Description Default
plan Plan

Validated execution plan.

required
target_root Path

Intended materialization root directory.

required
validate()

Validate all materialization preconditions.

Raises:

Type Description
MaterializationValidationError

If any precondition fails.