Knowledge¶
The Knowledge module provides pattern definitions and classification logic used during discovery and planning. It encodes reusable heuristics about DITA structures, naming conventions, and common package layouts.
This layer is descriptive, not prescriptive. It does not execute transformations or mutate data. Instead, it supplies structured knowledge that other stages can reference when making decisions.
By isolating heuristics in this module, the system avoids hard-coding assumptions into discovery or planning logic and makes those assumptions explicit, testable, and replaceable.
dita_package_processor.knowledge.invariants
¶
Invariant definitions for DITA package processing.
This module defines invariants: conditions that must hold true for a DITA package to be safely processed by the pipeline.
Invariants differ from validation rules:
- Validation answers: "Is this document well-formed or schema-valid?"
- Invariants answer: "Is this package structurally processable?"
Invariant violations are always fatal and must halt execution.
InvariantViolation
dataclass
¶
Represents a violation of a structural invariant.
Violations are immutable and descriptive. They do not attempt to recover or suggest fixes.
assert_invariants(package_dir)
¶
Assert that all filesystem invariants hold for the DITA package.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
package_dir
|
Path
|
Root directory of the DITA package. |
required |
Raises:
| Type | Description |
|---|---|
RuntimeError
|
If any invariant is violated. |
evaluate_invariants(package_dir)
¶
Evaluate all filesystem-level invariants for a DITA package.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
package_dir
|
Path
|
Root directory of the DITA package. |
required |
Returns:
| Type | Description |
|---|---|
List[InvariantViolation]
|
List of invariant violations. |
invariant_contains_ditamap(package_dir)
¶
Ensure the package contains at least one .ditamap file.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
package_dir
|
Path
|
Root directory of the DITA package. |
required |
Returns:
| Type | Description |
|---|---|
List[InvariantViolation]
|
List of invariant violations (empty if none). |
invariant_package_root_exists(package_dir)
¶
Ensure the package root directory exists and is a directory.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
package_dir
|
Path
|
Root directory of the DITA package. |
required |
Returns:
| Type | Description |
|---|---|
List[InvariantViolation]
|
List of invariant violations (empty if none). |
invariant_topics_directory_present(package_dir)
¶
Ensure the topics/ directory exists under the package root.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
package_dir
|
Path
|
Root directory of the DITA package. |
required |
Returns:
| Type | Description |
|---|---|
List[InvariantViolation]
|
List of invariant violations (empty if none). |
validate_single_main_map(inventory)
¶
Validate that exactly one main map exists in the discovery inventory.
Accepts either: - MapType enum values - string contract values ("MAIN_MAP")
This keeps invariants tolerant of internal vs serialized classification representations.
dita_package_processor.knowledge.known_patterns
¶
Known structural patterns for DITA discovery.
This module loads and validates declarative discovery patterns defined
in known_patterns.yaml. Patterns are normalized into immutable
:class:Pattern objects suitable for deterministic evaluation.
This module performs: - no classification - no inference - no filesystem inspection beyond loading the YAML file
Its responsibility is limited to: YAML → validated data → normalized Pattern objects
load_normalized_patterns()
¶
Load, validate, and normalize all discovery patterns.
This is the canonical entry point used by discovery classifiers.
Returns:
| Type | Description |
|---|---|
List[Pattern]
|
List of normalized :class: |
Raises:
| Type | Description |
|---|---|
ValueError
|
If any pattern is invalid. |
load_patterns()
¶
Load declarative discovery patterns from known_patterns.yaml.
Expected YAML structure::
version: 1
patterns:
- id: ...
applies_to: ...
signals: ...
asserts:
role: ...
confidence: ...
rationale:
- ...
This function only loads and validates the raw YAML structure.
Pattern normalization is handled by :func:load_normalized_patterns.
Returns:
| Type | Description |
|---|---|
Dict[str, Any]
|
Parsed YAML document. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If structure is invalid. |
dita_package_processor.knowledge.map_types
¶
Canonical DITA artifact classification types.
This module defines the authoritative classification enums used during discovery and planning.
Design Principles¶
- Deterministic
- Explicit
- Stable string values
- No inference logic
- No transformation logic
- No mutation
These types represent observed structural intent only. They do not imply correctness.
ArtifactCategory
¶
Bases: str, Enum
High-level artifact categories recognized by the processor.
Categories distinguish maps, topics, and other structural artifacts during discovery and inventory construction.
__str__()
¶
Return stable string value.