recovery pipeline feature
This commit is contained in:
@@ -0,0 +1,66 @@
|
||||
# Feature Design Map: Recovery Pipeline
|
||||
|
||||
## Bounded Contexts
|
||||
|
||||
- `ingest-snapshot` — owns deterministic upstream bundle ingest, segment boundaries, canonical source projection, and run manifests.
|
||||
- `dependency-recovery` — owns vendored package identification, dependency decisions, externalization, and bundled fallback preservation.
|
||||
- `static-context-evidence` — owns deterministic context packets, binding graphs, and usage evidence for downstream consumers.
|
||||
- `snapshot-lineage` — owns adjacent-run matching, durable lineage, change classification, relabel eligibility, and upstream summary facts.
|
||||
- `iterative-naming` — owns relabel queue planning, batch execution handoff, wave reconciliation, safe rename acceptance, and naming memory updates.
|
||||
- `codebase-regularization` — owns deterministic file placement, structural splitting, import/export reconstruction, and canonical editable tree emission.
|
||||
- `maintained-transform-replay` — owns replay of long-lived maintained transforms and replay conflict reporting.
|
||||
- `release-packaging` — owns release artifact assembly, provenance manifests, and publication-ready outputs.
|
||||
|
||||
## Feature Step to Workflow Slice Map
|
||||
|
||||
| Feature Step | Bounded Context | Workflow Slice | Notes |
|
||||
| :----------- | :-------------- | :------------- | :---- |
|
||||
| Ingest upstream bundle snapshot into deterministic recovery artifacts | `ingest-snapshot` | `deterministic-bundle-ingest` | Produces the canonical per-run source of truth used by all later slices. |
|
||||
| Identify vendored package boundaries and confidence decisions | `dependency-recovery` | `identify-vendored-packages` | Consumes ingest artifacts and records accepted, rejected, and unresolved dependency decisions. |
|
||||
| Replace accepted vendored packages with external dependencies while keeping fallbacks | `dependency-recovery` | `externalize-accepted-dependencies` | Depends on identified package decisions; unresolved packages stay bundled. |
|
||||
| Extract deterministic context packets for each segment | `static-context-evidence` | `extract-segment-context` | Consumes ingest output after dependency treatment to emit machine-readable evidence. |
|
||||
| Compare adjacent runs and classify lineage-aware changes | `snapshot-lineage` | `diff-adjacent-runs` | Consumes current and previous run manifests plus Phase 3 context. |
|
||||
| Rank relabel candidates into deterministic queue packets | `iterative-naming` | `plan-relabel-queue` | Uses only new and modified segments from snapshot-lineage. |
|
||||
| Execute queued relabel batches against the model provider in waves | `iterative-naming` | `execute-wave-batches` | Owns outbound API execution only; no naming decisions are applied here. |
|
||||
| Evaluate responses, accept safe names, and update queue state | `iterative-naming` | `evaluate-and-apply-renames` | Reconciles at wave boundary and updates naming memory. |
|
||||
| Emit the canonical editable recovered tree | `codebase-regularization` | `regularize-editable-tree` | Must preserve build-first while improving navigability. |
|
||||
| Replay long-lived maintained transforms onto the regularized tree | `maintained-transform-replay` | `replay-maintained-transforms` | Carries durable local changes across upgrades. |
|
||||
| Build release artifacts and publication metadata | `release-packaging` | `build-and-publish-artifacts` | Packages processed and unmodified artifacts for traceable release output. |
|
||||
|
||||
## Cross-Context Handoffs
|
||||
|
||||
- `ingest-snapshot` -> `dependency-recovery` via run manifest, segments, and canonical projection because vendored matching starts from deterministic ingest evidence.
|
||||
- `ingest-snapshot` -> `static-context-evidence` via stable segment records because context extraction depends on canonical segment boundaries.
|
||||
- `dependency-recovery` -> `static-context-evidence` via accepted externalization decisions and preserved fallbacks because context packets must describe the post-decision code surface.
|
||||
- `static-context-evidence` -> `snapshot-lineage` via deterministic context packets because fuzzy matching and summary facts need machine-readable evidence.
|
||||
- `snapshot-lineage` -> `iterative-naming` via relabel-eligible changed/new segments and ambiguity reports because only safe changed material should enter naming work.
|
||||
- `iterative-naming` -> `codebase-regularization` via safely renamed generated source and naming memory because regularization should operate on the best accepted recovered names.
|
||||
- `codebase-regularization` -> `maintained-transform-replay` via canonical editable tree and placement mappings because replay targets the regularized tree, not the pre-regularized source.
|
||||
- `maintained-transform-replay` -> `release-packaging` via replay outcomes and transformed tree state because releases must reflect which maintained transforms were applied, skipped, or conflicted.
|
||||
|
||||
## Recommended Slice Order
|
||||
|
||||
1. `ingest-snapshot/deterministic-bundle-ingest` — all later slices depend on deterministic ingest artifacts and canonical segment boundaries.
|
||||
2. `dependency-recovery/identify-vendored-packages` — shrinks the app-authored surface before later evidence and naming work.
|
||||
3. `dependency-recovery/externalize-accepted-dependencies` — completes dependency treatment before downstream evidence extraction.
|
||||
4. `static-context-evidence/extract-segment-context` — provides deterministic evidence used by diffing, summaries, and transform anchoring.
|
||||
5. `snapshot-lineage/diff-adjacent-runs` — identifies changed/new material and durable lineage needed for iterative naming.
|
||||
6. `iterative-naming/plan-relabel-queue` — transforms changed material into deterministic naming work packets.
|
||||
7. `iterative-naming/execute-wave-batches` — executes persisted batches without applying names yet.
|
||||
8. `iterative-naming/evaluate-and-apply-renames` — applies only accepted names after wave reconciliation.
|
||||
9. `codebase-regularization/regularize-editable-tree` — emits the canonical browsable tree once safe names are available.
|
||||
10. `maintained-transform-replay/replay-maintained-transforms` — reapplies durable local changes onto the regularized tree.
|
||||
11. `release-packaging/build-and-publish-artifacts` — packages the final tree and release metadata last.
|
||||
|
||||
## Orchestration Notes
|
||||
|
||||
- The feature-level pipeline is linear by default, but review-needed findings do not automatically halt later safe slices in MVP.
|
||||
- `iterative-naming` contains three slices inside one bounded context; only wave orchestration crosses those slice boundaries.
|
||||
- Cross-context decisions stay at handoff seams: each slice makes decisions only over state owned by its context.
|
||||
- Build-first remains the feature-level acceptance rule, especially across `codebase-regularization`, `maintained-transform-replay`, and `release-packaging`.
|
||||
|
||||
## Open Questions
|
||||
|
||||
- `static-context-evidence` will consume post-externalization source as its canonical input; if pre-externalization review becomes necessary later, treat it as a secondary review artifact rather than the main slice input.
|
||||
- The release docs imply publication is optional; the exact publication handoff seam inside `release-packaging` is still open.
|
||||
- Build verification is a hard invariant, but the repository-wide command set for that verification is not yet frozen in the design artifacts.
|
||||
@@ -0,0 +1,95 @@
|
||||
# Feature Discovery: Recovery Pipeline
|
||||
|
||||
## 1. Commands (User Intents)
|
||||
|
||||
- Pipeline operator wants to ingest an upstream bundle snapshot because they need a deterministic base for recovery work.
|
||||
- Pipeline operator wants to identify and externalize vendored dependencies because they want to shrink the app-authored surface that later phases must understand.
|
||||
- Pipeline operator wants to extract deterministic context because later phases need machine-readable evidence without relying on an LLM as the source of truth.
|
||||
- Pipeline operator wants to diff the current snapshot against the previous snapshot because they want durable lineage, compact upstream summaries, and to avoid resending unchanged material for naming.
|
||||
- Pipeline operator wants to iteratively relabel changed and new code because they want a more browsable recovered tree with readable names across modules, functions, locals, and parameters.
|
||||
- Pipeline operator wants to regularize recovered output into a canonical editable tree because they care most about a browsable codebase.
|
||||
- Pipeline operator wants the recovered tree to build because buildability is the current hard success invariant.
|
||||
- Pipeline operator wants uncertain areas surfaced in manifests and reports because uncertainty should not block MVP progress.
|
||||
- Pipeline operator wants manual runtime rescue patches captured as formal maintained transforms because repeated upgrades should become replayable.
|
||||
- Pipeline operator wants to publish processed and unmodified artifacts with provenance because releases should remain traceable to the upstream snapshot.
|
||||
|
||||
## 2. Events (Domain Facts)
|
||||
|
||||
- Upstream snapshot ingested (payload: run ID, upstream snapshot identity, emitted manifest, emitted segments).
|
||||
- Dependency candidate identified (payload: candidate package, evidence, recovered segment boundary).
|
||||
- Dependency decision recorded (payload: accepted|rejected|unresolved, confidence, rationale, fallback reference).
|
||||
- Context packet extracted (payload: segment ID, bindings, links, evidence, heuristics).
|
||||
- Run diff completed (payload: unchanged|modified|new|deleted|split|merged|ambiguous classifications, lineage updates).
|
||||
- Relabel candidate queued (payload: candidate ID, pass kind, evidence score, difficulty score, priority score).
|
||||
- Batch wave executed (payload: wave ID, batch IDs, model/config, execution outcomes).
|
||||
- Rename proposal evaluated (payload: accepted|deferred|stalled|exhausted outcomes, rejection reasons, counters).
|
||||
- Accepted names applied (payload: candidate fields renamed, updated source/metadata, naming-memory updates).
|
||||
- Regularized tree emitted (payload: canonical repo-root tree, regularization manifest, placement mappings).
|
||||
- Review-needed artifact emitted (payload: phase, machine-readable report, concise human summary).
|
||||
- Maintained transform replayed (payload: applied|conflict|skipped outcome, transform metadata, replay report).
|
||||
- Release artifact set emitted (payload: processed-source artifact, unmodified-source artifact, release manifest, release notes).
|
||||
|
||||
## 3. Business Rules & Invariants
|
||||
|
||||
- Rule: The repo root is always the latest canonical editable recovered tree.
|
||||
- Rule: Per-run artifacts, evidence, queue state, and review reports live under `runs/`.
|
||||
- Rule: Buildability outranks readability; risky naming or regularization must not be accepted if it jeopardizes correctness.
|
||||
- Rule: Runtime completeness is desirable but not required for MVP progression if the output still builds and remains browsable.
|
||||
- Rule: Uncertainty should be surfaced in manifests and reports instead of silently guessed away.
|
||||
- Rule: For MVP, review-needed states should not halt the entire pipeline if later phases can proceed safely.
|
||||
- Rule: Later phases must consume deterministic machine-readable artifacts as source of truth.
|
||||
- Rule: LLM output may assist naming and ambiguous ranking, but must not become the source of truth for deterministic structure, matching, or safety decisions.
|
||||
- Rule: The root recovered tree is generated, not hand-maintained between runs.
|
||||
- Rule: Upgrades should start from raw ingest, reuse deterministic prior evidence where valid, then replay maintained transforms.
|
||||
- Rule: If manual fixes are needed because the code is not runnable, those fixes should become formal Phase 9 maintained transforms.
|
||||
- Invariant: Build-first is the current formal verification bar for successful regularization/publishing.
|
||||
- Invariant: If a more navigable regularization attempt breaks the build, the failed attempt must be surfaced for review rather than silently degraded.
|
||||
- Invariant: Review surfacing must include both machine-readable artifacts and concise human-readable summaries.
|
||||
|
||||
## 4. Edge Cases Handled
|
||||
|
||||
- Case: Dependency match confidence is low or colliding -> record as unresolved or review-needed instead of forcing externalization.
|
||||
- Case: Vendored replacement may drift from bundled behavior -> preserve bundled fallback implementations for validation and safety.
|
||||
- Case: Diff matching remains contested -> emit `ambiguous` artifacts and exclude those segments from automated lineage-dependent actions.
|
||||
- Case: Rename candidates lack sufficient evidence -> keep them visible in queue state, defer and retry deterministically, then allow terminal `stalled` or `exhausted` outcomes rather than retrying forever.
|
||||
- Case: Model response is low confidence, insufficiently specific, invalid, or collision-prone -> reject deterministically and feed structured reasons back into queue state.
|
||||
- Case: A more readable split or placement would make the tree fail to build -> surface the failed regularization attempt for review.
|
||||
- Case: Runtime behavior is incomplete after recovery -> allow manual rescue patches, but capture durable fixes as maintained transforms when they must persist across upgrades.
|
||||
- Case: Publication fails after artifacts are built -> keep local built artifacts and separate publication failure from build failure.
|
||||
- Case: Review-needed findings appear in MVP -> continue later safe phases while recording artifacts for later inspection.
|
||||
|
||||
## 5. Candidate Bounded Contexts
|
||||
|
||||
- Ingest & Snapshot Evidence: owns deterministic bundle ingest, segment records, and canonical projections.
|
||||
- Dependency Recovery: owns vendored package identification, confidence decisions, externalization, and fallback preservation.
|
||||
- Static Context Evidence: owns deterministic context extraction artifacts and evidence packets.
|
||||
- Snapshot Lineage & Change Detection: owns run-to-run matching, lineage, change classification, and upstream summaries.
|
||||
- Iterative Naming: owns relabel queue planning, batch execution handoff, semantic acceptance, safe rename application, and naming memory.
|
||||
- Codebase Regularization: owns deterministic file/folder placement, structural splitting, import/export reconstruction, and editable-tree emission.
|
||||
- Maintained Transform Replay: owns deterministic replay of long-lived transforms and replay conflict reporting.
|
||||
- Release Packaging: owns artifact packaging, provenance manifests, and optional publication.
|
||||
|
||||
## 6. Candidate Workflow Slices
|
||||
|
||||
- ingest-snapshot/deterministic-bundle-ingest: turn an upstream bundle into deterministic segment records and canonical source projection.
|
||||
- dependency-recovery/identify-vendored-packages: score dependency candidates and recover package boundaries.
|
||||
- dependency-recovery/externalize-accepted-dependencies: replace accepted vendored code with npm imports while preserving fallbacks.
|
||||
- static-context-evidence/extract-segment-context: emit canonical context packets and binding/link evidence.
|
||||
- snapshot-lineage/diff-adjacent-runs: classify changes, mint lineage, and produce relabel queues plus upstream summaries.
|
||||
- iterative-naming/plan-relabel-queue: compute candidate evidence, difficulty, priority, and batch-ready work items.
|
||||
- iterative-naming/execute-wave-batches: send persisted batch artifacts to the model provider in parallel waves.
|
||||
- iterative-naming/evaluate-and-apply-renames: validate wave results, accept safe names, update queue state, and refresh naming memory.
|
||||
- codebase-regularization/regularize-editable-tree: produce the canonical repo-root tree with deterministic placement and mappings.
|
||||
- maintained-transform-replay/replay-maintained-transforms: apply stored transforms safely and emit replay outcomes.
|
||||
- release-packaging/build-and-publish-artifacts: package processed and unmodified artifacts with release metadata.
|
||||
|
||||
## 7. Shared Language Notes
|
||||
|
||||
- Preferred term: Recovery Pipeline = the full release-oriented workflow that turns an upstream bundle snapshot into a buildable, browsable recovered tree plus release artifacts.
|
||||
- Preferred term: Recovered Tree = the canonical editable source tree emitted at repo root.
|
||||
- Preferred term: Build-first = the current formal invariant that the recovered tree must build even if runtime completeness is still partial.
|
||||
- Preferred term: Review-needed artifact = a machine-readable report plus concise human summary describing uncertainty, failure, or conflict that requires later inspection.
|
||||
- Preferred term: Maintained Transform = a durable replayable change stored outside the numbered upstream-processing pipeline and reapplied in Phase 9.
|
||||
- Preferred term: Naming Memory = accepted-name history reused to improve future relabel iterations.
|
||||
- Avoid: “original repo layout” when you mean the deterministic regularized editable tree.
|
||||
- Avoid: “runtime complete” when you only mean “buildable and browsable enough to inspect.”
|
||||
@@ -0,0 +1,124 @@
|
||||
# Design Status: Recovery Pipeline
|
||||
|
||||
## Feature
|
||||
|
||||
- Name: `Recovery Pipeline`
|
||||
- Feature slug: `recovery-pipeline`
|
||||
- Current phase: `Context & Workflow Decomposition`
|
||||
- Overall status: `Decomposition In Progress`
|
||||
- Security verification status: `Not Started`
|
||||
- Current workflow slice: `ingest-snapshot/deterministic-bundle-ingest`
|
||||
|
||||
## Feature Artifacts
|
||||
|
||||
- [x] `design/feature/recovery-pipeline/discovery.md`
|
||||
- [x] `design/feature/recovery-pipeline/design.md`
|
||||
- [x] `design/feature/recovery-pipeline/status.md`
|
||||
|
||||
## Feature Discovery Gate
|
||||
|
||||
- [x] feature goal and actor intents captured
|
||||
- [x] commands and events identified at feature level
|
||||
- [x] business rules and invariants captured at feature level
|
||||
- [x] edge cases captured at feature level
|
||||
- [x] candidate bounded contexts identified
|
||||
- [x] candidate workflow inventory identified
|
||||
- [x] project-wide shared-language updates captured
|
||||
- [x] approved for context and workflow decomposition
|
||||
|
||||
## Context & Workflow Decomposition Gate
|
||||
|
||||
- [x] bounded contexts confirmed
|
||||
- [x] feature steps mapped to workflow slices
|
||||
- [x] cross-context handoffs recorded
|
||||
- [x] per-context shared-language files created or updated
|
||||
- [x] workflow folders created with `01-decomposition.md`
|
||||
- [x] recommended slice order recorded
|
||||
- [ ] approved to begin slice discovery
|
||||
|
||||
## Workflow Slice Tracker
|
||||
|
||||
| Bounded Context | Workflow Slice | Slice Discovery | Core Sketch | Blueprint | Design Security | Assembly | Impl Security | Refactor | Notes |
|
||||
| :-------------- | :------------- | :-------------- | :---------- | :-------- | :-------------- | :------- | :------------ | :------- | :---- |
|
||||
| `ingest-snapshot` | `deterministic-bundle-ingest` | `Complete` | `Complete` | `Ready` | `Not Started` | `Not Started` | `Not Started` | `Not Started` | `Foundational source-of-truth slice.` |
|
||||
| `dependency-recovery` | `identify-vendored-packages` | `Not Started` | `Not Started` | `Not Started` | `Not Started` | `Not Started` | `Not Started` | `Not Started` | `Shrinks app-authored surface before later phases.` |
|
||||
| `dependency-recovery` | `externalize-accepted-dependencies` | `Not Started` | `Not Started` | `Not Started` | `Not Started` | `Not Started` | `Not Started` | `Not Started` | `Depends on package identification decisions.` |
|
||||
| `static-context-evidence` | `extract-segment-context` | `Not Started` | `Not Started` | `Not Started` | `Not Started` | `Not Started` | `Not Started` | `Not Started` | `Produces deterministic evidence for downstream consumers.` |
|
||||
| `snapshot-lineage` | `diff-adjacent-runs` | `Not Started` | `Not Started` | `Not Started` | `Not Started` | `Not Started` | `Not Started` | `Not Started` | `Owns lineage and changed/new segment routing.` |
|
||||
| `iterative-naming` | `plan-relabel-queue` | `Not Started` | `Not Started` | `Not Started` | `Not Started` | `Not Started` | `Not Started` | `Not Started` | `Queue planning only.` |
|
||||
| `iterative-naming` | `execute-wave-batches` | `Not Started` | `Not Started` | `Not Started` | `Not Started` | `Not Started` | `Not Started` | `Not Started` | `Outbound model execution only.` |
|
||||
| `iterative-naming` | `evaluate-and-apply-renames` | `Not Started` | `Not Started` | `Not Started` | `Not Started` | `Not Started` | `Not Started` | `Not Started` | `Safe deterministic acceptance and application.` |
|
||||
| `codebase-regularization` | `regularize-editable-tree` | `Not Started` | `Not Started` | `Not Started` | `Not Started` | `Not Started` | `Not Started` | `Not Started` | `Must preserve build-first invariant.` |
|
||||
| `maintained-transform-replay` | `replay-maintained-transforms` | `Not Started` | `Not Started` | `Not Started` | `Not Started` | `Not Started` | `Not Started` | `Not Started` | `Carries maintained changes across upgrades.` |
|
||||
| `release-packaging` | `build-and-publish-artifacts` | `Not Started` | `Not Started` | `Not Started` | `Not Started` | `Not Started` | `Not Started` | `Not Started` | `Release-oriented output only.` |
|
||||
|
||||
## Current Slice Gates
|
||||
|
||||
### Slice Discovery Gate
|
||||
|
||||
- [x] selected slice named explicitly
|
||||
- [x] happy path captured
|
||||
- [x] edge cases captured
|
||||
- [x] business rules and invariants captured
|
||||
- [x] handoff assumptions captured
|
||||
- [x] context shared-language updates captured
|
||||
- [x] approved for core sketch
|
||||
|
||||
### Core Sketch Gate
|
||||
|
||||
- [x] required state is explicit
|
||||
- [x] command and events are explicit
|
||||
- [x] policy signature is explicit
|
||||
- [x] slice boundaries are explicit
|
||||
- [x] no cross-context decision logic inside the slice
|
||||
- [x] approved for blueprint
|
||||
|
||||
### Blueprint Gate
|
||||
|
||||
- [ ] domain types make illegal states harder to express
|
||||
- [ ] shared concepts reused appropriately
|
||||
- [ ] policy is pure
|
||||
- [ ] reducer/apply shape is explicit
|
||||
- [ ] workflow contract is explicit
|
||||
- [ ] approved for design security review or assembly
|
||||
|
||||
### Design Security Gate
|
||||
|
||||
- [ ] trust boundaries reviewed
|
||||
- [ ] authority and least privilege reviewed
|
||||
- [ ] sink and data-flow risks reviewed
|
||||
- [ ] blocking findings resolved or explicitly accepted
|
||||
- [ ] approved for assembly
|
||||
|
||||
### Assembly Gate
|
||||
|
||||
- [ ] tests added
|
||||
- [ ] implementation completed
|
||||
- [ ] types pass
|
||||
- [ ] tests passing
|
||||
- [ ] effect AST checks run for modified Effect files
|
||||
- [ ] approved for implementation security review or next slice
|
||||
|
||||
### Implementation Security Gate
|
||||
|
||||
- [ ] implementation security review completed or explicitly deferred
|
||||
- [ ] blocking findings resolved or explicitly accepted
|
||||
- [ ] approved for refactor consideration or next slice
|
||||
|
||||
### Refactor Gate
|
||||
|
||||
- [ ] diagnosis completed if structural changes were needed
|
||||
- [ ] execution completed if approved
|
||||
- [ ] verification rerun after refactor
|
||||
- [ ] slice complete
|
||||
|
||||
## Open Questions / Blockers
|
||||
|
||||
- Build-first is selected, but the exact build command set is still implementation-specific.
|
||||
- The release docs imply publication is optional; the exact publication handoff seam inside `release-packaging` is still open.
|
||||
|
||||
## Context Handoff Notes
|
||||
|
||||
- Read first: `design/feature/recovery-pipeline/discovery.md`
|
||||
- Current focus: `Context & Workflow Decomposition`
|
||||
- Do not change: `Buildability outranks readability, repo root is the latest editable tree, review-needed states continue in MVP, and uncertainty is surfaced through manifests and reports.`
|
||||
Reference in New Issue
Block a user