T1 Research Findings: Asset Extraction Readiness
Date: 2026-05-11
Agent: researcher
Task: 20260511-T001 - Asset Extraction Readiness
Executive Summary
This T1 research phase provides a comprehensive analysis of the current asset extraction workflow, identifies historical patterns, documents tooling capabilities, and pinpoints specific gaps that need to be addressed for title-menu asset extraction readiness.
1. Historical Asset Evidence Analysis
Stage Briefs Review
Finding: No dedicated asset stage brief exists. Asset handling is distributed across: - SB-003 Render System: Mentions texture loading but no D2-specific formats - SB-004 Font System: Focuses on font assets, not PVR/PVM textures - SB-010 Build System: References asset pipeline but lacks D2-specific details
Gap: No comprehensive asset extraction workflow documentation exists.
Historical Manifest Analysis
File: /work/repo/asset-staging/raw/d2-title/extract-manifest.json
Key Findings:
1. Early Speculative Manifest: Contains 8 assets with placeholder entry names
2. Different Symbol Naming: Uses d2_asset_title_snow vs current d2_asset_snow
3. Partial Coverage: Missing menu, copyright, and some background layers
4. Notes Field: Provides useful context about asset roles
Speculative Entry Names Found:
- Q_TITLEBGMT0.PVM:BGMT0 (background)
- Q_DJSNOW.PVM:DJSNOW (snow particles)
- Q_TITLE2D.PVM:TITLE2D (logo)
- Q_TITLEMENU.PVM:TITLEMENU (menu)
- Q_TITLEMENU.PVM:COPYRIGHT (copyright)
- SAKA_MNSNOW1A.PVM:MNSNOW1A (snow variant A)
- SAKA_MNSNOW1B.PVM:MNSNOW1B (snow variant B)
- SAKA_MNSNOW1C.PVM:MNSNOW1C (snow variant C)
Current Manifest Analysis
File: /work/repo/tools/title_menu_manifest.json
Key Findings:
1. Expanded Coverage: 11 assets covering all required roles
2. Consistent Naming: Uses d2_asset_* prefix consistently
3. Role Classification: Explicit role field (background, logo, menu, overlay, particles, copyright)
4. Speculative Entry Names: Still uses placeholder entry names that need validation
Current Speculative Entry Names:
- Q_TITLEBGMT0.PVM:TITLEBGMT0
- Q_TITLEBGMT1.PVM:TITLEBGMT1
- Q_TITLEBGMT2.PVM:TITLEBGMT2
- Q_DMTITLE.PVM:DMTITLE
- Q_TITLEMENU.PVM:TITLEMENU
- Q_TITLE2D.PVM:TITLE2D
- Q_DJSNOW.PVM:DJSNOW
- SAKA_MNSNOW1A.PVM:SAKA_MNSNOW1A
- SAKA_MNSNOW1B.PVM:SAKA_MNSNOW1B
- SAKA_MNSNOW1C.PVM:SAKA_MNSNOW1C
- P_COMTIT.PVR: (no entry, standalone PVR)
Gap Analysis Document
File: /work/repo/notes/2026-05-10-d2-title-menu-gap.md
Key Findings: 1. Current State: Renders old decoder-evidence screen (W/A/R/P + 0GDTEX) 2. Target State: D2 title menu with 5 components (background, snow, logo, menu, copyright) 3. Pipeline Status: - ✅ PVM parser implemented (supports multiple formats) - ✅ Manifest-driven extraction available - ❌ Title-menu PVMs not extracted from disc - ❌ Entry names not validated against real files - ❌ .CTS animation files not handled 4. Build Gap: No target for title menu validation image
2. Extraction Tool Capabilities Analysis
Supported Formats
Pixel Formats (PF_*):
- 0x01 PF_RGB565: RGB 5-6-5 format
- 0x02 PF_ARGB4444: ARGB 4-4-4-4 format
- 0x03 PF_ARGB1555: ARGB 1-5-5-5 format
- 0x04 PF_YUV422: YUV 4:2:2 format (not implemented)
- 0x05 PF_BUMP: Bump map format (not implemented)
- 0x06 PF_PAL4: 4-bit palettized (not implemented)
- 0x07 PF_PAL8: 8-bit palettized (not implemented)
Data Formats (DF_*):
- 0x01 DF_SQUARE_TWIDDLED: Square twiddled pixel data
- 0x03 DF_VQ: Vector quantization (256-entry codebook)
- 0x10 DF_SMALLVQ: Small VQ variant
Decoder Coverage
Implemented Decoders:
1. ✅ _decode_square_twiddled_rgb565() - RGB565 square twiddled
2. ✅ _decode_square_twiddled_argb4444() - ARGB4444 square twiddled
3. ✅ _decode_square_twiddled_argb1555() - ARGB1555 square twiddled
4. ✅ _decode_vq_rgb565() - VQ/SMALLVQ RGB565
5. ✅ decode_pvrt_auto() - Auto-detection dispatcher
Missing Decoders: 1. ❌ YUV422 formats 2. ❌ Bump map formats 3. ❌ Palettized formats (PAL4, PAL8) 4. ❌ .CTS animation sequence parsing
Error Handling Analysis
Current Error Messages:
# Missing PVMH magic
raise ValueError("PVMH header missing")
# Missing PVRT magic
raise ValueError(f"PVRT header missing at 0x{offset:x}")
# Wrong format for decoder
raise ValueError(f"{source_name} is not RGB565 square twiddled")
# Unsupported format
raise ValueError(f"unsupported PVRT format pixel_format={pf_name}...")
# File not found
print(f"SKIP {symbol}: {source_path} not found ({notes})")
# PVM entry not found
print(f"SKIP {symbol}: entry '{entry}' not in {source}. Available: {available}")
# Decode errors
print(f"SKIP {symbol}: decode error: {e}")
Improvement Opportunities:
1. Actionable Guidance: Errors don't explain how to fix issues
2. Path Suggestions: Don't indicate where to place assets
3. Format Documentation: Don't explain supported formats
4. Discovery Hint: Don't suggest using --discover mode
5. Manifest Validation: No pre-flight validation of manifest schema
Discovery Mode Analysis
Current --discover Output:
- ✅ Lists all PVM/PVR/CTS files recursively
- ✅ Shows PVM entry names and offsets
- ✅ Displays PVRT format info (PF/DF labels, dimensions)
- ✅ Indicates CTS companions
- ❌ No comparison with manifest expectations
- ❌ No indication of missing expected files
- ❌ No format validation warnings
- ❌ Output is text-only, not machine-readable
Improvement Opportunities: 1. Add manifest comparison mode 2. Highlight missing expected files 3. Warn about unsupported formats 4. Add JSON output option for tooling integration 5. Show expected vs. actual asset coverage
3. Manifest Validation Analysis
Current Manifest Schema
Required Fields:
- symbol: C identifier for generated asset
- source_file: Source filename
- source_entry: Entry name (for PVM) or null (for PVR)
- decoder: "auto" (only supported value)
- role: Asset role (background, logo, menu, overlay, particles, copyright)
- description: Human-readable description
Validation Rules:
- ✅ Symbol must start with d2_asset_ prefix
- ✅ Source file must exist
- ✅ PVM files require entry name
- ✅ PVR files require null entry
- ❌ No validation of role values
- ❌ No validation of symbol uniqueness
- ❌ No validation of source file extensions
- ❌ No pre-flight validation without actual extraction
Placeholder Clarity Analysis
Current Placeholder Issues: 1. Speculative Entry Names: All PVM entry names are guesses 2. No Confidence Indicators: No way to mark speculative vs. confirmed 3. No Validation Status: No field to track validation state 4. No Historical Context: No reference to where names came from
Suggested Improvements:
1. Add validation_status field: "speculative", "confirmed", "extracted"
2. Add source field: "historical_manifest", "discovery", "extracted"
3. Add confidence field: 0-100% confidence score
4. Add warnings in generated headers for speculative assets
4. Workflow Documentation Gaps
Missing Documentation
Critical Missing Documents:
1. Asset Extraction Workflow: Step-by-step guide from disc to C code
2. Private Asset Setup: Where to place assets, directory structure
3. Discovery Process: How to use --discover to validate entry names
4. Manifest Editing: How to update manifest with confirmed names
5. Troubleshooting Guide: Common issues and solutions
Existing Documentation Review
BUILD.md: - ✅ Explains build targets - ❌ No asset extraction section - ❌ No private asset setup instructions - ❌ No troubleshooting for missing assets
README.md: - ✅ Project overview - ❌ No asset pipeline mention - ❌ No setup prerequisites - ❌ No quickstart for asset extraction
docs/ASSET_EXTRACTION_MAP.md: - ✅ Lists candidate files - ✅ Explains extraction path priority - ❌ No workflow steps - ❌ No tool usage examples - ❌ No error handling guidance
5. Tooling Enhancement Requirements
High-Priority Enhancements
--dry-runMode- Validate manifest schema without requiring files
- Check symbol naming conventions
- Verify role values
-
Report potential issues before extraction
-
Improved Error Messages
- Explain how to resolve each error type
- Provide path suggestions for missing assets
- Reference documentation sections
-
Suggest
--discoverfor validation -
Manifest Comparison in Discovery
- Show which manifest assets are missing
- Highlight assets with wrong formats
- Indicate speculative vs. confirmed entries
-
Provide coverage percentage
-
Better Discovery Output
- Add JSON output option
- Show expected manifest assets
- Highlight format compatibility issues
- Provide actionable next steps
Medium-Priority Enhancements
- Placeholder Management
- Add placeholder validation status
- Generate warnings for speculative assets
- Track confidence levels
-
Provide upgrade path from placeholders to real assets
-
Format Documentation
- Add
--list-formatsoption - Explain supported PVRT formats
- Show format compatibility matrix
-
Provide examples of each format
-
Asset Preview
- Generate PNG previews of extracted assets
- Show asset dimensions and format
- Provide visual validation
- Help with debugging
6. Current Build System Analysis
Build Targets
Current Targets:
- make d2-assets: Runs extraction tool with manifest
- make elf: Builds ELF executable
- make flycast-image: Builds Flycast-compatible ELF
- make verify-flycast: Launches Flycast for validation
Build System Status: - ✅ KOS toolchain detection works - ✅ ELF validation (entry point check) - ✅ Manifest generation - ❌ No asset validation before build - ❌ No warning if using placeholder assets - ❌ No title-menu specific target
Asset Integration
Current Integration:
- Generated d2_menu_assets.c included in build
- Assets accessible via d2_menu_assets.h header
- Asset metadata in asset_metadata.c
- No runtime asset validation
- No missing asset warnings
7. Specific Recommendations
Documentation Improvements
- Create
docs/ASSET_EXTRACTION_WORKFLOW.md: ```markdown - Overview of asset extraction process
- Prerequisites (Python, private asset location)
- Private asset setup (directory structure)
- Discovery phase (--discover usage)
- Manifest validation and editing
- Extraction execution
- Build integration
-
Troubleshooting common issues ```
-
Update
README.md: - Add Asset Extraction section
- List prerequisites clearly
- Provide quickstart guide
-
Link to detailed workflow
-
Update
BUILD.md: - Add Asset Extraction prerequisites
- Explain placeholder vs. real assets
- Document troubleshooting steps
-
Add common error resolutions
-
Update
docs/ASSET_EXTRACTION_MAP.md: - Add workflow steps
- Include tool usage examples
- Explain error handling
- Provide discovery guidance
Tooling Improvements
-
Add
--dry-runMode: ```python def validate_manifest_only(d2_dir: Path, manifest: dict) -> list[str]: """Validate manifest schema and naming without requiring files.""" warnings = [] seen_symbols = set()for item in manifest.get("assets", []): # Validate symbol naming if not item["symbol"].startswith("d2_asset_"): warnings.append(f"Symbol '{item['symbol']}' should start with 'd2_asset_'")
# Check for duplicates if item["symbol"] in seen_symbols: warnings.append(f"Duplicate symbol: '{item['symbol']}'") seen_symbols.add(item["symbol"]) # Validate source file extension source = item.get("source_file") or item.get("source") if not source: warnings.append(f"Missing source file for '{item['symbol']}'") elif not source.upper().endswith((".PVM", ".PVR")): warnings.append(f"Invalid source extension for '{item['symbol']}': {source}") # Validate PVM/PVR requirements if source.upper().endswith(".PVM"): if not item.get("source_entry"): warnings.append(f"PVM source requires entry name: '{item['symbol']}'") elif source.upper().endswith(".PVR"): if item.get("source_entry"): warnings.append(f"PVR source should not have entry name: '{item['symbol']}'")return warnings ```
-
Enhance Error Messages: ```python # Before print(f"SKIP {symbol}: {source_path} not found ({notes})")
# After print( f"ERROR: Asset '{symbol}' not found at {source_path}\n" f"Expected location: {source_path}\n" f"Solution: Place private D2 assets at {d2_dir}/\n" f"See docs/ASSET_EXTRACTION_WORKFLOW.md for setup instructions\n" f"Use --discover to see available assets" ) ```
-
Improve Discovery Output: ```python def discover_with_manifest(d2_dir: Path, manifest: dict) -> None: """Enhanced discovery that compares with manifest expectations.""" # Get manifest expectations expected = {item["source_file"]: item for item in manifest.get("assets", [])}
# Track coverage found_assets = set() missing_assets = set(expected.keys())
# Existing discovery logic... for path in all_paths: rel = path.relative_to(d2_dir) if rel in expected: found_assets.add(rel) missing_assets.discard(rel)
# Existing discovery output...# Add summary print("\n" + "="60) print("MANIFEST COVERAGE SUMMARY") print("="60) print(f"Expected assets: {len(expected)}") print(f"Found assets: {len(found_assets)}") print(f"Missing assets: {len(missing_assets)}") if missing_assets: print("\nMissing assets:") for asset in sorted(missing_assets): print(f" - {asset}") print(f"\nCoverage: {len(found_assets)}/{len(expected)} ({100*len(found_assets)//len(expected)}%)") ```
Manifest Improvements
-
Add Validation Status Field:
json { "symbol": "d2_asset_title_bg", "source_file": "Q_TITLEBGMT0.PVM", "source_entry": "TITLEBGMT0", "validation_status": "speculative", "confidence": 50, "source": "historical_manifest", "role": "background", "description": "Snowy mountain background layer 0" } -
Add Schema Validation:
python MANIFEST_SCHEMA = { "type": "object", "properties": { "version": {"type": "string"}, "target": {"type": "string"}, "description": {"type": "string"}, "source_dir": {"type": "string"}, "assets": { "type": "array", "items": { "type": "object", "properties": { "symbol": {"type": "string", "pattern": "^d2_asset_"}, "source_file": {"type": "string"}, "source_entry": {"anyOf": [{"type": "string"}, {"type": "null"}]}, "validation_status": {"type": "string", "enum": ["speculative", "confirmed", "extracted"]}, "confidence": {"type": "integer", "minimum": 0, "maximum": 100}, "source": {"type": "string"}, "role": {"type": "string", "enum": ["background", "logo", "menu", "overlay", "particles", "copyright"]}, "description": {"type": "string"} }, "required": ["symbol", "source_file", "role"] } } }, "required": ["version", "assets"] }
8. Implementation Roadmap
Phase 1: Documentation (T2)
- Draft
docs/ASSET_EXTRACTION_WORKFLOW.md - Update
README.mdwith asset requirements - Update
BUILD.mdwith troubleshooting - Update
docs/ASSET_EXTRACTION_MAP.mdwith workflow
Phase 2: Tooling Enhancements (T3)
- Implement
--dry-runmode - Enhance error messages with actionable guidance
- Add manifest comparison to discovery
- Improve discovery output formatting
Phase 3: Manifest Improvements (T3)
- Add validation status field
- Add confidence indicators
- Add source tracking
- Update placeholders with historical context
Phase 4: Testing (T5)
- Test
--dry-runwith current manifest - Test discovery with manifest comparison
- Verify error messages are clear
- Test existing build still works
9. Risk Assessment
Low Risk
- ✅ Documentation improvements (no code changes)
- ✅ Error message enhancements (user experience only)
- ✅ Discovery mode improvements (read-only operations)
- ✅ Manifest schema validation (data validation only)
Medium Risk
- ⚠️ Manifest structure changes (backward compatibility)
- ⚠️ New command-line flags (need testing)
- ⚠️ Build system integration (need validation)
High Risk (Avoided)
- ❌ No licensed content handling
- ❌ No actual asset extraction
- ❌ No visual target changes
- ❌ No build system modifications
10. Conclusion
The T1 research phase has identified clear, actionable improvements that can be made to the asset extraction workflow without requiring access to private D2 assets. The proposed enhancements focus on:
- Documentation: Creating comprehensive workflow guides
- Tooling: Improving error handling and discovery capabilities
- Manifest: Adding validation status and clarity
- User Experience: Making the workflow fail gracefully and provide clear guidance
These improvements will prepare the project for successful asset extraction when private D2 assets become available, while maintaining the current build functionality with placeholder assets.
11. References
- Historical Manifest:
/work/repo/asset-staging/raw/d2-title/extract-manifest.json - Current Manifest:
/work/repo/tools/title_menu_manifest.json - Extraction Tool:
/work/repo/tools/extract_d2_menu_assets.py - Gap Analysis:
/work/repo/notes/2026-05-10-d2-title-menu-gap.md - Asset Extraction Map:
/work/repo/docs/ASSET_EXTRACTION_MAP.md - Build Documentation:
/work/repo/docs/BUILD.md
12. Next Steps
Proceed to T2 (Design) phase to: 1. Create detailed documentation outlines 2. Design tooling enhancements 3. Plan manifest improvements 4. Develop testing strategy