Home > Runs > 20260511 T001 > T1 Research Findings: Asset Extraction Readiness

T1 Research Findings: Asset Extraction Readiness

Date: 2026-05-11

Agent: researcher

Task: 20260511-T001 - Asset Extraction Readiness

Executive Summary

This T1 research phase provides a comprehensive analysis of the current asset extraction workflow, identifies historical patterns, documents tooling capabilities, and pinpoints specific gaps that need to be addressed for title-menu asset extraction readiness.

1. Historical Asset Evidence Analysis

Stage Briefs Review

Finding: No dedicated asset stage brief exists. Asset handling is distributed across: - SB-003 Render System: Mentions texture loading but no D2-specific formats - SB-004 Font System: Focuses on font assets, not PVR/PVM textures - SB-010 Build System: References asset pipeline but lacks D2-specific details

Gap: No comprehensive asset extraction workflow documentation exists.

Historical Manifest Analysis

File: /work/repo/asset-staging/raw/d2-title/extract-manifest.json

Key Findings: 1. Early Speculative Manifest: Contains 8 assets with placeholder entry names 2. Different Symbol Naming: Uses d2_asset_title_snow vs current d2_asset_snow 3. Partial Coverage: Missing menu, copyright, and some background layers 4. Notes Field: Provides useful context about asset roles

Speculative Entry Names Found: - Q_TITLEBGMT0.PVM:BGMT0 (background) - Q_DJSNOW.PVM:DJSNOW (snow particles) - Q_TITLE2D.PVM:TITLE2D (logo) - Q_TITLEMENU.PVM:TITLEMENU (menu) - Q_TITLEMENU.PVM:COPYRIGHT (copyright) - SAKA_MNSNOW1A.PVM:MNSNOW1A (snow variant A) - SAKA_MNSNOW1B.PVM:MNSNOW1B (snow variant B) - SAKA_MNSNOW1C.PVM:MNSNOW1C (snow variant C)

Current Manifest Analysis

File: /work/repo/tools/title_menu_manifest.json

Key Findings: 1. Expanded Coverage: 11 assets covering all required roles 2. Consistent Naming: Uses d2_asset_* prefix consistently 3. Role Classification: Explicit role field (background, logo, menu, overlay, particles, copyright) 4. Speculative Entry Names: Still uses placeholder entry names that need validation

Current Speculative Entry Names: - Q_TITLEBGMT0.PVM:TITLEBGMT0 - Q_TITLEBGMT1.PVM:TITLEBGMT1 - Q_TITLEBGMT2.PVM:TITLEBGMT2 - Q_DMTITLE.PVM:DMTITLE - Q_TITLEMENU.PVM:TITLEMENU - Q_TITLE2D.PVM:TITLE2D - Q_DJSNOW.PVM:DJSNOW - SAKA_MNSNOW1A.PVM:SAKA_MNSNOW1A - SAKA_MNSNOW1B.PVM:SAKA_MNSNOW1B - SAKA_MNSNOW1C.PVM:SAKA_MNSNOW1C - P_COMTIT.PVR: (no entry, standalone PVR)

Gap Analysis Document

File: /work/repo/notes/2026-05-10-d2-title-menu-gap.md

Key Findings: 1. Current State: Renders old decoder-evidence screen (W/A/R/P + 0GDTEX) 2. Target State: D2 title menu with 5 components (background, snow, logo, menu, copyright) 3. Pipeline Status: - ✅ PVM parser implemented (supports multiple formats) - ✅ Manifest-driven extraction available - ❌ Title-menu PVMs not extracted from disc - ❌ Entry names not validated against real files - ❌ .CTS animation files not handled 4. Build Gap: No target for title menu validation image

2. Extraction Tool Capabilities Analysis

Supported Formats

Pixel Formats (PF_*): - 0x01 PF_RGB565: RGB 5-6-5 format - 0x02 PF_ARGB4444: ARGB 4-4-4-4 format - 0x03 PF_ARGB1555: ARGB 1-5-5-5 format - 0x04 PF_YUV422: YUV 4:2:2 format (not implemented) - 0x05 PF_BUMP: Bump map format (not implemented) - 0x06 PF_PAL4: 4-bit palettized (not implemented) - 0x07 PF_PAL8: 8-bit palettized (not implemented)

Data Formats (DF_*): - 0x01 DF_SQUARE_TWIDDLED: Square twiddled pixel data - 0x03 DF_VQ: Vector quantization (256-entry codebook) - 0x10 DF_SMALLVQ: Small VQ variant

Decoder Coverage

Implemented Decoders: 1. ✅ _decode_square_twiddled_rgb565() - RGB565 square twiddled 2. ✅ _decode_square_twiddled_argb4444() - ARGB4444 square twiddled 3. ✅ _decode_square_twiddled_argb1555() - ARGB1555 square twiddled 4. ✅ _decode_vq_rgb565() - VQ/SMALLVQ RGB565 5. ✅ decode_pvrt_auto() - Auto-detection dispatcher

Missing Decoders: 1. ❌ YUV422 formats 2. ❌ Bump map formats 3. ❌ Palettized formats (PAL4, PAL8) 4. ❌ .CTS animation sequence parsing

Error Handling Analysis

Current Error Messages:

# Missing PVMH magic
raise ValueError("PVMH header missing")

# Missing PVRT magic  
raise ValueError(f"PVRT header missing at 0x{offset:x}")

# Wrong format for decoder
raise ValueError(f"{source_name} is not RGB565 square twiddled")

# Unsupported format
raise ValueError(f"unsupported PVRT format pixel_format={pf_name}...")

# File not found
print(f"SKIP {symbol}: {source_path} not found ({notes})")

# PVM entry not found
print(f"SKIP {symbol}: entry '{entry}' not in {source}. Available: {available}")

# Decode errors
print(f"SKIP {symbol}: decode error: {e}")

Improvement Opportunities: 1. Actionable Guidance: Errors don't explain how to fix issues 2. Path Suggestions: Don't indicate where to place assets 3. Format Documentation: Don't explain supported formats 4. Discovery Hint: Don't suggest using --discover mode 5. Manifest Validation: No pre-flight validation of manifest schema

Discovery Mode Analysis

Current --discover Output: - ✅ Lists all PVM/PVR/CTS files recursively - ✅ Shows PVM entry names and offsets - ✅ Displays PVRT format info (PF/DF labels, dimensions) - ✅ Indicates CTS companions - ❌ No comparison with manifest expectations - ❌ No indication of missing expected files - ❌ No format validation warnings - ❌ Output is text-only, not machine-readable

Improvement Opportunities: 1. Add manifest comparison mode 2. Highlight missing expected files 3. Warn about unsupported formats 4. Add JSON output option for tooling integration 5. Show expected vs. actual asset coverage

3. Manifest Validation Analysis

Current Manifest Schema

Required Fields: - symbol: C identifier for generated asset - source_file: Source filename - source_entry: Entry name (for PVM) or null (for PVR) - decoder: "auto" (only supported value) - role: Asset role (background, logo, menu, overlay, particles, copyright) - description: Human-readable description

Validation Rules: - ✅ Symbol must start with d2_asset_ prefix - ✅ Source file must exist - ✅ PVM files require entry name - ✅ PVR files require null entry - ❌ No validation of role values - ❌ No validation of symbol uniqueness - ❌ No validation of source file extensions - ❌ No pre-flight validation without actual extraction

Placeholder Clarity Analysis

Current Placeholder Issues: 1. Speculative Entry Names: All PVM entry names are guesses 2. No Confidence Indicators: No way to mark speculative vs. confirmed 3. No Validation Status: No field to track validation state 4. No Historical Context: No reference to where names came from

Suggested Improvements: 1. Add validation_status field: "speculative", "confirmed", "extracted" 2. Add source field: "historical_manifest", "discovery", "extracted" 3. Add confidence field: 0-100% confidence score 4. Add warnings in generated headers for speculative assets

4. Workflow Documentation Gaps

Missing Documentation

Critical Missing Documents: 1. Asset Extraction Workflow: Step-by-step guide from disc to C code 2. Private Asset Setup: Where to place assets, directory structure 3. Discovery Process: How to use --discover to validate entry names 4. Manifest Editing: How to update manifest with confirmed names 5. Troubleshooting Guide: Common issues and solutions

Existing Documentation Review

BUILD.md: - ✅ Explains build targets - ❌ No asset extraction section - ❌ No private asset setup instructions - ❌ No troubleshooting for missing assets

README.md: - ✅ Project overview - ❌ No asset pipeline mention - ❌ No setup prerequisites - ❌ No quickstart for asset extraction

docs/ASSET_EXTRACTION_MAP.md: - ✅ Lists candidate files - ✅ Explains extraction path priority - ❌ No workflow steps - ❌ No tool usage examples - ❌ No error handling guidance

5. Tooling Enhancement Requirements

High-Priority Enhancements

--dry-run Mode
Validate manifest schema without requiring files
Check symbol naming conventions
Verify role values
Report potential issues before extraction
Improved Error Messages
Explain how to resolve each error type
Provide path suggestions for missing assets
Reference documentation sections
Suggest --discover for validation
Manifest Comparison in Discovery
Show which manifest assets are missing
Highlight assets with wrong formats
Indicate speculative vs. confirmed entries
Provide coverage percentage
Better Discovery Output
Add JSON output option
Show expected manifest assets
Highlight format compatibility issues
Provide actionable next steps

Medium-Priority Enhancements

Placeholder Management
Add placeholder validation status
Generate warnings for speculative assets
Track confidence levels
Provide upgrade path from placeholders to real assets
Format Documentation
Add --list-formats option
Explain supported PVRT formats
Show format compatibility matrix
Provide examples of each format
Asset Preview
Generate PNG previews of extracted assets
Show asset dimensions and format
Provide visual validation
Help with debugging

6. Current Build System Analysis

Build Targets

Current Targets: - make d2-assets: Runs extraction tool with manifest - make elf: Builds ELF executable - make flycast-image: Builds Flycast-compatible ELF - make verify-flycast: Launches Flycast for validation

Build System Status: - ✅ KOS toolchain detection works - ✅ ELF validation (entry point check) - ✅ Manifest generation - ❌ No asset validation before build - ❌ No warning if using placeholder assets - ❌ No title-menu specific target

Asset Integration

Current Integration: - Generated d2_menu_assets.c included in build - Assets accessible via d2_menu_assets.h header - Asset metadata in asset_metadata.c - No runtime asset validation - No missing asset warnings

7. Specific Recommendations

Documentation Improvements

Create docs/ASSET_EXTRACTION_WORKFLOW.md: ```markdown
Overview of asset extraction process
Prerequisites (Python, private asset location)
Private asset setup (directory structure)
Discovery phase (--discover usage)
Manifest validation and editing
Extraction execution
Build integration
Troubleshooting common issues ```
Update README.md:
Add Asset Extraction section
List prerequisites clearly
Provide quickstart guide
Link to detailed workflow
Update BUILD.md:
Add Asset Extraction prerequisites
Explain placeholder vs. real assets
Document troubleshooting steps
Add common error resolutions
Update docs/ASSET_EXTRACTION_MAP.md:
Add workflow steps
Include tool usage examples
Explain error handling
Provide discovery guidance

Tooling Improvements

Add --dry-run Mode: ```python def validate_manifest_only(d2_dir: Path, manifest: dict) -> list[str]: """Validate manifest schema and naming without requiring files.""" warnings = [] seen_symbols = set()

for item in manifest.get("assets", []): # Validate symbol naming if not item["symbol"].startswith("d2_asset_"): warnings.append(f"Symbol '{item['symbol']}' should start with 'd2_asset_'")

   # Check for duplicates
   if item["symbol"] in seen_symbols:
       warnings.append(f"Duplicate symbol: '{item['symbol']}'")
   seen_symbols.add(item["symbol"])

   # Validate source file extension
   source = item.get("source_file") or item.get("source")
   if not source:
       warnings.append(f"Missing source file for '{item['symbol']}'")
   elif not source.upper().endswith((".PVM", ".PVR")):
       warnings.append(f"Invalid source extension for '{item['symbol']}': {source}")

   # Validate PVM/PVR requirements
   if source.upper().endswith(".PVM"):
       if not item.get("source_entry"):
           warnings.append(f"PVM source requires entry name: '{item['symbol']}'")
   elif source.upper().endswith(".PVR"):
       if item.get("source_entry"):
           warnings.append(f"PVR source should not have entry name: '{item['symbol']}'")

return warnings ```

Enhance Error Messages: ```python # Before print(f"SKIP {symbol}: {source_path} not found ({notes})")

# After print( f"ERROR: Asset '{symbol}' not found at {source_path}\n" f"Expected location: {source_path}\n" f"Solution: Place private D2 assets at {d2_dir}/\n" f"See docs/ASSET_EXTRACTION_WORKFLOW.md for setup instructions\n" f"Use --discover to see available assets" ) ```

Improve Discovery Output: ```python def discover_with_manifest(d2_dir: Path, manifest: dict) -> None: """Enhanced discovery that compares with manifest expectations.""" # Get manifest expectations expected = {item["source_file"]: item for item in manifest.get("assets", [])}

# Track coverage found_assets = set() missing_assets = set(expected.keys())

# Existing discovery logic... for path in all_paths: rel = path.relative_to(d2_dir) if rel in expected: found_assets.add(rel) missing_assets.discard(rel)
```
   # Existing discovery output...
```
# Add summary print("\n" + "="60) print("MANIFEST COVERAGE SUMMARY") print("="60) print(f"Expected assets: {len(expected)}") print(f"Found assets: {len(found_assets)}") print(f"Missing assets: {len(missing_assets)}") if missing_assets: print("\nMissing assets:") for asset in sorted(missing_assets): print(f" - {asset}") print(f"\nCoverage: {len(found_assets)}/{len(expected)} ({100*len(found_assets)//len(expected)}%)") ```

Manifest Improvements

Add Validation Status Field: json { "symbol": "d2_asset_title_bg", "source_file": "Q_TITLEBGMT0.PVM", "source_entry": "TITLEBGMT0", "validation_status": "speculative", "confidence": 50, "source": "historical_manifest", "role": "background", "description": "Snowy mountain background layer 0" }
Add Schema Validation: python MANIFEST_SCHEMA = { "type": "object", "properties": { "version": {"type": "string"}, "target": {"type": "string"}, "description": {"type": "string"}, "source_dir": {"type": "string"}, "assets": { "type": "array", "items": { "type": "object", "properties": { "symbol": {"type": "string", "pattern": "^d2_asset_"}, "source_file": {"type": "string"}, "source_entry": {"anyOf": [{"type": "string"}, {"type": "null"}]}, "validation_status": {"type": "string", "enum": ["speculative", "confirmed", "extracted"]}, "confidence": {"type": "integer", "minimum": 0, "maximum": 100}, "source": {"type": "string"}, "role": {"type": "string", "enum": ["background", "logo", "menu", "overlay", "particles", "copyright"]}, "description": {"type": "string"} }, "required": ["symbol", "source_file", "role"] } } }, "required": ["version", "assets"] }

8. Implementation Roadmap

Phase 1: Documentation (T2)

Draft docs/ASSET_EXTRACTION_WORKFLOW.md
Update README.md with asset requirements
Update BUILD.md with troubleshooting
Update docs/ASSET_EXTRACTION_MAP.md with workflow

Phase 2: Tooling Enhancements (T3)

Implement --dry-run mode
Enhance error messages with actionable guidance
Add manifest comparison to discovery
Improve discovery output formatting

Phase 3: Manifest Improvements (T3)

Add validation status field
Add confidence indicators
Add source tracking
Update placeholders with historical context

Phase 4: Testing (T5)

Test --dry-run with current manifest
Test discovery with manifest comparison
Verify error messages are clear
Test existing build still works

9. Risk Assessment

Low Risk

✅ Documentation improvements (no code changes)
✅ Error message enhancements (user experience only)
✅ Discovery mode improvements (read-only operations)
✅ Manifest schema validation (data validation only)

Medium Risk

⚠️ Manifest structure changes (backward compatibility)
⚠️ New command-line flags (need testing)
⚠️ Build system integration (need validation)

High Risk (Avoided)

❌ No licensed content handling
❌ No actual asset extraction
❌ No visual target changes
❌ No build system modifications

10. Conclusion

The T1 research phase has identified clear, actionable improvements that can be made to the asset extraction workflow without requiring access to private D2 assets. The proposed enhancements focus on:

Documentation: Creating comprehensive workflow guides
Tooling: Improving error handling and discovery capabilities
Manifest: Adding validation status and clarity
User Experience: Making the workflow fail gracefully and provide clear guidance

These improvements will prepare the project for successful asset extraction when private D2 assets become available, while maintaining the current build functionality with placeholder assets.

11. References

Historical Manifest: /work/repo/asset-staging/raw/d2-title/extract-manifest.json
Current Manifest: /work/repo/tools/title_menu_manifest.json
Extraction Tool: /work/repo/tools/extract_d2_menu_assets.py
Gap Analysis: /work/repo/notes/2026-05-10-d2-title-menu-gap.md
Asset Extraction Map: /work/repo/docs/ASSET_EXTRACTION_MAP.md
Build Documentation: /work/repo/docs/BUILD.md

12. Next Steps

Proceed to T2 (Design) phase to: 1. Create detailed documentation outlines 2. Design tooling enhancements 3. Plan manifest improvements 4. Develop testing strategy