Chapter 15 — DataRecovery Pro
Based on: SIH1749 — Recovery of Deleted Data from XFS and Btrfs Filesystems (NCIIPC)
Skills Required:
- Digital forensics
- Filesystem internal structure (XFS, Btrfs)
- Linux system internals
- C and Python programming
- Data recovery techniques
- Low-level disk analysis and metadata parsing
Project Description
DataRecovery Pro is a forensic recovery tool designed to recover deleted and corrupted data specifically from modern Linux filesystems like XFS and Btrfs. It parses filesystem metadata structures such as B-trees, extent maps, and journals to reconstruct deleted files. The tool aims to support rapid and thorough recovery of forensic artifacts and user data for legal and incident response purposes.
Tech Stack & Tools
- C for low-level filesystem metadata parsing
- Python for higher-level analysis and UI scripting
- Libbtrfs/libxfs (if applicable) for interfacing with filesystems
- Linux debugfs and ext tools
- SQLite or NoSQL database for recovered file cataloging
- CLI and optional simple GUI interface (Qt or Tkinter)
Week-wise Roadmap (9 Weeks)
Week 1 — Filesystem Study & Environment Setup
- Setup Linux forensic lab with disk images and test targets
- Study XFS and Btrfs on-disk structures and metadata layouts
- Setup dev environment with necessary libraries and tools
- Deliverable: Documented specs and environment ready
Week 2 — Metadata Parser for XFS
- Implement C code to parse XFS allocation groups, B-trees, and extent maps
- Extract inode and directory information
- Deliverable: Functional XFS metadata parser CLI tool
Week 3 — Metadata Parser for Btrfs
- Build Btrfs metadata parsing modules (tree roots, extent references)
- Support snapshots and subvolumes enumeration
- Deliverable: Btrfs metadata parser with CLI interface
Week 4 — Journal and Log Parsing
- Parse journaling data structures to recover recent file operations
- Correlate journal entries with metadata for enhanced reconstruction
- Deliverable: Extensible journaling parser module
Week 5 — File Carving & Recovery Logic
- Implement algorithms for carving deleted files from free space extents
- Rebuild partial file data using extent maps and snapshots
- Deliverable: File recovery engine supporting partial and complete files
Week 6 — Database and Cataloging
- Build recovery catalog to store recovered file metadata
- Enable tagging, filtering, and searching of recovered files
- Deliverable: Catalog database with queryable interface
Week 7 — User Interface Design
- Develop CLI and basic GUI for operating the recovery toolkit
- Support export of recovered files and reports
- Deliverable: Usable interface for forensic analysts
Week 8 — Testing & Validation
- Test recovery on sample forensic images with known deleted data
- Measure recovery success rates and accuracy
- Deliverable: Testing report and bug fixes
Week 9 — Final Documentation & Packaging
- Write detailed developer and user documentation
- Package complete toolkit with build scripts and installation guide
- Prepare demonstration walkthrough video
- Deliverable: Release-ready DataRecovery Pro toolkit
Testing and Deliverables
- Controlled tests with pre-deleted files on XFS/Btrfs images
- Validation of recovery completeness and metadata accuracy
- Final deliverables include source code repo, test datasets, user manuals, and demo videos