Initialization vectors: Normalizing iTunes Backups - Squeeze more data out of them, possibly...

Saturday, May 2, 2020

Normalizing iTunes Backups - Squeeze more data out of them, possibly...

Even though iOS full file system extractions are fairly commonplace the need to parse iTunes backups has not subsided. In many situations the only piece of evidence available could be an iTunes backup located in an external drive or one acquired from an iCloud data set. Most forensic tools are designed to parse these backups for relevant artifacts. What would happen if we took these backups and fed it to our tools in the same file structure they would be in if the backup had been restored to the iOS device? In other words what would our tools do if the backup file path structure was normalize to the file path structure found within the iOS that created them?

Video version


iTunes Backups - A super, ultra, mega short primer.

For those not familiar with iTunes backups I highly recommend reading Jack Farley's blog post on the topic. It can be found here:
http://farleyforensics.com/2019/04/14/forensic-analysis-of-itunes-backups/
An iTunes backup takes files from the device, renames them, and puts them in different folders where the key to recreate the almost original file structure is to look at the contents of the manifest.db database that is also created at backup time. I say almost because instead of keeping track of the whole path as it would have been on the device it abstract part of it under a series of category names called Domains. I think a comparison of paths would be a better way of understanding such transformations.

Filename: DataUsage.sqlite

Path as created by the backup:
0d/0d609c54856a9bb2d56729df1d68f2958a88426b/
Path as tracked in manifest.db:
WirelessDomain/Library/Databases/DataUsage.sqlite
Path within the device:
private/var/wireless/Library/Databases/DataUsage.sqlite
Notice how the farther away we are from the path as found on the device the more abstract it becomes. I bet you are wondering how could I tell that the WirelessDomain stand for /var/wireless within the file system. The domain to path is contained within the device in the following location:
System/Backup/Locations/Domain.plist.
The plist is not part of the backup and can only be accessed from an iOS full file system extraction. In my testing the domains seem to be the same across all iOS devices. Again, highly recommend reading the previous link to understand the how and the why of these transformations.

If you are interested in recreating an iTunes backup as defined in the manifest.db, using the domains, check out Jack Farley's python script here:
https://github.com/jfarley248/iTunes_Backup_Reader
The problem

I can't count how many times I have parsed an iTunes backup to realize the following:
- Not all relevant items are parsed.
- For items that are parsed the file path provided is the one as created by the backup. This tells me nothing in regards to where it was the file originally located on the phone. I don't like having to query the manifest.db data store by hand one bit thank you very much.

What if we normalized the backups to the full path as found within the device and then have our tools parse it? Would it possibly parse more if it believed it was a full iOS file system extraction?

Edward Greybeard (great pseudonym) made a script for easy normalization of these backups in Python. You can get it here:
https://github.com/edward-greybeard/iOS-UNF
For my testing I parsed an iTunes backup two times.
First run as a file system using domains.


Second run with fully normalized iOS paths.

The tool I used was my own, iLEAPP. I ran a limited set of artifacts for this example. The artifacts searched for where the same in both runs. You can find it here:
https://github.com/abrignoni/iLEAPP
Results

The first run provided 7 artifacts.


 The second run provided 9 artifacts.


Caveats

It is one thing to use a tool that has not been explicitly designed to parse iTunes backups and have it show this type of results. Would the same happen with a commercial tool? The answer, as everything in forensics, is it depends. I have done this type of double runs with commercial tools and found that they either parsed more artifacts, some times less, and sometimes both less and more. If the last one doesn't make sense think of a report that parsed less artifacts overall but included a few that weren't in the first run.

Takeaways

Is this a type of analysis you should do on every iTunes backup in every case? No. Commercially available digital forensic tools do a great job as is. But if the case hangs on that one iTunes backup it doesn't hurt to try different approaches.

In regards to vendors my hope is that when parsing iTunes backups they can add the fully normalized path for the artifact as part of the metadata panes in their tool user interface. Normalized paths for all backup files. The analytical benefit you can get from knowing where on the device was a file originally located can be incalculable.

As always, I can be reached on twitter @AlexisBrignoni and email 4n6[at]abrignoni[dot]com.