For the last couple of days I've been working creating a parser of Discord JSON chat files using iLEAPP. If you are not familiar with iLEAPP it is a Python 3 framework designed to parse useful forensic artifacts from iOS devices. More on iLEAPP here. I wanted to validate some findings on a case I am working with the amazing @i_am_the_gia and as part of the process I used the newly created parser on @Josh_Hickman1 excellent iOS testing images. You can get his testing images here.
Here is iLEAPP's HTML report for the chat:
Here is the output for the Discord user's email and user ID:
In that same moment I watched the most amazing trailer for The Mandalorian Season #2 thanks to @KevinPagano3. As you all should know by now, the Child just steals every scene with just how cute it is.
Going back to my report I copy one of the URLs in the attachment column and pasted it into an internet connected browser to see if it would come up. In past (2017) I did some testing on Discord for Android and found out that the links in chats could be copy-paste into a browser and be accessible from anywhere by anyone.
With Josh's image I confirmed that was still the case. And what did the URL image in the chat contain?
Coincidence? I think not. :-D
As always, I can be reached on twitter @AlexisBrignoni and email 4n6[at]abrignoni[dot]com.
Search for user_id_cache and email_cache It's only the user id, and not the username. Search the messages in the cache.db (iOS) or 50.json (Windows) to match up the userid with the username.
Thank you so much TheKateCain. Super useful information!
Greetings! Below is a list of assignments from recent classes.
Reminder:Assignments listed below indicate what to complete before class; make sure that you are signed in to Discord in order to access the practice files
🐍
Class 10 on 06/25/2020
No homework / study hall
Class 11 on 06/30/2020
No homework / study hall
Class 12 on 07/02/2020
Conduct online research of argparse and make a script that takes two arguments and prints them to screen
Interested in learning Python? Here's the syllabus from our DFIR Python Study Group course. Follow along by getting the book, doing the homework, and watching the YouTube videos.
Note: Assignments listed below indicate what to complete before class; make sure that you are signed in to Discord in order to access the exercises via the links
Textbook:Head First Python: A Brain-Friendly Guide, 2nd edition
🐍
Class 0 on 05/21/2020
Ch. 1 pp. 1-19 until “What We Already Know”
Class 1 on 05/26/2020
Ch. 1 pp. 20-46 until “Chapter 1’s Code”
Class 2 on 05/28/2020
Ch. 2 pp. 47-55 until “Creating Lists Literally”
Class 3 on 06/02/2020
Ch. 2 pp. 56-94 until “Chapter 2’s Code, 2 of 2” / blank page
Calls a function that selects timestamp, partner_jid, and body from the messagesTable in the Kik database; prints them to screen; and calls a function to generate a text file report
Calls a function that extracts the timestamp, author, and content from the Discord JSON file; prints them to screen; and calls a function to generate a text file report
Class 10 on 06/25/2020
No homework / study hall
Class 11 on 06/30/2020
No homework / study hall
Class 12 on 07/02/2020
Research information online about argparse
Research information online about dunders for name and main
Make a script that takes two arguments and prints them to screen
Even though iOS full file system extractions are fairly commonplace the need to parse iTunes backups has not subsided. In many situations the only piece of evidence available could be an iTunes backup located in an external drive or one acquired from an iCloud data set. Most forensic tools are designed to parse these backups for relevant artifacts. What would happen if we took these backups and fed it to our tools in the same file structure they would be in if the backup had been restored to the iOS device? In other words what would our tools do if the backup file path structure was normalize to the file path structure found within the iOS that created them?
Video version
iTunes Backups - A super, ultra, mega short primer.
For those not familiar with iTunes backups I highly recommend reading Jack Farley's blog post on the topic. It can be found here:
An iTunes backup takes files from the device, renames them, and puts them in different folders where the key to recreate the almost original file structure is to look at the contents of the manifest.db database that is also created at backup time. I say almost because instead of keeping track of the whole path as it would have been on the device it abstract part of it under a series of category names called Domains. I think a comparison of paths would be a better way of understanding such transformations.
Notice how the farther away we are from the path as found on the device the more abstract it becomes. I bet you are wondering how could I tell that the WirelessDomain stand for /var/wireless within the file system. The domain to path is contained within the device in the following location:
System/Backup/Locations/Domain.plist.
The plist is not part of the backup and can only be accessed from an iOS full file system extraction. In my testing the domains seem to be the same across all iOS devices. Again, highly recommend reading the previous link to understand the how and the why of these transformations.
If you are interested in recreating an iTunes backup as defined in the manifest.db, using the domains, check out Jack Farley's python script here:
I can't count how many times I have parsed an iTunes backup to realize the following:
- Not all relevant items are parsed.
- For items that are parsed the file path provided is the one as created by the backup. This tells me nothing in regards to where it was the file originally located on the phone. I don't like having to query the manifest.db data store by hand one bit thank you very much.
What if we normalized the backups to the full path as found within the device and then have our tools parse it? Would it possibly parse more if it believed it was a full iOS file system extraction?
Edward Greybeard (great pseudonym) made a script for easy normalization of these backups in Python. You can get it here:
For my testing I parsed an iTunes backup two times.
First run as a file system using domains.
Second run with fully normalized iOS paths.
The tool I used was my own, iLEAPP. I ran a limited set of artifacts for this example. The artifacts searched for where the same in both runs. You can find it here:
It is one thing to use a tool that has not been explicitly designed to parse iTunes backups and have it show this type of results. Would the same happen with a commercial tool? The answer, as everything in forensics, is it depends. I have done this type of double runs with commercial tools and found that they either parsed more artifacts, some times less, and sometimes both less and more. If the last one doesn't make sense think of a report that parsed less artifacts overall but included a few that weren't in the first run.
Takeaways
Is this a type of analysis you should do on every iTunes backup in every case? No. Commercially available digital forensic tools do a great job as is. But if the case hangs on that one iTunes backup it doesn't hurt to try different approaches.
In regards to vendors my hope is that when parsing iTunes backups they can add the fully normalized path for the artifact as part of the metadata panes in their tool user interface. Normalized paths for all backup files. The analytical benefit you can get from knowing where on the device was a file originally located can be incalculable.
As always, I can be reached on twitter @AlexisBrignoni and email 4n6[at]abrignoni[dot]com.
Make no mistake, M.E.A.T. had me at hello. iOS extraction open source python software by Jack Farley (@JackFarley248). It creates a file system extraction of jailbroken iOS devices using Apple File Conduit 2.
I love Twitter but sometimes the algorithm, or just plain bad timing, has me miss some of the most juicy tweets and announcements. Thanks to Phill Moore (@phillmoore) I was made aware of the newly release Mobile Evidence Acquisition Toolkit (M.E.A.T.) by Jack Farley. I've been following his work and using his code for a while now. I cannot encourage you more to give his scripts a shot. From a learning/training use to a data acquisition and validation perspective, these tools are worth your time.
M.E.A.T. will give you a full file system extraction from iOS devices with a single command. If you had the pleasure (or pain) of extracting a full file system over WiFi using SSH you will appreciate the speed and simplicity of this USB based method. The following is a quick guide and review on the scripts and how to execute them.
Pre-requisites & download
Go to the github repo for M.E.A.T and download the scripts. As stated in the repo's readme you will need a Windows machine with Python 3.7.4 or 3.7.2 installed. Before running the scripts make sure you go unzip the downloaded file and run the following command from the root directory of the scripts.
pip install -r requirements.txt
This will make sure all the dependencies are installed. Also your target iOS device will need to be jailbroken and have Apple File Conduit 2 (AFC2) installed via Cydia.
A great guide on what jailbreaking entails with emphasis on the Checkra1n implementation of Checkma8 see Ian Whiffin & Shafik G. Punja blog post here.
Assuming the target device is jailbroken with AFC2 and your device has Python with all proper dependencies you are ready to go.
Let's beat it
Connect your iOS device to your computer. As expected make sure to select Trust Device at the prompt on your iOS device and provide the proper pin/code/pass as needed. With trust established we can connect to the phone.
Navigate to the script's root directory. It will look like this:
Open a command line interface at this location and run the following command to examine the help documentation.
python MEAT.py -h
You will see the following. Pretty self-explanatory.
Sick MEAT ascii art. I think it is cured by now.
As seen in the help you can generate MD5 and SHA1 hashes for your extracted files. Delicious!
For this example I will run a file system extraction that will pull everything from the root of the device. The logical option will only extract data from the \private\var\mobile\Media directory. Be aware that the -v option will add some additional time to the extraction since you will be getting a lot of output sent to the screen.
Before starting the extraction I create the output folder in the same directory from where I am running the script. You don't have to do this of course. I do because it shortens how long the extraction command will be. In this example I typed the following to create my output directory:
mkdir output
To start the extraction process in verbose mode without hashing type this:
python MEAT.py -iOS -filesystem -o ./output -v
At the start of the process you will see your target device information.
Since we are running in verbose mode you will see a flying matrix movie like screen of text on screen.
These are files being extracted. `The process on my device took around 90 minutes for a 15 GB extraction in verbose mode. When done you will have an iOS file system folder and a log. If you selected the hashing option then you would have seen a csv file with all the calculated hashes.
Be aware that you might find some difficulty running scripts on some older iOS devices. This is OK. No software executes a 100% of the time. That is to be expected as well as these scripts getting better with time. The fact that even though these were released a few days ago you are able to extract so much from so many iOS devices is amazing.
Here is the contents of the iOS-Filesystem directory.
Now what?
iLEAPP for M.E.A.T.
With the extraction done you can parse it with your favorite digital forensics commercial tools. In order to keep the open source vibe going I will parse it with my own tool, the iOS Logs Events And Properties Parser (iLEAPP), and see what we can get. You can get iLEAPP here:
As expected the tool parses the artifacts and provides a report. Currently I am updating the reporting function in iLEAPP so it has searching by report section and an overall more polished look. Credit goes to Yogesh Khatri (@swiftforensics) for his work on the reporting features in ALEAPP that I am now porting to iLEAPP.
iLEAPP command line execution:
iLEAPP reporting from M.E.A.T. extraction:
Conclusion:
We are truly living in the golden age of mobile digital forensics in the midst of a vibrant community of practitioners that work together to make the industry, tools, and knowledge more useful and accessible. Again, a big thank you to Jack Farley for his work on this tool. It is greatly appreciated.
As always, I can be reached on twitter @AlexisBrignoni and email 4n6[at]abrignoni[dot]com.
For details on how to jailbreak an iOS device see here: https://www.doubleblak.com/blogPosts.php?id=12. Lots of detail on how to use Checkra1n so a full file system data dump can be extracted for analysis.
Due to Covid-19, and the fact that social interactions in person have been limited because of it, a slew of group video chat applications have taken off in popularity. One of those is Houseparty for all major operating systems. This post will deal with the iOS version of the app.
For this analysis I used the excellente public test image created by Josh Hickman (@josh_hickman1). His images have detailed documentation regarding what apps were used, what user activity was generated, and when. This process is key when dealing with an unknown app or one that is not parse by commercial tools. You can get these excellent test images here:
In order to investigate a non-parsed app the process I recommend is to generate a known data set collection. That way one is aware of what to look for while trying to decipher how is the data stored. In this case Josh's image, since it is so well documented, will serve as our research platform.
Our test image has the following documented activity:
This is the data we will be searching for in our app data store. The first step is to locate the app data folder in the iOS full file system extraction. To do this I ran the extraction on iLEAPP. This is a collection of python 3 scripts designed to extract interesting artifacts from iOS images. You can download iLEAPP here:
After processing a report is generated. For simplicity I limited the report to the applicationstate.db artifact. This is the database that iOS uses to keep track of what apps are installed and where.
Using the search feature in the report I was able to locate the app and the location where the user generated activity is kept. If you are not sure what the bundle ID of the app is you can easily find it here:
The path to follow is under the Sandbox Path column. Notice how app directories in iOS are identified by a long GUID number. This is why querying the applicationstate.db is so important. It is the fastest way to determine what GUID name directory corresponds to the app of interest.
After arriving to the target directory we find the usual app structure for iOS apps.
Inside the documents folder is our data store of interest. A Real file named houseparty.rocky.realm.
In order to view the contents of this data store one has to have Realm Studio installed on our analysis computer. Real Studio can be found here:
After opening the data store three classes are of interest. The first one is RealNote. This one contains the expected chats with recipient IDs and timestamps.
The second one is RealmPublicUser. This class contains information about the message recipients.
The third one is RealLocalContact. It has additional information of the local user account for the app.
One way of reporting the contents of these data stores is to export the contents to JSON.
With the data in JSON format one can extract whatever classes are needed for reporting purposes.
A quick triage way to visualize the data without the needing Real Studio is to process the exported JSON file through a JSON to HTML converter. One can be found here:
This conversion helps, in my opinion, the user see delimiting lines between keys and values more easily.
Conclusion
Realm databases are becoming more prevalent in mobile analysis. We will be well served in practicing how to approach these new data stores. I believe they could possibly replace SQLite databases in the future.
As always, I can be reached on twitter @AlexisBrignoni and email 4n6[at]abrignoni[dot]com.
One of the most important aspects of digital forensics is the need to validate tool output. Sadly it is also one of the most overlooked by practitioners. The reason for this is obvious, NO TOOL IS PERFECT. No matter the vendor, no matter how gifted the developer is. No tool is perfect. Heck, nothing is (except the wife of course.) But when perfection, or as close as we can get to, is the goal then we have to test.
One of the easiest ways to conduct a quick sanity check validation is by comparing the output of one tool to the output of another. But what happens when you only have the output of one tool and nothing else? When the extraction is only parsed by the tool set that created it?
You actually do.
For some years now Cellebrite has been using the disk archiver format (http://dar.linux.free.fr/) to collect full file system data from target devices. This format is known as DAR. It goes without saying that UFED4PC and Physical Analyzer (PA) are THE industry leading tools with incredible technology behind it. Anyone that has used their products can attest that they are one step away from magic (and that step I contend is above not below.) Any comments on the tool are done in the context of full praise and illustrate the great work they do as developers, technicians, and experts. Folks that have a zeal for truth and clarity in the service of justice.
Today a new version of PA (7.31.0.222) came out. It addresses a bug on the way creation times where being presented after processing a DAR file with previous PA versions. For full disclosure it is well documented that I am not a big fan of the format, or any other obscure format for that matter.
Then again I am all for progress and advancement if something is better, faster, and/or more modern. Still with the new there comes risks. I remember back in the day when Microsoft re-did the whole network stack for Windows Vista from scratch and some old network vulnerabilities resurfaced. The point of this analogy is that new implementations will always require extra work and vigilance due to their newness. Both in implementation and application.
The new release addresses many things but mainly, in my opinion, the following:
PA was not parsing the creation date from the DAR file.
There was some issue with access and modified dates.
Below are a couple of screenshots from the release.
Birth - Creation time
Access and modified
Per the release notes older versions of PA were not presenting the changed category of timestamps as well as giving erroneous dates for the creation category. We infer this from the fact that only the change, modify and access timestamps where supported by DAR as seen above.
This confused me. If DAR itself only supports 3 types of timestamps, how can just reprocessing, and not re-extracting, fix the problem? I will assume that the release meant to say that DAR extractions do contain all timestamps but that older versions of PA were ignoring the change times and putting a wrong timestamp under the creation label.
Here is how that looks when we compare the new PA version with the previous one. The following is a screenshot of some image metadata parsed from a DAR by PA 7.28.0.203.
As explained in the release there are only 3 timestamps visible. These are created, accessed and modified. There is no changed timestamp. Let's now look at the same image from the same DAR extraction as parsed with PA 7.31.0.222.
Notice how there is now a 4th timestamp, changed. Even more interesting is to note the difference in creation times between PA versions. The following seems to be the issue in the old PA output:
The changed time is the modified time.
The creation time is the accessed time.
In this particular case the difference between the creation and access times is 17 days. It goes without saying how critical the misplacement of this information can be when dealing with geolocation data or any file type that is tied to user generated activity. From when an image was taken to the time it was last accessed, timestamps can be the determining factor between freedom or incarceration. Just on a timestamp! If your case depended on a key creation timestamp you need to go back, as the release notes, and reprocess that case again.
Conclusions:
I love Cellebrite. The previous is not a dig at them, their technology, and much less their people. They came out with a fix and release notes about it as one expects from industry leading vendors when the inevitable bug surfaces. They aren't the first, wont be the last, and that is ok. I appreciate and value the cutting edge work they do and how the push the industry forward with newer, better, and faster things.
Trust but verify. This is hard when presented with somewhat of a black box scenario. In this case the only way to test that I could see was to get a test phone, create known data, generate a DAR extraction, jailbreak the device, SSH into the device, and compare timestamps between the device and the tool. Not an easy nor quick task for most users. How can this be approached as a community exercise, since no one person can do this alone, is something we should all think about and share ideas for.
Vendors should not move to a new technology without providing some backward compatibility to more established formats. It could be temporary acting as a bridge to newer ones. For example E01s are still supported even when newer formats, like AFF4, are now around. If the new format is needed because there is no other way to do the thing then the way it was implemented should be shared with the community. For an example see Blackbag (a Cellebrite company) and how they provided the specs to their APFS implementations. Incredible work.
If anything else I hope the previous motivates you to read all release notes that have to do with the tools you depend on for work. As en examiner you need to understand what the tool does, what new things it does, and what things it has failed on and how they were fixed. Own your data. Own your tools. Research, test and validate. And whatever you find, make it known.
As always, I can be reached on twitter @AlexisBrignoni and email 4n6[at]abrignoni[dot]com.
With UFED support of Checkm8 for iOS extractions Cellebrite uses the DAR (Disk Archiver) format as their archiving file type of choice. It works great and captures the necessary data but it is not easy to work with nor does it have widespread third party support.
Cellebrite? Yes!!! - DAR? ok...
That being said the nice folks at Cellebrite promised additional image support in the not too distant future.
If you speaketh they will listenth
In the meantime you might want to validate the Cellebrite tool output or run a third party tool to generate a particular visualization. What to do?
The solution is pretty easy. First get the DAR binaries for your favorite platform. For this example we will use Windows. The files can be found here:
When the files are extracted add dar.exe to your Windows Path so you can access the executable from any command line window. For help on how to do that go here:
After that is done go to your UFED Checkm8 extraction and identify the file/s that end in .dar.
In my case I decided to move the FullFileSystem.1.dar to the same directory as the dar.exe program to make my extraction command as short as possible. To extract make sure to be in the directory that will hold all the extracted files coming from the dar file, the destination directory. From there run dar.exe with the -x argument and the location of the dar archive. Since I placed it in the same directory as the dar.exe the path is as short as it can be.
Notice how the executable seems to notice the #1 in the filename assuming there are more parts to the dar file. When that warning shows up just press enter and let it move forward. After a little bit you will see files and directory locations fly by the command prompt as everything is being unarchived. When done you should see the following:
Success!!!
As seen in the image with the command line execution, the data is now in the extracted directory. Now you can point third party tools that can traverse directories (Apollo, iLEAPP, KAPE, etc...) and get the needed validations and/or visualizations.
For testing I pointed iLEAPP to the extracted files directory.
Notice the Extraction location and Extraction type entries. The scripts were able to parse all the data with no issues.
As examiners we will be well served to live in the spirit of the survivalist mentality. To always improvise, adapt, and overcome (while documenting of course.) Find a way to get the data, make the correct interpretations, fulfill the mission.
As always remember to validate all findings and be aware I can be reached on twitter @AlexisBrignoni and via email 4n6[at]abrignoni[dot]com.
From the department of unimaginative names comes ALEAPP, the sister script to iLEAPP. For additional information on iLEAPP go here.
ALEAPP will aggregate all my previous Android parsing scripts as well as be the framework for future script development. Previous users of iLEAPP will recognize the same interface and workflow present in ALEAPP. The script can parse logical file systems, tar and zip extractions as well as providing reports in html and csv formats.
This first release only parses events and accounts from the Wellbeing Android database. I can't thank enough Josh Hickman for sharing his research on the Wellbeing database and allowing me to use it to make the first ALEAPP artifacts. His research is a must read if you do Android digital forensics and can be found here:
The next artifact to be supported will be UsageStats events both in XML and protobuf formats. For details on this artifact go here and here. The standalone script that parses UsageStats can be found here. ALEAPP will absorb that functionality. Many thanks to Yogesh Khatri for his UsageStats research and coding.
The prerequisites for ALEAPP are:
Python 3.7.4 and above
pip install six
pip install PySimpleGUI
The next screenshots illustrate the Wellbeing artifacts output.
The Wellbeing Account report normalizes a protobuf file for account information. The data is shown in both parsed and unparsed formats.
Account data
The Wellbeing Events report has tons of useful data. Josh Hickman's post has all the details. Great investigative data source.
Add captionfdfd
A csv report example.
As always, I can be reached on twitter @AlexisBrignoni and email 4n6[at]abrignoni[dot]com.