Initialization vectors: 2020

Tuesday, September 15, 2020

It's alive! - Attachment links in Discord

What happens to the URL links inside Discord chats if you copy-paste them into an internet connected browser? You might be surprised to know that...

In the past I have written about the structure of Discord chats in the following platforms:

Windows:

https://abrignoni.blogspot.com/2018/03/finding-discord-app-chats-in-windows.html

macOS:

https://abrignoni.blogspot.com/2018/03/finding-discord-chats-in-os-x.html

iOS: 

https://abrignoni.blogspot.com/2018/08/finding-discord-chats-in-ios.html

Linux:

https://abrignoni.blogspot.com/2018/08/finding-discord-chats-in-linux-dfir.html

Android:

https://abrignoni.blogspot.com/2017/07/discord-app-forensic-artifacts-in.html

Viewing extracted data using an Android emulator:

https://abrignoni.blogspot.com/2017/08/viewing-extracted-android-app-data.html

Timely updates to the research have been provided by generous folks, like @TheKateCain,  here:

https://abrignoni.blogspot.com/2020/08/update-on-discord-forensic-artifacts.html

For the last couple of days I've been working creating a parser of Discord JSON chat files using iLEAPP. If you are not familiar with iLEAPP it is a Python 3 framework designed to parse useful forensic artifacts from iOS devices. More on iLEAPP here. I wanted to validate some findings on a case I am working with the amazing @i_am_the_gia and as part of the process I used the newly created parser on @Josh_Hickman1 excellent iOS testing images. You can get his testing images here.

Here is iLEAPP's  HTML report for the chat:


Here is the output for the Discord user's email and user ID:


In that same moment I watched the most amazing trailer for The Mandalorian Season #2 thanks to @KevinPagano3. As you all should know by now, the Child just steals every scene with just how cute it is. 

Going back to my report I copy one of the URLs in the attachment column and pasted it into an internet connected browser to see if it would come up. In past (2017) I did some testing on Discord for Android and found out that the links in chats could be copy-paste into a browser and be accessible from anywhere by anyone.

With Josh's image I confirmed that was still the case. And what did the URL image in the chat contain?


Coincidence? I think not. :-D
As always, I can be reached on twitter @AlexisBrignoni and email 4n6[at]abrignoni[dot]com.

May the Force be with you too.
-Brigs




Monday, August 3, 2020

Update on Discord forensic artifacts for iOS & Windows

Thanks to @TheKateCain for the following artifacts we can find on Discord for iOS. All the artifacts are located within the application folder for the app. For details on how to identify and extract the application folder see here: https://abrignoni.blogspot.com/2018/08/finding-discord-chats-in-ios.html

Email address used to download the app to the device can be found here:
RCTAsyncLocalStorage_V1/manifest.json

The userid and email can be found here in an iOS device: /private/var/mobile/Containers/Data/Application/*UUID*/Documents/mmkv/mmkv.default 
Search for user_id_cache and email_cache. 

In Windows they can be found here: 
USER/appdata/roaming/discord/Local Storage/leveldb/000003.log 
Search for user_id_cache and email_cache It's only the user id, and not the username. Search the messages in the cache.db (iOS) or 50.json (Windows) to match up the userid with the username.


Thank you so much TheKateCain. Super useful information!

Sunday, July 26, 2020

DFIR Python Study Group Syllabus Part 2

Greetings! Below is a list of assignments from recent classes.

Reminder: Assignments listed below indicate what to complete before class; make sure that you are signed in to Discord in order to access the practice files

🐍

Class 10 on 06/25/2020
  • No homework / study hall

Class 11 on 06/30/2020
  • No homework / study hall

Class 12 on 07/02/2020
  • Conduct online research of argparse and make a script that takes two arguments and prints them to screen
  • Research dunders for name and main
  • Kik_Discord_Parser.py: review for argparse and main() implementations

Class 13 on 07/07/2020
  • Ch. 5 pp. 195-211 until “There Are No Dumb Questions”
  • json_in_sqlite.zip: Download for class

Class 14 on 07/09/2020
  • Slack_Messages.sql: Add to query to parse fields from the Slack database from previous class

Class 15 on 07/14/2020
  • Ch. 6 pp. 243-264 until “Test Drive”
  • LastBuildInfo.plist: Write script that pulls out every key and value

Class 16 on 07/21/2020
  • Ch. 6 pp. 265-280 until “Chapter 6’s Code”
  • nskeyedarchive_files.zip: Look for UNNotificationUserInfo and pull out screen_name, full_name, and video url using the Deserializer library

Class 17 on 07/23/2020

Class 18 on 07/30/2020
  • Ch. 8 pp. 309-334 “Chapter 8’s Code” / blank page

Tuesday, June 30, 2020

DFIR Python Study Group Syllabus

Interested in learning Python? Here's the syllabus from our DFIR Python Study Group course. Follow along by getting the book, doing the homework, and watching the YouTube videos.

Note: Assignments listed below indicate what to complete before class; make sure that you are signed in to Discord in order to access the exercises via the links

Textbook: Head First Python: A Brain-Friendly Guide, 2nd edition

🐍

Class 0 on 05/21/2020
  • Ch. 1 pp. 1-19 until “What We Already Know”


Class 1 on 05/26/2020
  • Ch. 1 pp. 20-46 until “Chapter 1’s Code”
 

Class 2 on 05/28/2020
  • Ch. 2 pp. 47-55 until “Creating Lists Literally”
 

Class 3 on 06/02/2020
  • Ch. 2 pp. 56-94 until “Chapter 2’s Code, 2 of 2” / blank page


Class 4 on 06/04/2020
  • build.prop
    • Open and read file using a for loop
    • Use if and elif to select interesting items in the file
    • Select the values to the right of = using start, stop, step, or split()
    • Write the extracted data to a text file
 

Class 5 on 06/09/2020
  • Ch. 3 pp. 95-121 until “Test Drive”
    • Note: p. 98 is outdated and new info can be found here


Class 6 on 06/11/2020
  • Ch. 3 pp. 122-144 until “Chapter 3’s Code, 2 of 2”
  • discord.json
    • Pull the chats (content key) and user identifiers


Class 7 on 06/16/2020
  • Ch. 4 pp. 145-169 until “Test Drive”


Class 8 on 06/18/2020


Class 9 on 06/23/2020
  • homework_files.zip: Create a script that does the following
    • Calls a function that selects timestamp, partner_jid, and body from the messagesTable in the Kik database; prints them to screen; and calls a function to generate a text file report
    • Calls a function that extracts the timestamp, author, and content from the Discord JSON file; prints them to screen; and calls a function to generate a text file report


Class 10 on 06/25/2020
  • No homework / study hall


Class 11 on 06/30/2020
  • No homework / study hall


Class 12 on 07/02/2020
  • Research information online about argparse
  • Research information online about dunders for name and main
  • Make a script that takes two arguments and prints them to screen

Saturday, May 2, 2020

Normalizing iTunes Backups - Squeeze more data out of them, possibly...

Even though iOS full file system extractions are fairly commonplace the need to parse iTunes backups has not subsided. In many situations the only piece of evidence available could be an iTunes backup located in an external drive or one acquired from an iCloud data set. Most forensic tools are designed to parse these backups for relevant artifacts. What would happen if we took these backups and fed it to our tools in the same file structure they would be in if the backup had been restored to the iOS device? In other words what would our tools do if the backup file path structure was normalize to the file path structure found within the iOS that created them?

Video version


iTunes Backups - A super, ultra, mega short primer.

For those not familiar with iTunes backups I highly recommend reading Jack Farley's blog post on the topic. It can be found here:
http://farleyforensics.com/2019/04/14/forensic-analysis-of-itunes-backups/
An iTunes backup takes files from the device, renames them, and puts them in different folders where the key to recreate the almost original file structure is to look at the contents of the manifest.db database that is also created at backup time. I say almost because instead of keeping track of the whole path as it would have been on the device it abstract part of it under a series of category names called Domains. I think a comparison of paths would be a better way of understanding such transformations.

Filename: DataUsage.sqlite

Path as created by the backup:
0d/0d609c54856a9bb2d56729df1d68f2958a88426b/
Path as tracked in manifest.db:
WirelessDomain/Library/Databases/DataUsage.sqlite
Path within the device:
private/var/wireless/Library/Databases/DataUsage.sqlite
Notice how the farther away we are from the path as found on the device the more abstract it becomes. I bet you are wondering how could I tell that the WirelessDomain stand for /var/wireless within the file system. The domain to path is contained within the device in the following location:
System/Backup/Locations/Domain.plist.
The plist is not part of the backup and can only be accessed from an iOS full file system extraction. In my testing the domains seem to be the same across all iOS devices. Again, highly recommend reading the previous link to understand the how and the why of these transformations.

If you are interested in recreating an iTunes backup as defined in the manifest.db, using the domains, check out Jack Farley's python script here:
https://github.com/jfarley248/iTunes_Backup_Reader
The problem

I can't count how many times I have parsed an iTunes backup to realize the following:
- Not all relevant items are parsed.
- For items that are parsed the file path provided is the one as created by the backup. This tells me nothing in regards to where it was the file originally located on the phone. I don't like having to query the manifest.db data store by hand one bit thank you very much.

What if we normalized the backups to the full path as found within the device and then have our tools parse it? Would it possibly parse more if it believed it was a full iOS file system extraction?

Edward Greybeard (great pseudonym) made a script for easy normalization of these backups in Python. You can get it here:
https://github.com/edward-greybeard/iOS-UNF
For my testing I parsed an iTunes backup two times.
First run as a file system using domains.


Second run with fully normalized iOS paths.

The tool I used was my own, iLEAPP. I ran a limited set of artifacts for this example. The artifacts searched for where the same in both runs. You can find it here:
https://github.com/abrignoni/iLEAPP
Results

The first run provided 7 artifacts.


 The second run provided 9 artifacts.


Caveats

It is one thing to use a tool that has not been explicitly designed to parse iTunes backups and have it show this type of results. Would the same happen with a commercial tool? The answer, as everything in forensics, is it depends. I have done this type of double runs with commercial tools and found that they either parsed more artifacts, some times less, and sometimes both less and more. If the last one doesn't make sense think of a report that parsed less artifacts overall but included a few that weren't in the first run.

Takeaways

Is this a type of analysis you should do on every iTunes backup in every case? No. Commercially available digital forensic tools do a great job as is. But if the case hangs on that one iTunes backup it doesn't hurt to try different approaches.

In regards to vendors my hope is that when parsing iTunes backups they can add the fully normalized path for the artifact as part of the metadata panes in their tool user interface. Normalized paths for all backup files. The analytical benefit you can get from knowing where on the device was a file originally located can be incalculable.

As always, I can be reached on twitter @AlexisBrignoni and email 4n6[at]abrignoni[dot]com.

Thursday, April 23, 2020

Meet M.E.A.T. - It's really well done!

Short version

Make no mistake, M.E.A.T. had me at hello. iOS extraction open source python software by Jack Farley (@JackFarley248). It creates a file system extraction of jailbroken iOS devices using Apple File Conduit 2.

Meat puns are the wurst

Get some M.E.A.T. here:
https://github.com/jfarley248/MEAT
I also highly recommend Jack's iTunes Backup Reader. It takes unencrypted iTunes backups an recreates the full file system structure. Get it here:
https://github.com/jfarley248/iTunes_Backup_Reader
Update:

Video version


Long version

I love Twitter but sometimes the algorithm, or just plain bad timing, has me miss some of the most juicy tweets and announcements. Thanks to Phill Moore (@phillmoore) I was made aware of the newly release Mobile Evidence Acquisition Toolkit (M.E.A.T.) by Jack Farley. I've been following his work and using his code for a while now. I cannot encourage you more to give his scripts a shot. From a learning/training use to a data acquisition and validation perspective, these tools are worth your time.

M.E.A.T. will give you a full file system extraction from iOS devices with a single command. If you had the pleasure (or pain) of extracting a full file system over WiFi using SSH you will appreciate the speed and simplicity of this USB based method. The following is a quick guide and review on the scripts and how to execute them.

Pre-requisites & download

Go to the github repo for M.E.A.T and download the scripts. As stated in the repo's readme you will need a Windows machine with Python 3.7.4 or 3.7.2 installed. Before running the scripts make sure you go unzip the downloaded file and run the following command from the root directory of the scripts.
pip install -r requirements.txt
This will make sure all the dependencies are installed. Also your target iOS device will need to be jailbroken and have Apple File Conduit 2 (AFC2) installed via Cydia.

A great guide on what jailbreaking entails with emphasis on the Checkra1n implementation of Checkma8 see Ian Whiffin & Shafik G. Punja blog post here.

Assuming the target device is jailbroken with AFC2 and your device has Python with all proper dependencies you are ready to go.

Let's beat it

Connect your iOS device to your computer. As expected make sure to select Trust Device at the prompt on your iOS device and provide the proper pin/code/pass as needed. With trust established we can connect to the phone.

Navigate to the script's root directory. It will look like this:


 Open a command line interface at this location and run the following command to examine the help documentation.
python MEAT.py -h
You will see the following. Pretty self-explanatory.

Sick MEAT ascii art. I think it is cured by now.

As seen in the help you can generate MD5 and SHA1 hashes for your extracted files. Delicious!
For this example I will run a file system extraction that will pull everything from the root of the device. The logical option will only extract data from the \private\var\mobile\Media directory. Be aware that the -v option will add some additional time to the extraction since you will be getting a lot of output sent to the screen.

Before starting the extraction I create the output folder in the same directory from where I am running the script. You don't have to do this of course. I do because it shortens how long the extraction command will be. In this example I typed the following to create my output directory:
mkdir output
To start the extraction process in verbose mode without hashing type this:
python MEAT.py -iOS -filesystem -o ./output -v
At the start of the process you will see your target device information.


Since we are running in verbose mode you will see a flying matrix movie like screen of text on screen.


These are files being extracted. `The process on my device took around 90 minutes for a 15 GB extraction in verbose mode. When done you will have an iOS file system folder and a log. If you selected the hashing option then you would have seen a csv file with all the calculated hashes.


Be aware that you might find some difficulty running scripts on some older iOS devices. This is OK. No software executes a 100% of the time. That is to be expected as well as these scripts getting better with time. The fact that even though these were released a few days ago you are able to extract so much from so many iOS devices is amazing.

Here is the contents of the iOS-Filesystem directory.

Now what?
iLEAPP for M.E.A.T.

With the extraction done you can parse it with your favorite digital forensics commercial tools. In order to keep the open source vibe going I will parse it with my own tool, the iOS Logs Events And Properties Parser (iLEAPP), and see what we can get. You can get iLEAPP here:
https://github.com/abrignoni/iLEAPP
As expected the tool parses the artifacts and provides a report. Currently I am updating the reporting function in iLEAPP so it has searching by report section and an overall more polished look. Credit goes to Yogesh Khatri (@swiftforensics) for his work on the reporting features in ALEAPP that I am now porting to iLEAPP.

iLEAPP command line execution:


iLEAPP reporting from M.E.A.T. extraction:


Conclusion:

We are truly living in the golden age of mobile digital forensics in the midst of a vibrant community of practitioners that work together to make the industry, tools, and knowledge more useful and accessible. Again, a big thank you to Jack Farley for his work on this tool. It is greatly appreciated.

As always, I can be reached on twitter @AlexisBrignoni and email 4n6[at]abrignoni[dot]com.



Sunday, April 19, 2020

iOS Houseparty app: More Realm

Short version:

The Houseparty app keeps user generated data in in the following Realm database:
/private/var/mobile/Containers/Data/Application/*GUID*/Documents/
houseparty.rocky.realm
For details on how to jailbreak an iOS device see here: https://www.doubleblak.com/blogPosts.php?id=12. Lots of detail on how to use Checkra1n so a full file system data dump can be extracted for analysis.

For details on Real databases and how to approach their examinations see here: https://abrignoni.blogspot.com/2019/11/realm-database-storage-primer-for.html
It is of note that Cellebrite Physical Analyzer has a database browser that is compatible with Realm databases.

Update

Video version of this blog post:



Long version:

Due to Covid-19, and the fact that social interactions in person have been limited because of it, a slew of group video chat applications have taken off in popularity. One of those is Houseparty for all major operating systems. This post will deal with the iOS version of the app.


 For this analysis I used the excellente public test image created by Josh Hickman (@josh_hickman1). His images have detailed documentation regarding what apps were used, what user activity was generated, and when. This process is key when dealing with an unknown app or one that is not parse by commercial tools. You can get these excellent test images here:
https://thebinaryhick.blog/2020/04/16/ios-13-images-images-now-available/

In order to investigate a non-parsed app the process I recommend is to generate a known data set collection. That way one is aware of what to look for while trying to decipher how is the data stored. In this case Josh's image, since it is so well documented, will serve as our research platform.

Our test image has the following documented activity:
This is the data we will be searching for in our app data store. The first step is to locate the app data folder in the iOS full file system extraction. To do this I ran the extraction on iLEAPP. This is a collection of python 3 scripts designed to extract interesting artifacts from iOS images. You can download iLEAPP here:
https://github.com/abrignoni/iLEAPP

After processing a report is generated. For simplicity I limited the report to the applicationstate.db artifact. This is the database that iOS uses to keep track of what apps are installed and where.


Using the search feature in the report I was able to locate the app and the location where the user generated activity is kept. If you are not sure what the bundle ID of the app is you can easily find it here:
https://offcornerdev.com/bundleid.html

The path to follow is under the Sandbox Path column. Notice how app directories in iOS are identified by a long GUID number. This is why querying the applicationstate.db is so important. It is the fastest way to determine what GUID name directory corresponds to the app of interest.

After arriving to the target directory we find the usual app structure for iOS apps.


Inside the documents folder is our data store of interest. A Real file named houseparty.rocky.realm.
In order to view the contents of this data store one has to have Realm Studio installed on our analysis computer. Real Studio can be found here:
https://realm.io/products/realm-studio
After opening the data store three classes are of interest. The first one is RealNote. This one contains the expected chats with recipient IDs and timestamps.


The second one is RealmPublicUser. This class contains information about the message recipients.


The third one is RealLocalContact. It has additional information of the local user account for the app.


One way of reporting the contents of these data stores is to export the contents to JSON.


With the data in JSON format one can extract whatever classes are needed for reporting purposes.
A quick triage way to visualize the data without the needing Real Studio is to process the exported JSON file through a JSON to HTML converter. One can be found here:
https://github.com/abrignoni/JSON-to-HTML-and-XLS
This conversion helps, in my opinion, the user see delimiting lines between keys and values more easily.


Conclusion

Realm databases are becoming more prevalent in mobile analysis. We will be well served in practicing how to approach these new data stores. I believe they could possibly replace SQLite databases in the future.

As always, I can be reached on twitter @AlexisBrignoni and email 4n6[at]abrignoni[dot]com.

Tuesday, March 17, 2020

Trust but verify: Formats, timestamps, and validation

One of the most important aspects of digital forensics is the need to validate tool output. Sadly it is also one of the most overlooked by practitioners. The reason for this is obvious, NO TOOL IS PERFECT. No matter the vendor, no matter how gifted the developer is. No tool is perfect. Heck, nothing is (except the wife of course.) But when perfection, or as close as we can get to, is the goal then we have to test.

One of the easiest ways to conduct a quick sanity check validation is by comparing the output of one tool to the output of another. But what happens when you only have the output of one tool and nothing else? When the extraction is only parsed by the tool set that created it?

You actually do.
For some years now Cellebrite has been using the disk archiver format (http://dar.linux.free.fr/) to collect full file system data from target devices. This format is known as DAR. It goes without saying that UFED4PC and Physical Analyzer (PA) are THE industry leading tools with incredible technology behind it. Anyone that has used their products can attest that they are one step away from magic (and that step I contend is above not below.) Any comments on the tool are done in the context of full praise and illustrate the great work they do as developers, technicians, and experts. Folks that have a zeal for truth and clarity in the service of justice.

Today a new version of PA (7.31.0.222) came out. It addresses a bug on the way creation times where being presented after processing a DAR file with previous PA versions. For full disclosure it is well documented that I am not a big fan of the format, or any other obscure format for that matter.


Then again I am all for progress and advancement if something is better, faster,  and/or more modern. Still with the new there comes risks. I remember back in the day when Microsoft re-did the whole network stack for Windows Vista from scratch and some old network vulnerabilities resurfaced. The point of this analogy is that new implementations will always require extra work and vigilance due to their newness. Both in implementation and application.

The new release addresses many things but mainly, in my opinion, the following:

  • PA was not parsing the creation date from the DAR file.
  • There was some issue with access and modified dates.
Below are a couple of screenshots from the release.
Birth - Creation time
Access and modified
Per the release notes older versions of PA were not presenting the changed category of timestamps as well as giving erroneous dates for the creation category. We infer this from the fact that only the change, modify and access timestamps where supported by DAR as seen above. 

This confused me. If DAR itself only supports 3 types of timestamps, how can just reprocessing, and not re-extracting, fix the problem? I will assume that the release meant to say that DAR extractions do contain all timestamps but that older versions of PA were ignoring the change times and putting a wrong timestamp under the creation label.

Here is how that looks when we compare the new PA version with the previous one. The following is a screenshot of some image metadata parsed from a DAR by PA 7.28.0.203.


As explained in the release there are only 3 timestamps visible. These are created, accessed and modified. There is no changed timestamp. Let's now look at the same image from the same DAR extraction as parsed with PA 7.31.0.222.


Notice how there is now a 4th timestamp, changed. Even more interesting is to note the difference in creation times between PA versions. The following seems to be the issue in the old PA output:
  • The changed time is the modified time.
  • The creation time is the accessed time.
In this particular case the difference between the creation and access times is 17 days. It goes without saying how critical the misplacement of this information can be when dealing with geolocation data or any file type that is tied to user generated activity. From when an image was taken to the time it was last accessed, timestamps can be the determining factor between freedom or incarceration. Just on a timestamp! If your case depended on a key creation timestamp you need to go back, as the release notes, and reprocess that case again. 

Conclusions:
  1. I love Cellebrite. The previous is not a dig at them, their technology, and much less their people. They came out with a fix and release notes about it as one expects from industry leading vendors when the inevitable bug surfaces. They aren't the first, wont be the last, and that is ok. I appreciate and value the cutting edge work they do and how the push the industry forward with newer, better, and faster things.
  2. Trust but verify. This is hard when presented with somewhat of a black box scenario. In this case the only way to test that I could see was to get a test phone, create known data, generate a DAR extraction, jailbreak the device, SSH into the device, and compare timestamps between the device and the tool. Not an easy nor quick task for most users. How can this be approached as a community exercise, since no one person can do this alone, is something we should all think about and share ideas for.
  3. Vendors should not move to a new technology without providing some backward compatibility to more established formats. It could be temporary acting as a bridge to newer ones. For example E01s are still supported even when newer formats, like AFF4, are now around. If the new format is needed because there is no other way to do the thing then the way it was implemented should be shared with the community. For an example see Blackbag (a Cellebrite company) and how they provided the specs to their APFS implementations. Incredible work.
If anything else I hope the previous motivates you to read all release notes that have to do with the tools you depend on for work. As en examiner you need to understand what the tool does, what new things it does, and what things it has failed on and how they were fixed. Own your data. Own your tools. Research, test and validate. And whatever you find, make it known.

As always, I can be reached on twitter @AlexisBrignoni and email 4n6[at]abrignoni[dot]com.

Tuesday, March 3, 2020

So you have a DAR file...

With UFED support of Checkm8 for iOS extractions Cellebrite uses the DAR (Disk Archiver) format as their archiving file type of choice. It works great and captures the necessary data but it is not easy to work with nor does it have widespread third party support.

Cellebrite? Yes!!! - DAR? ok...
That being said the nice folks at Cellebrite promised additional image support in the not too distant future.

If you speaketh they will listenth
In the meantime you might want to validate the Cellebrite tool output or run a third party tool to generate a particular visualization. What to do?

The solution is pretty easy. First get the DAR binaries for your favorite platform. For this example we will use Windows. The files can be found here:
https://sourceforge.net/projects/dar/files/dar/2.6.8/
Get that (ironic) Windows ZIP
When the files are extracted add dar.exe to your Windows Path so you can access the executable from any command line window. For help on how to do that go here:
https://www.howtogeek.com/118594/how-to-edit-your-system-path-for-easy-command-line-access/
After that is done go to your UFED Checkm8 extraction and identify the file/s that end in .dar.



In my case I decided to move the FullFileSystem.1.dar to the same directory as the dar.exe program to make my extraction command as short as possible. To extract make sure to be in the directory that will hold all the extracted files coming from the dar file, the destination directory. From there run dar.exe with the -x argument and the location of the dar archive. Since I placed it in the same directory as the dar.exe the path is as short as it can be.


Notice how the executable seems to notice the #1 in the filename assuming there are more parts to the dar file. When that warning shows up just press enter and let it move forward. After a little bit you will see files and directory locations fly by the command prompt as everything is being unarchived. When done you should see the following:

Success!!!
As seen in the image with the command line execution, the data is now in the extracted directory. Now you can point third party tools that can traverse directories (Apollo, iLEAPP, KAPE, etc...) and get the needed validations and/or visualizations.

For testing I pointed iLEAPP to the extracted files directory.


Notice the Extraction location and Extraction type entries. The scripts were able to parse all the data with no issues.

As examiners we will be well served to live in the spirit of the survivalist mentality. To always improvise, adapt, and overcome (while documenting of course.) Find a way to get the data, make the correct interpretations, fulfill the mission.


As always remember to validate all findings and be aware I can be reached on twitter @AlexisBrignoni and via email 4n6[at]abrignoni[dot]com.

Saturday, February 22, 2020

ALEAPP - Android Logs Events And Protobuf Parser

From the department of unimaginative names comes ALEAPP, the sister script to iLEAPP. For additional information on iLEAPP go here.

ALEAPP will aggregate all my previous Android parsing scripts as well as be the framework for future script development. Previous users of iLEAPP will recognize the same interface and workflow present in ALEAPP. The script can parse logical file systems, tar and zip extractions as well as providing reports in html and csv formats.

ALEAPP can be downloaded from:

https://github.com/abrignoni/ALEAPP

GUI interface
This first release only parses events and accounts from the Wellbeing Android database. I can't thank enough Josh Hickman for sharing his research on the Wellbeing database and allowing me to use it to make the first ALEAPP artifacts. His research is a must read if you do Android digital forensics and can be found here:

https://thebinaryhick.blog/2020/02/22/walking-the-android-timeline-using-androids-digital-wellbeing-to-timeline-android-activity/

The next artifact to be supported will be UsageStats events both in XML and protobuf formats. For details on this artifact go here and here. The standalone script that parses UsageStats can be found here. ALEAPP will absorb that functionality. Many thanks to Yogesh Khatri for his UsageStats research and coding.

The prerequisites for ALEAPP are:

  • Python 3.7.4 and above
  • pip install six
  • pip install PySimpleGUI
The next screenshots illustrate the Wellbeing artifacts output.

The Wellbeing Account report normalizes a protobuf file for account information. The data is shown in both parsed and unparsed formats.

Account data
The Wellbeing Events report has tons of useful data. Josh Hickman's post has all the details. Great investigative data source.

Add captionfdfd
A csv report example.



As always, I can be reached on twitter @AlexisBrignoni and email 4n6[at]abrignoni[dot]com.

Saturday, February 15, 2020

Initial thoughts on Android 10 parsing

When Josh Hickman (@josh_hickman1) told me he was working on creating an Android 10 full file system image as part of his testing images series I was stoked. After suggesting some apps to test he diligently worked on it and made the image public for all to use. Go get it here. Before I continue I want to thank Josh for putting this work out and to express how useful it is to everybody. Thank you!

After running the image by two commercial digital forensic tools I noted a few things.
  • When parsing the image with commercial DFIR you will see 99% of what you expect to see. This is good and speaks to the maturity of Android as an operating system and the responsiveness of vendors in this space. Still, as expected, a new OS version will break artifact parsers third party apps and native files. It is our job to figure out where the known but now lost items are as well as finding new artifacts we weren't aware of. This is how toolmakers can focus effectively on what needs to be done, it is us doing the work and telling them what's important to us.  For example chat messages from Discord and TikTok seem to be missing even though they are there. In the case of TikTok the old database query to extract chats still works. SQL queries can be found here.
  • One example of a native OS file changing format is the UsageStats files. These keep track of application usage. It is similar to KnowledgeC database entries in iOS. For details see here. Traditionally these UsageStats files where XML formatted. With Android 10 they are now protobuf encoded.All credit goes to Yogesh Khatri since he did all the heavy research work on it. His blog post is required reading. It can be found here. Not only did he identify the change in format he also updated my old UsageStats XML parser to make it protobuf encoded capable. The script can be found here. These protobuf encoded files were not decoded by the digital forensic tools. As said before, this is not a bash on digital forensic tool developers. It is a call to action to the community to test, discover, and help focus our development efforts on the artifacts we need and deem to be relevant.
  • It is rare. Haven't seen it happen on a case yet but never assume you never will. Multiple user accounts on an Android device. Artifacts left behind by a second account seem to be missing or come out jumbled together after parsing. For example if the examiner looks at app data she might find that in one case a parsed report for a database might show data for both accounts while in another artifact the data available is from the currently active account only. It is important that we identify the presence of multiple user accounts on the device and take steps to validate our output accordingly. A quick check for multiple user accounts can be done by looking at the contents of the /data/user_de/ directory. If you see another folder other than folder 0 then you have multiple user account on the device.
Multiple user accounts. Usually account #2 is 10 but who knows why it went to 11.


As an example of how tool design might affect report output I will show how my own UsageStats parser script comingles in one report the data from the two Android user accounts on the device.

After extracting the UsageStats directory the script is run.



Notice it processed 11099 records from the files.
Next I separately processed the data from each user directory. To do so I processed the usagestats directory with either directory 0 or directory 11 present.

Directory 0 and directory 11.
Data processed from user directory 0:


Records processed number went down to 8796.
Now user directory 11:


Records processed number is 2303.
Even without looking at the contents of each report we can determine which account was used the most easily. This insight would have been lost if the data was shown all together in one report.

As examiners we own the data we are tasked with processing and it is our responsibility to verify that any inferences gathered from it are exact and backed up by the source. We are uniquely positioned to identify gaps in knowledge, to work in filling them up, and sharing that knowledge with others that can automate the process to the benefit of the greater community of practitioners. If you feel bored while working in this field you are definitely not paying attention. Your perspective is needed, your expertise is essential. Make it known.