Wednesday, June 12, 2019

Android - Samsung My Files App

Short version

Samsung mobile devices keep a list of stored media files in the following location and database:
data/data/com.sec.android.app.myfiles/databases/FileCache.db
These same devices also keep track of recent accessed media in the following location and database:
data/data/com.sec.android.app.myfiles/databases/myfiles.db 
The following queries at https://github.com/abrignoni/DFIR-SQL-Query-Repo/ can be used as templates to extract data from the aforementioned databases.

  • FileCache.db
    • Table: Filecache
    • Fields: storage, path, size, date, latest_date
  • myfiles.db
    • Table: recent_files
    • Fields: nane, size, date, _data, ext, _source, description, recent_date
Long version

Samsung devices come preinstalled with the Samsung My Files app. The app can also be used on other branded devices by download and install of the app via the Google Play store.

Samsung My Files app
The app description tells us the main software features.
[Key features]- Browse and manage files stored on your smartphone, SD card, or USB drive conveniently..Users can create folders; move, copy, share, compress, and decompress files; and view file details.
- Try our user-friendly features..The Recent Files list: Files the user has downloaded, run, and/or opened.The Categories list: Types of files, including downloaded, document, image, audio, video, and installation files (.APK)
Stored files analysis

The My Files app directory data resides in the data/data/com.sec.android.app.myfiles directory as seen in the next image.

App directory contents
Within this directory the SQLite FileCache.db file can be found. In the FileCache table one can find information on stored media to include path, size in bytes, date timestamp, and latest timestamp.


A simple query can be produce to extract this data. One can be found here.

Recent files list analysis

Within the same database directory one can also find the SQLite myfiles.db file. The recent_files tables keeps information on recently accessed files as explained in the app description from the Google Play store. This table tracks of file name, size in bytes, data, path, extension, source, description and recent date.


A simple query can be produced to extract this data. One can be found here.

Why does this matter?

A list of files as recorded by the app can give us clues on what files once existed on the device if these files were deleted before the usage of the My Files app. The utility of the recent apps list is even more apparent since we can correlate particular real world events with the last usage of pertinent media on the device. User generated artifacts should be of interest to the analyst, even more so when they intersect with other parts of the case we are working. Only by knowing that such artifacts exist can we make use of them.

As always I can be reached on Twitter @AlexisBrignoni and email 4n6[at]abrignoni[dot]com.

Tuesday, June 11, 2019

Android - Predictive text exclusions in Samsung devices

Short version

Samsung keyboard predictive text exclusions are located in the following location and database:
data/data/com.sec.android.inputmethod/databases/RemoveListManager
The following query at the https://github.com/abrignoni/DFIR-SQL-Query-Repo
can be used as a template to extract text exclusion entries from the RemoveListManager database.
  • RemoveList
    • Fields: removed_word, time_word_added
SwiftKey predictive text exclusions are located in the following location and text file:
data/data/com.sec.android.inputmethod/app_SwiftKey/user/blacklist
  • Blacklist
    • Text file contents are composed of one excluded word per line. 

Long Version

The following discussion shows how excluded words from the Samsung keyboard's predictive text are stored in Samsung Android devices. 

Predictive text options in Samsung Android phones
Predictive text is an Android feature that learns the user's most used typed words and presents them as options for autocomplete. For example if I type the city name "San Juan" with regularity on my device the next time I start typing "San" the predictive text option will volunteer the full name "San Juan" as an option to complete the word for me. Auto-predictive text saves the user time since I can type the word full word in just 3 taps (San + tap on the suggestion or tap spacebar for auto complete) instead of the 8 taps needed for the full name to be spelled out.

What happens when the keyboard is giving you a suggestion for a word that you don't want constantly for a set of initial letters? Imagine that currently instead of typing "San Juan" constantly you find yourself typing "San Lorenzo" instead since you moved to a new city. Every time you type "San" you get the suggestion "San Juan" instead of "San Lorenzo". By long pressing the suggestion box the keyboard gives you the option to stop suggesting the pressed word moving forward.

What happens to the long pressed word that will now not be suggested anymore? The same process is used for text exclusions for the SwiftKey keyboard app. Where does these excluded or blacklisted words reside? Why would finding these items be of importance to the forensic analyst?

Samsung Keyboard Analysis

On Samsung Android devices data related to keyboard configurations reside in the data/data/com.sec.android.inputmethod directory. The following image shows the contents of the aforementioned directory.

Database folder exist when a blacklisted word exists

Notice in the image above how the databases folder is highlighted. This directory did not exist until I added a word to be excluded on the Samsung keyboard. Within this directory resides the SQLite database named RemoveListManager. Within the database the RemovedList table keeps the excluded word list.


In the previous image the word EMBASSIES was excluded. Notice the added time. While doing testing the actual time of addition was 2019-06-11 09:13:51. There is a difference of 4 hours. It is my assumption that the date shown is UTC time in human readable format. This underlines how important it is to test your conclusions. On your case work it is key to duplicate the environment by getting a similar phone to the original and do your own testing. 

Samsung SwiftKey Analysis

Just like the Samsung Keyboard, the SwiftKey keyboard app keeps pertinent data in the data/data/com.sec.android.inputmethod directory. 


As seen in the previous image the user directory is where the excluded word list resides. The following image shows the contents of the blacklist text file.

For this analysis the creation and modified dates can be used to show when the list was created first and modified last. Dates for words excluded that are between the first and last entries on the list lack a timestamp or a way to infer it.

Why does this matter?

Excluded words are voluntary user generated events. When a user decides to exclude a word it is because it constantly gives suggestions to a list of words that are constantly typed. What would a list of excluded words that mostly contains terms related to child exploitation tell the digital forensic analyst? What can the analyst infer the user was typing? When was the exclusion made? Can the timestamp be correlated to a location by the use of another type of artifact on the device? User generated events tend to have relevance to our analysis and should be sought out and aggregated. We can make out the forest by getting at all those trees.

As always I can be reached on Twitter @AlexisBrignoni and email 4n6[at]abrignoni[dot]com. 

Monday, June 3, 2019

Finding Badoo chats in Android using SQL queries and the MAGNET App Simulator

Short version

The Badoo Free Chat and Dating app keeps user generated chats in the following SQLite database:
userdata/data/com.badoo.mobile/databases/ChatComDatabase
The following queries at https://github.com/abrignoni/DFIR-SQL-Query-Repo can be used as templates to extract chats from the Badoo database:

  • Messages
    • Sender name, recipient name, chat message, create time, modified time, server status, payload.
  • User data
    • User ID, username, gender, age, user image url, photo url, max unanswered messages, sending multimedia enabled, user deleted.
By using the MAGNET App Simulator the messages can be easily seen in their native format. The simulator can be downloaded from here:
https://www.magnetforensics.com/resources/magnet-app-simulator/
Long version

The Badoo application is a chat and dating platform for Android and iOS. The app website claims a to have over 425,000,000 users and counting.

Large install base
The app seem to be fairly popular in the Google Play store with over 4 million reviews.


The following analysis came to be due to a request from a digital forensics examiner not being able to parse the app data using commercial mobile forensic tools. I procured consent from my colleague to use the data sets in the creation of the queries and accompanying blog post. With that being said I will obscure usernames and chat content in the data sets due to the fact that they are in French, which I do not speak, and I want to avoid publishing something without knowing what it says.

Analysis via SQL queries

The data is kept in the SQLite ChatComDatabase file located in the userdata/data/com.badoo.mobile/databases/ directory. Within the database there are 2 tables containing data of interest.

Conversation_info
This table contains the user IDs, gender, user names, age and profile photo URLs for all the users that chatted with the local Badoo app user. It is of note that the local app user information is not contained within this table. To identify the local user information I emulated the app with the Magnet App Simulator (more on that later) and was able to see the name and age of the local user.

Username obscured
With that information on hand I processed the app directory with Autopsy and did a text search for the user name which had a hit in the following path and filename:
userdata/data/com.badoo.mobile/files/c2V0dGluZ3M=
Note the base64 formatted filename. Using Cyberchef  it was easy to convert the base64 filename to ASCII as seen in the next image.


By looking at the contents of the settings file with Autopsy the following data can be obtained regarding the local user:

  • Username
  • Birth date
  • Telephone numbers
  • Weight & height
  • Body type
  • Workplace
  • Sexual orientation
  • Political orientation

It is of note that this user generated data surely would vary depending how much the user adds to their profile. Further testing would be required to confirm.

Regarding the user data of individuals that exchanged messages with the local user the User data query can be used to get the following column values as seen in the next image.



Messages
This table contains the user IDs, timestamps, and chat messages. The chat messages are contained in a field labeled as payload that holds them in JSON format. It is really easy to extract them using SQLite's the json_extract function. For an example on how to use the json_extract function see the following post on Slack app messages parsing:
https://abrignoni.blogspot.com/2018/09/finding-slack-messages-in-android-and.html
 Since the messages are referenced by their user IDs a join select of the messages and conversation_info tables had to be used to determine the sender and recipient names. To do this the select query had to take into account that the local user information was not found within the conversation_info table. This fact made it difficult to join the tables by user_ids since the most important user (the local user) did not have user name data to join. To overcome that obstacle I used two separate query conditions.

  1. Left join conversation info on sender_id = user_id
    This condition gave me all sender user names to include null rows that had data but no corresponding user name (i.e. the rows for the messages sent by the local user.)
  2. Left join conversation info recipient_id = user_id
    This condition gave me all recipient user names to include null rows that had data but no corresponding user name (i.e. the rows for the messages received by the local user.)
With these two queries on hand the idea was to join both selects by each row's unique ID. This would guarantee that there wouldn't be a one to multiple selection which would cause rows to be unnecessarily repeated. Then a simple order by created time would put all the messages in their proper order. I have also added a ifnull condition to the query so that every null username value will read 'local user' instead. The query and the result looks as follows:

To see the full query see the previously provided link

It is of note that I have added the payload data field with all the JSON content in it. This was important since some of the JSON content might not be a chat message but data regarding a shared image. When the chat_text field is null in the query results the examiner can simply go to the contents of the payload field to determine additional information like upload ID, expiration timestamp and the URL of the image itself. In the preceding image notice how the chat_text null field rows say "type":"permanent_image" in the payload field.

I plan to have these queries submitted to the MAGNET Artifact Exchange Portal soon.

MAGNET App Simulator
Main screen
As stated previously I used the simulator to identify local user data by visualizing the app data through the app itself. The process is simple and straight forward. 

The first thing to do is extract the app APK from the device.

Load the APK

Then load the app directory.
Load app directory

The simulator brings up an Android instance within VirtualBox, installs the APK, and injects the app data into this new virtualized app instance.

Installing, importing, & injecting

The results are incredible.
Chats in viewed with the app itself, as intended

Conclusion
This analysis was interesting to me for a couple of reasons. The first one underlines the importance of always doing a manual visual check of what apps are available in our extractions and of these how many are parsed by our tools. The difference requires manual attention since the most important piece of data might reside where is not to be found readily. The second reason is how simulation or virtualization of apps does not substitute manual database analysis and that both techniques can and should be used together to guide a deeper analysis of the application data. Without the combination of both techniques the rich repository of local user data might have gone unnoticed since it wasn't accessible in the databases nor in the virtualized screens.

To end I would like to thank not only those who contribute to the DFIR field with tools, scripts and data sets but also those who reach out to ask questions because they want to learn and grow. Truly there is no better way to learn that by trying to fill the gaps of things yet to be known.

As always I can be reached on Twitter @AlexisBrignoni and email 4n6[at]abrignoni[dot]com.

Friday, May 31, 2019

Android SystemPanel2 - App usage tracking

Short version

The 3rd party app SystemPanel2 keeps timestamped system wide app usage statistics in the following database:
userdata/data/nextapp.sp/databases/SystemRecord.db
The following queries at https://github.com/abrignoni/DFIR-SQL-Query-Repo can be used as templates to extract information from the SystemPanel2 database:

  • History Master
    • Timestamps start & end, battery data, CPU usage, screen usage, cell and WiFi signal strength, CPU clock, and load information.
  • History Process
    • Timestamps start & end, PID, process name, type, CPU time, total CPU time, and foreground time.
  • History Foreground
    • Timestamps start & end, interval, process name, type, battery usage, and CPU usage
       
  • History Power
    • Timestamps start & end, name, power use, flags, wake count, and wake time.

Long version

Since Sarah Edwards came out with her earth shattering iOS analysis of the KnowledgeC database and then her industry leading APOLLO framework I have been obsessed with finding ways of getting similar data sets from Android devices.

To this end one of the most enjoyable research projects I embarked this year involved working on some scripts that parse the UsageStats and Recent Tasks XML data stores in Android. These scripts took as foundation the incredible research done by Jessica Hyde. They were usability enhanced by the magical python GUI work of Chris Weber. This particular work for Android is of importance due to the fact that these XML files reside in most, if not all, Android devices. Still I knew there had to be better captures of app usage via third party apps due to some work I had done last year on the Android Ccleaner app.

This blog post will be the first of a series that will look into pattern of life, app usage and system data that can be extracted from Android third-party apps. Some of these apps are user installed via the Play Store but others will come by default on branded devices. Some of these apps will require root access to extract the data while others will spill the usage secrets via plain and simple ADB. Here is the first of these...

Testing Platform

For my testing platform see here.
For this particular post a rooted Samsung S7 Edge was used.

Analysis


The best explanation of what data this app records can be found in their Google Play store description. It read as such:
SystemPanel is a tool to let you view and manage just about everything possible about the goings-on of your device and visualize it in an easy-to-understand graphical format. 

Features include:

* Show active apps, record app battery, CPU, and wake lock usage over time to show potential battery drain issues
* Draw plots showing how you used your phone over time and how much battery disappeared as a result
* Analyze recent battery consumption and device wakeups (wakelocks), showing potential problem apps
* Manage installed apps, backup app APKs, uninstall apps, and re-install archived versions
* View apps categorized by the permissions they require
* Disable system packages [ROOT required]
* Disable individual services of apps (e.g. OTA updates) [ROOT required]
* Browse all the technical nitty-gritty about your phone

This app uses Accessibility services. SystemPanel's "Usage" feature can optionally use an "accessibility service" to show you how much time you're spending in each app on your phone, and when you use them throughout the day. This is useful for those with addiction disorders (and/or their parents or legal guardians) to avoid addictive use of the device/specific applications. Use of this service is optional, and like the rest of SystemPanel, no collected data is sent from the device, it is only displayed to the user.
It goes without saying that the amount of data this app tracks is incredible. The data is kept in the sqlite SystemRecord.db file located in the userdata/data/nextapp.sp/databases/ directory. Within the database there were 4 tables housing the data of interest.

History_Master Table

This table contains information on timestamps, battery data, CPU usage, screen usage, cell and WiFi signal strength, CPU clock, and load information.


A simple scroll down the results shows how when a device is in use the screen usage, load, and CPU stats then to be high while battery charge goes down (assuming is not plugged in.) In my test data it was easy to see how most heavy usage took place during the daytime hours. It would be trivial to export the data into Excel to create usage graphs for all the parameters being tracked. This is true of the data in all the tables. Each row of data in the table has a 7 second interval between the start and end timestamps.

History_Process

In this table we get out first set of data that is related to both app bundle ids and Android OS processes.


As seen on the previous screenshot the bundle ids have a type classification of 2 where as non reverse URL designations have a type classification of 1. Each row of data in the table has a 3 second interval between the start and end timestamps.

History_Foreground

The table name is self explanatory.


Is is of note that the time interval between the start and end timestamps is variable and it is contained in the interval row. The interval values are in seconds.

History_Power

Power usage aggregation by app / process.


Each row of data in the table has a 7 second interval between the start and end timestamps.

So what?

What use would such data have? For starters it can tell us not only when a device was in use but also what apps where the ones and and which ones weren't in 3 & 7 second time-frames. Was the screen on (higher energy consumption) or not? Was the phone being charged at a particular time? Was the device on or off? How long was the app used for? The best part is the knowledge of when all of these events were happening. The forensic applicability of such data cannot be overstated. To that end I will be submitting these queries as custom artifacts to the Magnet Forensics Artifact Exchange.

Conclusion

The preceding analysis of a third-party app demonstrates again the need to always manually review the app directories for items our forensic tools might not be aware of yet. A few extra minutes of work can yield incalculable benefits.

As always I can be reached on Twitter @AlexisBrignoni and email 4n6[at]abrignoni[dot]com.

Friday, March 8, 2019

UsRT - Graphical interface for Android Usagestats and Recent Tasks XML parsers.


Introducing UsRT

Thanks to the hard work of Chris Weber (@RD4N6) we now have a way to parse the essential data contained in the Android Usage Stats and Recent Tasks XML files through a graphical interface. Like Eric Zimmerman says it is agent proof. Chris took my scripts, based on the research done by Jessica Hyde (@B1N2H3X), and made them accessible to all. Point and click goodness.

The application can be run as an executable (UsRT.exe) via the provided installer or through the python scripts directly. The installer has all dependencies included and is the easiest and fastest way to use the parser.

For details on the original research that motivated these scripts and the interface see Jessica Hyde's research at the SANS DFIR Summit 2018. For details on the parsing scripts see my previous blog posts for Usage Stats and Recent Tasks.

Script and installer links at the end of the blog post.

Features for Usage Stats:
  • Case information fields

  • Visual listing of files as they are processed in the left bottom corner of the interface



  • Rows and columns format with the ability to hide columns and select all rows, check rows or unchecked rows.



  • HTML reporting



  • Ability to open already processed cases through the application generated case json file.
  • Included Read Me file that has a quick overview on usage with related screenshots. The Read Me can be accessed via the Help menu options.
Features for Recent Tasks:
  • Same features as Usage Stats with the addition of the recent images and snapshot fields. Pressing on the images will show them in your system's default image viewer. HTML reporting include images as well.


Repository and installer

To get the scripts go to the following repository:
https://github.com/abrignoni/UsRT
The installer is in the same repository in the release tab.
https://github.com/abrignoni/UsRT/releases 
Conclusion

As said at the beginning of the post I am indebted to Jessica Hyde for doing the original research and to Chris Weber for putting all work an effort to maximize the use of the parsing scripts by making an awesome graphical interface for them.

As always I can be reached on twitter @AlexisBrignoni and email 4n6[at]abrignoni[dot]com.



Saturday, March 2, 2019

iOS Bplist Inception

Update 03/21/2019:
Script now decodes NSdata contents. See details at 
https://thinkdfir.com/2019/03/21/updating-the-knowledgec-parser

Short version:


Python 3 script that export compound bplists from a specific field on a iOS knowledgeC database, extracts the internal bplist and creates a triage html report of its contents. Two versions are provided, for iOS 11 and iOS 12, due to a slight difference on how the internal bplist is referenced within the external that holds it.

The scripts can be found in the following location:
https://github.com/abrignoni/iOS-KnowledgeC-StructuredMetadata-Bplists
It's recommended that you load these plists into your viewer of choice to examine them directly.

Long version:

Like most DFIR things lately this one also started with Phill Moore. He reached out to the community on the following:


Since I've been on a data parsing binge of lately I was happy to try and assist. As I was reading the replies to Phill's tweet I was reminded of how, of all the data structures utilized by Apple products, bplists are one of the most prevalent. So prevalent that these can be contained within SQLite databases and they themselves contain other bplists within them. Total data storage inception. At this point there was no doubt...


Thanks to kind souls like @i_am_the_gia@ScottVance, and others who will remain anonymous; we got test data to see if we could do the following:

  1. Export the bplists intact from the SQLite DB.
  2. Extract a bplist (clean) from the bplist that holds it (dirty.)
  3. Access the clean bplist and create a file that could be used in forensic tools for analysis.
  4. Generate a triage report of clean bplist data contents to easily evaluate relevance before importing to forensic tools.
There are many tools that let us view the contents of bplists but when these are nested in such a way getting to the internal content requires some manual work. Like any and all examiners the world over manual work is just the universe telling you there is a need to automate and scale. 

The database selected for our testing was the iOS knowledgeC database. I highly recommed everyone reads Sarah Edwards' article on it, THE article on it. By looking at the Z_DKINTENTMETADATAKEY__SERIALIZEDINTERACTION field within the ZSTRUCTUREDMETADATA table can see how these bplists look when nested.


Notice how there are two bplist headers in the same SQLite database content. 

Export

Exporting the data was a straight forward action. Regular SELECT and assigning the content of the field to a variable that would be written to a file. For this to work the receiving file has to expect binary content. As seen in the next image the extracted bplist are named in the following convention:
  • D/C = Dirty or clean There is nothing wrong or dirty about the shell bplist. It is a shorthand in opposite to the internal bplist which I called clean after extraction due to a lack of its bplist shell. 
  • Z_PK = The field name in the table that contained the primary key for the row that contained the exported bplist.
  • Numeric value = Integer contained in the Z_PK field for the row that contained the exported bplist.

By establishing this filename convention the examiner can easily backtrack to the proper row from the target table if additional fields are of interest or if there is a question on the validity of the exported bplist.

Extraction

Now that we had exported the bplist we had to get to the clean one in a automated way. Thanks to @firmsky I was reminded of an article by Sarah Edwards on the use of ccl_bplist for the parsing of NSKeyedArchiver bplists in Python. These bplist objects are beyond the scope of this blog but just know that I am grateful that Alex Caithness came up with this module that saved me from experiencing a painful headache. You can find this great module here:
https://github.com/cclgroupltd/ccl-bplist
With this module in hand and some test data we figured out that:
  1. In iOS 11 one has only to deserialize the bplist at the root which gives you the clean bplist.
  2. In iOS 12 one has to desiralize the bplist at the NS.data level since the clean bplist is contained within it.
The previous was a long way of saying that in iOS 11 the following key piece of cll_bplist function
CleanBplistFile = ccl_bplist.deserialise_NsKeyedArchiver(DirtyBplistFile)
would give you the clean bplist ready to write out where as the following code
ns_keyed_archiver_objg = ccl_bplist.deserialise_NsKeyedArchiver(DirtyBplistFile)
CleanBplistFile = (ns_keyed_archiver_objg["NS.data"])
would give you the clean bplist after accessing the NS.data portion. It would be good to have further confirmation that these type of incepted bplist truly vary per iOS version and that is not only a crazy coincidence of the the data sets we had available.

Originally the purpose of this exercise was to find a way to easily extract the clean bplists in order to import them into forensic tools with minimum effort and no manual extraction. It became clear that a triage report was needed when one of my data sets contained 1565 extracted bplists. Be aware that the script developed will keep both the dirty and clean bplists in separate folders within a timestamped directory. In this way one can backtrack the whole process for validation purposes.



Reporting

With a triage report that shows the content one can decide which set of bplists should be drilled down more or just retained due to work or case relevance. The fields on the html formatted report are the following:
  • Filename = Same format as stated before.
  • Intent Class = This is a value taken from a field in the table where the dirty bplists where stored in the knowledgeC database. This value is key cause it gives you a clue of the purpose of the contents of the bplist.
  •  Intent Verb = Another value taken from one of the table fields. Further description of bplist purpose and/or type of content.
  • NSstartDate = Time stamp.
  • NSsendDate = Time stamp.
  • NSduration = Float value.
  • NSdata = Binary data store of activity.
Since the report is a triage report the NSdata values are just a string representation of the binary values in it. Although it contains many non human readable characters it is pretty easy to key in on those ASCII values that one can easily read. The report is a testament to my ignorance on how to convert these values to something more pleasing to the eyes, but for triage purposes that help the examiner decide what to process with a forensic tool further it is perfect. Some of the values can cleaned up a little with UTF-8 decoding but many, especially those that contain a lot of data, are not.

The next picture is an example of the report format. The particular data in the report was shared with the condition that it would not be share hence the redaction of it.


It is up to the reader to test it out and discover for herself what awesome data resides in these structures. Things that are, things that were in one form and changed to another, and things that are no more.

Future work

I was surprised by the amount of data contained in just one field from one table in one database. I can only imagine what relevant data resides in incepted SQLite held bplists in other tables and other databases. The next step is to evolve the script so it can extract any bplist blob from any SQLite table and generate dirty and clean instances as needed with complementing reports for triage. A key part is to better better understand how the NSdata fields work to see if anyone in the community knows how to parse them.  If only the days had more hours and our bodies less need for sleep.

As always I can be reached on twitter @AlexisBrignoni and email 4n6[at]abrignoni[dot]com.

Tuesday, February 19, 2019

Android Recent Tasks XML Parser

This post is a continuation of my last blog post where I introduced a simple parser for the Android usagestats XML files.
https://abrignoni.blogspot.com/2019/02/android-usagestats-xml-parser.html
In this entry I am introducing a parser for the Android recent tasks XML files. Like the previous parser it is based on the research done by Jessica Hyde that she presented at the SANS DFIR Summit 2018. You can see her excellent presentation here:
YouTube: Every Step You Take: Application and Network Usage in Android 
The presentation slides, in PDF format, can be found here:
PDF Slides: Every Step You Take: Application and Network Usage in Android 
As explained in the presentation the Recent Tasks XML files record the following activities for recently used apps:

  • Task ID number = Used to correlate snapshot and recent image files.
  • Effective UID = App identifier.
  • First active time = Timestamp in millisecond epoch time.
  • Last active time= Timestamp in millisecond epoch time.
  • Last time moved = Timestamp in millisecond epoch time. 
  • Affinity = Bundle ID name.
  • Calling package = Bundle ID or process that called the referenced recent task.
  • Real activity = Gives information on app usage at time of recording and snapshot creation.
These XML files are located in the following directory:
\system_ce\0\recent_tasks
In addition to these XML files, recent tasks can produce snapshot images as well as recent images. Details about these are contained in the previously referenced presentation. These images can be found in the corresponding directories:
\system_ce\0\shortcut_service\snapshots
\system_ce\0\recent_images

In order to leverage the data contained in these XML files and images I made a parser in Python 3 that takes the XML information and puts it in a SQLite database for ease of querying. The script can be found here: 
https://github.com/abrignoni/Android-Recent-Tasks-XML-Parser
Warning:

The script has been tested and found to be accurate on my own data sets. Not all recent tasks will contain all data events or related images. Additional testing and validation of the script is humbly requested and more than welcomed.

Script usage

1. Extract from your Android source device the three directories mentioned previously. Extraction should be logical and not contained forensic tool generated recovered items like deleted and/or file slack files.

2. Place the script and the noimage.jpg files from the repository in the same root directory as the extracted directories.

Have this before running script.
3. Run the script with no arguments.

Script is done.
4. When completed the script will generate two files, a SQLite database named RecentAct.db and a report file named Recent_Activity.html.

What you should see after a successful run of the script.

Note that the RecentAct.db SQLite file will contain two fields populated with all the XML attributes in JSON format. The analyst can run a query using JSON_extract to custom generate queries with any of the attributes within the XML.

5. Open the Recent_Activity.html report.

Sample report entry.
For every recent task there will be a table with pertinent information as well as the snapshot and recent image files that correspond to it. To view the images full size just click on them. Be aware of the importance of the creation times of these image files within the source media. For details see the presentation previously mentioned.

It is of note that not all recent tasks, in some of my test data samples, had corresponding images or full sets of attributes. When a recent task lacks corresponding images the script will reference the noimg.jp file.

Missing image and missing attributes.
For missing attributes the report will state 'NO DATA' and/or 'NO IMAGE' in the Key and Values columns as needed. Be aware that the SQLite database has all attributes in JSON format for custom query generation.

Conclusion

I want to thank again Jessica Hyde for her research and for making the community aware of these artifacts. Hopefully this script can make it easier to give much needed context to these images and apps whose value might not be found anywhere else on the source device.

As always I can be reached on twitter @AlexisBrignoni and email 4n6[at]abrignoni[dot]com.

Sunday, February 17, 2019

Android Usagestats XML Parser

As I've been testing and using Sarah Edwards' excellent APOLLO pattern of life framework for iOS I reminded myself of the great work done by Jessica Hyde on a similar set of files for Android called usagestats. These files provide insight on what apps where being used, if they were in the foreground or background, and how long have the apps been active among many other forensically interesting data points.

In her presentation for the SANS DFIR Summit 2018, Jessica Hyde explains how the usagestats XML files record the following activity from Android devices:
  • User interaction
  • Move to foreground
  • Move to background
  • Configuration changes
I highly recommend the reader check out her Every Step you Take DFIR Summit video and presentation slides on PDF format. The rest of the blog post will make more sense after the viewing and/or reading of her work.

In order to leverage the data contained in these XML files I made a parser in Python 3 that takes the XML information and puts it in a SQLite database for ease of querying. The script can be found here: 
https://github.com/abrignoni/Android-Usagestats-XML-Parser
Warning:
The script has been tested and found to be accurate on my own data sets. Additional testing and validation of the script is humbly requested and more than welcomed.

Script usage

Extract from your source Android device the following directory:
\data\system\usagestats\
Export the usagestats directory
 Place the script at the in the same root directory as the just extracted usagestats directory.

Side by side as such.
Run the script with no arguments from the root directory that contains both the script and the usagestats directory. The script will parse all the internal directories and files for you.

No arguments are needed.
The script will also alert you of files whose content is not XML that can be parsed. After the script ends a sqlite database named usagestats.db is generated.

Data is served.
Use your favorite SQLite application to view the parsed data. Notice the timestamps are in epoch time.

Notice the fields.
The generated database contains a table named data with the following fields:

Usage_type: 

Each XML file contains a description of what type of data is recording. The values can be event-log, configuration, and packages.

Lastime:

Records when an app (package) was last active or when a configuration took place. The XMLs themselves keep track of these time in two ways. Most usage events maintain a count of how many milliseconds have passed since the creation of the XML file and the occurrence of the event. To calculate the timestamp of the event the script takes the XML filename, which is the epoch time of the file itself in milliseconds, and adds to it the milliseconds the event took to occur. This provides the event time as en epoch timestamp. For events that are not millisecond offsets from the epoch time filename of the XML file, they keep the time as en epoch timestamp preceded by a minus sign. The script eliminates the minus sign to keep the epoch timestamp. My testing has shown this way of calculating times to be accurate to activity I have taken on the device.

The following image was included in an app review I did in November for the TikTok Android application. Notice the time some of the chat activity took place.

Notice the created time
As seen in the image above the TikTok app is sending and receiving chat messages. The parsed XML SQLite database shows the same activity at the same time being totally consistent with TikTok chatting.

Notice the classs values
Shortly I will provide a SQL query that will format the dates and type values in the same human readable format seen above.

Time_active
Certain events keep track of their length in milliseconds.

Package:
Application name.

Types:
Activity types as integer values. These represent activity like move to background or move to foreground. The list of interactions can be found here:
https://developer.android.com/reference/android/app/usage/UsageEvents.Event
Classs:
Application name and corresponding modules in use.

Source:
Usagestat originating XML category. They are daily, weekly, monthly, and yearly.

Fullatt:
Contains the full attributes for the XML event, in other words all the data for the even in JSON format. With the data in this field the analyst can easily select any key:value pair and make it its own column in a SQL query by the use of JSON_Extract. For an example on how this SQL query function works see here:
https://abrignoni.blogspot.com/2018/09/finding-slack-messages-in-android-and.html
SQL query

The following query can be run against the script generated database to format the timestamps from UTC to local time, add a field for time_active in seconds, and changes the types integer values to readable activity descriptions. Be aware that I have not added all case types per the link provided previously in the Types section. Add as needed.

SELECT
  usage_type,
  datetime(lastime/1000, 'UNIXEPOCH', 'localtime') as lasttimeactive,
  timeactive as time_Active_in_msecs,
  timeactive/1000 as timeactive_in_secs,
  package,
CASE types
     WHEN '1' THEN 'MOVE_TO_FOREGROUND'
     WHEN '2' THEN 'MOVE_TO_BACKGROUND'
     WHEN '5' THEN 'CONFIGURATION_CHANGE'
     WHEN '7' THEN 'USER_INTERACTION'
     WHEN '8' THEN 'SHORTCUT_INVOCATION'
     ELSE types
END types,
  classs,
  source,
  fullatt
FROM data
ORDER BY lasttimeactive DESC

Conclusion

My hope with this script is to make accessible the data contained in the usagestats xml files for digital forensic case work. Additional testing of the script and suggestions on how to optimize it are welcomed. I hope to create additional scripts that will parse the Android battery status and recent tasks XML files as show by Jessica Hyde in her presentation.

As always I can be reached on twitter @AlexisBrignoni and email 4n6[at]abrignoni[dot]com.



Wednesday, January 16, 2019

QuickPic for Android - Don't forget external/emulated storage!

QuickPic Gallery for Android

QuickPic is an image gallery app for Android devices that used to be fairly popular before it was taken down from the Google Play Store. Since the app is still available via third-party APK repositories we might still come across it in our case work.

Logo

The main reason for this blog post on the QuickPic app is that it illustrates the value of checking related external, emulated, and adoptable storage in Android devices. In this particular app all the databases that track user generated activity were kept outside of the main app directory.

QuickPic app keeps relevant databases in the following directory:

/storage/emulated/0/Android/data/com.alensw.PicFolder/cache

SQLite Queries

Queries that can be used as to extract pertinent data can be found at:

https://github.com/abrignoni/DFIR-SQL-Query-Repo/tree/master/Android/QUICKPIC

The following queries are provided:
Quick analysis of SQLite databases

Thumbnails
The app provides thumbnails for all the images it scans as seen in the next image.
Thumbnails
These thumbnails are kept in a database titled thumbs_numeric-value.db where the words in red represent a series of numbers as seen below.

Notice the multiple thumbnail databases
In the image we can see that the databases of interest are named thumb_123904.db and thumb_220900.db. It appears that when one database reaches a particular point (size? directory paths? amount of records? year?) it continues registering thumbnail data in a second database.

The thumbnail databases contains a table named thumbs with the following columns: path, thumb, and modified.

Thumbnail database content
The columns provide the path where the original image is located on the device, a modified timestamp, and a thumbnail of the original image. To view the image export the blob within the thumb column and rename it with their proper extensions as seen in the path column. The usefulness of having a copy of the data in thumbnail form is apparent. This is even more so when the examiner has access to multiple data backups, be it by Android itself or by third party backup apps, of the same device.

Previews
When the user presses one of the thumbnails images in the application, in order to view a larger version of the image, the app keeps track of such activity in the preview database. The database can be seen in the cache directory image seen above. The preview database contains a table named cache with the following columns: document_id, _data, _size, accessed_time, and last_modified.

Preview database
As seen previous image the document_id columns keeps the location path for the original image. the _data column keeps the location path for a copy of the original image that was selected by the user. These images are kept, as seen in the path, in the .preview directory. The preview images themselves have alphanumeric names with no extensions. Just like the images in the thumbnails database one can add the proper extension to the preview files to view them. The rest of the fields are for preview image size, accessed time and last modified time. 

It is of note that files in the .preview directory tell us that the user had to select/press the thumbnail of a media item of interest in order to view a large version of it. It requires user interaction. 

Conclusions

This quick analysis made me think of how many times we can be in a hurry and overlook related app directories that are contained in SD cards or emulated storage space. Databases that tell us about user intent, what was selected for viewing, and what was actually seen can be missed if we don't make it a habit to always look for unfamiliar app ids in app directories AND emulated/external storage. A one minute check can provide us with amazing returns we did not expect. 

As always I can be reached on twitter @alexisbrignoni and email 4n6[at]abrignoni[dot]com.