Saturday, November 10, 2018

Finding TikTok messages in iOS

Short version

The iOS TikTok app keeps message related data the TIMMessageORM table from the following SQLite database:
/private/var/mobile/Containers/Data/Application/UU-ID/Library/Application Support/ChatFiles/User-ID/db.sqlite
Be aware that in the path UU-ID should be replaced by the application identifier and the User-ID  for TikTok user identifier of interest. The process to obtain the correct UU-ID number is presented in brief in the long version section. For a detailed example see here. To obtain the correct User-ID directory number for a user of interest see the contents of the awemecontacts table from the following SQLite database:
/private/var/mobile/Containers/Data/Application/UU-ID/Documents/AwemeIM.db
Queries that can be used as templates to extract messages from the database can be found at:
https://github.com/abrignoni/DFIR-SQL-Query-Repo/tree/master/iOS/TIKTOK
In order to view the public TikTok profiles of the users found in the awemecontacts table in the AwemeIM.db database add the user name ID to the end of the following URL:
https://m.tiktok.com/h5/share/usr/(insert username ID number from DB).html
For one of the test accounts used in this blog post the URL looks like this:
https://m.tiktok.com/h5/share/usr/6619782258185388037.html 
Videos created with the app can be found in the following directory with the .mp4 extension:
/private/var/mobile/Containers/Data/Application/UU-ID/temp/
 Long version

TikTok is one of the most popular apps in the iOS App Store.

For an example on how to obtain a file system extraction from a rooted iOS device see here.

Testing Platform

For analysis I am using the following device and equipment:
  • iPhone SE - A1662
  • iOS 11.2.1
  • Jailbroken - Electra
  • Forensic workstation with Windows 10 and SSH software.
Acquisition

A brief overview of the identification and extraction of  TikTop user data app directory is as follows. For an detailed example see here.

1. Locate bundle id name.


2. Access the 'applicationState.db' file located at:
/private/var/mobile/Library/FrontBoard/ 

This SQLite database provided the connection between the bundle id and the UUID numbers in the 'Application' directory.

Open the SQLite database with a SQLite browser. Look for the bundle id name in the 'application_identifier_tab' table. Take note of the corresponding id number.


3. Look for it in the 'kvs' table in the 'application_identifier' field . Export the blob in the value field for the id. The exported data is a bplist that maps all pertinent UUID numbers to the application name and/or bundle id. The data can also be seen in the preview pane in binary mode without the need to export the blob content. If the bplist is exported a viewer, like Sanderson Forensics BPlister, can be used to see the relationship between UUID and application we are looking for.


4. With the correct application directory identified I copied it via SSH to the forensic workstation.


In this particular instance I did not make a full file system extraction of the device. I only copied the app directory of interest for testing purposes. Do follow generally accepted forensic principles when doing similar work on your case work.

Chats and Media

As stated in the Short version portion of the blog post the message data can be accessed by joining the contents of two different databases and two tables in them.

1. TIMMessageORM.db.sqlite
2. awemecontacts.AwemeIM.db

The paths for these files are respectively:

1. Support/ChatFiles/User-ID/db.sqlite
2. /private/var/mobile/Containers/Data/Application/UU-ID/Documents/AwemeIM.db

The messages query found in the DFIR SQL Query Repo for TikTok produces the following results:


The column data is as follows:
1. sender = The numeric user id. The value is used to join the two tables in order to access usernames.

2. profilepicURL = Is the link for the user profile pic.

3. customID = Account username.

4. nickname = Precisely what it says.

5. Local_create_Time = Local/device time for a particular message.

6. servercreatedat = Server/remote time for a particular message. A value of zero indicates the message did not leave the device.

7. message = The content of the message.

8. localresponse = Additional information for a particular message. For example for messages that did not leave the device this field will provide some diagnostic information.

9. links_display_name = If the user responds with an image or a gif this field will have the display name of the file.

10. links_gif_url = The link for the sent image of gif. Contents can be accessed without authentication.

The user data query found in the DFIR SQL Query Repo for TikTok produces the following results:


The column data is as follows:
1. uid = Numeric user id. 

2. customID = Account username.

3. nickname = Precisely what it says.

4. latestchattimestamp = Last timestamp for a chat.

5. url1 = Link for the profile pic of the user.

One can use the uid number to access the public profile of the user over a browser. Just user the following URL and fill it with the uid of interest.
https://m.tiktok.com/h5/share/usr/(insert username ID number from DB).html
Here is one of my test account profiles as an example:

 All public shared videos can be seen in the profile.

Videos created with the app can be found in the following directory with the .mp4 extension:
/private/var/mobile/Containers/Data/Application/UU-ID/temp/
Conclusion

The TikTop app, both in Android and iOS, stores JSON data within SQLite databases. Currently I don't know of any mayor forensic tool vendor that has the json_extract function enabled in their SQLite implementations. This means that queries that can handle JSON data can't be incorporated into their artifact/template generation tools except via the use of more complex JSON handling python scripts.

For the time being exporting the databases of interest and executing SQL queries on them via a third party SQLite browser tool will be my preferred choice.

As always I can be reached on twitter @alexisbrignoni and email 4n6[at]abrignoni[dot]com.

Friday, November 9, 2018

Finding TikTok messages in Android

Short version

The Android TikTok app keeps message related data in SQLite databases located in the following location:
userdata/data/com.zhiliaoapp.musically/databases/
The database containing user data, both the local user and friends, is named db_im.xx.
The database containing the messages is named in the following regex format: ([0-9]{19})(_im.db)$ where the filename is 19 character numeric sequence ending in the _im.db extension.

Queries that can be used as templates to extract messages from the database can be found at:
github.com/abrignoni/DFIR-SQL-Query-Repo/tree/master/Android/TIKTOK


In order to view the public TikTok profiles of the users found in the db_im.xx table add the user name to the end of the following URL:
https://m.tiktok.com/h5/share/usr/(insert username number from DB).html
For one of the test accounts used in this blog post the URL looks like this:
https://m.tiktok.com/h5/share/usr/6619791930123403269.html 
Multiple XML app files can be located at:
userdata/data/com.zhiliaoapp.musically/shared_prefs
Some of the app related info contained within the XML files includes:

  • Total traffic
  • Collect traffic time
  • Recent search history 
  • Mobile traffic 
  • Language
  • Region
  • First open time
  • App install time
  • Last update time
  • Mac address
  • Last wifi bssid
  • Last time check bssid
The previous are just a few examples of the type of content the XML stores. Additional user info can be found in the aweme_user.xml file.

Videos created with the app can be found in three files. One contains the video, another the audio for it and a third one combines both. The files are located at:
userdata/data/com.zhiliaoapp.musically/files/
The filenames are a dash separated timestamp followed by a numeric sequence that ends with -concat-v for the video and -concat-a for the audio. A sample audio filename would be something like this: 2018-11-03-210557702-mix-concat-a.

For the combined video and audio file the filename will follow the previous format with the addition of the synthetise_ prefix. For example: synthetise_2018-11-03-210556218-concat-v.


Long version

The Android TikTok app is one of the more popular apps in the Google Play store with over 100,000,000 downloads. The app is used to create short videos where the user can easily edit the sounds, visuals, and share them in within the social media environment it provides.

This app is ridiculously popular with teens.
 As a most social media platforms the app provides a way for user to send messages to each other.


Testing platform

Details of hardware and software used can be found here.

Extraction and processing

Magnet Forensics Acquire - Full physical extraction. 
Autopsy 4.8.0
DB Browser for SQLite

Analysis

The TikTok app directory structure looks as follows:

Usual Android app file structure.
As stated at the beginning of the post the main messaging content SQLite database is named by the following pattern  ([0-9]{19})(_im.db)$. The 19 character number at the start of the file name is the same as the logged in user of the app. The messages table does not contain the actual user names, that information resides in a second table called db_im.xx. The table name for the message is appropriately called msg.

The following image shows some of the more relevant fields in the msg table:

JSON, we meet again!!!!
As expected the creation time of the messages is unix epoch and the actual test content is in JSON format. The extract messages query at the top of the blog uses the json_extract function to separate the relevant JSON into its own database response columns.

It is of not that some of the messages have the read_status value in the last column set to zero. This means that the message did not reached the server. In my test those messages were sent before the target account had followed account initiating the message. The local info column contained relevant information that will help the analyst understand the reason for a read_status of zero. Again in this instance the local info message read as "This person hasn't followed you yet and may not be able to receive your messages."

Local_info column value. Long hand for sorry can't do that Dave.
Next is the user data table contents named SIMPLE_USER in the db_im.xx database.

Notice the user id, nickname and unique_id values. The avatar thumb url, in JSON format, is there as well. The SQL query for messages joins both the messages and user data tables to present a unified result for all messages sent and received. As always it calculates the time from unix epoch to local time and extract all the relevant JSON to its own result columns. For the query to work one of the tables needs to be attached so the query can have access to both databases and the necessary tables from each.

The next image show a portion of the message extract messages query results. See how if the user responds with a GIF the URL and display name are extracted from the JSON and given their own columns. These URLs are accessible without authentication.

Love it when a plan comes together.
One useful trick is to take the UID values and insert them into a specific URL in the following manner:
https://m.tiktok.com/h5/share/usr/(insert username number from DB).html
 If the account is not private you will be able to see the shared content. Here is an example:

This is the account.

The app also maintains the shared content as described in the first section of the blog post as well as multiple XML files that can be of use to the analyst. In my sample data set there were 77 XML files. Going through them here is beyond scope but it is highly recommended to take the time and understand their content.

Next post will be about finding TikTok messages in iOS. When those queries are completed they will be added to the DFRI SQL Query Repo as well.

As always I can be reached on twitter @alexisbrignoni and email 4n6[at]abrignoni[dot]com.



Thursday, November 1, 2018

Quick DFIR review - CCleaner for Android

Short version

CCleaner for Android keeps application usage related data in the following location:
data/data/com.piriform.ccleaner/databases/
 The following queries at https://github.com/abrignoni/DFIR-SQL-Query-Repo can be used as templates to extract information from the CCleaner SQLite databases.
CCleaner installation and first usage timestamps can be found in the following XML file:
com.google.android.gms.measurement.prefs.xml
The XML file is located at:
data/com.piriform.ccleaner/shared_prefs/ 
Long version

Mobile extractions can provide different degrees of visibility into what type of artifacts can be gleaned from the device under examination. For example in most instances a physical dump will enable us to get at more artifacts than an Android backup extraction would. In situations where the extraction at hand provides limited visibility we can look at extracted apps to complement our view.  

Testing platform

For my testing platform see here.

Analysis

CCleaner for Android is a well known app with over 50 million downloads. 



The app is advertised as a simple way of reclaiming storage space and speeding up the device.

When the app is accessed, and the Analyze button is pressed, it presents bar graphic stating that a scan is in progress. When the bar gets completely filled a list of apps and related directories that can cleaned by emptying their corresponding cache folders is shown. It looks as so:

Apps after scanning
The list is divided in the following categories:

  • Hidden cache
  • Visible cache
  • Empty folders
  • Downloads
When the FINISH CLEANING button at the end is pressed the app presents how much space was saved.

Saved space.
Running the scan a second time the app list is shorter than the first go around.

Fewer apps.
By looking at the data/data/com.piriform.ccleaner/databases/ directory one sees that the database which contains the scanned app data is properly called scanner-cache.db. This database lists apps that have data in the /data/data/ and related /data/media/ directories. When a scan is responsive to any of those directories the package name and the timestamp is added to the scanner-cache.db in the appInfoCache table. Based on my testing the timestamps in the database will correspond to the latest scan of the apps and directories, hence why all the timestamps are separated at most by milliseconds. Is there an exception to the rule?

In the following image I sorted the contents of the database by timestamp. Notice the first entry.

Odd man out.

Notice the scan time in the image. The Flud application was not available on my sample Android backup extraction. I would have not known the device had the application installed at some point since my extraction was not able to pull the usual files used to determine installed applications (like packages.xml). The only way I found to replicate such behavior in CCleaner is to deleted an app with the Android uninstall function, as opposed to using CCleaner itself to make the uninstall, while also making sure that the deleted app did not have any associated directories in the emulated storage directory.

Odd man out.
What does this tell us? Not only does the list tells us what installed applications were at some point scanned by CCleaner but it can also gives us a time range of when the odd man out app, scan time wise, was deleted. In the first image the Flud app was deleted at some point between 2018-09-24 08:00:00 and 2018-10-10 17:51:33. In the second image the Translator app was deleted at some point between 2018-10-30 17:30:04 and 2018-10-30 21:32:57.

With this information the next steps would be to see if any other artifacts for Flud or Translator exist on the extractions under examination. For example in the case of Flud you might want to start looking for media player history files that include the application path for Flud downloads. As a popular BitTorrent client many individuals use it to download multimedia files. A media player that accesses such paths might retain the paths and filenames of media viewed from Flud directories even if those directories are not present in the extraction.  

Can apps that have never been downloaded and/or installed be on this list? I have not seen this to be the case for obvious reasons. Can deleted apps still be found on the list that have a last/current scan time? Yes! And this is why testing and validation is so important.  To assume that all the apps in the list were present/installed at the most current scan time could lead us to make an unsupported statement. My current hypothesis on why deleted apps might still have the most current scan times resides on the fact that these apps have related directories in emulated storage or in another location on the device. My limited tests seem to support the hypothesis but further testing is needed to confirm.

Another CCleaner file of interest is the com.piriform.cleaner.db database. It contains automated scheduled cleans.

Schedule your cleans.
I have not found this database in a free CCleaner app installation.

The cleaner_apps_db contents seem to match a list of installed apps. I have seen deleted apps persist in this database.


A list of CCleaner usage stats can be found in the cleaner database. 
Usage stats
Items that start with SAFEC are generated when an Analyze and Clean option is selected. Items that start with ADVC_CLEARED are generated when a Analyze and Clean Memory option is selected. The statistics are grouped by day and do not contain details on what was deleted.

A list of media filenames in emulated storage can be found in the myroll database.

Media list in emulated storage. Notice the was_deleted column.
Installation and first use CCleaner timestamps can be found in the com.google.android.gms.measurement.prefs.xml file located at data/com.piriform.ccleaner/shared_prefs/.

Conclusion
 

Third-party apps can augment our visibility into device artifacts when we are working with extractions that have limited file or system information. Testing and validation keep being of paramount importance since any conclusions we reach need to be supported by evidence and not only hunch or opinion. 

Since information that is not shared is information that is lost, these SQL queries are now accessible at the DFIR SQL query repository.

https://github.com/abrignoni/DFIR-SQL-Query-Repo/tree/master/Android/CCLEANER

As always I can be reached on twitter @alexisbrignoni and email 4n6[at]abrignoni[dot]com.