Initialization vectors: August 2018

Sunday, August 26, 2018

Android Nike Run app - Geolocation, SQLite views & self joins

Short version:

The Android Nike apps keeps geolocation data in the following location and files:

/userdata/data/com.nike.plusgps/databases/com.nike.activitystore.database

SQLite database
Tracks latitude, longitude, distance, calories and a multitude of different metrics for runners.

To select the latitude and longitude coordinates for any recorded run the following query can be used:

CREATE VIEW sub1
as
SELECT * FROM activity_raw_metric, activity_metric_group
WHERE activity_raw_metric.rm_metric_group_id = activity_metric_group._id and activity_metric_group.mg_activity_id = integer_value_of_interest
ORDER BY activity_raw_metric.rm_start_utc_millis;

SELECT datetime(lat.rm_start_utc_millis / 1000, 'unixepoch', 'localtime') as day, lat.rm_value as latitude, long.rm_value as longitude
FROM sub1 as lat left join sub1 as long on lat.rm_start_utc_millis = long.rm_start_utc_millis
WHERE lat.mg_metric_type = 'latitude' and long.mg_metric_type = 'longitude'
ORDER BY lat.rm_start_utc_millis;

DROP VIEW sub1;

In order to identify what values of latitude and longitude correspond to a particular run a VIEW is used where the integer value for mg_activity_id identifies such a run uniquely. In the query above one has to substitute integer_value_of_interest with the integer that represents the run we want to extract.
The second SELECT is a self join that places latitude and longitude values for a specific point in time in one row. This makes exporting the data to mapping applications possible since most of these expect a pair of geolocation values per row.

Below is a python script that parses com.nike.activitystore.database and generates multiple CVS files with the date, time, latitude and longitude values for all runs. No need to manually change the integer_value_of_interest. The script produces one CSV file per run which can be imported to a mapping application like Google Earth for visualization.

https://github.com/abrignoni/NikeRunApp-Geoparser

Long version:

This post is the logical continuation of my previous iOS Nike Run app - Geolocation & self join query entry. The purpose is to find geolocation artifacts for app recorded runs. Be aware that the app contains many more metrics of possible interest beyond GPS coordinates.

Testing platform

Details of hardware and software used can be found here.

Extraction and processing

Magnet Forensics Acquire - Full physical extraction.

Autopsy 4.8.0

Analysis

The Nike Run user and application directories were located at:

userdata/data/com.nike.plusgps

Main Nike Run directory

The database that contained all the metrics for tracked runs was located at:

/userdata/data/com.nike.plusgps/databases/com.nike.activitystore.database

The main tables to consider in the com.nike.activitystore.database file are activity_raw_metric
and activity_metric_group.

The activity_raw_metric table contains all the values for longitude, latitude, time and more. As seen in the image below there is no way of determining by the table itself which entries correspond to what values.

Activity_raw_metric sample data.

The activity_metric_group table does contain the required detail but the connection among the tables is not readily apparent.

Activity_metric_group sample data.

The key was understanding that mg_activity_id identifies individual runs while the _id field will link to the values in the activity_raw_metrics table.

The following query makes the initial relation we seek between run and time/geolocation values. In this case it is for run number 23.

SELECT * FROM activity_raw_metric, activity_metric_group
WHERE activity_raw_metric.rm_metric_group_id = activity_metric_group._id and activity_metric_group.mg_activity_id = 23
ORDER BY activity_raw_metric.rm_start_utc_millis;

Latitude is identified by 67 and longitude by 68 for run 23.

As seen above for every run there will be unique rm_metric_group_id values we need to consider. In this example the 67 and 68 in this field correspond to latitude and longitude only for run 23. Next to them are the actual values we can map on. The query is ordered by the time field since for every moment in time there will be a unique pair of geolocation values. The longer the run the more values are stored in the table.

Since most third party mapping applications, like Google Earth, accept CSV files with latitude and longitudes to map we have to make sure the end result of our query is compatible with such tools. The query above is not able to aggregate in one row both latitude and longitude values for particular points in time. This is important since mapping tools expect coordinates to map one latitude and longitude pair per row in the CSV file.

Unlike the database for iOS this first query produces the initial result set by combining to tables. Since this initial query is a join of two tables and in order to place the longitude and latitude value pairs in rows we make a VIEW out of it as so:

CREATE VIEW sub1
AS
SELECT * FROM activity_raw_metric, activity_metric_group
WHERE activity_raw_metric.rm_metric_group_id = activity_metric_group._id and activity_metric_group.mg_activity_id = 23
ORDER BY activity_raw_metric.rm_start_utc_millis;

It is the same query as before but adding the CREATE VIEW AS at the start. Now that we have a view we can reference it by the name we gave it of sub1 and do a self join query on it.

SELECT datetime(lat.rm_start_utc_millis / 1000, 'unixepoch', 'localtime') as day, lat.rm_value as latitude, long.rm_value as longitude
FROM sub1 as lat left join sub1 as long on lat.rm_start_utc_millis = long.rm_start_utc_millis
WHERE lat.mg_metric_type = 'latitude' and long.mg_metric_type = 'longitude'
ORDER BY lat.rm_start_utc_millis;

DROP VIEW sub1;

This query does a self join on the date-time values since for every point in time there will be a set of coordinates to map. Also I took the opportunity to add a field to the result where the time is changed from UTC to local time. When I'm done with the view I delete it.

The image above combines the two queries just described and shows how geolocation values, with their corresponding date-times, are consolidated by row.

Notice result at bottom.

What if we want to export all runs with their values as just seen and not just one run at a time? Changing the mg_activity_id value in the create view query run by run gets old pretty quick. In order to automate it a little I made a script that selects all run numbers first and then exports all values per run automatically. The values are exported as CSV files whose filenames are the date and time of the start of the run they represent.

The script can be found here:

https://github.com/abrignoni/NikeRunApp-Geoparser

Export the database and put it in the same directory as the script. Run the script and when it ends one should see the created CSV files as seen below, one per run for all runs.

Look at all those CSVs!

Take the CSV files and import them to a mapping application and done.

Google Earth

BIG BIG CAVEAT #1

This analysis tell us NOTHING about what device was used to record this data within the app.
As a matter of fact none of the runs in my test device were made with it. Just because an app registers geolocation data does not mean the device in which the data was found was the one used to generate it. App synchronization requires the examiner to be really careful before making causal determinations. As always careful testing and validation is necessary.

BIG BIG CAVEAT #2

Super important. In order to not miss evidence of possible deleted records do use a forensic SQLite tool to explore WAL records.

As always I can be reached on twitter @alexisbrignoni and email 4n6[at]abrignoni[dot]com.

Tuesday, August 21, 2018

Finding Discord chats in Linux - #DFIR review

This is the last entry on the Discord chat files analysis series.

I have reviewed Discord in the following platforms:

For Debian based Linux distributions the chat are located in the following directory:

/home/testing/.config/discord/Cache/

Root structure

The cache folder follows the same structure as the ones in Windows and OS X.

Just like Windows and OS X.

The following links will explain how to extract the json chat objects and how to convert them to html or xls files using a Python script. Although written originally for objects found in Windows the extraction and conversions steps apply all the same.

Extraction of json objects:

https://abrignoni.blogspot.com/2018/03/finding-discord-app-chats-in-windows.html

Discord json chats conversion to html or xls.

https://github.com/abrignoni/Discord-JSON-chat-conversions

As background on how I located the correct folder I took the following steps:

1. Linux Mint installation using Vmware
2. Installed the Discord program.
3. Searched for Discord related files and folder with the following command:

sudo find / -name discord

/home/testing/.config/discord

4. The correct folder was found in:

/home/testing/.config/discord/Cache/

5. Followed the extraction and JSON conversion steps detailed in the previous links.
6. Chat JSON fields, format and content the same as in Windows and OS X.

JSON content in HTML format.

As always I can be reached on twitter @alexisbrignoni and email 4n6[at]abrignoni[dot]com.

Sunday, August 19, 2018

iOS Nike Run app - Geolocation & self join queries

Short version:

The iOS Nike Run app keeps geolocation data in the following location and files:

/private/var/mobile/Containers/Data/Application/*UUID*/Documents/activityStore.db

SQLite database
Contains metrics for each run to include calories, distance, elevation, latitude and longitude.

SQL self join query to extract GPS coordinates from activityStore.db in the metrics table:

SELECT lat.sourceid, datetime(lat.startDateInUtcSeconds, 'unixepoch', 'localtime') as day, lat.startDateInUtcSeconds, lat.value as latitude, long.value as long
FROM metrics as lat left join metrics as long on lat.startDateInUtcSeconds = long.startDateInUtcSeconds
WHERE lat.type = 'latitude' and long.type = 'longitude' and lat.sourceID = integer_value_of_interest
ORDER BY lat.startDateInUtcSeconds

Latitude and longitude values for a specific point in time reside in different rows of the same table. This makes exporting the data to a mapping application difficult since these programs expect both values to be in one row. In order to export both values in one row a SQL self join query like the one above should be used.
By eliminating lat.sourceID from the query all GPS data from all runs are returned. Importing all that latitude/longitude data to a mapping application, like Google Earth, will either hang it or map overlapping tracks if the multiple runs covered the same path during different days.
By equating lat.sourceID to a sourceID of interest a single run (set of related GPS coordinates) can be selected and exported for mapping. In the query above the term integer_value_of_interest is a placeholder term I which should be substituted. The script below selects the appropriate integer for all runs that have geolocation data automatically.

The following link leads to a python script that parses the activityStore.db SQLite database and generates multiple CSV files with date, time, latitude and longitude. One CSV file per run is generated, these can be imported into mapping applications like Google Earth for visualization.

https://github.com/abrignoni/NikeRunApp-Geoparser

Long version:

One of the apps I use to track my #DFIRFit runs is the Nike + Running app. As I train for the Dopey Challenge races, the app tracks my all progress. It tracks pace, elevation, time, weather and even route per GPS coordinates. These last set of data points are the reason for this blog post.

Nike+ App in the App Store

Testing Platform

For the analysis I am using the following device:

iPhone SE - A1662
iOS 11.2.1
Jailbroken - Electra
Nike + app installed and logged into my own account.
Forensic workstation with Windows 10 and SANS SIFT in VMware Pro

The following links describe how to jailbreak a test device and how to make a logical extraction of the user generated data via SSH.

iPhone Jailbreaking - Elcomsoft blog
iOS imaging on the cheap - Sarah Edwards

SSH command used:

ssh root@iphoneip -p 22 'tar -cf - /private/var' > data.tar

After obtaining the logical image in a tar file the contents were extracted and processed with Autopsy.

Analysis

Since the folders where apps store user data use a UUID for a name, I did the following to identify which of these were the ones in use by the Nike+ app.

1) Identify the bundle id.
Search the bundle id lookup here.

Bundle Id: com.nike.nikeplus-gps

2) Go to the following path and file:

/private/var/mobile/Library/FrontBoard/applicationState.db

3) Open the applicationState.db SQLite database. Look for the bundle id in the
'application_identifier_tab' table. Take note of the id.

The id is 103

4) Go to the 'kvs' table and export the blobs for id 103 or view their contents in the included binary viewer in the DB Browser for SQLite software for the target UUID numbers. If blob is exported use a bplist viewer like Sanderson Forensics Binary Plist Viewer to get the following:

UUID named directories that hold the Nike+ data.

UUID numbers can change from device from device and from installation to installation in the same device.

Now that we have the UUIDs that relate to the app we can analyze the contents of those directories.
The following image is an overview of the directory structure in use by the Nike+ app.

Nike+ app directories.

After navigating the contents of these folders, the following path and database contained the GPS coordinates for runs tracked by the app.

/private/var/mobile/Containers/Data/Application/BEDDAE13-7EEC-492B-94AD-16B2456D8AF2/Documents/activityStore.db

The activityStore.db SQLite database contains information beyond latitude and longitude. It also tracks pace, descent, ascent, distance, elevation, calories, nike fuel, speed, and horizontal_accuracy. These can be found in the 'metrics' table within the database. The following image shows the fields names defined in this table.

Metrics table.

Thankfully the developers used self explanatory terms for the fields and the content within them. The most salient feature of this table is that all run metrics it tracks are contained in the same table.
The following query selects only rows where the field type contains longitude or latitude as a metric of interest. It also generates a field in the report that converts the UTC time to local time.

SELECT *, datetime(metrics.startDateInUtcSeconds, 'unixepoch', 'localtime') as loctime
FROM metrics
WHERE (metrics.type = 'latitude' or metrics.type = 'longitude') and metrics.sourceID = 4 ORDER BY metrics.startDateInUtcSeconds

Exporting is not that straight forward.

Notice how the longitude and latitude for a single point in time have their own individual rows. If the results of this query are exported to a CSV file we will have to find a way to make sure that the longitude and latitude coordinates share one row before importing them into a mapping application like Google Earth. Mapping applications will only take in coordinates from a point in time if latitude and longitude are provided in the same row, each point in time in its own row sequentially. Thankfully SQL gives us an easy way to format the data in the way we need it. In order to do this I came up with a self join query.

SELECT lat.sourceid, datetime(lat.startDateInUtcSeconds, 'unixepoch', 'localtime') as day, lat.startDateInUtcSeconds, lat.value as latitude, long.value as long
FROM metrics as lat left join metrics as long on lat.startDateInUtcSeconds = long.startDateInUtcSeconds
WHERE lat.type = 'latitude' and long.type = 'longitude' and lat.sourceID = integer_value_of_interest
ORDER BY lat.startDateInUtcSeconds

The query does various things.

It creates 2 temporary tables which are copies of the metrics table. This happens at the FROM instruction. Each temporary table is named lat and long respectively.
A left join is used to combine both tables by the startDateinUtcSeconds field. This is due to the fact that for a recorded location in time there will always be a set of corresponding latitude and longitude values.
From this joint table the sourceid, startDateinUtcSeconds, startDateinUtcSeconds converted to local time, and the values for latitude and longitude are selected. The sourceid is important because all metrics tracked in a single run will share the same sourceid.
The selection is constrained to only latitude and longitude values in the WHERE section. Optionally one can constrain further by selecting only a single run for the results. This is accomplished by 'lat.sourceID = integer_value_of_interest' where the integer will be a single run.

It is of note that if the lat.sourceID constraint is removed from the WHERE clause the returned query will contain all the selected fields for all runs. Exporting so many entries will hang most mapping applications. It is better to select a few runs of interest based on case pertinent points in time and export them by individual run or to script the extraction of these from the database. These data sets will be much smaller and easily ingested by the mapping applications. More on scripting at the end.

In the following example I selected all longitude and latitude data where lat.sourceID equaled 4. Based on the date I knew it had to reflect a half marathon I had ran at Disney in Orlando some years ago. I exported the result as a CSV file and used Google Earth for visualization.

Here is how the resulting query looked:

Latitude and longitude share the same row now. SourceID = 4.

Here is how Google Earth mapped the coordinates:

Yellow pin on the right denotes starting point.

If we zoom in we can see Epcot Center.

Epcot Center

We can also see Cinderella's Castle at Magic Kingdom.

Castle right in the middle.

By changing the integer selected in the lat.sourceID field I could target any individual run and export those coordinates only.

BIG BIG CAVEAT

This analysis tell us NOTHING about what device was used to record this data within the app.
As a matter of fact none of the runs in my test device were made with it. The Disney run above happened in 2012 and I don't even have the phone I used back then. We have to be careful and not jump to conclusions. Just because an app registers geolocation data does not mean the device in which the data was found was the one used to generate it. App synchronization requires the examiner to be really careful before making causal determinations. As always careful testing and validation is necessary.

BIG BIG CAVEAT #2

Super important. In order to not miss evidence of possible deleted records do use a forensic SQLite tool to explore WAL records before any manipulation of the database is attempted with non-forensic tools.

Automation

The following script will automate the extraction of geolocation data by creating a CSV file per run.
Execute the script in the same directory as the extracted database. CSV files will be generated automatically from all available runs.

https://github.com/abrignoni/NikeRunApp-Geoparser

CSV file names generated by the time the run started.

Exported and converted runs from SQLite to CSV

I've also made a parser for the Android version of the app which will be explained in the next blog post. Database schema for Android differs a lot from the one we just discussed for iOS.

Conclusions
There are additional artifacts and larger set of metrics recorded in the referenced database for future exploration. Self join queries are awesome. Automating data parsing/conversion is even better.

As always I can be reached on twitter @alexisbrignoni and email 4n6[at]abrignoni[dot]com.

Monday, August 13, 2018

Finding Discord chats in iOS

Short version:

Discord for iOS keeps chat data in the following directories and files:

/private/var/mobile/Containers/Data/Application/*UUID*/Library/Caches/com.hammerandchisel.discord/fsCachedData/

The UUID formatted directory name in the path maps to the bundle id of the app.
Chat and media files found in the fsCachedData directory.
All chat content in JSON format.

/private/var/mobile/Library/FrontBoard/applicationState.db

SQLite database.
Contains a bplist that establishes the relationship between the bundle id (com.hammerandchisel.discord), app id (Discord.app) and the UUID folder name in the Applications folder.
Look for app name in 'application_identifier_tab' table. Take id number. Use id to extract blob field in the 'kvs' table. Open blob to map UUID to application name.
The more direct way to determine the correct application folder where the data resides is to open each one by one and look for the bundle id in the path or by using commercial tools.

/private/var/mobile/Containers/Data/Application/*UUID*/Documents/RCTAsyncLocalStorage_V1/manifest.json

Key value pairs in JSON format.
Contains items like 'LAST_VIEWED_PATH', 'first_run_date_key', and 'Emoji UsageHistory.'

Long version:

This post is a continuation of the Discord digital forensic analysis on all platforms that it is available. The main purpose is to determine where user generated items are located like chats and related multimedia files. Discord is supported on browsers and multiple operating systems.

Supported platforms

So far I have reviewed Discord in the following platforms:

This blog post will discuss Discord in iOS devices.

Testing Platform

For analysis I am using the following device and equipment:

iPhone SE - A1662
iOS 11.2.1
Jailbroken - Electra
Discord app installed and logged in to test account with test data.
Forensic workstation with Windows 10 and SANS SIFT in VMware Pro

Acquisition

My first attempt to extract Discord data from the device involved using Magnet Forensics Acquire and Cellebrite Physical Analyzer. None of the available extraction methods provided user generated Discord data. iTunes backup was also not aggregating any Discord files. The only solution was to try and jailbreaking the device in order to access the file system. The instructions I followed for the jailbreak came from the ElcomSoft blog here. It worked.

It is of note that even though the device was jailbroken the extraction/imaging tools at my disposal did not recognize the device as such. My assumption is that the current version of the Apple File Conduit is not compatible with the jailbreak being used. Thankfully it was fairly easy to extract the file system data via SSH connection from the iPhone to the forensic workstation SIFT virtual machine. Sarah Edwards in her 'Getting Saucy with APFS' presentation explains how to make a tarball of the iPhone file system via SSH. It worked like a charm. All done in a single line.

ssh root@iphoneip -p 22 'tar -cf - /private/var' > data.tar

Page 8 of the presentation slides PDF.
Worth watching the whole presentation.

After decompressing the tar bundle the logical files where processed with Autopsy and Cellebrite Physical Analyzer. No user generated Discord artifacts were present in the tools' parsed artifacts sections.

Analysis

There are multiple ways of navigating the acquired logical files in order to find Discord data. The most basic one is to do an index/text/keyword search for the term 'Discord.'

Another option is to navigate to the

/private/var/mobile/Containers/Data/Application

directory and look through the UUID named folders for the one that corresponds to the Discord app.

Commercial tools can establish this association for you.

Cellebrite Physical Analyzer

See the field application id, identifier and the source file for these values in Physical Analyzer. Since one of the plists with the required information resides in the directory we are trying to identify to begin with this method did not seem to be the fastest way to go when a manual examination is being done or when using an open source tool like Autopsy.

My approach:

Looked for the Discord app bundle id and search for that specific term within the data. There are various ways to obtain this bundle id. For this analysis I searched for the bundle id in a lookup web page here. For the Discord app the bundle id is 'com.hammerandchisel.discord'.

https://offcornerdev.com/bundleid.html

The bundle id keyword search in Autopsy led me to the 'applicationState.db' file located at

/private/var/mobile/Library/FrontBoard/

This SQLite database provided the connection between the bundle id and the UUID numbers in the 'Application' directory.

Open the SQLite database with a SQLite browser. Look for the bundle id name in the 'application_identifier_tab' table. Take note of the corresponding id number. In this instance it is 101.

Look at the id field with the 101 value.

Use the 101 id and look for it in the 'kvs' table in the 'application_identifier' field . Export the blob in the value field for the id. The exported data is a bplist that maps all pertinent UUID numbers to the application name and/or bundle id. The data can also be seen in the preview pane in binary mode without the need to export the blob content.

Blob as binary data

If the bplist is exported a viewer, like Sanderson Forensics BPlister, can be used to see the relationship between UUID and application we are looking for.

These are the UUIDs we are looking for.

Items of interest, including the chats, were located in the matching UUID directories.
The user data Discord directory per the bplist in my test device was the following:

/private/var/mobile/Containers/Data/Application/97E2A29D-B465-451B-B045-4D480FC7AFE2/

Be aware that the UUID folder name assigned in my device will be different from one in another devices for the same app. In some instances this UUID number can change, for example, if the app is uninstalled and installed again in the same device.

After finding the proper UUID named folder for Discord, here is the file structure for it:

Notice 'fsCachedData' right under 'com.hammerandchisel.discord'

Chats and media

The chats were located in the following directory:

/private/var/mobile/Containers/Data/Application/*UUID*/Library/Caches/com.hammerandchisel.discord/fsCachedData/

The chats are in the expected JSON format.

Chat data in JSON

In order to make the chats more human readable I made a simple Python script that gives the option to convert the JSON data into HTML or XLS format.

JSON to HTML/XLS conversion

The script can be found here:

https://github.com/abrignoni/Discord-JSON-chat-conversions

Here is a sample of how the JSON chat looks in HTML format:

Truncated for better viewing

The fields are pretty self explanatory. If the analyst is has an urgent interest in the content field in particular and wants to see how the chat content flows quickly then the XLS format is ideal for this. The XLS format compacts some of the metadata fields in order to see the chat content line by line chronologically. Most recent content at the top.

Easy to see the content quickly. Even emojis!!!!

The 'fsCachedData' directory also contains media shared in the chat. In the next image the item line in yellow is the JSON chat file while the item line in blue is a picture of a pen that was shared in chat.

We got media too. UUID like names.

An additional Discord JSON file was located here:

/private/var/mobile/Containers/Data/Application/*UUID*/Documents/RCTAsyncLocalStorage_V1/manifest.json

The UUID in the path above is the same as the one for the location of the chats. It contains multiple items of interest like 'LAST_VIEWED_PATH', 'first_run_date_key', and even 'Emoji UsageHistory.' Here is how it looks after converted to HTML.

Image truncated for visibility.

Conclusion

Be it via a jailbroken phone or alternate extractions that provide access to the iOS file system, knowing how and where to look for pertinent artifacts can be just as important as the extraction of the file system itself. Hopefully the paths, files and methodology here can speed up Discord artifact examinations.

I plan on taking a look at Discord for Linux soon. This will close out all platform analysis for the Discord service.

As always I can be reached on twitter @alexisbrignoni and email 4n6[at]abrignoni[dot]com.

Thanks!

Wednesday, August 1, 2018

JSON-ception and the need for mobile DFIR scripting courses

In my previous post I reviewed the Microsoft Translator (MT) app for Android. Usually I do reviews for apps that I have either worked with in the past or apps that are not supported by major forensic tools. In my limited experience it seems that recently many more apps are using XML/JSON data stores as opposed to the typical SQLite databases structures we are used to and that take a big chunk of training time on mobile forensics courses.

In my particular analysis of the MT app I found that the relevant data was stored in an XML file whose 'string' tags contained JSON which they themselves reference further JSON structures with some ending in a list. Jeez.

while true: list.append(JSON)

The data in my sample was limited to 4 OCR and 4 spoken phrase translations. It was small enough that it was easy to copy the relevant JSON from the XML files and copy it onto a separate file that I then convert it into HTML. It goes without saying that such output wasn't ready made for a report and further copy-pasting would be needed to do that. It was obvious this would not be scalable with a large data set in the target XML file.

As I was thinking about this Jessica Hyde made a nice comment on my reuse of a JSON to HTML script I had put together. This spurred the following exchange:

Key items in large font.

Is the future of mobile forensics all JSON all the time? Might be. I hadn't even considered IoT (and thank goodness for Jessica's upcoming book on IoT forensics.) Are we as examiners getting ready for it? As trainers and matter subject experts, are we thinking of incorporating scripting and parsing how-to blocks into the training materials we develop?

With this in mind I tried coding a simple parser in Python to process this particular piece of incepted XML/JSON. The script parses the relevant JSON values within the XML and places them in a SQLite database. The database has two tables, one for the OCR content and another for the spoken phrases. The reason I decided to use SQLite as the end product of the script is that I like time formatting via SQL query as seen here.

The script can be found here: https://github.com/abrignoni/MSFT-Translate-Android-Parser

It takes data like this that looks like this

And this is just a highlighted portion.

and turns it into the following two tables in the database:

Table 1: OCR

Notice translated text at the right.

Table 2: Phrases

Clean layout.

One does not have to be the ultimate programmer to achieve positive results on a case specific tasking. It is true that my script needs further refinement and proper error handling needs to be added in some sections but for the purpose of getting the pertinent data out for my review it works. I know this because as an examiner I took the time to understand the data store formats, I analyzed the content for relevance, and I verified that my script output adequately represents the content of the data store in question. At the end of the day validation is king in all we do.

My takeaway from this is that as DFIR instructors and examiners we need to focus more on foundational skills rather than just third-party tool usage. The first makes the second work to full capacity.

As time goes by and more apps depend on API returned JSON data the teaching on how to parse it will be as important as instructing folks how to join SQLite tables. With IoT possibly even more.

As always I can be reached on twitter @alexisbrignoni and email 4n6[at]abrignoni[dot]com.

PD:
In regards to Jessica's astute observation on using native tool support to view targeted data sets here is an example on how to do so in an Android emulator.