In my particular analysis of the MT app I found that the relevant data was stored in an XML file whose 'string' tags contained JSON which they themselves reference further JSON structures with some ending in a list. Jeez.
while true: list.append(JSON) |
The data in my sample was limited to 4 OCR and 4 spoken phrase translations. It was small enough that it was easy to copy the relevant JSON from the XML files and copy it onto a separate file that I then convert it into HTML. It goes without saying that such output wasn't ready made for a report and further copy-pasting would be needed to do that. It was obvious this would not be scalable with a large data set in the target XML file.
As I was thinking about this Jessica Hyde made a nice comment on my reuse of a JSON to HTML script I had put together. This spurred the following exchange:
Key items in large font. |
With this in mind I tried coding a simple parser in Python to process this particular piece of incepted XML/JSON. The script parses the relevant JSON values within the XML and places them in a SQLite database. The database has two tables, one for the OCR content and another for the spoken phrases. The reason I decided to use SQLite as the end product of the script is that I like time formatting via SQL query as seen here.
The script can be found here: https://github.com/abrignoni/MSFT-Translate-Android-Parser
and turns it into the following two tables in the database:
Table 2: Phrases
One does not have to be the ultimate programmer to achieve positive results on a case specific tasking. It is true that my script needs further refinement and proper error handling needs to be added in some sections but for the purpose of getting the pertinent data out for my review it works. I know this because as an examiner I took the time to understand the data store formats, I analyzed the content for relevance, and I verified that my script output adequately represents the content of the data store in question. At the end of the day validation is king in all we do.
My takeaway from this is that as DFIR instructors and examiners we need to focus more on foundational skills rather than just third-party tool usage. The first makes the second work to full capacity.
As time goes by and more apps depend on API returned JSON data the teaching on how to parse it will be as important as instructing folks how to join SQLite tables. With IoT possibly even more.
As time goes by and more apps depend on API returned JSON data the teaching on how to parse it will be as important as instructing folks how to join SQLite tables. With IoT possibly even more.
As always I can be reached on twitter @alexisbrignoni and email 4n6[at]abrignoni[dot]com.
PD:
In regards to Jessica's astute observation on using native tool support to view targeted data sets here is an example on how to do so in an Android emulator.
PD:
In regards to Jessica's astute observation on using native tool support to view targeted data sets here is an example on how to do so in an Android emulator.