Update 03/21/2019:
Script now decodes NSdata contents. See details at
https://thinkdfir.com/2019/03/21/updating-the-knowledgec-parser
Short version:
Python 3 script that export compound bplists from a specific field on a iOS knowledgeC database, extracts the internal bplist and creates a triage html report of its contents. Two versions are provided, for iOS 11 and iOS 12, due to a slight difference on how the internal bplist is referenced within the external that holds it.
The scripts can be found in the following location:
https://github.com/abrignoni/iOS-KnowledgeC-StructuredMetadata-Bplists
It's recommended that you load these plists into your viewer of choice to examine them directly.
Long version:
Like most DFIR things lately this one also started with Phill Moore. He reached out to the community on the following:
Since I've been on a data parsing binge of lately I was happy to try and assist. As I was reading the replies to Phill's tweet I was reminded of how, of all the data structures utilized by Apple products, bplists are one of the most prevalent. So prevalent that these can be contained within SQLite databases and they themselves contain other bplists within them. Total data storage inception. At this point there was no doubt...
Thanks to kind souls like
@i_am_the_gia,
@ScottVance, and others who will remain anonymous; we got test data to see if we could do the following:
- Export the bplists intact from the SQLite DB.
- Extract a bplist (clean) from the bplist that holds it (dirty.)
- Access the clean bplist and create a file that could be used in forensic tools for analysis.
- Generate a triage report of clean bplist data contents to easily evaluate relevance before importing to forensic tools.
There are many tools that let us view the contents of bplists but when these are nested in such a way getting to the internal content requires some manual work. Like any and all examiners the world over manual work is just the universe telling you there is a need to automate and scale.
The database selected for our testing was the iOS knowledgeC database. I highly recommed everyone reads
Sarah Edwards' article on it, THE article on it. By looking at the Z_DKINTENTMETADATAKEY__SERIALIZEDINTERACTION field within the ZSTRUCTUREDMETADATA table can see how these bplists look when nested.
Notice how there are two bplist headers in the same SQLite database content.
Export
Exporting the data was a straight forward action. Regular SELECT and assigning the content of the field to a variable that would be written to a file. For this to work the receiving file has to expect binary content. As seen in the next image the extracted bplist are named in the following convention:
- D/C = Dirty or clean There is nothing wrong or dirty about the shell bplist. It is a shorthand in opposite to the internal bplist which I called clean after extraction due to a lack of its bplist shell.
- Z_PK = The field name in the table that contained the primary key for the row that contained the exported bplist.
- Numeric value = Integer contained in the Z_PK field for the row that contained the exported bplist.
By establishing this filename convention the examiner can easily backtrack to the proper row from the target table if additional fields are of interest or if there is a question on the validity of the exported bplist.
Extraction
Now that we had exported the bplist we had to get to the clean one in a automated way. Thanks to
@firmsky I was reminded of an article by
Sarah Edwards on the use of ccl_bplist for the parsing of NSKeyedArchiver bplists in Python. These bplist objects are beyond the scope of this blog but just know that I am grateful that Alex Caithness came up with this module that saved me from experiencing a painful headache. You can find this great module here:
https://github.com/cclgroupltd/ccl-bplist
With this module in hand and some test data we figured out that:
- In iOS 11 one has only to deserialize the bplist at the root which gives you the clean bplist.
- In iOS 12 one has to desiralize the bplist at the NS.data level since the clean bplist is contained within it.
The previous was a long way of saying that in iOS 11 the following key piece of cll_bplist function
CleanBplistFile = ccl_bplist.deserialise_NsKeyedArchiver(DirtyBplistFile)
would give you the clean bplist ready to write out where as the following code
ns_keyed_archiver_objg = ccl_bplist.deserialise_NsKeyedArchiver(DirtyBplistFile)
CleanBplistFile = (ns_keyed_archiver_objg["NS.data"])
would give you the clean bplist after accessing the NS.data portion. It would be good to have further confirmation that these type of incepted bplist truly vary per iOS version and that is not only a crazy coincidence of the the data sets we had available.
Originally the purpose of this exercise was to find a way to easily extract the clean bplists in order to import them into forensic tools with minimum effort and no manual extraction. It became clear that a triage report was needed when one of my data sets contained 1565 extracted bplists. Be aware that the script developed will keep both the dirty and clean bplists in separate folders within a timestamped directory. In this way one can backtrack the whole process for validation purposes.
Reporting
With a triage report that shows the content one can decide which set of bplists should be drilled down more or just retained due to work or case relevance. The fields on the html formatted report are the following:
- Filename = Same format as stated before.
- Intent Class = This is a value taken from a field in the table where the dirty bplists where stored in the knowledgeC database. This value is key cause it gives you a clue of the purpose of the contents of the bplist.
- Intent Verb = Another value taken from one of the table fields. Further description of bplist purpose and/or type of content.
- NSstartDate = Time stamp.
- NSsendDate = Time stamp.
- NSduration = Float value.
- NSdata = Binary data store of activity.
Since the report is a triage report the NSdata values are just a string representation of the binary values in it. Although it contains many non human readable characters it is pretty easy to key in on those ASCII values that one can easily read. The report is a testament to my ignorance on how to convert these values to something more pleasing to the eyes, but for triage purposes that help the examiner decide what to process with a forensic tool further it is perfect. Some of the values can cleaned up a little with UTF-8 decoding but many, especially those that contain a lot of data, are not.
The next picture is an example of the report format. The particular data in the report was shared with the condition that it would not be share hence the redaction of it.
It is up to the reader to test it out and discover for herself what awesome data resides in these structures. Things that are, things that were in one form and changed to another, and things that are no more.
Future work
I was surprised by the amount of data contained in just one field from one table in one database. I can only imagine what relevant data resides in incepted SQLite held bplists in other tables and other databases. The next step is to evolve the script so it can extract any bplist blob from any SQLite table and generate dirty and clean instances as needed with complementing reports for triage. A key part is to better better understand how the NSdata fields work to see if anyone in the community knows how to parse them. If only the days had more hours and our bodies less need for sleep.
As always I can be reached on twitter @AlexisBrignoni and email 4n6[at]abrignoni[dot]com.