Monday, October 22, 2018

DFIR tool review: Cellebrite Virtual Analyzer


Virtual Analyzer (VA) is an Android emulator that works within Cellebrite Physical Analyzer (PA). It provides a convenient way to look at data logically (no deleted/recovered items) as the user would have seen it through the app itself. This blog post will briefly detail my initial interactions and reactions regarding the use of the tool.

Installation notes will be at the end of the post.

Testing
For this test I used one of my Android physical extractions that contained some sample data used for previous app reviews in this blog. I opened PA and pointed it to the physical extraction and let it parse. When it was done the VA option was available under the tools menu.

After the initial splash screen shown above, the tool lets you choose up to 5 apps to emulate at one time. For my initial run I selected the following 3 apps: Slack, LA Fitness Mobile and Microsoft Translator.

Select your apps.
The interface was clean and consistent with the PA screens I am already used to. After selection the next screen shows how much data will the emulator work with per app.
Notice the 'good to know.'
Processing took a little bit of time but even so it was still way, way, way quicker than doing a similar analysis by 'hand' as I have done here.


As VA starts you will see a progress bar show up on the lower right corner of the screen. When completed a window opens that has the emulated system ready for user interaction.


The selected apps will show up on the home screen and one has only to click them as one would in any Android system. By looking at the setting I noticed that the emulation was done via the use of the Andy Android emulator.


Going back to the home screen I interacted with the apps. All three worked and presented to me the data I expected to see it based on my previous non-emulated analysis. Here is how it looked per app. A link to non-emulated analysis is provided as a comparison.

Microsoft Translator

Emulated analysis image for Microsoft Translator:
All the options in the app worked.
LA Fitness Mobile

Emulated analysis image for LA Fitness mobile:

In this instance looking at check-in data through the emulated app was harder and slower than looking at it when extracted directly since it requires no screen manipulation. Still the analysis validation provided by this tool is immense. A long excel like report with dates and data can be validated by looking at a portion of it through the emulated app itself.

Slack

Emulated analysis image for Slack:


I really liked being able to see the images properly placed within their corresponding Slack messages. I have yet to find a way to do such a image placement manually by only looking at the content of the databases themselves. Emulation solves this problem.

To finish my testing I looked at two more apps, Discord and Nike Run. Discord behaved as the previous apps, the emulator showed me the content as expected.

Discord

Emulated analysis image for Discord:

As Discord seems to be showing up more and more in colleagues' case work being able to access it in this way is invaluable. In some instances emulation might be the only way of presenting the data absent an actual manipulation and screen recording of the device.

Nike Run

Emulated analysis image for Nike Run:

Womp, womp.
The emulation for the Nike app kept crashing and did not work. Using PA I looked at the database and the required data was there.

Time, GPS, speed, etc...
In this instance a database extraction paired with a custom SQL query was the ideal analysis method. It seems the app itself requires a connection to an external server before it initiates. VA, being a forensic tool, does not allow such a thing to happen. To permit the host system that runs PA/VA and the emulator to be connected to the Internet (they shouldn't be as a matter of course) can bring up issues regarding the illegal search of a remote system. Always be aware what you do, how you are doing it and under what authority you are operating. Such things cannot be overlooked.

Conclusion 
There is no ultimate one-in-all forensic tool and most likely there will never be one. As examiners we need to exercise proper judgment on how we go about our work and what will the ideal tools be for a particular case or work scenario. It seems to me that Virtual Analyzer is poised to become one of those essential tools in our forensic toolbox. 

If we could only have something similar for iOS....

As always I can be reached on twitter @alexisbrignoni and email 4n6[at]abrignoni[dot]com.

Installation Notes
Installation was easy but required a little tinkering. After downloading the executable from the Cellebrite portal I tried to, mistakenly, install it in a system that had a previous PA version than the current and required 7.10. Instead of getting an error of 'update to current version' I got the following:

Fair enough, it was my fault initially...
After noticing my mistake I upgraded to the latest version and the software almost installed. After some emails with support I found out that the Android emulator requires VMware Player to run. The error was caused due to my previously installed version of VMware Workstation 12.5.9 not being compatible. Currently VA works with VMware Player versions 12 to 14 and developers are currently working on a solution to have VA work with the latest release, version 15, of VMware Player. Since I couldn't uninstall Workstation 12 for reasons I moved to another system and did a clean installed of all the required software. This was my procedure:

  1. Download and install VMware Player version 14.
  2. Download and install PA.
  3. Download and install VA.

Quick DFIR review - LA Fitness Android app

Short version

The LA Fitness app for Android is an example of how a non messaging application can have plenty of relevant data if only one cares to look.
  • Gym check-ins: club name, date, time and location. Sample data covered a year's worth of check-ins.
  • Pin code for the app.
  • Billing information to include the last 4 digits of the credit card in use.
  • Registration information: membership start date, name, address, phone, email, home club.

Long version

LA Fitness is a gym club with more than 700 locations across the United States and Canada. The Android app has more than a million downloads.

LA Fitness in the Google Play Store
A lot of case work involves finding GPS coordinates, message content and media of interest. The app does not seem to have anything to do with the previous and still as a user I've notice that some of the information it shows me could be of relevance in a case. Is there information that it does not show me? What could it be?

As always I've tried to research this topic using open source tools.

Testing platform

Details of hardware and software used can be found here.

Extraction and processing

Magnet Forensics Acquire - Full physical extraction. 
Autopsy 4.8.0

Analysis

The LA Fitness app data folders can be found here:
/userdata/data/com.lafitness.lafitness/

Interestingly enough the relevant information resided in the files folder, not in databases. A look at the contents of databases folder shows that most SQLite files seem to manage data related to multimedia files and menus used by the app. None seem to track any user generated requests or actions. More testing needed of course.

From the files directory I identified four files containing user related content.

1. Gym check-ins
/userdata/data/com.lafitness.lafitness/files/MembershipCheckins
Data is formatted as:

GYM-NAME - Street address
XX/XX/XXXX 00:00:00 AM/PM

The GYM-NAME is the same as the city where the facility is located.
In my sample data the check-ins go back as far as a year.

2. Pin code for app access 
/userdata/data/com.lafitness.lafitness/files/4-digitpin
Used for a screen app lock. Cleartext file.

3. Billing information
/userdata/data/com.lafitness.lafitness/files/AccountProfile

It is of note that if the billing name can be different from the name on top. It also keeps the last 4 digits of the credit card number.

4. Customer information
/userdata/data/com.lafitness.lafitness/files/AccountProfile
Same fields as billing information with the addition of a few extra fields regarding the type of access the client has to gym services like personal training, guest passes and premium status.

Conclusion

From the Android third party video player database that contained proof of file access to the gym application that had the pin code needed to access a previously out of reach data store, new data can be found when and where we least expect it. As examiners it is true that our time is short and the cases plenty but to be at our most effective we need to think more about how we can streamline our workflows. Lately I've been getting great results from parsing my mobile extractions with Autopsy for fast index searching within SQLite databases while making it a habit of going through the databases section in Physical Analyzer looking at non-decoded data stores. As part of this streamlining of processes I will review in the next blog post the new Virtual Analyzer by Cellebrite.

 As always I can be reached on twitter @alexisbrignoni and email 4n6[at]abrignoni[dot]com.

Wednesday, October 17, 2018

Update to Android Discord app "missing" values

On 8/16/17 I posted a brief explanation on how to decode the last value in a line of chat text from the Discord app for Android.

https://abrignoni.blogspot.com/2017/08/discord-app-missing-values-not-missing.html

Today I had an email exchange with Patrick Mooney on why the app might behave the way it does.
I believe what is actually happening is that Discord is not translating the character set but rather is setting the most significant bit on these fields to annotate that it is the end of the field. In the data I've looked at the last characters are 'translated' up a constant value of 0x80 which is actually a simple setting of that most significant 8-bit value. ASCII doesnt use that most significant bit so it makes it a natural delimiter that also does not require the use of additional characters or storing field lengths & counting characters.

Good stuff. As more folks encounter Discord in their case work I am having the pleasure of hearing from them. Most times I give additional background regarding my posts on the topic but in this case I learned something new which I get to share. Thanks Patrick!

As always I can be reached on twitter @alexisbrignoni and email 4n6[at]abrignoni[dot]com.



Sunday, October 14, 2018

Github repository for SQL queries used in digital forensics

As I started to share some of the queries I use in my analysis of different apps I noticed how much screen space these take in a blog post. If the analysis requires more than one query or one query that joins many tables with many relevant fields the whole thing is pretty much unreadable when written as part of the regular text of a post. To solve this I made the following repo:

https://github.com/abrignoni/DFIR-SQL-Query-Repo

The idea is to house those SQL queries by platform and application. The readme of these queries will have an accompanying explanation on usage or a link to a blog post that does as such.  This has the benefit of making it easier to update the queries without having to go back and find the original blog post. It is also easier to search for a particular query per application.

When I write a blog post I just have a link to the proper location in the repo where the reader can go and look at the query. Github is awesome cause it does reserved words highlighting on its own. Super easy to read and no more query clutter in the post.

Only mobile apps for now.

If anyone wants to contribute some of their queries please do so. If anyone knows of a more widely known way of sharing DFIR related SQL queries please let me know. Be it in my repo or somewhere else the idea is to make these queries available and have them be easy to search and maintain.

As always I can be reached on twitter @alexisbrignoni and email 4n6[at]abrignoni[dot]com.


Finding Slack app messages in iOS

Short version

The Slack app for iOS keeps message related data in the following database location.Be aware that in the path *UU-ID* should be replaced by the application identifier and *DB-ID*  for the Slack work-space identifier of interest.

/private/var/mobile/Containers/Data/Application/*UU-ID*/Library/Application Support/Slack/*DB-ID*/Database/main_db
Details about these IDs are found below in the long version section of the blog post.

The following queries at https://github.com/abrignoni can be used as templates to extract relevant information from the database.
The following database is a repository of CFURL links, some in bplist format, in use by the application.
/private/var/mobile/Containers/Data/Application/*UU-ID*/Library/Library/Caches/com.tinyspeck.chatlyio/Cache.db 
Within the cfurl_cache_blob_data table, in one of the request_object fields, the Slack login/username and password for the user of the application was located in clear text. Details in the long version of the blog post below.

Images in use by the application, including pictures send as attachments as well as avatars, can be found in the following location:
/private/var/mobile/Containers/Data/Application/*UU-ID*/Library/Application Support/Library/Caches/default/com.hackemist.SDWebImageCache.default/
Currently I know of no way to reverse the file names to their original form as found by the Files & attachments query.

Long version

Slack is one of the more successful corporate/work messaging apps in the world with over 8 million daily active users and 4 million paid users.

If you haven't seen it in your cases you will soon.
Slack for iOS is the mobile version for Apple products. In this blog post we will examine where are the messages located, how can they extracted, and what other items of interest can be found in the app directories.

Extraction

For a detailed explanation on how to extract user data file system level information from a jail-broken device see the following post. The steps, even though centered on another app, will be applicable to most if not all current iOS applications. Here is the abridged version for my Slack app analysis. Not to belabor the point but if the short explanation that follows does not make sense please see the full example in the following post.

0. Looked up the bundle id for Slack here.

Would have never guessed the id.

1. Connected to device using SSH.

2. Made a local copy of the  applicationState.db located at /private/var/mobile/Library/FrontBoard/


3. Located the Slack id number in the application_identifier_tab table in the aforementioned database.

4. Used the application_identifier number to look up the value blobs in the kvs table. Exported the blob which contains a bplist. The contents can be seen by using the hex viewer function in your SQLite viewer tool of choice or after exporting.

5. Open the extracted bplist with a bplist viewer. The Slack directory UU-ID can be seen.
The long identifier in the picture will be different from the one in your device.

6. Copied the Slack application folder as identified in the bplist.
Notice the long ID number used as the directory name.

The copied folder contains the relevant files for our objective.

Analysis of message related databases

The directory structure of the Slack application is as follows:


Under the /Library/Application Support/Slack/ are the team id numbers. For my testing team the number is TCJRXQDB1. In every team id folder there will be the relevant database (/Database/main_db) to that team. In Android the actual database for the app has the id as its name.

Using a SQLite viewer one can see that the main_db database has 21 tables.



For my analysis I formulated five queries that identify what I believe to be pertinent information in most cases. It is of note that these queries are only templates. Each table has multiple fields and some of these might be relevant to your case and not be selected in the queries below.
Sample of query output.

Some notes on the values encountered for the extracted messages query:

1. User IDs start with the letter U, Channel IDs start with the letter C and Direct Message channel IDs start with the letter D. The linking of user messages to channels and channel metadata took some work since public channel ids, like General or Random, reside in a column of their own in one of the tables. Hence the need for the OR condition in the SELECT clause.

2. The ZLSKMESSAGE table had repeated rows in it. Not sure why. I used a distinct clause at the beginning to clean it up. Be aware that you should look at the data manually first and then try the query with the DISTINCT clause and also without it. I believe it is important to have a clear understanding of the data and the state it was found.

3. The query sets the time to local time. Adjust as needed for your purposes.

4. The ZFILEIDS indicates that the message was sent with a file attached to it. If sent the field will have a value in it. The actual sent data might reside in the ZSLKFILE table. If it is a picture it will not. The picture will be located in the following folder:
/private/var/mobile/Containers/Data/Application/*UU-ID*/Library/Application Support/Library/Caches/default/com.hackemist.SDWebImageCache.default/
The filenames bare no relation to the ones found in the ZSLKFILE table. Currently I know of no way to reverse the filenames in the cache to the original filenames.
Sample of query output.
Some notes on the values encountered for the user data query:

1. Notice the ZTEAMID value. It is the current working team. In iOS it is used to name the directory for the team database. In Android it is how the database itself is named. Additional relevant data will be discussed in the channels data query portion below.

2. The ZADMIN and ZOWNER identify the user/s that wields such designations. Boolean value.
Sample of query output.
Some notes on the values encountered for the files & attachments query:

1. If the content of the message attachment is available it will reside in the ZPREVIEW field. As explained previously images themselves are not found in the table. They reside in another directory named in a way that has no obvious relation to the original filename as stated in the query.

2. If proper consent/legal authorization is obtained the images can be downloaded from the server using the permalink URL. Validation via username and password will be needed. More on login username and password below.
Sample of query output.
Some notes on the values encountered for the work-space data query:

1. The ZDOMAIN value is the one used to name and connect to the work-space. In this instance it would be dfirtesting.slack.com. Internally the work-space is identified by the ID located in the ZTSID field.
Sample of query output.
Some notes on the values encountered for the channels data query:

1. There are tons of interesting timestamps in this table. Among them the time the channel was created, when was a topic and/or purpose for the channel set, the last message in the channel and the last read of the channel.

2. The purpose and topic string is found under ZPURPOSETEXT and ZTOPICTEXT respectively.

3. The ZSHAREDTEAMIDS field has the current work-space listed as a list of one item. Haven't tested the shared teams functionality but my educated guess is that the list would be populated with the respective IDs from other work-spaces. This will require further testing.

Analysis of non-message related databases

The following database is a repository of CFURL links, some in bplist format, in use by the application.
/private/var/mobile/Containers/Data/Application/*UU-ID*/Library/Library/Caches/com.tinyspeck.chatlyio/Cache.db 
    Within the cfurl_cache_blob_data table, in one of the request_object fields, the Slack login/username and password for the user of the application was located in clear text. See the image.

    Username, password and work-space ID.
    The content of this blob field starts as a bplist. Exporting the content and using a bplist viewer did not show me the username and password. Using the hex viewer to look at the contents did. One way of finding this data fairly quickly is by indexing your case with Autopsy (or any other forensics suite) and looking for __CFURLRequestNullTokenString__. The username and password will be fairly close to it.

    Username, password and work-space ID.

    As always these databases might contain deleted data in their respective WAL files. Use forensic SQLite tools to take that into account.

    As always I can be reached on twitter @alexisbrignoni and email 4n6[at]abrignoni[dot]com.

    Monday, October 8, 2018

    Quick DFIR analysis using a sandbox

    The concept of a sandbox is not new in computing. Sandboxes are found everywhere. The browser you are reading this with uses a sandbox in some way or fashion. In malware analysis sandboxing is one of the most useful ways to understand how such software works while preventing the host system from getting infected by it. Using virtual machines (VM) for digital forensic work can be considered to be like sandbox. This post will discuss a simple and quick way of looking at Windows applications with the use of a sandbox for digital forensics work.

    The case

    Let say we just received a system where it is alleged that a peer-to-peer (P2P) software was used to download contraband and share intellectual property. You decide to look at what P2P software was in use to determine what forensic artifacts can be recovered. The forensic tool used does not know how to parse some of these potentially evidentiary artifacts you found. Others, like the 'AC_SearchStrings.dat' artifact, has content consistent with how it is named. Do we really know the contents of the file are searches done by the user? How can we verify?  What about the items the tool does not understand and which contents are not in clear text?

    The substitution process

    One way of looking at data is by using the tool that created it as the viewer of it. If you can determine what software and version was in use, or recover the installation file from the target system, you can install the software in your analysis workstation. It becomes trivial to substitute the files created by the just installed program with artifacts recovered from the target system. The idea is that when the program re-starts it will read the substituted artifacts and show you the state of the program as it was on the target system at acquisition time. Neat, right?

    Be aware that installing software from a case on your work system can cause problems. Here is where a sandbox comes in.

    Sandbox vs VMs

    If you have a VM available you can install the software, do the substitution, and see what you get. After you are done you can revert to a previous snapshot and be ready for the next case. Nice. But ask yourself, how do you know you got everything? Do you know what changes did the software make to the system in its entirety? Do you know if you actually substituted everything there is to substitute about the software? There are software tools you can use to keep track of changes to the system made by the targeted software but they come with increased complexity. A sandbox might be a better alternate solution for this particular scenario.

    The case in toto

    One of the P2P software packages in use for this case was eMule. You were able to extract the eMule executable from the downloads folder as well as the corresponding configuration files in use at the time of extraction. Instead of using a VM we will use a sandbox. One of the oldest and most well know sandbox implementations is the Sandboxie software for Windows.


    After installation the program will have created the following directory structure:

    C:\Sandbox\{User Account Name}\DefaultBox

    Now we are ready to start.

    1. Install the software by right-clicking on the executable and press the 'Run Sandboxed' option.

    4th option from the top.
    2. Notice how all the files the program writes to disk are contained within the sandbox.

    Registry files!!!!

    3. Since you will only find files associated with the program in the sandbox it is easy be to determine if there are any other locations related to it where evidence may be found in the target device. Also you can see what locations might be needed to successfully complete artifact substitution for viewing.

    AppData - Substitution happens here.

    4. We are going to focus on the 'AC_SearchStrings.dat' artifact that we believe contains user generated search terms. After the installation there are no search terms available. This make sense since it is a clean install of the software in the sandbox.

    You can tell it is sandboxed.

    Sandboxie gives you two visual cues so you know that the software is running from the sandbox. The first one is the yellow border around the whole software window. I cropped it for the sake of space but it is clearly visible above. Also notice the second visual cue which is the [#] symbols surrounding the application name.

    Now let's view the contents of the 'AC_SearchStrings.dat' artifact before substitution.

    Obvious search terms are obvious. Still need to confirm.

    5 . Great. Now lets substitute the contents of the AppData folder in our sandbox installation with those extracted from the target device. First shutdown the sanboxed application. Here is how the sanboxed installation looks before substitution.

    Before with a few files.

    This is how it looks after substitution.

    After with a bunch of files.

    6. Run the program from the sandbox.

    Right-click. Run Sandboxed.

    7. Look at the searches history portion of the application.

    The program is your viewer.

    Few things create as big of an impact in stakeholders as seeing the data as the user would have seen it in the application that used or generated it.

    8. Go through the application and see what else can you see. In this particular instance one can find historical statistics of how much data was transferred and how many files completed transfer among many other potentially important data points. 

    9. After preserving the contents of the sandbox as part of your work you can delete it simply right-clicking and selecting 'Delete Contents' form the Sandboxie icon in the system tray.

    Benefits of sandboxing

    This analysis confers the following benefits:

    1. All writes done by the program to storage are kept within the sandbox.

    2. No need for third party software tracking of program interactions with storage.

    3. If the program has additional bloatware it will not affect the host system you use for analysis.

    4. It makes it easy to identify locations where configuration and file substitution will be needed for this type of application as a viewer analysis.

    5. It reduces the need to virtualize a whole target system if only a few limited apps are of interest.

    Conclusion

    The application as a viewer is one of the most effective ways of understanding and presenting digital forensic artifacts. The use of a sandbox is one simple and quick method that may help you achieve those desired results.

    As always I can be reached on twitter @alexisbrignoni and email 4n6[at]abrignoni[dot]com.