The Slack app for Android keeps message related data in the TCJRXQD1B database located in the following directory:
/userdata/data/com.Slack/databases/ Notice the database name. Due to its curious name I am not sure if it is the same across all Slack installations or unique per device. Will confirm soon with further testing.- Edit on 01/27/2019: Database name changes. Each Slack group will have their own unique database name. The name follows the same alphanumeric format and character length.
- Edit on 01/27/2019: Scripts below and test database can be found at:
https://github.com/abrignoni/DFIR-SQL-Query-Repo/tree/master/Android/SLACK
SELECT messaging_channel_counts.CHANNEL_TYPE,The following query on the TCJRXQD1B database was used to identify shared files and content that might possibly reside in the database itself. Not all shared content is kept there.
messaging_channels.NAME_OR_USER,
users.NAME,
users.PROFILE_REAL_NAME,
messages.USER_ID,
messages.CHANNEL_ID,
messages.TS as TS_Original,
DateTime(messages.TS, 'unixepoch', 'localtime') as TS_Conv,
messages.MESSAGE_BLOB,
json_extract(MESSAGE_BLOB, '$.text') as TEXTinJSON,
json_extract(MESSAGE_BLOB, '$.files[0].filetype') as FILE_TYPE,
json_extract(MESSAGE_BLOB, '$.files[0].name') as FILE_NAME,
json_extract(MESSAGE_BLOB, '$.files[0].created') as FILE_CREATED,
DateTime(json_extract(MESSAGE_BLOB, '$.files[0].created'), 'unixepoch', 'localtime') as FILE_CREATED_CONV,
json_extract(MESSAGE_BLOB, '$.files[0].public_url_shared') as PUBLIC_URL_SHARED,
json_extract(MESSAGE_BLOB, '$.files[0].permalink') as PERMALINK,
json_extract(MESSAGE_BLOB, '$.files[0].id') as FILE_ID
FROM users, messages, messaging_channel_counts, messaging_channels
WHERE messages.USER_ID = users.ID and messaging_channel_counts.id = messages.channel_id
and messaging_channels.msg_channel_id = messages.channel_id
ORDER BY messages.ts
SELECT files.ID,
files.CONTENT,
files.CONTENT_HTML,
files.FILE_BLOB,
json_extract(file_blob, '$.created') as FILE_CREATED,
DateTime(json_extract(file_blob, '$.created'), 'unixepoch', 'localtime') as FILE_CREATED_CONV,
json_extract(file_blob, '$.url') as URL,
json_extract(file_blob, '$.name') as NAME,
json_extract(file_blob, '$.filetype') as FILE_TYPE
FROM FILES
Long Version:
Last month Jessica Hyde, while being interviewed by the Rally Security folks, mentioned the need to share information due to the vast amount of topics available for research in the digital forensics field. She asked, as an example, "where are the write-ups to parse Slack?"
This post will be the start of a series looking into where Slack stores message data and how to extract it on all available platforms.
What is Slack?
Slack is a fully featured messaging and collaboration service designed for corporate workspaces. The latest Android Slack client has over 10 million downloads.
In Dr. Evil voice: 10 million downloads |
Testing platform
Details of hardware and software used can be found here.
Extraction and processing
Magnet Forensics Acquire - Full physical extraction.
Autopsy 4.8.0
DB Browser for SQLite
DB Browser for SQLite
Analysis
The Slack data folders can be found here:
The Slack messages are in the databases folder:
For my testing I created a free (as in no cost to use) Slack workspace. On this workspace I exchange a few messages both in the general (a type of public group) channel as well as direct messages (DMs). Within these I took and sent a picture, uploaded a picture, created a post, sent a piece of python code, and sent some emojis.
The main table is appropriately named as messages. The most relevant content in this table is located in the MESSAGE_BLOB field. The content is in JSON format and it has the user generated text message data as well as metadata of shared files.
In the past I have expressed my belief in the increasing importance of being able to know some scripting in order to custom parse JSON and XML formats when these are placed one inside another. See here. For example an XML file whose tags have JSON as content where this JSON has more JSON or XML further in it. Thankfully in this case the MESSAGE_BLOB field only contains various levels of JSON and nothing more. This fact made it really easy to access the JSON content and have it be shown as another field in a database table.
Json_extract
The moment I saw that the MESSAGE_BLOB field had only JSON in it I remembered a tweet sent to me by Paul Sanderson.
The book introduced me to the json_extract functionality in SQLite. It made extracting the JSON data a snap and presented it as another field in my SELECT query. For an in depth explanation on how it works and to see some usage examples go here.
The following images are the result of the first query shown at the beginning of the blog post. I divided them in two since it was impossible to screenshot all the fields with content at once. The first image are the first fields in the query that relate to the users and messages. These have been sorted by time; timestamps have been converted to local time as well.
All the fields at the right of the MESSAGE_BLOB are fields extracted from it using json_extract. These are TEXTinJSON and all of the following fields in the next image.
These fields are metadata about the images and files shared within a message. This is how a user would have seen them:
A comparison of the last two images tells us that the shared content metadata (file creation time, filename, type, etc...) is contained in the JSON file we just extracted. Where to these shared files reside?
It depends
Shared files can be in a few places:
/userdata/data/com.Slack
In the Beatles voice: All the lonely folders. Where do they all come from. |
/userdata/data/com.Slack/databases/In my exemplar phone the main SQLite database, the one that contains the messages, is named TCJRXQD1B. I find this name to be really peculiar and I wonder if it is unique to my Slack installation. This is something I will follow up on soon. The schema of this database is as follows:
So many tables. |
The main table is appropriately named as messages. The most relevant content in this table is located in the MESSAGE_BLOB field. The content is in JSON format and it has the user generated text message data as well as metadata of shared files.
In the past I have expressed my belief in the increasing importance of being able to know some scripting in order to custom parse JSON and XML formats when these are placed one inside another. See here. For example an XML file whose tags have JSON as content where this JSON has more JSON or XML further in it. Thankfully in this case the MESSAGE_BLOB field only contains various levels of JSON and nothing more. This fact made it really easy to access the JSON content and have it be shown as another field in a database table.
Json_extract
The moment I saw that the MESSAGE_BLOB field had only JSON in it I remembered a tweet sent to me by Paul Sanderson.
If there is lots of json & the format of each is similar (it often is) then you can import it into SQLite and use the official SQLite JSON extension (if enabled - it is in my Forensic Browser for SQLite) to query it and produce reports. There is some info in my book on page 223.— Paul Sanderson (@sandersonforens) July 30, 2018
The book introduced me to the json_extract functionality in SQLite. It made extracting the JSON data a snap and presented it as another field in my SELECT query. For an in depth explanation on how it works and to see some usage examples go here.
The following images are the result of the first query shown at the beginning of the blog post. I divided them in two since it was impossible to screenshot all the fields with content at once. The first image are the first fields in the query that relate to the users and messages. These have been sorted by time; timestamps have been converted to local time as well.
message_blob is JSON!!! |
Messages and metadata of attached files. |
Emoji, text, image, code and post. |
It depends
Shared files can be in a few places:
- In the following app directory: /userdata/data/com.Slack/cache/
- Files located in this folder are files, like images, that were taken with the device itself.
- In the image_manager_disk_cache folder after message sync in the following directory: /userdata/data/com.Slack/cache/image_manager_disk_cache/
- The cache engine used by Slack for images is called Glide. More here.
- I know of no way to reverse the hashed filename of the images back to their original components, one of these being the original filename itself. More here. This means there is no way of ascertaining for sure which images in this folder match up with a particular message and file metadata in the extracted JSON or files table.
- The only way to ascertain which files go with what is to go to the url in the permalink field as stated in the messages query. To do this authentication to the Slack work space is required for access.
- In the files table.
- The files table has a reference for all shared items. It can also contain the actual file data for posts and shared code. The second query at the top of the blog post can be used to locate these types of shared content.
The following image shows the result of executing the second query. It used json_extract as well.
Files |
Some things of note from the previews image:
- ID = Used to match the file content with the message itself.
- Content = Contained all of the sample python snippet of code I shared.
- Content_html = Contained all of the post content I shared.
- File_blob = Used json_extract to generated the rest of the fields from it.
- File_Created = Self explanatory.
- URL = When this field contains data it can reference a file created by the device itself. Notice how the contained data path references the local file system when that happens.
- Name = Self explanatory.
- File_typer = Self explanatory.
Conclusion
There are additional Slack artifacts not covered in this write-up. Be aware that my queries extracted from the JSON only the content I found to be relevant. You might find that in your case work some of these other artifacts might be of importance. Change the query as needed and do validate, validate, validate.
Last but not least:
In order to not miss evidence of possible deleted records do use a forensic SQLite tool to explore WAL records.
As always I can be reached on twitter @alexisbrignoni and email 4n6[at]abrignoni[dot]com.
There are additional Slack artifacts not covered in this write-up. Be aware that my queries extracted from the JSON only the content I found to be relevant. You might find that in your case work some of these other artifacts might be of importance. Change the query as needed and do validate, validate, validate.
Last but not least:
In order to not miss evidence of possible deleted records do use a forensic SQLite tool to explore WAL records.
As always I can be reached on twitter @alexisbrignoni and email 4n6[at]abrignoni[dot]com.