Digital forensics

OSXCollector: Walkthrough

OSXCollector is an open-source forensic evidence and analysis tool for OSX released by Yelp back in 2014. Its GitHub repository can be found at https://github.com/Yelp/osxcollector

Built in Python, OSXCollector's script running on the infected machine and it generated the output in a JSON file which contains the description of the target machine. It gathers its information from different sources such as SQLite databases, local file systems, plists, etc.

Learn Digital Forensics

Build your skills with hands-on forensics training for computers, mobile devices, networks and more.

Start Learning

Using OSXCollector, a forensic investigator/analyst can shed light on the following questions:

Is the machine infected?
How did the malware get there?
How can this be presented and detect further infection?

To begin, let's start by cloning the project on our local machine:

$ git clone https://github.com/Yelp/osxcollector.git

A file named osxcllector.py can be found at osxcollector > osxcollector.py. It is a single Python file can run without any dependencies on a standard OSX machine. To run this file simply type:

$ sudo osxcollector/osxcollector.py

It will take a couple of minutes to run the file and once done, this the output you might come across:

This shows us the file has done its work and an output file by the name of osxcollect-2017_05_03-18_06_35.tar.gz is available with our output. Once extracted the contents, this is what it looks like:

As we can see various logs are present in the output folder along with the JSON file:

Note: It is important to note that Python command uses the default Python interpreter on the OSX machine and is not overridden by Python version installed via brew. OSXCollector relies on a few native Python bindings for OSX libraries which might not be available in other Python versions. To specify a specific Python version, you can use the following command:

$ /usr/bin/python2.7 osxcollector/osxcollector.py

Additional parameters can be used with osxcollector.py:

-i INCIDENT_PREFIC/--id=INCIDENT_PREFIX: This sets an identifier which is used as the prefix of the output file. The default value is osxcollect:

$ sudo osxcollector/osxcollector.py -I ChiraghDewan

The output folder created would be named: ChiraghDewan-2017_05_03-18_06_35

-p ROOTPATH/--path=ROOTPATH: This sets the path to the root of the filesystem to run collection on. The default value is /. This is helpful when running collection on the image of a disk:

$ sudo osxcollector/osxcollector.py -p '/mnt/pawned'
-s SECTION/--section=SECTION: This parameter is used to run only a portion of the full collection:

$ sudo osxcollector/osxcollector.py -s 'safari' -s 'downloads'

Following is a list of all the sections and sub-sections:

version
system_info
kext
startup
- launch_agents
- scripting_additions
- startup_items
- login_items
applications
- applications
- install_history
quarantines
downloads
- downloads
- email_downloads
- old_email_downloads
chrome
- history
- archived_history
- cookies
- login_data
- top_sites
- web_data
- databases
- local_storage
- preferences
firefox
- cookies
- downloads
- formhistory
- history
- signons
- permissions
- addons
- extension
- content_prefs
- health_report
- webapps_store
- json_files
safari
- downloads
- history
- extensions
- databases
- local storage
- extension_files
accounts
- system_admins
- system_users
- social_accounts
- recent_items
mail
full_hash
-c/--collect-cookies: This parameter collects cookies value. By default, it does not dump the value of a cookie as it may contain sensitive information.
-l/--collect-local-storage: This parameter collects the value stored in the web browser's local storage. By default, they are not collected as the values may contain sensitive information.
-d/--debug: The parameter enables verbose output and Python breakpoints.

Understanding the JSON file

Common Keys

Every line of the JSON file records one piece if information. Some common keys are:

osxcollector_incident_id: A unique ID shared by every record
osxcollector_section: The section or type of data the record holds
osxcollector_subsection: The subsection of the type of data the record hold

File Records

ctime: The file creation time
mtime: The file modified time
file_path: The absolute path to the file
md5: MD5 hash of the file contents
sha1: SHA1 hash of the file contents
sha2: SHA2 hash of the file contents
signature_chain: The common names of the certs in the files signing chain

Downloaded File

xattr-wherefrom: A list containing the source and referrer URLs for the downloaded file
xattr-quarantines: A string describing which application downloaded the file

The keys mentioned above are a few that OSXCollector uses. Few more that are used are for:

SQLite Records
Timestamps
Version section
System_info section
Kext section
Startup section
Applications section
Quarantines section
Downloads Section
Chrome section
Firefox section
Safari section
Accounts section
Mail section
Full Hash section

The detailed keys of the section mentioned above can be found in the README.MD file of the project which can be read after cloning it or on its GitHub repository.

Basic Manual Analysis

Forensic analysis is not an exact science. Some may argue that it may fall somewhere between art and science and because of that, every person that reads the story, sees something different.

Going through the entire JSON can be overwhelming to many. However, we can use few commands to narrow down our search:

Timestamps

$ cat osxcollect-2017_05_03-18_06_35.json | grep '2017-05-03'

Browser History

$ cat osxcollect-2017_05_03-18_06_35.json | grep '2017-05-03' | jq 'select(has("url")) | .url'

Note: The above command also requires jq to be installed. Jq is an open-source JSON processor which is available at: https://github.com/stedolan/jq

Single User

$ cat osxcollect-2017_05_03-18_06_35.json | jq 'select(.osxcollector_username=="Chiragh")|.'

Using a combination of sections and subsections, more useful commands can be created and used to simplify the process.

Automated Analysis

OSXCollector helps us by automating the task of analyzing the output by using various filters. A package called osxcollector.output_filters.

Its repository can be found at https://github.com/Yelp/osxcollector_output_filters

Unlike osxcollector.py, the filters have dependencies do not come pre-installed on a MacOS. The best solution proposed by Yelp is to use Virtualenv.

Installation

Run the following commands to install virtualenv:

$ sudo pip install tox virtualenv

$ make venv

$ source virtualenv_run/bin/activate

Different type of filters present

Find Domains Filter
Find Blacklisted Filter
Related Files Filter
Chrome History Filter
Firefox History Filter
Chrome Extensions Filter
Firefox Extensions Filter
OpenDNS Related Domains Filter
OpenDNS Lookup Domains Filter
Virus Total Lookup Domains Filter
Virus Total Lookup Hashes Filter
Virus Total Lookup URLs Filter
Shadow Server Lookup Hashes Filter

More details, along with how to use them, can be found at the repository link.

Learn Digital Forensics

Build your skills with hands-on forensics training for computers, mobile devices, networks and more.

Start Learning

Conclusion

The biggest downside is that it is only for MacOS. During recent times, the development for OSXCollector has been slow. However, Yelp encourages developers to contribute. All-in-all OSXCollector is a powerful tool that can make any forensic analyst's life easier.

Posted: May 4, 2017

Chiragh Dewan

View Profile

A creative problem-solving full-stack web developer with expertise in Information Security Audit, Web Application Audit, Vulnerability Assessment, Penetration Testing/ Ethical Hacking as well as previous experience in Artificial Intelligence, Machine Learning, and Natural Language Processing. He has also been recognised by various companies such as Facebook, Google, Microsoft, PayPal, Netflix, Blackberry, etc for reporting various security vulnerabilities. He has also given various talks on Artificial Intelligence and Cyber Security including at an TEDx event.

OSXCollector: Walkthrough

Understanding the JSON file

Common Keys

Basic Manual Analysis

Automated Analysis

Installation

Different type of filters present

Conclusion

Learn Digital Forensics

Get certified and advance your career