Sunday, August 26, 2018

Situational Awareness and System Triage Assistant



There are several points in the Incident Management Life Cycle where it is import to build a thorough understanding of the state of the system. In this post I will discuss a tool which I use during both attack and defense which automates most of the Situational Awareness and System Triage work one might want across Windows, Linux, Mac, and FreeBSD.

Background

A few years ago, I ran across an abandoned Python project called "Rapid Triage". The point of the project was, as it's name implies, to help investigators quickly determine the risk of an operational system. It didn't perform any actual analysis. Rather, it issues a series of commands into a terminal-like subprocess. Finally, the output was captured and sent to a file for human analysis. There was a white paper published by the author in 2014 https://www.sans.org/reading-room/whitepapers/tools/rapid-triage-automated-system-intrusion-discovery-python-34512 but I couldn't find much else after that. I liked the concept, but I saw a couple of things I thought I might be able to contribute. So, I adopted the project. First, I integrated it into my experimental host-based IPS (called Pyzano). I updated several commands, added others, and made some more performance tweaks along the way. I renamed this utility the Situational Awareness tool. This worked well, and became one of the main features I relied on. I decided to split my updated version back out into it's own tool so I wouldn't have to download the whole Pyzano project just to run the one module. I still integrate it with my IPS, but now I can run it as a stand-alone application.

Redesign

While I was tweaking the tool to fit in my own methodology, I came up with a list of things I thought I could improve when making this a go-to tool in my Incident Response toolbox. I had 3 use cases in mind when redesigning this tool.
  1. System Baseline Utility to be run on a clean system in order to gather it's normal operating conditions.
  2. System Triage Tool to be used when investigating a machine for Indicators of Compromise (IoCs).
  3. Situational Awareness Tool to be deployed during the post exploitation phase. This use case, perhaps ironically, required no changes.
Points one and two are of particular interest to me as they directly relate to my day-to-day responsibilities currently.

Work Structure

If you have ever read any introductory material on Incident Response best practices (or if you are familiar with Root Kit development), you already know that you cannot trust the system you are investigating to report on it's state truthfully. Any self-respecting Root Kit is going to subvert (or 'Hook') the system functions which control the responses to you're querying for. For an excellent rundown of Root Kit development, you can refer to the book  "Rootkits" by Greg Hoglund and James Butler (https://www.amazon.com/Rootkits-Subverting-Windows-Greg-Hoglund/dp/0321294319).

The original Rapid Triage tool relied solely on tools in the operating system's PATH which won't work from a defender's point-of-view (probably fine from an attackers point-of-view though). To counter this, I added the option to specify a directory to load the necessary binaries from. This is complicated by the fact that the Situational Awareness Tool needs to support Ubuntu Linux, FreeBSD OSX, and Windows systems. Each system needs it's own trusted tools USB. In the book "Linux Forensics" Dr. Phil Polstra lays out a suggested method for gathering all the tools you need for Linux on a USB. That way you can avoid touching as much of the suspected workstation's tools as possible. I will go over this more below.

Extending the Commands

One of the things I liked about the original design was the modularization of concerns. In the code, seven areas of information are defined: General System, File System, Network, Process, Task, User, and Event Log.


For each of these concerns, commands are identified which will provide the necessary information for all of the supported OSes. For example, under the General System Information Commands section, the commands for the Linux family of OSes are:
if os_type is "linux":
cmds = [
'System Name::hostname',
'Effective User::whoami',
'Runlevel::runlevel',
'System Type::uname -a',
]
The original format was [Description]::[Command]. The code was laid out in such a manner that each section had little more than a collection of commands for each OS related to a specific concern, some code to pass the commands to the operating system, and then a bit more to write the command results into an output file defined at run time. By including the rules directly in the application the original author was able to keep the application to a single file which is good for portability sake. However, I like to update and modify the commands and scrolling through the code looking for the right section of commands was thoroughly annoying. I ended up moving the sections into a separate JSON file called Commands.json The structure of which is fairly simple:


Now, as the need arises to groom my command list, I have a simple place to do it. This format also benefits the reporting process, as it allows me to quickly list the commands issued during the initial triage phase of an analysis. Documentation is crucial in any forensic investigation to ensure anyone can determine what potential changes may be have been introduced into the system while trying to investigate it.

Building the trusted system binaries USB

As I mentioned previously, the primary shortcoming I saw with the original utility was that it relied on the system under test to self-report. This is fine for baseline readings but shouldn't be used during actual incident response situations. During these times you will want to have a set of tools gathered from a known-safe copy of the same version of Operating System to be investigated. The basic flow is to gather the binaries and any supporting directories required and write them to a USB, making sure to preserve permissions on the executables. Doing this largely depends on system type so make sure to research application portability for each type of systems in your environment.
Image result for cool USB

For this example I will show how to make a Trusted System Binaries USB for Linux. First, you will need a Trusted Linux System to use as the source. I keep an air-gapped Desktop and different Gold Plated bootable USBs expressly for this purpose. I will assume, for the purpose of this explanation, that you are using a completely sterile USB (never used in manner that may have polluted it's firmware) to transfer the tools to.
First, format the USB as EXT4. You can do this in a few commands from the trusted system. Plug in the USB to be used for tool storage and identify it using df -h. I will use /dev/sdb1 as my USB device ID. These commands will format the drive and copy the recommended folders. DO NOT RUN THIS WITHOUT CHANGING THE DEVICE ID It could very well format your primary drive and we don't want that!
First log in as the super user (su) and then unmount the tool USB (if it auto-mounted)
sudo su
umount /dev/sdb1
Next, begins the formatting process. It is worth mentioning at this point, that the EXT4 file system used below will not work equally well for all systems. It does, however, allow Linux to transfer additional information (such as ownership and execution permissions) along with the executable.
mkfs.ext4 /dev/sdb1
mkdir -p /media/ltoolusb
mount /dev/sdb1 /media/ltoolusb
Theses commands copy the System Binaries from the current system onto the USB (assuming a default directory setup):
export USBL = /media/ltoolusb
cp -rp /bin $USBL
cp -rp /sbin $USBL
cp -rp /lib $USBL
cp -rp /lib64 $USBL #for x64 platforms
cp -rp /usr $USBL
Once these commands complete you should unmount the USB. It is now ready for use.

Encrypted Reporting

I liked the original feature of the result file being hashed, but I wanted to take it a step further and encrypt the result file before sending it across a network. I think it is important, when applying cryptography to a problem, to clearly state the risks it is expected to mitigate. Far too often cryptography is applied as a 'security silver bullet'. In this case, there are three goals I think crypto can help achieve:
  • Privacy-in-Transit.
  • Message Authentication.
  • Mutual Identification.

Mutual Identification

Mutual identity authentication is achieved by using asymmetric RSA keys to sign each message and to verify each signature. The client's Private Key and the Server's Public Key are coded into the agent at creation time (more on this later).

Message Authentication

Message Authentication is achieved by including a hash of the message body. The original code included a section to create an MD5 hash of the file to be saved alongside the result file. MD5 is so old that there is no excuse to leave it in place. I swapped this out for a SHA hash. Once a message is encrypted, the result is hashed and sent/stored along with the final report.

Privacy-in-Transit

Previously, each section was written to a text file stored in plain-text somewhere on the file system. This poses a few risks depending on where you decide to store the file. If you store it on your tool USB, then that cannot be write-blocked and you risk your tools being corrupted (as described later). If you store it on the system under test, you are making changes to the system which should be avoided wherever possible. Furthermore any infection present on the system may be able to influence the file contents (say by randomly changing any strings which may identify it in the result). For this reason, the last thing I added to this section was the ability to deliver the report to an off-system collection point. That way, Malware would have to be inside the running RapidTriager process to influence the results. Unfortunately, this change brings with it the risk of network snooping. Several pieces of sensitive system details might be included in any given report. I wanted to be sure I was not inadvertently exposing information while trying to investigate. To accomplish this, I created a CherryPy HTTPS web server component. I tend to host this on my investigation laptop during the collection phase. The server maintains a list of valid client ids and handles the encrypted communications. New servers and agents (and therefore keys) are generated for each investigation. I will cover this in more detail in the section on Preparation.

Process

When a user (or myself) starts the Situational Awareness Tool with a remote server defined (-r <IP>), the client (I will call her Alice) will attempt to establish a session with the reporting server (I will call him Bob). The two systems will exchange the information required to create a Station-to-Station encrypted channel. The benefit is that the STS Key exchange is mutually authenticated by the use of asymmetric RSA keys. It is common to use asymmetric cryptography to establish a symmetric key, which is in turn used to do the actual encryption. This is because asymmetric key cryptography is computationally expensive (compared to symmetric) and so is not suitable for large amounts of data.

Station to Station protocol

Alice selects a 512 bit random key (x), computes g^x (called gx), then encrypts this result using Bob's public key. Alice then uses her private key to sign the encrypted result. The signature and the encrypted blob are then hashed as described previously. All of these pieces (encrypted blob, signature, message hash, and client id) are sent to Bob.

Bob then verifies the message hash matches the one he derives from hashing the combination of the signature and the encrypted blob received. If these match, Bob uses the client ID to retrieve the proper public key for Alice. The signature is then verified. Supposing the signature is valid, Bob uses his private key to decrypt the blob which contains gx. At this point, Bob can be fairly confident he is communicating with an authorized agent. Bob then begins forming his response. First he selects his own random 512 bit key (y) and computes g^y (called gy). The message body is the string concatenation of gx (from Alice) and gy (called gxgy). Bob encrypts it using Alice's public key. The encrypted blob is then signed using Bob's private key. The encrypted blob and the signature are once again hashed together. Before responding, Bob computes the shared secret gx^y. He stores this in a Session file created for the communication with Alice. All future communications will be encrypted and decrypted using this shared secret as the key. Once Alice receives the response from Bob, she follows a similar path in which she verifies the hash, validates the signature, and finally decrypts the blob which contains gxgy. If all of these steps complete successfully, Alice can be fairly confident she is speaking to the report server as expected. She can recover gy from the decrypted message by simply removing gx from the string and casting it back to an integer. Alice then computes the shared secret on her side gy^x. This works because ((g^x)^y) = ((g^y)^x). Alice takes the Triage results as a string to send to Bob. She encrypts it using the shared secret. Once again, she signs the message and computes the hash over the combination. She forwards all of this to Bob. Bob then goes through the hash and signature verification procedure and supposing everything checks out, uses the shared secret in the session file to decrypt the result and store it as a Triage Report.

Preparation Phase

The investigation preparation phase begins with the report of some incident which might require an in-depth look at the system state. I will begin by speaking with the user making the report. Often times you can rule out malicious activity with a few basic troubleshooting questions. If I conclude I should proceed with live analysis I will begin to prepare the Triage Area (hence the name). There are several questions I ask which help guide how I deploy within the environment. A few examples are listed below. You should formulate questions which make sense within your own context so i don't list them all here.
  • How many systems are initially reported?
  • Do I have physical access to the system(s)?
  • What roles do the system(s) play in the network?
  • How technically proficient is the reporter?
This allows me to figure out the tools I need for my Trusted Tools set. If I have physical access, I will use the USB method described previously. Otherwise, I load the tools from a secure location into a RAM-based file system. The latter is far less frequent, so I will use the former as the example here. Once the trusted tools are collected, a new RSA keypair is generated to identify the server. Next, I begin the customization of the RapidTriage client. I wrote a utility called "prepare.py" which automates a lot of this. First, prepare.py generates a new client ID for the investigation. It then assigns this client a new RSA key pair. The prepare tool inserts a copy of the server's public key and the agents private key into the source code for the RapidTriage client. It then selects a new value for G and updates both the server and client source code. Finally, if a specific Operating System is selected the prepare tool will update the client source code to include only those commands relevant to it. This brings the RapidTriage tool back down to a single file for the client. In the case of a specific OS I often want to compile the client into a native application targeted at that arch. This is done using Nuitka, py2exe, or py2app. Each of these has very good documentation and user guides so I won't dive into it here. The result of each is a stand alone client which can be run on the target OS. I can then add this alongside the other tools on the Trusted USB. At this point I will stand up the report server on my collection laptop and begin investigating.

Loading the tools 

When I load the trusted system binaries USB into a potentially infected (and therefore hostile) machine, I like to be sure the underlying Operating System loads the tools without being able to corrupt them. For this purpose, a hardware write-blocking tool can be used to physically bar the system from overwriting the tools on the USB. Int80 of Dual Core gives a great talk on offensive anti-forensics in which he describes some potential attacks against an adversary plugging USB devices into a system. Hence the importance of hardware protections. Still be aware, Malware may be alerted to your presence once you plug any device in.

Once the USB is mounted, the files can be read and the binaries can be executed, but any malicious infection cannot corrupt the tools. The next step is to open a trusted shell from the USB. From there, I copy the  system's path to another environment variable named something like OLD_PATH. Finally, I adjust PATH and any other environment variables which may reference local programs to point at the USB tools. At this point I will run the Situational Awareness tool against the system with relative confidence in the result.

Conclusion

This system has the potential to to quickly identify IoCs across a large number of systems by allowing an investigator to interrogate multiple systems with the same tool. It further speeds this work by collecting all of the reports to an off-site collection point making cross-correlation possible. With the addition of Trusted Tools, Identity, and Authentication, the results have more credibility than before.  While I never expect it to help in any actual forensic capacity, it has already helped me identify compromised systems quickly in the field.

At the moment I am putting the finishing touches on the code after which I will release the various utilities I have described. I will update this area with a link when that time has come.

No comments:

Post a Comment