Labs

Tools useful for hunting

Log Aggregation

Much of the work you’ll do hunting involves performing data transformation on log data. You must have a tool that allows you to centrally aggregate those logs and interact with them. This is a basic necessity for hunting.

ELK, https://www.elastic.co/
Splunk, https://www.splunk.com/
Graylog, https://www.graylog.org/

Data Manipulation Tools

Many of the best data manipulation tools are free on the command line, or application you might traditionally use for other purposes. Scripting languages and specific libraries are also useful for this purpose.

Python
Pandas (Python Library)
R
CyberChef
Regex Buddy
linux cmd tools
- sed
- awk
- sort
- uniq
- cut
- jq
- grep

Specialized Analysis Tools

A significant portion of the hunter’s tool kit is dedicated to analysis tools that are unique to specific data types. I’ve highlighted some of my favorites for many of my preferred data types here.

Network

Wireshark
tshark
molo.ch
network miner
nfdump

File

YARA

OS

Osquery
GRR

Memory

Volatility

Threat Intel

Domain Tools
Passive Total
Alienvault OTX
Hybrid Analysis
Virus Total
any.run
Cuckoo
Maltego
urlscan.io
shodan.io
censys.io

Have BITS Jobs been used for malicious purposes on my network?

Your goal is to use the attack-based hunting method to find anomalous BITS-related network communication using Bro/Zeek HTTP communication logs.

References:

Malicious use of BITS jobs (be sure to check out the reference links at the bottom): https://attack.mitre.org/techniques/T1197/.
BITS Examples from MS: https://docs.microsoft.com/en-us/windows/desktop/bits/bitsadmin-examples
Matthew Green's post on malicious BITS detection: https://mgreen27.github.io/posts/2018/02/18/Sharing_my_BITS.html
Bro/Zeek log fields: https://docs.zeek.org/en/stable/scripts/base/protocols/http/main.bro.html#type-HTTP::Info/

You may not be able to 100% confirm that the anomaly you've found is evil. The goal here is simply to find the anomaly that is most likely to lead to an incident.

Accessing the Lab Data

All data for this exercise is already loaded on to the student VM in the lab1 index.
To query the data, open Kibana using the desktop icon. Click the Discover tab and search for _index:lab1. This will return all the lab data.
Any search or data transformation you perform must include the string _index:lab1 to limit your result set exclusively to this lab.
To ensure you see all the data, make sure the time range selection at the top right of the screen is always set to Last 5 Years.

Initial Research

First, start out by doing some basic research on how BITS functions and how attackers have used it to transmit files, evade detection, and facilitate command and control. You can use the links provided in the lab introduction. You should focus on answering the following:

What is the designed purpose of BITS?
How do attackers use BITS for malicious purposes?
What does BITS look like in logs and/or network data?

What am I looking for?

You're looking for evidence of the BITS mechanism used to transfer files to support malicious activities.

Where am I likely to find it?

HTTP transaction logs are all you have available in this exercise, so you'll clearly be focusing on BITS related communication rather than execution logs of the bitsadmin tool.

Apply your knowledge of BITS communication to the available fields in the Zeek HTTP data. What fields are most useful for anomaly hunting?

Here's a short list of considerations:

Method
Mime Type
Domain
URI

How can I manipulate data to see it?

For each of the fields identified, you're looking for anomalies that are likely to occur infrequently. So, perform an aggregation on each field to examine the distribution of values, focusing on outliers (least frequent occurrence).

To create aggregations:

Go to the Visualize Tab
Create a new Data Table
Select the lab1 index
Under Buckets, click Split Rows.
Under Aggregation, select Terms.
Choose the field you wish to aggregate the unique values for.
Change the size to 50000 so it includes all the logs from this lab.
Click the play button (if you don't have any data, make sure your date range is set to Last 5 years)

Review the distribution of results. What looks unique and weird? Research it by looking at the full event (search the string in the Discover tab) or using Google to perform research.

As you find things likely to be benign, exclude them from your aggregations. You can do this by adding an exclusion to the search at the top of that screen. You should keep adding these until you get down to a reasonable set of results you can manually examine:

NOT domain:(*microsoft* OR *windows*)
NOT uri:(*jpg* OR *png* OR *edge*)

The two fields that should become most interesting to you are the domain and URI fields because of their variability.

Does anything in Windows Program Execution logs look malicious?

References:

Microsoft EID 4688 Reference: https://docs.microsoft.com/en-us/windows/security/threat-protection/auditing/event-4688
Ultimate Windows Security EID 4688 Reference: https://www.ultimatewindowssecurity.com/securitylog/encyclopedia/event.aspx?eventID=4688

I also have some friendly intelligence for you to consider in this exercise.

There are three network segments of traffic included here. You can identify member workstations by their hostname values.

Finance: Accounting users
- Leonard
- Deanna
- Pavel
Dev: Development users
- Geordi
- Scotty
- Sonya
Admin: IT administrators
- Jean-luc
- Jim
- Kathryn

Accessing the Lab Data

All data for this exercise is already loaded on to the student VM in the lab2 index.
To query the data, open Kibana using the desktop icon. Click the Discover tab and search for _index:lab2. This will return all the lab data.
Any search or data transformation you perform must include the string _index:lab2 to limit your result set exclusively to this lab.
To ensure you see all the data, make sure the time range selection at the top right of the screen is always set to Last 5 Years.
The analysis or comparison of timestamps are not relevant for this exercise.

Initial Research

First, start out by doing some basic research on what EID 4688 represents. Also consider the role of program execution in a typical attack, either directly (execution of malware) or indirectly (execution of file that will launch malware). These attacks span multiple categories including:

Any malware
Malicious documents (office/PDF)
Malicious use of legitimate applications
Initial compromise
Lateral movement
Exfiltration
Sensitive file access

This is a very broad realm of possibilties, but there aren't too many relevant fields. You can use the links provided in the introduction for input.

What fields are most likely to contain evidence of attacks?

The primary field we're concerned with is NewProcessName since it provides the most context about what was executed.

In addition, the following fields also provide useful context about the nature of the execution:

Timestamp (not relevant in this exercise)
HostName
User

What would be anomalous in these fields?

Here are some ideas for things to look for in relation to the identified fields.

Mirroring legitimacy
Oddly high frequency of occurrence
Generic non descriptives
Use of legitimate applications in an abnormal context (odd command line arguments)
Weird content formatting
Unexpected entity relationships (by user/host)
Improper timing
Baseline deviations

How can I manipulate data to see it?

The technique used to manipulate data will vary based on the field and anomaly you're looking at.

Search/Aggregate by unique NewProcessName + apply reductions to exclude normal

Mirroring legitimacy
Generic non descriptives
Weird content formatting
Use of legitimate applications in an abnormal context (odd command line arguments)

Search/Aggregate by unique NewProcessName + count occurrence per host and compare normal vs. expected

Oddly high frequency of occurrence

Search/Aggregate by unique NewProcessName by Host(s) or User(s) + examine context vs. expected

Unexpected entity relationships (by user/host)
Baseline deviations

Search/Aggregate by unique NewProcessName + examine timing vs. expected for the user/app

Improper timing

Has Emotet executed successfully on my network?

Your goal is to use the attack-based hunting method to find evidence of potential Emotet activity using only Bro/Zeek connection logs.

References:

MalwareBytes basic Emotet Overview: https://www.malwarebytes.com/emotet/
Emotet Traffic Samples from Brad Duncan: https://www.google.com/search?q=site%3Amalware-traffic-analysis.net+emotet&oq=site%3Amalware-traffic-analysis.net+emotet&aqs=chrome..69i57j69i58.7214j0j7&sourceid=chrome&ie=UTF-8
Bro/Zeek log fields: https://docs.zeek.org/en/stable/scripts/base/protocols/http/main.bro.html#type-HTTP::Info/

I also have some friendly intelligence for you to consider in this exercise.

Network IP Range: 192.168.100.0/24

Domain Controllers: 192.168.100.2-5
Internal Web Servers: 192.168.100.10-15
Workstations: 192.168.100.100-245

Accessing the Lab Data

All data for this exercise is already loaded on to the student VM in the lab3 index.
To query the data, open Kibana using the desktop icon. Click the Discover tab and search for _index:lab3. This will return all the lab data.
Any search or data transformation you perform must include the string _index:lab3 to limit your result set exclusively to this lab.
To ensure you see all the data, make sure the time range selection at the top right of the screen is always set to Last 5 Years. If you'd like to get more specific, all the data for this lab is timestamped during the month of January 2019.

Initial Research

First, start out by doing some basic research on how Emotet functions and how it has evolved over time. While it started as a simple banking trojan, it's functionality has expanded and it is also used in the delivery of other malware. You can use the links provided in the lab introduction to study this information.

You should focus on answering the following:

How is Emotet generally delivered to a host for an initial infection?
Once successfully running on a system, how does Emotet spread to others?
What are Emotet's goals? How does it achieve them?

What am I looking for?

You're looking for evidence of Emotet on your network.

This exercise is unique because it forces you to really parse down the threat intel you're provided and consider which possible anomalies could manifest given the limited data you have to work with.

The only thing you have to work with here are network connections. So, you can only find evidence of Emotet's network communication. That will probably be the initial infection or the lateral movement.

Where am I likely to find it?

Bro/Zeek connection are all you have available in this exercise, so you won't be able to look for some of the more obvious indicators. Instead, you'll have to focus on network behaviors.

Apply your knowledge of Emotet network communication to the available fields in the Zeek connection data. What fields are most useful for anomaly hunting?

There aren't many available for us:

Originating (Source) IP
Originating (Source) Port
Responding (Dest) IP
Responding (Dest) Port
Duration
Original Bytes
Respond Bytes

So, we have to consider the anomaly types that are mostly likely to manifest here. On the initial infection side:

Delivery from a phishing link
Externally facing SMB service exploited via EternalBlue vulnerability

On the lateral movement/spread side:

SMB exploitation via EternalBlue

In many of these cases you are limited to only a few types of anomalies: frequency of occurrence, baseline deviations, unexpected one to many relationships, abnormal relationships. Most of these require some application of the friendly intelligence information.

How can I manipulate data to see it?

Delivery from a phishing link

Search for inbound port 25 traffic. Aggregate port 25 by source host. Sort by LFO.

Externally facing SMB service exploited via EternalBlue vulnerability

Search inbound port 445 traffic. Aggregate port 445 by dest host. Compare to friendly intelligence. Should these hosts (if they exist) be receiving SMB data from the internet? Consider the amount of connections and total transferred traffic -- you can do this by running more aggregations.

SMB exploitation via EternalBlue

Search internal to internal port 445 traffic. Aggregate unique count of dest hosts by source host. Sort by MFO. Compare to friendly intelligence. Should these hosts (if they exist) be communicating with so many other hosts over SMB? Consider the amount of connections and total transferred traffic -- you can do this by running more aggregations.

Has a User Account Control (UAC) bypass been used for malicious purposes on my network?

Your goal is to use the attack-based hunting method to find evidence of likely UAC bypass using the provided Windows logs.

References:

MITRE ATT&CK Page: https://attack.mitre.org/techniques/T1088/
Microsoft's "How UAC Works" reference: https://docs.microsoft.com/en-us/windows/security/identity-protection/user-account-control/how-user-account-control-works
JYM's blog post on UAC bypass analysis: https://medium.com/@jym/uac-bypass-analysis-7a1379d21d36
Windows EID 4688 Fields: https://www.ultimatewindowssecurity.com/securitylog/encyclopedia/event.aspx?eventID=4688
Windows EID 4689 Fields: https://www.ultimatewindowssecurity.com/securitylog/encyclopedia/event.aspx?eventID=4689
Note: This data set includes a calculated EventTimeDelta field in EID 4689 logs that is not-standard to these log types. It contains the time a process ran before it was terminated.

Accessing the Lab Data

All data for this exercise is already loaded on to the student VM in the lab4 index.
To query the data, open Kibana using the desktop icon. Click the Discover tab and search for _index:lab4. This will return all the lab data.
Any search or data transformation you perform must include the string _index:lab4 to limit your result set exclusively to this lab.
To ensure you see all the data, make sure the time range selection at the top right of the screen is always set to Last 5 Years. If you'd like to get more specific, all the data for this lab is timestamped during the month of January 2019.

Initial Research

First, start out by doing some basic research on how UAC functions and how attackers have been able to bypass. You might even want to simulate this yourself in a lab environment. You can use the links provided in the lab introduction to guide your research, but you probably won't want to start there. You should focus on answering the following:

What does the execution of a process that relies on a UAC prompt look like?
What ways might attacks or suppression of UAC look like in log data?

What am I looking for?

You're looking for evidence of UAC being bypassed to launch another malicious process.

Where am I likely to find it?

You're working with Windows logs here, and if you look closely you only have Windows process execution (4688) and termination (4689) logs.

Apply your knowledge of UAC execution to the available fields in the Windows log data. What fields are most useful for anomaly hunting?

Here's a short list of considerations:

EventTime
NewProcessName / ProcessName
EventTimeDelta
Command Line

In this case you really have to consider the relationship between the UAC consent.exe process (the prompt the user has to click to allow an elevated process to run) and the execution of that elevated process. That order will occur in a specific manner, but more importantly, consider the human factor involved in acknowledging the prompt.

A few ideas for things to look for:

Rarely executed process
Odd process start and end timing of consent.exe
Unexpected user ownership for the elevated process

How can I manipulate data to see it?

Rarely executed process

Aggregate the NewProcessName field and sort by LFO.

Odd process start and end timing of consent.exe

Search for low consent.exe EventTimeDelta values and look for following elevated process.

Unexpected user ownership for the elevated process

Search for elevated process and review user ownership. Compare to other executions of that same process.

Does anything in HTTP traffic on our guest network look malicious?

Your goal is to use the data-based hunting method to find anomalous HTTP communication using Bro/Zeek HTTP logs.

References:

Mozilla Overview of HTTP. This provides a general overview of how the protocol works and various fields within it: https://developer.mozilla.org/en-US/docs/Web/HTTP/Overview
MITRE ATT&CK Reference for use of an application layer protocol. This provides an overview of some ways attackers might use HTTP, but covers a lot of ground since this is a broad technique: https://attack.mitre.org/techniques/T1071/.
Bro/Zeek log fields: https://docs.zeek.org/en/stable/scripts/base/protocols/http/main.bro.html#type-HTTP::Info/

Accessing the Lab Data

All data for this exercise is already loaded on to the student VM in the lab5 index.
To query the data, open Kibana using the desktop icon. Click the Discover tab and search for _index:lab5. This will return all the lab data.
Any search or data transformation you perform must include the string _index:lab5 to limit your result set exclusively to this lab.
To ensure you see all the data, make sure the time range selection at the top right of the screen is always set to Last 5 Years.
Timestamp data for this exercise is irrelevant.

Initial Research

First, start out by doing some basic research on how attackers use HTTP for malicious purposes. These attacks span multiple categories including:

Use as a command and control channel
A mechanism for redirection to websites hosting malicious content
A protocol that can be eavesdropped to intercept credentials and session cookies
A pathway towards exploitation of browsers or their plugins (flash, java)
A mechanism for carrying web application attacks (SQL injection, XSS, etc.)

This is very broad, which makes DBH for HTTP difficult and diverse. You can use the links provided in the introduction for input.

What fields are most likely to contain evidence of attacks?

We can expect that guest Wi-Fi is an isolated network segment, so there are no internal assets that can be directly accessed other than the routing device upstream from the public network. Since we only have HTTP logs, it's unlikely we'll observe any attacks from the public network to the private network. Aside from misconfigurations, we'll mostly be focused on people using the wifi network to launch attacks against other customers or organizations.

What fields are most useful for anomaly hunting of HTTP data in this context?

Here's a short list of considerations to limit your scope:

Mime Type
Domain
Cookie

What would be anomalous in these fields?

Here are some ideas for things to look for in relation to each field.

Mime Type

Executable files
Archive files
Suspiciously named files
Unrecognized file extensions
Requests for files with no referrer
Multiple requests for the same file

Domain

Mirroring legitimacy
Generic non-descriptives
High degree of randomness

Cookie/website language mismatches
The same session cookie used by multiple sources
Unexpected cookie obfuscation
Sensitive information leakage

How can I manipulate data to see it?

The technique used to manipulate data will vary based on the field and anomaly you're looking at.

Mime Type

Search/Aggregate by unique file name + Apply reductions to exclude normal
Executable files
Archive files
Suspiciously named files
Unrecognized file extensions
Requests for files with no referrer
Aggregate file name by count
Multiple requests for the same file

Domain

Search/Aggregate by unique domain + Apply reductions to exclude normal
Mirroring legitimacy
Generic non-descriptives
Search + Apply entropy calculation
High degree of randomness

Search responses by content language for individual country + aggregate unique cookies + examine individually
Cookie/website language mismatches
Aggregate by unique src ip + Aggregate by unique cookies
The same session cookie used by multiple sources
Aggregate by unique cookies + apply entropy calculation
Unexpected cookie obfuscation
Aggregate by unique cookies + examine individually / Search for unique strings
Sensitive information leakage

Does anything in VPN authentication logs look malicious?

Your goal is to use the data-based hunting method to find anomalous VPN authentication logs.

References:

MITRE ATT&CK Reference for external remote services: https://attack.mitre.org/techniques/T1133/
MITRE ATT&CK Reference for redundant access: https://attack.mitre.org/techniques/T1108/
MITRE ATT&CK Reference for valid accounts: https://attack.mitre.org/techniques/T1078/

This lab also includes friendly intelligence in the form of a user list.

EXECUTIVES:

mark.higgins
carmen.hart
nancy.bower
derek.ellis
sheila.scerra
katelyn.perez
dolores.pepin
william.thacker
douglas.hubbs
tony.fann

IT ADMINS:

robert.daugherty
mark.moskovitz
dominic.elias
carlos.snellings
janet.morgan

ACCOUNTING:

leanna.tung
janice.turman
russell.martin
loyd.emily
jenifer.king

SALES:

jasmine.miller
pam.mcfarland
faye.rene
dorothy.gray
peggy.fondren
anita.howard
daniel.elmore
heather.folks
nicole.avella
eva.williamson

DEVELOPERS:

joe.jackson
william.osher
kathy.ayala
patricia.garcia
george.gertsch
sarah.jenkins
james.sedotal
cindy.wilson
erin.cheeks
anthony.gieger
anthony.jones
fannie.lewis
joseph.bell
david.bryan
timothy.sirles
james.sainz
leslie.galbraith
philip.fitzgibbon
rachael.hall
melissa.mcguire

Executive and sales users spend quite a bit of time traveling, but most everyone uses the VPN for remote access.

Accessing the Lab Data

All data for this exercise is already loaded on to the student VM in the lab6 index.
To query the data, open Kibana using the desktop icon. Click the Discover tab and search for _index:lab6. This will return all the lab data.
Any search or data transformation you perform must include the string _index:lab6 to limit your result set exclusively to this lab.
To ensure you see all the data, make sure the time range selection at the top right of the screen is always set to Last 5 Years. If you'd like to get more specific, all the data for this lab is timestamped from August through December 2018.

Initial Research

First, start out by doing some basic research on how attackers leverage VPN access as part of their compromise. These attacks span multiple categories including:

Attacking flaws in the VPN appliance itself
Using the VPN to access to internal network with stolen credentials to a legitimate account
Using the VPN to access the internal network with credentials belonging to an attacker-created account established at some other point in the compromise.

What fields are most likely to contain evidence of attacks?

VPN logs don't provide much context, but they do tell us what user authenticated to the VPN, and where they authenticated from. We can treat this like we would treat most authentication logs, with the caveat that this authentication mechanism can be accessed from the outside world.

Here's a short list of considerations to limit your scope:

EventTime
username
source_ip
source_country
source_state

What would be anomalous in these fields?

Here are some ideas for things to look for in relation to each field.

EventTime

Logins at unexpected times for a specific user
Logins from multiple distant locations within a short time window
Unexpected input formatting

username

Unknown or illegitimate usernames
Improperly formatted usernames
Unexpected input formatting

source_ip

Large or unexpected number of source IPs for a user
Unexpected input formatting

source_country

Large or unexpected number of source countries for a user
Countries that a user doesn't travel to or work from
Unexpected input formatting

source_state

Large or unexpected number of source states
States that a user doesn't travel to or work from
Unexpected input formatting

How can I manipulate data to see it?

The technique used to manipulate data will vary based on the field and anomaly you're looking at.

EventTime

Search logins per user and compare to normal work hours
Search logins per user and identify cases where two logins from distant locations occur within a short time window.
Aggregate by timestamp value (reducing precision) and sort by LFO.

username

Search/aggregate by unique username. Compare to friendly intelligence.
Search/aggregate by unique username. Sort by LFO. Look for improper formatting.

source_ip

Search/aggregate unique source IPs by user. Compare to friendly intelligence.

source_country

Search/aggregate unique countries by user. Compare to friendly intelligence.

source_state

Search/aggregate unique states by user. Compare to friendly intelligence.

PreviousAttack Dissection NextThreat Hunting with Yara

Last updated 4 years ago