14MAR19
“Wire data is the observed behavior and communication between network elements which is an important source of information used by IT operations staff to troubleshoot performance issues, create activity baselines, detect anomalous activity, investigate security incidents, and discover IT assets and their dependencies.” – Wikipedia
What IS Wire Data?
I was going to come up with my own definition, but I think Wikipedia gave a fine, albeit academic, explanation. For me, wire data falls into two categories—verbose (like packet capture) and metadata. The easier of the two to operationalize (and which provide arguably the most bang for your buck) is metadata, which we'll focus on in this blog post!
It doesn’t matter if it's metadata created from NetFlow, Zeek, Meek, Fleek or Stream (note: Meek and Fleek aren't real)—it's all just bits of information about network traffic. You can create metadata from either live network traffic via network taps, switches/routers or even on network interfaces on localhosts. Wire data from these devices then allow security analysts to gain context into security events, detect unusual events and pivot across data sources that are usually locked in proprietary vendor logs.
You can even create wire data or “network metadata” from PCAPs using Zeek or Splunk Stream. The advantage of metadata over packet capture (PCAP) is that it is significantly smaller while still providing much of the relevant data. For example, if we take one of the PCAPs hosted here and run it through Zeek, it looks like this:
As you can see, Zeek reads in the PCAP, then extracts out metadata and puts them into TSV (tab separated value) formatted .log files labeled by type. Below is an example of the http.log in raw text form:
It’s not the easiest thing to read, but that’s where Splunk comes in. ;-)
Now, if you recall, I talked about how wire data is the "metadata" of network traffic. In the example above, we have historical network traffic in the form of a PCAP. After Zeek reads the PCAP and outputs it to log, the information in the PCAP reduces from 7.4M to 256K. In other words, the "metadata" is only 3% of the captured network traffic (PCAP) but has nearly everything that you would want to use for network defense or hunting! Now you might not always have such an impressive reduction of size, but full network traffic (on the wire or saved in PCAP) will still be greater in size than wire data, whether it be Zeek, NetFlow or Splunk Stream.
What’s the Difference?
In the section above I discussed three different types of network metadata: NetFlow, Zeek logs and Splunk Stream. Let’s break down what these tools offer!
NetFlow/Flow Data
NetFlow is a Cisco-developed feature that “provide['s] network administrators with access to information concerning IP flows within their data networks.”[1] Basically, it's metadata generated on network data as it traverses routing devices. The original goal of NetFlow was to provide network administrators information about network traffic between Cisco devices, but security analysts quickly realized that this network information was invaluable for seeing the source and destination tuples.
NetFlow—and other variants of flow data like sFlow, Argus and others—generally created network metadata from the Transport, Network and Data layers of the OSI model. This means that the metadata generated by NetFlow consists typically of IP addresses (source and destination), the number of bytes/packets and the TCP/UDP ports utilized. NetFlow doesn't create information about the content of the network traffic. I like to think of it like the phone bill—you can see what time the call occurred, what callers it was between, how much was talked about and how long it lasted, but not much else.
One final thing to always remember about NetFlow is that it is was originally designed for network diagnostics and is quite often only enabled with “sampling.” Sampling means that you reduce your NetFlow telemetry to a certain percentage (often 1 out of every 100) packets. This is great for network diagnostics, but not great for security analysts!
Bro Zeek
Zeek is the wire data generator formally known as Bro (or even more widely known as Bro IDS). Since Bro was known as Bro IDS for many years, there's a misconception that Zeek is just another Snort. In fact, Zeek is less of an IDS than a network scripting language; at its base level, it can generate metadata network traffic (either from a live feed on the network OR in PCAP form).
In more complex installations, users can do everything from writing their own parsers for new protocols to performing modifications of the metadata before writing to disk. Zeek creates metadata on all seven layers of the OSI model including information IN the data. For example, Zeek users can actually extract files that are traversing the network along with hashing that file and identifying its type—e.g. whether the file is a PDF or a Windows executable.
Splunk Stream
Splunk Steam is a free tool from Splunk that allows you to create metadata very similar to Zeek. Although it understands fewer protocols than Zeek, it covers almost every protocol that most organizations see in their network. Not only that, it can actually produce more data about protocols it understands than Zeek. For example, when Zeek sees HTTP information, it can read and deliver metadata about its header information (user agent string, HTTP Methods, etc.). Splunk Stream can even—if you choose to enable the capability—record the payload of the HTTP session and document the payload of the communication. This isn't "metadata," but this sort of information is beneficial for security analysts and provides a tactical way of gathering PCAP-like data without requiring the complexities of a full-PCAP deployment. Not only that, but it's a tool created by Splunk and is easily integrated with Splunk Enterprise. On top of all these other features, Splunk Stream also allows for the ingest of flow data (like NetFlow) as well!
Okay, I’m Sold – I Want to Try Wire Data!
Well obviously, you could set it up yourself! If you try Zeek, Splunk has written the Splunk Add-on for Zeek aka Bro that allows you to ingest the data Zeek creates easily. However, perhaps a more natural route for Splunk customers is Splunk Stream. As previously discussed, this is a free tool from Splunk that will generate network wire data. Check out these blogs on Splunk Stream installation and configuration! Once you have Stream or Zeek up and running, you may want to try out the new Splunk App, Splunk Essentials for Wire Data.
It provides over 49 examples of Wire Data use cases broken into categories, and each category has different use cases broken into examples and stages of maturity.
Each example has extensive documentation, explanations and sample searches to help you perform these use cases in your own network!
The app even includes sample data so that you can see what the searches will return in Splunk Stream.
I hope this post has helped you understand the value of wire data and what it can help you find in your network. Many of the blogs in the “Hunting with Splunk" blog series depend upon wire data for advanced hunting! The more you use wire data, the more you'll understand why it is the favorite data source of Security Splunkers like myself.
With that, Happy Hunting!
[1] https://www.cisco.com/en/US/technologies/tk648/tk362/technologies_white_paper09186a00800a3db9.html
Ryan Kovar
NY. AZ. Navy. SOCA. KBMG. DARPA. Splunk.