Wire Data, Huh! What Is It Good For? Absolutely Everything, Say It Again Now! | Splunk (2024)

14MAR19

“Wire data is the observed behavior and communication between network elements which is an important source of information used by IT operations staff to troubleshoot performance issues, create activity baselines, detect anomalous activity, investigate security incidents, and discover IT assets and their dependencies.” – Wikipedia

What IS Wire Data?

I was going to come up with my own definition, but I think Wikipedia gave a fine, albeit academic, explanation. For me, wire data falls into two categories—verbose (like packet capture) and metadata. The easier of the two to operationalize (and which provide arguably the most bang for your buck) is metadata, which we'll focus on in this blog post!

It doesn’t matter if it's metadata created from NetFlow, Zeek, Meek, Fleek or Stream (note: Meek and Fleek aren't real)—it's all just bits of information about network traffic. You can create metadata from either live network traffic via network taps, switches/routers or even on network interfaces on localhosts. Wire data from these devices then allow security analysts to gain context into security events, detect unusual events and pivot across data sources that are usually locked in proprietary vendor logs.

You can even create wire data or “network metadata” from PCAPs using Zeek or Splunk Stream. The advantage of metadata over packet capture (PCAP) is that it is significantly smaller while still providing much of the relevant data. For example, if we take one of the PCAPs hosted here and run it through Zeek, it looks like this:

Wire Data, Huh! What Is It Good For? Absolutely Everything, Say It Again Now! | Splunk (1)

As you can see, Zeek reads in the PCAP, then extracts out metadata and puts them into TSV (tab separated value) formatted .log files labeled by type. Below is an example of the http.log in raw text form:

Wire Data, Huh! What Is It Good For? Absolutely Everything, Say It Again Now! | Splunk (2)

It’s not the easiest thing to read, but that’s where Splunk comes in. ;-)

Now, if you recall, I talked about how wire data is the "metadata" of network traffic. In the example above, we have historical network traffic in the form of a PCAP. After Zeek reads the PCAP and outputs it to log, the information in the PCAP reduces from 7.4M to 256K. In other words, the "metadata" is only 3% of the captured network traffic (PCAP) but has nearly everything that you would want to use for network defense or hunting! Now you might not always have such an impressive reduction of size, but full network traffic (on the wire or saved in PCAP) will still be greater in size than wire data, whether it be Zeek, NetFlow or Splunk Stream.

What’s the Difference?

In the section above I discussed three different types of network metadata: NetFlow, Zeek logs and Splunk Stream. Let’s break down what these tools offer!

NetFlow/Flow Data

NetFlow is a Cisco-developed feature that “provide['s] network administrators with access to information concerning IP flows within their data networks.”[1] Basically, it's metadata generated on network data as it traverses routing devices. The original goal of NetFlow was to provide network administrators information about network traffic between Cisco devices, but security analysts quickly realized that this network information was invaluable for seeing the source and destination tuples.

NetFlow—and other variants of flow data like sFlow, Argus and others—generally created network metadata from the Transport, Network and Data layers of the OSI model. This means that the metadata generated by NetFlow consists typically of IP addresses (source and destination), the number of bytes/packets and the TCP/UDP ports utilized. NetFlow doesn't create information about the content of the network traffic. I like to think of it like the phone bill—you can see what time the call occurred, what callers it was between, how much was talked about and how long it lasted, but not much else.

One final thing to always remember about NetFlow is that it is was originally designed for network diagnostics and is quite often only enabled with “sampling.” Sampling means that you reduce your NetFlow telemetry to a certain percentage (often 1 out of every 100) packets. This is great for network diagnostics, but not great for security analysts!

Bro Zeek

Zeek is the wire data generator formally known as Bro (or even more widely known as Bro IDS). Since Bro was known as Bro IDS for many years, there's a misconception that Zeek is just another Snort. In fact, Zeek is less of an IDS than a network scripting language; at its base level, it can generate metadata network traffic (either from a live feed on the network OR in PCAP form).

In more complex installations, users can do everything from writing their own parsers for new protocols to performing modifications of the metadata before writing to disk. Zeek creates metadata on all seven layers of the OSI model including information IN the data. For example, Zeek users can actually extract files that are traversing the network along with hashing that file and identifying its type—e.g. whether the file is a PDF or a Windows executable.

Splunk Stream

Splunk Steam is a free tool from Splunk that allows you to create metadata very similar to Zeek. Although it understands fewer protocols than Zeek, it covers almost every protocol that most organizations see in their network. Not only that, it can actually produce more data about protocols it understands than Zeek. For example, when Zeek sees HTTP information, it can read and deliver metadata about its header information (user agent string, HTTP Methods, etc.). Splunk Stream can even—if you choose to enable the capability—record the payload of the HTTP session and document the payload of the communication. This isn't "metadata," but this sort of information is beneficial for security analysts and provides a tactical way of gathering PCAP-like data without requiring the complexities of a full-PCAP deployment. Not only that, but it's a tool created by Splunk and is easily integrated with Splunk Enterprise. On top of all these other features, Splunk Stream also allows for the ingest of flow data (like NetFlow) as well!

Okay, I’m Sold – I Want to Try Wire Data!

Well obviously, you could set it up yourself! If you try Zeek, Splunk has written the Splunk Add-on for Zeek aka Bro that allows you to ingest the data Zeek creates easily. However, perhaps a more natural route for Splunk customers is Splunk Stream. As previously discussed, this is a free tool from Splunk that will generate network wire data. Check out these blogs on Splunk Stream installation and configuration! Once you have Stream or Zeek up and running, you may want to try out the new Splunk App, Splunk Essentials for Wire Data.

Wire Data, Huh! What Is It Good For? Absolutely Everything, Say It Again Now! | Splunk (3)

It provides over 49 examples of Wire Data use cases broken into categories, and each category has different use cases broken into examples and stages of maturity.

Wire Data, Huh! What Is It Good For? Absolutely Everything, Say It Again Now! | Splunk (4)

Each example has extensive documentation, explanations and sample searches to help you perform these use cases in your own network!

Wire Data, Huh! What Is It Good For? Absolutely Everything, Say It Again Now! | Splunk (5)

The app even includes sample data so that you can see what the searches will return in Splunk Stream.

Wire Data, Huh! What Is It Good For? Absolutely Everything, Say It Again Now! | Splunk (6)

I hope this post has helped you understand the value of wire data and what it can help you find in your network. Many of the blogs in the “Hunting with Splunk" blog series depend upon wire data for advanced hunting! The more you use wire data, the more you'll understand why it is the favorite data source of Security Splunkers like myself.

With that, Happy Hunting!

[1] https://www.cisco.com/en/US/technologies/tk648/tk362/technologies_white_paper09186a00800a3db9.html

Wire Data, Huh! What Is It Good For? Absolutely Everything, Say It Again Now! | Splunk (7)

Ryan Kovar

NY. AZ. Navy. SOCA. KBMG. DARPA. Splunk.

Wire Data, Huh! What Is It Good For? Absolutely Everything, Say It Again Now! | Splunk (2024)

FAQs

What is the data of a wire? ›

Wire data is the observed behavior and communication between networked elements which is an important source of information used by IT operations staff to troubleshoot performance issues, create activity baselines, detect anomalous activity, investigate security incidents, and discover IT assets and their dependencies.

What is Splunk in big data? ›

Splunk is a big data solution that can help you turn raw data into insights. Splunk architecture comes with a set of tools that help you integrate with data sources and then perform collection, queries, indexing, analyses, and visualization.

How does Splunk interact with machine data? ›

Splunk works through a forwarder collecting data from remote machines and forwarding it on to an index. An indexer then processes that data in real time and stores and indexes it on the disk. End-users then interact with Splunk through the search head, which enables them to search, analyze, and visualize data.

What are some of the reasons why Splunk partners can feel good about building their business on Splunk? ›

Partners can feel confident in their collaboration due to the provision of resources like training, enablement, and incentives, the potential for significant expansion of capabilities while reducing elements of automation, and the comprehensive support offered for all stages of the shared journey.

What is data wire used for? ›

A data cable is designed specifically for data transfer. Essentially, they transmit electronic information from a source to a destination and vice versa. They are extensively used in telecommunication and computer platforms, establishing connections between multiple locations throughout a network.

What is the most common data wiring? ›

The most common forms of data cabling are: Cat 5: using twisted pair copper cables, Cat 5 cabling is unshielded and delivers a bandwidth of up to 100MHz. The distance covered by Cat 5 cables can reach up 100m and, over this distance, data signals can be carried at speeds of between 10Mbps to 100Mbps.

What data does Splunk collect? ›

Data Collection

Splunk can ingest data from a wide variety of sources, including files, directories, network events, and APIs. It supports common data formats such as CSV, JSON, and XML, as well as custom formats.

Why is Splunk so popular? ›

Splunk gives organizations the confidence they need to quickly change course while minimizing business risk. With comprehensive visibility, teams can better understand interdependencies and the downstream impact of changes in their environments.

What kind of data does Splunk usually read through? ›

The Splunk platform can index any kind of data. In particular, the Splunk platform can index any and all IT streaming, machine, and historical data, such as Microsoft Windows event logs, web server logs, live application logs, network feeds, metrics, change monitoring, message queues, archive files, and so on.

What is better than Splunk? ›

There are several alternatives to Splunk, including SigNoz, Graylog, Loggly, Dynatrace, New Relic, Datadog, Logz.io, Logstash, Fluentd, AppDynamics, and Mezmo.

Can Splunk track user activity? ›

Splunk UBA logs several types of activity for auditing. All actions performed by users in Splunk UBA are logged for auditing, including visits to dashboards and pages within the application. User logged in or out. User account created, modified, or deleted.

Where does Splunk data get stored? ›

The events are stored in in the splunk indexers in indexes in a timestamp order. By default the retention size per index is 500GB and the time retention is 6 years. It can be changed of course depending of your needs and of your storage. If you are looking for logs for application errors (splunkd.

Who competes with Splunk? ›

Top Competitors and Alternatives of Splunk

The top three of Splunk's competitors in the Log Management category are Datadog with 61.93%, Logstash with 4.99%, Loggly with 4.64% market share.

Which companies use Splunk? ›

List of companies using Splunk
CompanyCountryIndustry
JPMorgan Chase Bank, N.A.United StatesFinancial Services
EYUnited KingdomIt Services And It Consulting
Praxis EngineeringUnited StatesSoftware Development
Northrop GrummanUnited StatesDefense And Space Manufacturing
6 more rows

How can Splunk help with big data? ›

Splunk's software can be used to examine, monitor, and search for machine-generated big data through a browser-like interface. It makes searching for a particular piece of data quick and easy, and more importantly, does not require a database to store data as it uses indexes for storage.

What is the data rate of 1 wire? ›

1-Wire is a wired half-duplex serial bus designed by Dallas Semiconductor that provides low-speed (16.3 kbit/s) data communication and supply voltage over a single conductor.

What is wired data? ›

Wired communication refers to the transmission of data over a wire-based communication technology (telecommunication cables). Wired communication is also known as wireline communication. Examples include telephone networks, cable television or internet access, and fiber-optic communication.

What is wire data format? ›

A wire formats defines the format in which messages are to be sent or received by endpoints.

What is the measurement of a wire? ›

Wire gauge is a measurement of wire diameter. This determines the amount of electric current the wire can safely carry, as well as its electrical resistance and weight.

Top Articles
Latest Posts
Recommended Articles
Article information

Author: Edwin Metz

Last Updated:

Views: 5595

Rating: 4.8 / 5 (58 voted)

Reviews: 89% of readers found this page helpful

Author information

Name: Edwin Metz

Birthday: 1997-04-16

Address: 51593 Leanne Light, Kuphalmouth, DE 50012-5183

Phone: +639107620957

Job: Corporate Banking Technician

Hobby: Reading, scrapbook, role-playing games, Fishing, Fishing, Scuba diving, Beekeeping

Introduction: My name is Edwin Metz, I am a fair, energetic, helpful, brave, outstanding, nice, helpful person who loves writing and wants to share my knowledge and understanding with you.