07:48 Publications - Reservoir Labs cognitive views of learning, Inc. | |
Full packet capture (FPC) consists in capturing all packets and storing them into permanent storage to enable offline forensic analysis. FPC however suffers from a scalability issue: at today's normal traffic speed rates of 10gbps or above, it either becomes intractable or requires highly expensive hardware both in processing and storage, which rapidly decreases the economic viability of the technology.Cognitive views of learning The first good news is that for many practical cases, full packet capture is not necessary. This rationale stems from the well-known law of heavy tailed traffic: from an analysis standpoint, most of the interesting features found in network traffic—such as a network attack, although not limited to it—are found in a very small fraction of it.Cognitive views of learning further, in some cases full packet capture is not only unnecessary but could represent a liability as sensitive information is kept in non-ephemeral storage.Cognitive views of learning the second good news is that all the heavy lifting done by zeek in processing network traffic can be leveraged to overcome both the intractability and the liability problems.Cognitive views of learning indeed, zeek can be brought into the loop to perform selective packet capture (SPC), a process by which the zeek workers themselves decide which traffic must be stored into disk in a selective and fine granular manner.Cognitive views of learning In this talk reservoir labs will present a workflow to perform selective packet capture using the zeek sensor at very high speed rates. The workflow allows zeek scripts to directly trigger packet captures based on the real time analysis of the traffic itself.Cognitive views of learning we will describe key data structures needed to efficiently perform this task and introduce several zeek scripts and use cases illustrating how SPC can be used to capture just the necessary packets to enable meaningful forensic analysis while minimizing the exposure to the liability risk.Cognitive views of learning The increasing size, variety, rate of growth and change, and complexity of network data has warranted advanced network analysis and services.Cognitive views of learning tools that provide automated analysis through traditional or advanced signature-based systems or machine learning classifiers suffer from practical difficulties.Cognitive views of learning these tools fail to provide comprehensive and contextual insights into the network when put to practical use in operational cyber security. In this paper, we present an effective tool for network security and traffic analysis that uses high-performance data analytics based on a class of unsupervised learning algorithms called tensor decompositions.Cognitive views of learning the tool aims to provide a scalable analysis of the network traffic data and also reduce the cognitive load of network analysts and be network-expert-friendly by presenting clear and actionable insights into the network.Cognitive views of learning In this paper, we demonstrate the successful use of the tool in two completely diverse operational cyber security environments, namely, (1) security operations center (SOC) for the scinet network at SC16 - the international conference for high performance computing, networking, storage and analysis and (2) reservoir labs’ local area network (LAN).Cognitive views of learning in each of these environments, we produce actionable results for cyber security specialists including (but not limited to) (1) finding malicious network traffic involving internal and external attackers using port scans, SSH brute forcing, and NTP amplification attacks, (2) uncovering obfuscated network threats such as data exfiltration using DNS port and using ICMP traffic, and (3) finding network misconfiguration and performance degradation patterns.Cognitive views of learning The problem of elephant flow detection is a longstanding research area with the goal of quickly identifying flows in a network that are large enough to affect the quality of service of smaller flows.Cognitive views of learning past work in this field has largely been either domain-specific, based on thresholds for a specific flow size metric, or required several hyperparameters, reducing their ease of adaptation to the great variety of traffic distributions present in real-world networks.Cognitive views of learning in this paper, we present an approach to elephant flow detection that avoids these limitations, utilizing the rigorous framework of bayesian inference.Cognitive views of learning by observing packets sampled from the network, we use dirichlet-categorical inference to calculate a posterior distribution that explicitly captures our uncertainty about the sizes of each flow.Cognitive views of learning we then use this posterior distribution to find the most likely subset of elephant flows under this probabilistic model. Our algorithm rapidly converges to the optimal sampling rate at a speed O(1/n), where n is the number of packet samples received, and the only hyperparameter required is the targeted detection likelihood, defined as the probability of correctly inferring all the elephant flows.Cognitive views of learning compared to the state-of-the-art based on static sampling rate, we show a reduction in error rate by a factor of 20 times. The proposed method of dirichlet-categorical inference provides a novel, powerful framework to elephant flow detection that is both highly accurate and probabilistically meaningful.Cognitive views of learning | |
|
Total comments: 0 | |