book

Network Security Through Data Analysis, 2nd Edition

Name: Network Security Through Data Analysis, 2nd Edition
Author: Michael Collins
ISBN: 9781491962794

by Michael Collins

September 2017

Beginner to intermediate

428 pages

11h 40m

English

O'Reilly Media, Inc.

Read now

Unlock full access

Preface
AudienceContents of This BookChanges Between EditionsConventions Used in This BookUsing Code ExamplesO’Reilly SafariHow to Contact UsAcknowledgments
I. Data
1. Organizing Data: Vantage, Domain, Action, and Validity
DomainVantageChoosing VantageActions: What a Sensor Does with DataValidity and ActionInternal ValidityExternal ValidityConstruct ValidityStatistical ValidityAttacker and Attack IssuesFurther Reading
2. Vantage: Understanding Sensor Placement in Networks
The Basics of Network LayeringNetwork Layers and VantageNetwork Layers and AddressingMAC AddressesIPv4 Format and AddressesIPv6 Format and AddressesValidity Challenges from Middlebox Network DataFurther Reading
3. Sensors in the Network Domain
Packet and Frame FormatsRolling BuffersLimiting the Data Captured from Each PacketFiltering Specific Types of PacketsWhat If It’s Not Ethernet?NetFlowNetFlow v5 Formats and FieldsNetFlow Generation and CollectionData Collection via IDSClassifying IDSsIDS as ClassifierImproving IDS PerformanceEnhancing IDS DetectionConfiguring SnortEnhancing IDS ResponsePrefetching DataMiddlebox Logs and Their ImpactVPN LogsProxy LogsNAT LogsFurther Reading
4. Data in the Service Domain
What and WhyLogfiles as the Basis for Service DataAccessing and Manipulating LogfilesThe Contents of LogfilesThe Characteristics of a Good Log MessageExisting Logfiles and How to Manipulate ThemStateful LogfilesFurther Reading
5. Sensors in the Service Domain
Representative Logfile FormatsHTTP: CLF and ELFSimple Mail Transfer Protocol (SMTP)SendmailMicrosoft Exchange: Message Tracking LogsAdditional Useful LogfilesStaged LoggingLDAP and Directory ServicesFile Transfer, Storage, and DatabasesLogfile Transport: Transfers, Syslog, and Message QueuesTransfer and Logfile RotationSyslogFurther Reading
6. Data and Sensors in the Host Domain
A Host: From the Network’s ViewThe Network InterfacesThe Host: Tracking IdentityProcessesStructureFilesystemHistorical Data: Commands and LoginsOther Data and Sensors: HIPS and AVFurther Reading
7. Data and Sensors in the Active Domain
Discovery, Assessment, and MaintenanceDiscovery: ping, traceroute, netcat, and Half of nmapChecking Connectivity: Using ping to Connect to an AddressTraceroutingUsing nc as a Swiss Army Multitoolnmap Scanning for DiscoveryAssessment: nmap, a Bunch of Clients, and a Lot of RepositoriesBasic Assessment with nmapUsing Active Vantage Data for VerificationFurther Reading
II. Tools

8. Getting Data in One Place
High-Level ArchitectureThe Sensor NetworkThe RepositoryQuery ProcessingReal-Time ProcessingSource ControlLog Data and the CRUD ParadigmA Brief Introduction to NoSQL SystemsFurther Reading
9. The SiLK Suite
What Is SiLK and How Does It Work?Acquiring and Installing SiLKThe DatafilesChoosing and Formatting Output Field Manipulation: rwcutBasic Field Manipulation: rwfilterPorts and ProtocolsSizeIP AddressesTimeTCP OptionsHelper OptionsMiscellaneous Filtering Options and Some Hacksrwfileinfo and ProvenanceCombining Information Flows: rwcountrwset and IP SetsrwuniqrwbagAdvanced SiLK FacilitiesPMAPsCollecting SiLK DataYAFrwptoflowrwtucrwrandomizeipFurther Reading
10. Reference and Lookup: Tools for Figuring Out Who Someone Is
MAC and Hardware AddressesIP AddressingIPv4 Addresses, Their Structure, and Significant AddressesIPv6 Addresses, Their Structure, and Significant AddressesIP Intelligence: Geolocation and DemographicsDNSDNS Name StructureForward DNS Querying Using digThe DNS Reverse LookupUsing whois to Find OwnershipDNS Blackhole ListsSearch EnginesGeneral Search EnginesScanning Repositories, Shodan et alFurther Reading
III. Analytics
An Overview of Attacker BehaviorFurther Reading
11. Exploratory Data Analysis and Visualization
The Goal of EDA: Applying AnalysisEDA WorkflowVariables and VisualizationUnivariate VisualizationHistogramsBar Plots (Not Pie Charts)The Five-Number Summary and the BoxplotGenerating a BoxplotBivariate DescriptionScatterplotsMultivariate VisualizationOther Visualizations and Their RoleOperationalizing Security VisualizationFitting and EstimationIs It Normal?Simply Visualizing: Projected Values and QQ PlotsFit Tests: K-S and S-WFurther Reading
12. On Analyzing Text
Text EncodingUnicode, UTF, and ASCIIEncoding for AttackersBasic SkillsFinding a StringManipulating DelimitersSplitting Along DelimitersRegular ExpressionsTechniques for Text AnalysisN-Gram AnalysisJaccard DistanceHamming DistanceLevenshtein DistanceEntropy and CompressibilityHomoglyphsFurther Reading
13. On Fumbling
Fumbling: Misconfiguration, Automation, and ScanningLookup FailuresAutomationScanningIdentifying FumblingIP Fumbling: Dark Addresses and SpreadTCP Fumbling: Failed SessionsICMP Messages and FumblingFumbling at the Service LevelHTTP FumblingSMTP FumblingDNS FumblingDetecting and Analyzing FumblingBuilding Fumbling AlarmsForensic Analysis of FumblingEngineering a Network to Take Advantage of Fumbling
14. On Volume and Time
The Workday and Its Impact on Network Traffic VolumeBeaconingFile Transfers/RaidingLocalityDDoS, Flash Crowds, and Resource ExhaustionDDoS and Routing InfrastructureApplying Volume and Locality AnalysisData SelectionUsing Volume as an AlarmUsing Beaconing as an AlarmUsing Locality as an AlarmEngineering SolutionsFurther Reading
15. On Graphs
Graph Attributes: What Is a Graph?Labeling, Weight, and PathsComponents and ConnectivityClustering CoefficientAnalyzing GraphsUsing Component Analysis as an AlarmUsing Centrality Analysis for ForensicsUsing Breadth-First Searches ForensicallyUsing Centrality Analysis for EngineeringFurther Reading
16. On Insider Threat
Insider Threat Versus Other Classes of AttacksAvoiding ToxicityModes of AttackData Theft and ExfiltrationCredential TheftSabotageInsider Threat Data: Logistics and CollectionApplying Sector-Based Workflow to Insider ThreatPhysical Data SourcesKeeping Track of User IdentityFurther Reading
17. On Threat Intelligence
Defining Threat IntelligenceData TypesCreating a Threat Intelligence ProgramIdentifying GoalsStarting with Free SourcesDetermining Data OutputPurchasing SourcesBrief Remarks on Creating Threat IntelligenceFurther Reading
18. Application Identification
Mechanisms for Application IdentificationPort NumberApplication Identification by Banner GrabbingApplication Identification by BehaviorApplication Identification by Subsidiary SiteApplication Banners: Identifying and ClassifyingNon-Web BannersWeb Client Banners: The User-Agent StringFurther Reading
19. On Network Mapping
Creating an Initial Network Inventory and MapCreating an Inventory: Data, Coverage, and FilesPhase I: The First Three QuestionsPhase II: Examining the IP SpacePhase III: Identifying Blind and Confusing TrafficPhase IV: Identifying Clients and ServersIdentifying Sensing and Blocking InfrastructureUpdating the Inventory: Toward Continuous AuditFurther Reading
20. On Working with Ops
Ops Environments: An OverviewOperational WorkflowsEscalation WorkflowSector WorkflowHunting WorkflowHardening WorkflowForensic WorkflowSwitching WorkflowsFurther Readings
21. Conclusions
Index

Content preview from Network Security Through Data Analysis, 2nd Edition

Chapter 8. Getting Data in One Place

Once you collect all your data, you have to have an environment where you can process it and produce results. In this chapter, I provide some notes on an architecture to facilitate the rapid development and operational deployment of security analysis software (analytics¹).

There are a number of ways to implement this; the version you’ll see in Figure 8-1 is a high-level diagram for a basic environment. In general, these environments should have the following attributes:

Robust, universal access to all sensor data. The term “universal” here is used in lieu of “centralized”—it’s not critical that the data be in one place, but it is critical that anyone implementing analytical code have uniform access to all the data.
Access to a Turing-complete language. This differentiates an analysis environment from the classic security console. Complex analytics require access to a general-purpose programming language and the ability to build constructs that rely on in-place memory manipulation—so, Python good, R good, SQL bad.
Performance. Any analytic system will have to deal with resource contention; it is better to overprovision for multiple simultaneous queries early on rather than have your analysts fighting to get results in a crisis.

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.

Read now

Unlock full access

More than 5,000 organizations count on O’Reilly

O’Reilly covers everything we've got, with content to help us build a world-class technology community, upgrade the capabilities and competencies of our teams, and improve overall team performance as well as their engagement.

Julian F.

Head of Cybersecurity

I wanted to learn C and C++, but it didn't click for me until I picked up an O'Reilly book. When I went on the O’Reilly platform, I was astonished to find all the books there, plus live events and sandboxes so you could play around with the technology.

Addison B.

Field Engineer

I’ve been on the O’Reilly platform for more than eight years. I use a couple of learning platforms, but I'm on O'Reilly more than anybody else. When you're there, you start learning. I'm never disappointed.

Amir M.

Data Platform Tech Lead

I'm always learning. So when I got on to O'Reilly, I was like a kid in a candy store. There are playlists. There are answers. There's on-demand training. It's worth its weight in gold, in terms of what it allows me to do.

Mark W.

Embedded Software Engineer

Publisher Resources

ISBN: 9781491962831Errata Page

Cloud Computing

Data Engineering

Data Science

AI & ML

Programming Languages

Software Architecture

IT/Ops

Security

Design

Business

Soft Skills

Network Security Through Data Analysis, 2nd Edition

by Michael Collins

Chapter 8. Getting Data in One Place

Become an O’Reilly member and get unlimited access to this title plus top books and audiobooks from O’Reilly and nearly 200 top publishers, thousands of courses curated by job role, 150+ live events each month,
and much more.