
Composite Big Data Modeling for Security Analytics
383
(classification) type-labeled data, and D
u
is a set of type-unlabeled data. To model the initial
security expert domain knowledge K
l
, ontologies are represented in RDF(S) with nodes as
basic features selected by a security expert. Each feature node is a Boolean variable, X
i
j
, and
a sink node is also a Boolean variable Y
i
for two possible classification types: attacked and
not-attacked.
Referring to a framework for constructing features and models for intrusion detection
(Lee and Stolfo 2000), possible network security features for intrusion pattern recognition
are protocol type, flag, packet size,