
264 The State of the Art in Intrusion Prevention and Detection
by the averaged (over the set of observations) square of differences of actual values ŷ
j
(X, w), j = 1,
2,…, M of the outputs calculated for particular observations X with the weights w (X = (x
1
x
2
,…, x
N
)
being the vector of all inputs of the neuron/neural network while w is the vector of all weights, and
their desired values y
j
. Thus, the cost C(.) for a single training sample (X, Y) can be expressed as
Cy
j
M
() (
ˆ
(,
=
∑
2
1
(11.7)
and in a case of a larger training set (X
(i)
, Y
(i)
), i = 1, 2,…,I it will be averaged over all samples:
C
I
yX y
j
i
j
i
j
M
i
I
()
ˆ
(,)
() ()
==
∑∑
1
2
11
.