
138 Knowledge Discovery from Data Streams
Assuming that p is known, we can estimate P (x
ij
|0) as:
U
ij
− P (x
ij
|1) × p × U
(1 − p)U
where U
ij
is the number of unlabeled examples with X
i
= j and U the cardi-
nality of the unlabeled examples.
The problem with this estimator is that it can be negative. Calvo et al.
(2007) solve the problem by replacing the negative estimations by 0, and then
normalizing all the probabilities such that, for each variable X
i
, they sum to
1:
P
n
j=1
P (x
ij
|0) = 1.
P (x
ij
|0) =
1 + max(0; U
ij
− P (x
ij
|1) × p × U )
2 + (1 − p)U
. (9.1)
To summarize, the positive naive Bayes estimates P (x
ij
|1) from the posi-
tive examples by means of a maximum ...