The Continuous Case: Gaussian Bayesian Networks 47
have the form C ∼ µ
C
+ Nβ
N
+ Wβ
W
, where µ
C
is the intercept and β
N
, β
W
are
the regression coefficients for N and W. Denote each observation for C, N and
W with x
i,C
, x
i,N
and x
i,W
, respectively, where i = 1, . . . , n and n is the sample
size. penalized implements ridge regression, which estimates β
N
, β
W
with
{
b
β
RIDGE
N
,
b
β
RIDGE
W
} = argmin
β
N
,β
W
(
n
X
i=1
(x
i,C
− µ
C
− x
i,N
β
N
− x
i,W
β
W
)
2
+ λ
2
(β
2
N
+ β
2
W
)
)
;
(2.7)
the lasso, with
{
b
β
LASSO
N
,
b
β
LASSO
W
} =
= argmin
β
N
,β
W
(
n
X
i=1
(x
i,C
− µ
C
− x
i,N
β
N
− x
i,W
β
W
)
2
+ λ
1
(|β
N
| + |β
W
|)
)
; (2.8)
and the elastic net, with
{
b
β
ENET
N
,
b
β
ENET
W
} = argmin
β
N
,β
W
(
n
X
i=1
(x
i,C
− µ
C
− x
i,N
β
N
− x
i,W
β
W
)
2
+
+ λ
1
(|β
N
| + |β
W
|) + λ
2
(β
2
N
+ β
2
W
)
)
, (2.9)
which includes ridge regression and the lasso as special cases when λ
1
= 0
and ...