162 Chapter 4 StatiStiCS for Data Mining
(Query 11)
Find attribute_1 times attribute_2 divided by attribute_3 from the table.
(SQL 11)
Select attribute_1 × attribute_2 / attribute_3 from the table.
These queries compute the value of the expected frequency based on
three groups of data: attribute_1, attribute_2, and attribute_3, where their size
must be the same. Finally, we define the c
2
-based query as follows, which
involves two subqueries, one for the observed variable and the other for the
expected variable:
(Query 12)
Are attribute_1 and attribute_2 independent in the table?
(SQL 12)
Select chi-square(∗) from the table where variable is observed and variable is
expected.
From the previous example dealing with price and feature attributes, w