complex than those in traditional dom ain adaptation, we believe that the adaptation
method described here improves performance and portability in large-scale SDS. An
extensive experiment on large-scale multi-domain SLU, where the domains are close
to each other, remains our future work.
This chapter introduced the use of statistical machine learning methods to under-
stand spoken language. Statistical SLU was formalized by sequential supervised
learning, in which the DA and NE recognition tasks were formulated as sequential
labeling and sequence classification. State-of-the-art learning techniques for sequen-
tial supervised learning were introduced, most of which can be described by a loss
minimization framework in linear and log-linear models. In particular, CRFs have
been widely adopted for solving statistical SLU because of well-founded probabilistic
theory and empirical evidence.
Despite the clear potential of machine learning approaches in SLU, developing
practical methods for ambient intelligence and smart environments remains a signifi-
cant challenge. To address this problem, this chapter presented two advanced appli-
cations of statistical SLU for ambient intelligence. First, we addressed efficient
learning and inference of large-scale CRFs by applying two methods: (1) partial-space
inference for saving time, and (2) feature selection for reducing memory require-
ments. Second, we addressed transfer learning for statistical SLU, where multiple
tasks and domain knowledge can be incorporated. To this end, a novel probabilistic
model, triangular-chain CRFs, was proposed to concurrently solve the two related
problems of sequence labeling and sequence classification. An attractive feature of
our method is that it represents multiple tasks in a single graphical model, thus natu-
rally embedding their mutual dependence. We applied triangular-chain CRFs to two
novel applications: joint prediction of NEs and DAs and multi-domain SLU.
Researchers have begun to study machine learning methods to improve perfor-
mance and adaptability for practical SLU. The main drawback of the supervised
learning method is that assembling training data is expensive. To alleviate this
problem, methods of active and semi-supervised learning to reduce annotation
requirements were presented [35]. Some studies attempt to incorporate human-
crafted knowledge to compensate for the lack of data when building statistical
SLU (e.g., [31]).
Another problem with the supervised learning method is how to use the acoustic
information from a speech recognizer to develop robust SLU. The output of a speech
recognizer contains rich information, such as n-best lists, word lattices, and confi-
dence scores, which many researchers have tr ied to utilize to imp rove SLU perfor-
mance as well as to provide useful informati on for dialogue management [12, 30].
Exploiting structural information to overcome long-distance dependency is
another significant challenge. He and Young [13] described an approach using a
hidden vector state model that extends the basic hidden Markov model for encoding
8.8 Conclusion and Future Direction 221

Get Human-Centric Interfaces for Ambient Intelligence now with O’Reilly online learning.

O’Reilly members experience live online training, plus books, videos, and digital content from 200+ publishers.