complex than those in traditional dom ain adaptation, we believe that the adaptation
method described here improves performance and portability in large-scale SDS. An
extensive experiment on large-scale multi-domain SLU, where the domains are close
to each other, remains our future work.
8.8 CONCLUSION AND FUTURE DIRECTION
This chapter introduced the use of statistical machine learning methods to under-
stand spoken language. Statistical SLU was formalized by sequential supervised
learning, in which the DA and NE recognition tasks were formulated as sequential
labeling and sequence classification. State-of-the-art learning techniques for sequen-
tial supervised learning were introduced, most of which can be described by a loss
minimization framework in linear and log-linear models. In particular, CRFs have
been widely adopted for solving statistical SLU because of well-founded probabilistic
theory and empirical evidence.
Despite the clear potential of machine learning approaches in SLU, developing
practical methods for ambient intelligence and smart environments remains a signifi-
cant challenge. To address this problem, this chapter presented two advanced appli-
cations of statistical SLU for ambient intelligence. First, we addressed efficient
learning and inference of large-scale CRFs by applying two methods: (1) partial-space
inference for saving time, and (2) feature selection for reducing memory require-
ments. Second, we addressed transfer learning for statistical SLU, where multiple
tasks and domain knowledge can be incorporated. To this end, a novel probabilistic
model, triangular-chain CRFs, was proposed to concurrently solve the two related
problems of sequence labeling and sequence classification. An attractive feature of
our method is that it represents multiple tasks in a single graphical model, thus natu-
rally embedding their mutual dependence. We applied triangular-chain CRFs to two
novel applications: joint prediction of NEs and DAs and multi-domain SLU.
Researchers have begun to study machine learning methods to improve perfor-
mance and adaptability for practical SLU. The main drawback of the supervised
learning method is that assembling training data is expensive. To alleviate this
problem, methods of active and semi-supervised learning to reduce annotation
requirements were presented [35]. Some studies attempt to incorporate human-
crafted knowledge to compensate for the lack of data when building statistical
SLU (e.g., [31]).
Another problem with the supervised learning method is how to use the acoustic
information from a speech recognizer to develop robust SLU. The output of a speech
recognizer contains rich information, such as n-best lists, word lattices, and confi-
dence scores, which many researchers have tr ied to utilize to imp rove SLU perfor-
mance as well as to provide useful informati on for dialogue management [12, 30].
Exploiting structural information to overcome long-distance dependency is
another significant challenge. He and Young [13] described an approach using a
hidden vector state model that extends the basic hidden Markov model for encoding
8.8 Conclusion and Future Direction 221

Get Human-Centric Interfaces for Ambient Intelligence now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.