Multi-Modal Signal Processing

Chapter | 4 Natural Language and Dialogue Processing

mainly to enlarge the set of available data and to predict the behaviour

of the SDS in unseen situations.Among simulation methods presented

in the literature, one can distinguish between state-transition methods

as proposed in [54] and methods based on modular simulation envi-

ronments as described in [48, 58–60]. The ﬁrst type of method is more

task-dependent as well as the hybrid method proposed in [85]. One

can also distinguish methods according to the level of abstraction in

which the simulation takes place. While [59, 86] models the dialog at

the acoustic level, most of other methods [48, 54, 58, 60, 85] remain

at the intention level, arguing that simulation of other levels can be

inferred from intentions.

4.5 CONCLUSION

In this chapter, we have described processing systems that are usu-

ally hidden to the user although essential for building speech- or

text-based interfaces. All of these systems are still being the topic

of intensive research and there exists room for improvement in per-

formance. Especially, data-driven methods for optimising end-to-end

systems from speech recognition to text-to-speech synthesis are being

investigated [87], albeit data collection and annotation is still a major

problem. What is more, language processing is still limited to domain-

dependent applications (such as troubleshooting, database access, etc)

and cross-domain or even cross-language methods are still far from

being available. Also, transfer of academic research into the industrial

world is still rare [88]. The search for efﬁciency often leads to hand-

crafted and system-directed management strategies that are easier to

understand and control.

REFERENCES

1. J. Allen, Natural Language Understanding, second ed., Benjamin Cummings,

1994.

2. N. Chomsky, Three models for description of languages, Trans. Inf. Theory,

2 (1956) 113–124.

3. A.Aho, J. Ullman, The Theory of Parsing, Translation, and Compiling, Prentice-

Hall, 1972.

4. W. Woods, Transition network grammars for natural language analysis,

Commun. ACM, 13 (1970) 591–606.

PART|I Signal Processing, Modelling and Related Mathematical Tools

5. D.E. Knuth, Backus normal form vs. backus naur form. Commun. ACM 7 (12)

(1964) 735–736.

6. G. Gazdar, C. Mellish, Natural Language Programming in PROLOG, Addison-

Wesley, Reading, MA, 1989.

7. F. Jelinek, Self-organized language modelling for speech recognition, in:

A. Waibel, K.-F. Lee (Eds.), Readings in Speech Recognition, Morgan Kauf-

mann, 1990, pp. 450–506.

8. S. Seneff, Tina: a natural language system for spoken language applications,

Comput. Linguist. 18 (1) (1992) 61–86.

9. J. Austin, How to Do Things with Words, Harvard University Press, Cambridge,

MA, 1962.

10. P. Cohen, C. Perrault, Elements of a plan-based theory of speech acts, Cogn.

Sci. 3 (1979) 117–212.

11. R. Montague, Formal Philosophy, Yale University, New Haven, 1974.

12. R. Pieraccini, E. Levin, Stochastic representation of semantic structure for

speech understanding, Speech Commun. 11 (1992) 238–288.

13. Y. He, S. Young, Spoken language understanding using the hidden vector state

model, Speech Commun. 48 (3–4) (2006) 262–275.

14. S. Pradhan, W. Ward, K. Hacioglu, J.H. Martin, D. Jurafsky, Shallow seman-

tic parsing using support vector machines, in: Proceedings of HLT-NAACL,

2004.

15. C. Raymond, G. Riccardi, Generative and discriminative algorithms for spo-

ken language understanding, in: Proceedings of Interspeech, Anvers (Belgium),

August 2007.

16. F. Mairesse, M. Gaši

c, F. Jur

cí

cek, S. Keizer, B. Thomson, K. Yu, et al., Spoken

language understanding from unaligned data using discriminative classiﬁcation

models, in: Proceedings of ICASSP, 2009.

17. F. Jelinek, J. Lafferty, D. Magerman, R. Mercer, A. Ratnaparkhi, S. Roukos,

Decision tree parsing using a hidden derivation model, in: HLT ’94: Proceedings

of the workshop on Human Language Technology, Morristown, NJ, USA, 1994,

pp. 272–277, Association for Computational Linguistics.

18. L. Kartunnen, Discourse referents, in: J. McCawley (Ed.), Syntax and Seman-

tics 7, Academic Press, 1976, pp. 363–385.

19. C. Sidner, Focusing in the comprehension of deﬁnite anaphora, in: M. Brody,

R. Berwick (Eds.), Computational Models of Discourse, MIT Press, Cambridge,

Mass, 1983, pp. 267–330.

20. M. Walker, A. Joshi, E. Prince (Eds.), Centering Theory in Discourse, Oxford

University Press, 1998.

21. E. Reiter, R. Dale, Building Natural Language Generation Systems, Cambridge

University Press, Cambridge, 2000.

Get Multi-Modal Signal Processing now with the O’Reilly learning platform.

O’Reilly members experience books, live events, courses curated by job role, and more from O’Reilly and nearly 200 top publishers.

Start your free trial

Multi-Modal Signal Processing by Jean-Philippe Thiran, Ferran Marqués, Hervé Bourlard

Don’t leave empty-handed

It’s yours, free.

Check it out now on O’Reilly