Computing Prosody: Computational Models for Processing by D. R. Ladd (auth.), Yoshinori Sagisaka, Nick Campbell, Norio

By D. R. Ladd (auth.), Yoshinori Sagisaka, Nick Campbell, Norio Higuchi (eds.)

This publication provides a suite of papers from the Spring 1995 paintings­ store on Computational ways to Processing the Prosody of Spon­ taneous Speech, hosted by way of the ATR reading Telecommunications Re­ seek Laboratories in Kyoto, Japan. The workshop introduced jointly lead­ ing researchers within the fields of speech and sign processing, electric en­ gineering, psychology, and linguistics, to debate facets of spontaneous speech prosody and to signify methods to its computational research and modelling. The publication is split into 4 sections. half I provides an summary and theoretical historical past to the character of spontaneous speech, differentiating it from the lab-speech that has been the point of interest of such a lot of prior analyses. half II specializes in the prosodic positive aspects of discourse and the constitution of the spoken message, half ilIon the new release and modelling of prosody for laptop speech synthesis. half IV discusses how prosodic details can be utilized within the context of computerized speech attractiveness. each one element of the booklet starts off with an invited evaluation paper to situate the chapters within the context of present examine. We believe that this number of papers deals attention-grabbing insights into the scope and nature of the issues concerned about the computational research and modelling of actual spontaneous speech, and anticipate that those works won't simply shape the root of additional advancements in each one box but additionally merge to shape an built-in computational version of prosody for a greater figuring out of human processing of the complicated interactions of the speech chain.

As far as I know, no one yet has worked out an elicitation paradigm with sufficient relevant control to allow fruitful analysis of these phenomena for computational modelling. I think we are now at the stage where enough experts in the relevant different areas are aware of each others' work that we can begin to seriously hone elicitation paradigms for modelling the prosody of such phenomena as repair, discourse topic organization, and interactive structure. Linguists and computer scientists working on dialogue models know that they need 20 Mary E.

Journal of Phonetics, 18:65-69, 1990. [AT91] J. Azuma and Y. Tsukuma. Role of Fo and pause in disambiguating syntactically ambiguous Japanese sentences. In Proceedings of the XIIeme International Congress of Phonetic Sciences, Aix-en-Provence, France, Vol. 3, pp. 274-277, 1991. [A ve90] C. Avesani. A contribution to the synthesis of Italian intonation. In Proceedings of the International Conference on Spoken Language Processing, Kobe, Japan, Vol. 2, pp. 833-836, 1990. [Aye94] G. Ayers. Discourse functions of pitch range in spontaneous and read speech.

The stage of message planning needs further elaboration. , to select the linguistic units and structures for the message. Both these stages require a varying amount of processing time. For example, when a speaker faces a difficult or unexpected question, stage (a) will require time during which the speaker generally produces a certain kind of filler sound or expresses hesitation. On the other hand, when a speaker has difficulty in finding appropriate words or phrases, it is indicated by another type of filler sound or expression, or by interruptions and re-starts.

