
[1]Flanagan J L. Speech analysis, synthesis, and perception[M]. 2nd ed. New York: Springer-Verlag, 1972.

[2]Ramakrishnan B R. Reconstruction of incomplete spectrograms for robust speech recognition[D]. Ph. D. dissertation. CMU, 2000.

[3]Kandel E R, Schwartz J H, Jessell T M, et al. Principles of neural science[M], 3rd ed. Amsterdam: Elsevier Science Publishing, 1991.

[4]Morgan D P, Scofield C L. Neural networks and speech processing[M]. Amsterdam: Kluwer Academic Publishers, 1991.

[5]Painter T, Spanias A. Perceptual Coding of Digital Audio[J]. Proceedings of the IEEE, 2000, 88(4):451-513.



[8]Teager H M, Teager S M. Some observation on oral airflow during phonation[J]. IEEE Trans on ASSP, 1980, 28(5):599-601.

[9]Teager H M, Teager S M. Evidence for nonlinear production mechanisms in vocal tract[J]. In: Speech Production and Speech Modeling, vol 55. Boston: Kluwer Academic Publishers, 1990:241-261.

[10]Thomas T J. A finite element model of fluid flow in the vocal tract[J]. Computer Speech and Language, 1986, 1:131-151.

[11]McGowan R S. An aero acoustic approach to phonation[J]. The Journal of the Acoustical Society of America, 1988, 83(2):696-704.

[12]Maragos P, Kaiser J F, Quatieri T F. Energy separation in signal modulation with application to s peech analysis[J]. IEEE Trans. Signal Processing, 1993, 41(10):3024-3051.

[13]Kaiser J F. Some useful properties of Teager energy operators[J]. In Sullivan B J. ICASSP 93, vol 3. Minnesota, USA: IEEE Press, 1993:149-152.

[14]Kaiser J F. On a simple algorithm to calculate the“energy”of a signal[J]. In Ludeman L. ICASSP, vol 1. Albuquerque, New Mexico: IEEE Press, 1990:381-384.

[15]Hanson H M, Maragos P, Potamianos A. Finding speech formants and modulations via energy separation: with application to a vocoder[J]. In Sullivan B J. ICASSP 93, vol 2. Minnesota, USA: IEEE Press, 1993:716-719.

[16]Potamianos A, Maragos P. Speech formant frequency and bandwidth tracking using multiband energy demodulation[J]. In Drago D. ICASSP 95, vol 1. Michigan, USA: IEEE Press, 1995, 1:784-787.

[17]Potamianos A, Maragos P. Speech analysis and synthesis using an AM-FM modulation model[J]. Speech Communication, 1999, 28(3):195-209.

[18]Foote J T, Mashao D J, Silverman H F. Stop classification using DESA-1 high-resolution formant Tracking[J]. In Sullivan B J. ICASSP 93, vol 2. Minnesota, USA: IEEE Press, 1993, 720-723.

[19]Guojun Z, Hansen J H L, Kaiser J F. Classification of speech under stress based on features derived from the nonlinear Teager energy operator[J]. In Acero A. ICASSP 98, vol 1. Seattle, Washington, USA: IEEE Press, 1998, 549-552.

[20]Ying G S, Mitchell C D, Jamieson L H. Endpoint detection of isolated utterances based on a modified Teager energy measurement[J]. In Sullivan B J. ICASSP 93, vol 2. Minnesota, USA: IEEE Press, 1993, 732-735.

[21]Cairns D A, Hansen J H L. Nonlinear analysis and classification of speech under stressed conditions[J]. The Journal of the Acoustical Society of America, 1994, 96(6):3392-3400.

[22]Guojun Z, Hansen H J L, Kaiser J F. Methods for stress classification: nonlinear TEO and linear s peech based features[J]. In Rodriguez J. ICASSP 99, vol 4. Phoenix, Arizona, USA: IEEE Press, 1999, 2087-2090.






[28]Michael Konerner.最新语音识别技术[M].李逸波,郭天杰,王华驹,等译.北京:电子工业出版社,1998.


(1) 本书中的对数函数,除明确标注了底数的部分外,其他形如log表述的部分底数均可取任意值。因为语音信号处理中,取对数运算主要有两个用途:一是压缩数据的动态范围;二是将诸如xy两变量的乘积部分通过取对数运算转化为两变量的相加,即logxy=logx+logy