Lin Zhang(Tianjin University), Kiyoshi Honda(Tianjin University), Jianguo Wei(Tianjin University) and Seiji Adachi(Fraunhofer Institute for Building Physics)
Abstract:
This study attempts to describe a plausible causal mechanism of generating individual vocal characteristics in higher spectra. The lower vocal tract has been suggested to be such a causal region, but a question remains as to how this region modulates vowels’ higher spectra. Based on existing data, this study predicts that resonance of the lower vocal tract modulates higher vowel spectra into a peak-dip-peak pattern. A preliminary acoustic simulation was made to confirm that complexity of lower vocal-tract cavities generates such a pattern with the second peak. This spectral modulation pattern was further examined to see to what extent it contributes to generating static speaker characteristics. To do so, a statistical analysis of male and female F-ratio curves was conducted based on a speech database. In the result, three frequency regions for the peak-dip-peak patterns correspond to three regions in the gender-specific F-ratio curves. Thus, this study suggests that, while the first peak may be the major determinant by the human ears, the whole frequency pattern facilitates speaker recognition by machines.