AES E-Library

Quantifying the Speaking Voice: Further Investigation Into Speaker Identification by a Simple Code-Matching Technique

This paper reports on the techniques refined for a method of speaker identification through the automated comparison of spectral, timbral, and temporal features unique to an individual’s speech production. This method was first described in Convention Paper 7274 presented by the co-author of this paper, Richard Sanders, at the 123rd Convention of the Audio Engineering Society. Since its first publication, the system (now referred to as SIDNI or Speaker Identification by Numerical Imprint) has improved from 79% correct identifications in 78 comparisons from the speech of 26 males to 100% correct identifications in 150 comparisons from the speech of 50 males. This paper will provide more information on these results and the results of several other tests while also elaborating on the specific speech characteristics exploited by the system and their potential for identification. Some characteristics include: average fundamental speaking frequency, ratio of spectral densities below 1 kHz to those above 1 kHz (Alpha ratio), average rate of vowels, jitter, and shimmer.

 

Author (s):
Affiliation: (See document for exact affiliation information.)
Publication Date:
Session subject:

DOI:


Click to purchase paper as a non-member or login as an AES member. If your company or school subscribes to the E-Library then switch to the institutional version. If you are not an AES member Join the AES. If you need to check your member status, login to the Member Portal.

Type:
16938
Choose your country of residence from this list: