Sections

AES Section Meeting Reports

Argentina - May 29, 2024

Meeting Topic:

Moderator Name:

Speaker Name:

Other business or activities at the meeting:

Meeting Location:

Summary

On Wednesday, May 29, we organized an educational event aimed at the Spanish-speaking community, featuring two of the leading Latin American experts in the field of artificial intelligence applied to audio and human voice. This international seminar, which lasted for three hours, was hosted by the current Vice Chairman of AES Argentina, Christian Paladino. Paladino was responsible for guiding our guests, José Elizalde and Juan "Cana" San Martín.
In the first part of the webinar, an interview was conducted with the Mexican engineer José Elizalde. During this session, José described his early steps, technical and artistic interests that led him to deepen his knowledge in music and technology. He also provided a detailed analysis of his technological work related to language, linguistics, computer science, and artificial intelligence. José Elizondo is a musician and engineer, holding degrees in Music and Electrical Engineering from MIT. He also studied musical analysis, orchestration, and conducting at Harvard. As a composer, he has collaborated in creating music for orchestras and chamber ensembles. Additionally, he has published articles on technology and interface design in international journals and has presented workshops in various countries on intercultural design and technology.
In the second part of the seminar, Juan "Cana" San Martín shared his knowledge on voice recognition and artificial intelligence. Cana provided a comprehensive overview of the most used acoustic and language models in these disciplines. He also explained the crucial role of APIs in integrating advanced voice recognition capabilities into various applications. Subsequently, he delved into Neural Networks and Stochastic Gradient Descent (SGD) Optimization, highlighting their ability to efficiently handle large volumes of data. He explained how SGD allows these models to learn and improve, providing the capabilities we see in voice recognition applications such as virtual assistants and automatic transcription systems.
Additionally, the importance of prosody and its essential components for improving the accuracy, naturalness, and effectiveness of these AI and voice recognition systems was discussed.
After three intense hours and the customary acknowledgments, we concluded an event that invites revisiting. Those interested in watching it for the first time (or again) are invited to visit this link https://www.youtube.com/watch?v=aQy7nOhsn-4&t=5588s&pp=ygUNYWVzIGFyZ2VudGluYQ%3D%3D

Written By:

More About Argentina Section

AES - Audio Engineering Society