For humans, it often takes years to develop the skill of picking out a voice among crowds. And for smart home speakers, hearing distinct words among a crowd can be particularly difficult. However, Google might have found a fix for that struggle. Google just developed an easy way to isolate a single voice in a crowd.
The Google development team trained its neural network model in order to recognize individual people's voices. The team then created virtual "parties" that included background noise in order to train the AI to isolate those multiple voices into distinct audio tracks.
In this clip from the Google team, you can watch two comedians compete for attention against each other vocally. Despite the craziness, the AI generated clean audio track for a single person by isolating their face. The AI tracks a person and their voice even when their face is obscured with a waving hand or a microphone.
As of right now, Google hasn't put the technology into its Google Home assistant yet. The company has said it's "exploring opportunities" to use the feature in products. There are a number of technologies it could use the new AI feature in, like video chats like Hangouts or Duo. The feature could help enhance speech in video recording, and it could even lead to camera-linked hearing aids in order to improve quality for hearing-impaired users.
Source: Interesting Engineering