Abstract

The audio signal transmitted by the far end of a conference call or video conference is typically received at the near end as a composite signal that is the sum of the audio signals of the far-end participants. Far-end participants are therefore not spatially separable at the near end. This prevents near-end participants from using the natural focusing abilities of the brain (cocktail party effect) to focus on the speech of particular far-end participants. This disclosure describes techniques, e.g., per-microphone audio channels, speech diarization, etc., that distinguish far-end participants such that their audio is spatially separated at the near end.

Creative Commons License

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.

Share

COinS