A system and method are disclosed for enabling digital assistant devices to distinguish human inputs from non-human voice inputs using subaudible cue tones. The system includes a home assistant device with cue tone recognition that receives cue tones from non-human assistant devices. The cue tones are used to differentiate live, in-person audio input from remote or non-human audio. For example, a television commercial may initiate a tone that communicates with the nearby device to remain non-interactive with the following audio. When devices communicate with one another, a cue tone from each device may also relay inaudibly that certain behaviors associated with a human person will not be triggered in these interactions.

