Systems and methods described herein allow for consolidation of requests from multiple users in a voice activated interaction computer environment. A data processing system can receive at least two audio input signals from at least two user devices associated with at least two corresponding users, and determine, for each received input audio signal, a respective user request. The data processing system can generate, for each received input audio signal, a corresponding action data structure, and compare the action data structures associated with separate user requests to determine a pooling parameter indicative of overlapping features or themes among the action data structures. The data processing system can generate a pooled action data structure based on the determined pooling parameter, and transmit the pooled data structure to computing device associated with a service provider.

Creative Commons License

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.