Systems and methods described herein allow for monetization of a virtual personal assistant by offering sponsored content to users responsive to input audio signals triggering online actions. A data processing system can identify a user request based on a received input audio signal. The data processing system can execute an online action responsive to the identified user request. The data processing system can generate a response to the input audio signal based on the executed online action, and select a sponsored content item based on a context of the identified user request. The data processing system can then transmit the generated response and/or the selected sponsored content item for presentation to the user. The user may respond with an instruction for selecting or confirming a service described in the generated response or the sponsored content item, and the data processing system can inform the respective service provider of such confirmation.

