Aspects of the present disclosure are directed to a projection device that projects images onto surfaces and detects interactions with the images on the surfaces. For example, the projection device can include a projector that can project images onto a surface, a camera that can capture imagery of user interactions with the images on the surface, and a processing system (e.g., that includes one or more machine-learned models) that can process the imagery to understand the user interactions, thereby receiving user input. The images projected onto the surface can include still images, moving images, or a user interface that includes elements with which a user can interact. For example, the projection device can receive data descriptive of the user interface of a mobile device (e.g., smartphone) and can project the mobile device’s interface onto the surface of a table, thereby allowing a user to interact with the mobile device interface on the surface of the table.

