On video-sharing platforms, users access some video clips primarily for audio rather than video content. In such video clips, the display is idle or otherwise possibly uninteresting to the viewer. The techniques of this disclosure apply machine learning to detect if the visual portion of a video clip is likely not of interest to the user. If the visual portion detected to not be of interest to the user, permission is sought from the user to insert a visual ad into the clip while audio continues playing unchanged. If user permission is obtained, ads are inserted in portions of video clips identified as not being of interest to the user, thereby monetizing the video clip.

This work is licensed under a Creative Commons Attribution 4.0 License.