A technique is proposed to translate slide-based documents, for example, into videos with or without captions or subtitles. The presentation intake component extracts a script or speaker notes from the content inputted by the user, and creates a text file corresponding to each slide or image. The translation component then translates the script into multiple target languages. The synthesizing component uses a text-to-speech conversion to create an audio file for each slide or image. The video creator component creates a silent video for each image for the duration of the audio file corresponding to each still image, with or without captions or subtitles created using the translated text. The video creator component merges the silent video files with the audio files in order to produce a final video translation. Along each step of the way, the technique saves each file as intermediate artifacts, which may be downloaded, altered and/or uploaded by the user during the production process.

Creative Commons License

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.