Audiovisual Transcription

Audiovisual transcription is the production of a document created by typing (or writing) everything that is heard from either an audio or video. The transcript can be an exact word for word document, or the transcriber can clean up certain parts of the speech. They do this by removing things like ‘ehh’, ‘ooh’, ‘aah’, ‘mmh’, … that are heard but are not necessary in the document. Transcription aims to provide a solid starting point for your subtitling and dubbing. The resulting texts can also be reused to improve accessibility of your content, such as for deaf viewers, who’ll be able to read a perfect transcription of your video.


There is more than one way to transcribe audio or video content and a variety of reasons why you might want a transcription document. There are many reasons an audiovisual transcript may be needed:

  • Transcribing audio content for the deaf or hard of hearing.

Being a brand that offers accessibility alternatives to those with hearing impairments not only enhances corporate social responsibility and improves the brand image but has the potential to open up services to a whole new audience.

  • Transcribing audio for subtitles or closed captions.

To create subtitles or closed captions you need a full transcription document from the video. We’ll then use the transcript to add time-code markers that will reference what is being said at concrete times in the video. This means that the subtitles or closed captions will appear on screen at the exact time it’s supposed to.

  • Transcribing audio for voice over.

We transcribe the speech when the final voice over needs to be time-synced to a video. This means the timing of the voice over speech will match up with specific timings in the video and therefore be in synchronization with what is happening on screen. This is a common requirement when translating the voice over in corporate videos or explainers.

  • Transcribing for translation purposes.

The first step of translation is always transcribing. Once we have the time codes in place for subtitle or voice over use, we can then use that same transcription document to create as many different foreign language versions as required. The time code stamps will stay the same in every language and will mark where each section of speech starts and ends. This is important as the translator may have to condense the translation to fit in with the voice over or subtitle timings so that they still match up with what’s on screen.