While the aim of transcriptionists is to create an accurate transcript of the English speech heard, the aim of captioners is to recreate the full audio experience for non-hearing viewers which includes capturing the English speech, any singing, and using atmospherics to describe the music and sounds integral to the context of the story. The ultimate goal of captions is to give the viewer as close to the same experience as the hearing viewer. Captioners also sync caption groups to the audio.
Dashes indicate speaker changes. In general, use only a dash "-" and space when it's obvious through visual clues who's speaking. Add a speaker label, [Name], if the speaker cannot be visually identiﬁed as speaking before being interrupted by another speaker. If the name is not known, use the most appropriate role descriptor such as Instructor, Narrator, Announcer, et cetera.
Proper caption breaking
Caption groups should be created for best readability Start a new caption group after terminal punctuation, period, question mark, exclamation park or a double dash for an abrupt interruption by another speaker or relevant sound. A caption group cannot exceed five seconds nor 60 characters in length. Break before pronouns, adverbs, and prepositional phrases such as: that, who, in order to, not only, as we, in which, where, with, what, how, for, through, until, to, as, of, yet, so, by, as well as conjunctions such as and, nor, but, or, because. Here's an example of good caption breaking:
It's invaluable as far as what it's going to do
for my job security and my options when I get out
of school and start looking for full-time work.
I don't miss school appointments or school plays.
Those are benefits that you can't get in an office.
I'm not sure how it can get much better than that.
Please use a caret when there's added text that appears in the lower third of the screen that's intended to be readable AND there's zero text in the upper third at the start time of the caption group. Use Shift + 6 for the ^ in the Type stage. Do not use a caret in these instances:
- Text is native to the video recording and not added later, such as a software or game interface
- Video property text such as production running timecodes or logos, or functioning as a logo
- Graphics or images
- When there's also text in the upper third at the start time of the caption group
- Unchanging text that persists for the entire duration of the video
Syncing caption groups
In the sync stage, use the Up or Down Arrow key to sync each caption group so it appears on screen when the audio begins. The start time needs to align with the beginning of the sound. This applies to both atmospherics and speech. Aim for precision, but it’s okay for the start time to be up to a ½ second early or late from the start of the sound.
Caption lyrics when there is no spoken dialogue occurring at the same time. In the absence of spoken words, the lyrics become the dialogue to be captioned. Add a musical eighth note “♪” at the start of every caption group containing the lyrics by typing ## followed by a space in Dash.
Captions need to indicate sounds heard on screen. These identifiers, which we call atmospherics, provide visual indicators of non-verbal sounds to viewers. Use adjectives to describe mood music, i.e., (bright piano music), and use active verbs to describe relevant sounds heard, i.e., (jet engine roaring) or (audience cheering). Keep these points in mind:
- Use parentheses ( ) and lowercase unless a proper noun is used
- Use a noun + descriptor/verb in present tense form
- Use present tense
- For music, include adjectives describing the type of music