Style Guide References

Have more questions? Submit a request

The style guides below contain guidelines our transcribers and captioners follow when working on your files. 

 

Transcription Style Guide

The Transcription Style Guide explains Rev's expectations for transcript quality. In addition to this guide, which covers the fundamental elements of a quality transcript, we have a robust Help Center for additional guidance and examples. 

We trust you to deliver high-quality work. Customers rely on your accurate and timely transcription as a  crucial part of their daily work.

 

Browser Compatibility
Rev Recommends that you use the most up-to-date version of Google Chrome when working in the transcription editors

 

Transcription Accuracy

Precision
Always transcribe the audio as spoken. Remember, the transcript should accurately reflect what was actually said in the audio file. Although spoken word is not always grammatically correct, your transcript must preserve the integrity of the original speech. Do not type what you think the speaker meant to say.
Always attribute what is being said to the correct speaker.

  • Do not omit content
    • There are allowable exceptions.
  • Never add content or paraphrase.
  • Never censor or edit expletives/profanity if the word is spoken.
    • If the word is censored with a sound or silenced in the audio, use a notation such as (beep) or (censored) to indicate the censored word.
  • Egregious phonetic and pronunciation errors that inhibit readability or understanding may be corrected to help readability.
    • Example: if a speaker pronounces “refrigerator, washer and dryer” as “refrigurator, warshar and dryear”, please use the correct word and spelling based on the context of the audio.
  • Informal contractions may be corrected in non-verbatim projects to help readability.
    • Informal contractions are short forms of words that people use while speaking casually.
    • You may change these to the formal form when applicable if it would help with readability.
      • ’cause ➜ because
      • ‘em ➜ them
      • doin’ ➜ doing
      • gonna ➜ going to
      • gotta ➜ got to
      • kinda ➜ kind of
      • wanna ➜ want to

 

Wrong Words
Always use context clues in the audio to type the correct word or phrase. If you are unsure of a word or phrase, try researching, using Lend An Ear, or asking for a second opinion on the forum.

 

Examples
“aerospace” vs “arrow space”
“Botox” vs “boat ox”
Always use context clues to write down the appropriate word. This is especially important for proper nouns or industry terminology.
“looked” vs “loved”
“kissed” vs “killed”
Take your time while transcribing—a changed word could result in a drastic change in the meaning of a sentence.
“than” vs “then” Be mindful to use the correct form of a homophone.
“desert” vs “dessert” Sometimes even a single letter can completely change the meaning.

 

Spelling & Grammar

  • Use standard U.S. spelling.
  • Always research words, phrases and proper nouns (names, companies, titles, etc.) you are unfamiliar with.
    • If you cannot confirm the spelling of a proper noun through research, use your best guess and keep it consistent throughout the project.
  • Always reference glossary terms when provided. If a customer has provided glossary terms, they will display in the left-hand menu of the editor.
  • Make sure to spell check for spelling and typographical errors.*
  • Use English grammar conventions while maintaining the integrity of what was spoken.
    • We are unable to cover and address specific guidelines regarding grammar.
    • We expect you to have prior knowledge of, or be able to research, American English grammar, capitalization, and punctuation guidelines.

grammar_tip.jpg

* The spellcheck in the Editor is a very helpful tool to help catch errors, but it is still ultimately up to you to proof your document for spelling errors/incorrect word swaps.

 

Verbatim vs. Non-Verbatim

Verbatim
In verbatim projects, transcribe exactly what you hear, including filler words, stutters, interjections (active listening) and repetitions. Click here to see an example.

You will be able to tell if a project is verbatim in Find Work (indicated in the TYPE column) and in the editor (listed next to TYPE in the upper right corner, above the playback controls).

If the project was requested Verbatim but was not completed as such, the project will be graded 1/1 for accuracy/formatting.

 

Non-Verbatim (default style)
In non-verbatim projects, you should lightly edit for readability. You should not change the structure or meaning of the speech. Non-verbatim projects will not have an indicator in the TYPE column in Find Work and are listed as NON-VERBATIM in the editor.

 

  Verbatim Non-Verbatim

Non-speech sounds  

(laughs) and (laughing) are the only non-speech sounds we capture, and only in verbatim projects.

check.jpg no_x.jpg

All OTHER non-speech sounds  

(e.g. coughs, sneezes, clapping, paper rustling, dog barks, car honks) should not be transcribed.

no_x.jpg no_x.jpg

Interjections or signs of active listening that interrupt a speaker.

(e.g. Okay, Yeah, Mm-hmm) 

check.jpg no_x.jpg

Filler words (um, uh)

Also known as “verbal pauses”; other words such as like or you know may also be used like this.

check.jpg no_x.jpg

False starts / self-corrections that are quickly reworded, unless they provide additional context. 
 

A complete sentence is not a false start. 

check.jpg no_x.jpg

Stutters

(e.g. I think we should go to the, the m- m- movies.)

check.jpg no_x.jpg
Explicit content or profanity should be captured as spoken (or as censored) in both default and verbatim projects. check.jpg check.jpg
Singing should be noted only as (singing) in both default and verbatim projects; do not transcribe the lyrics to a song in a transcript. check.jpg check.jpg

 

Examples: Verbatim vs Non-Verbatim

  Verbatim Non-Verbatim
Example 1 And so, um, I guess… I think we should go to the, the m- m- movies tonight ‘cause of the discount (laughs). I think we should go to the movies tonight because of the discount.
Example 2 I like, you know, called her, like, yesterday and, um, like, she was, like, sleeping. Probably, she was just like, really tired. I called her yesterday and she was sleeping. Probably, she was just really tired.
Example 3
Leave the false start in default style because it provides context as to who called.
My mom was (laughs)… I forgot to tell you, she called me yesterday. My mom was… I forgot to tell you, she called me yesterday.

Example 4

Remove the false start in default style because “My mom” is introduced later.

My mom… I forgot to tell you, my mom called me yesterday. I forgot to tell you, my mom called me yesterday.

 

Inaudibles
An inaudible tag should be used when unintelligible or inaudible words are spoken. This may happen due to difficult audio quality, a sound (such as a car horn) obscuring the main speaker, or recording issues. This tag should never be used in place of research when you are unfamiliar with a term.

inaudible_example.png

 

  • Excessive Inaudibles: If you are using an excessive number of inaudibles in a transcript (to the point where the transcript would be unusable to the customer), unclaim and report the file as difficult audio.
  • Incorrect use of the inaudible tag is an error
    • Using the tag when the word can be identified is an accuracy error.
    • Incorrectly formatting the tag is a formatting error, as explained under Notation Tags.

 

Transcription Formatting

Provided Speaker Labels
If a customer has provided speaker labels, they will appear in the information pane in the left-hand panel of the editor. You must use them if:

  • The speaker is self-identified in the audio or video.
    • “My name is Arnold”
  • You can reasonably infer who is speaking if another speaker introduces the name.
    • “What do you think, Gustav?”
  • There is only one speaker and one name is provided.
  • You can use the process of elimination to assign the correct speaker names (e.g. one male name and one female name match up with one male speaker and one female speaker).

speaker_label_tip.jpg

If you cannot assign the provided speaker labels, follow the guidelines below for Inferred Speaker Labels.

 

Inferred Speaker Labels
A reasonable effort must be made to distinguish speakers using the rules below: 

  • Never create your own descriptive speaker labels (e.g. “Old man” or “Blue shirt guy”).
    • This is extremely unprofessional and will result in a 1 in Formatting.
  • Please make every effort to not use gender in any format for speaker labels.
    • This can be considered offensive in some scenarios, and other options must be explored.
    • While this would not qualify as an automatic 1 in formatting, this will result in a reduction in score.
    • Speaker + number, roles/titles, or group-type labels can be used, depending on the scenario in the project.

 

Speaker Label Type Examples When to Use
Speaker + Number Speaker 1, Speaker 2 Default and most common way of labeling speakers when the speaker’s name cannot be reasonably inferred from the audio or video.
Speaker’s Name John Smith, Sara, Professor Lee If the speaker’s name can be reasonably inferred from the audio or video. If labels were not provided by the customer, Speaker + Number is also acceptable in this scenario.
Professional Role or Title Interviewer, Doctor, Translator  (Optional) If the speaker’s name cannot be reasonably inferred from the transcript. Using Speaker + Number is also acceptable.
Group Label Students, Audience, Camera Crew, Speaker X Only when there are too many speakers to consistently track who says what (e.g. classroom discussion, focus group). Do not use as a substitute for reasonable speaker identification.
Customer-provided speaker labels must be used whenever possible according to the guidelines in the previous section.

 

Notation Tags
If you encounter difficult or non-English audio, use one of the bracketed notation tags below, including a timestamp of the audio location. Also take note of the parenthetical tags used for singing, laughter, and censored content. Do not create a notation tag not listed below. 

 

Notation Tag  When to Use
[inaudible hh:mm:ss] Use when unintelligible or inaudible words are stated. Equivalent to a “blank” in medical transcription.
[foreign language hh:mm:ss]

For any non-English portions of audio, indicate where they begin with a timestamp and either the name of the language (if known) or simply “foreign language”. DO NOT transcribe non-English audio.


 

If a translator is speaking on a respondent's behalf, there is no need to denote [foreign language hh:mm:ss] every time that the respondent speaks. 

(singing) Used only if the lyrics cannot be clearly discerned because of challenging audio or unclear singing. 
(laughs) or (laughing) Used to indicate laughter in verbatim files only.
(beep), (censored) Used to indicate words that have been intentionally censored in the audio (usually profanity or redacted content). DO NOT censor content if it is spoken in the audio.

 

Dictation

Occasionally, customers dictate instructions to format the transcription while they are speaking. These instructions should be followed when possible but never transcribed.

  • Follow customer requests for spoken directions such as “new paragraph”, “comma”, “period” or “bullet point” (use a dash). Do not type out the instruction.
  • If a customer has clearly missed an instruction (e.g. “period” after a sentence has obviously concluded), it’s acceptable to add it in to aid readability.
  • As Rev does not support text formatting in the editor, ignore requests such as “bold”, “italics”, “underline” or “strikethrough”.

dictation_tip.jpg

 

Lyrics and Singing

Transcribe lyrics when there is no spoken dialogue occuring at the same time, even if there are pre-existing captions or subtitles on the video. When there are no spoken words, the lyrics become the dialogue to be transcribed. 

  • Omitting clearly heard lyrics will result in a reduction in score and can be scored as low as 1/1.
  • Tip: Googling portions of the lyrics can be helpful.

 

How to notate lyrics

  • Use the speaker label MUSIC for lyrics
    • If there are multiple singers and/or background vocalists, all content should remain under a single MUSIC label
  • Each lyrics line should be on a new line, or in a new paragraph in the editor
  • Capitalize the beginning of a line of lyrics
  • Each lyrics line should end with a period
  • Background vocals, if present, should be transcribed and included on a new line
    • Do not use parentheses for background vocals
  • All lyrics need to be transcribed as sung, including repeated lines
    • Do not use elements like (repeat x2)
  • When words repeat at the end of a song and fade out, ellipses may be used to represent the fadeout
  • Filler words in lyrics, like oh, ah, etc, should be transcribed sparingly if they add to the song's content stylistically
  • When these are background vocals, they can be omitted

 

Instrumental Music Only

Music should only be noted if the project only contains instrumental music. Music should not be noted in a project with dialogue or singing. 

A project with only instrumental music will contain music and no dialogue, speaking, singing, or lyrics. If a project has only instrumental music, it can be submitted with a single (music) notation tag with MUSIC as the speaker label.

tc_upbeat_music.jpg

Please note that we strongly recommend listening to the full length of the project to ensure that there is no dialogue or singing later in the file. Omitted dialogue or singing can result in a score as low as 1/1.

 

Additional Guidelines

Unworkable Projects
Certain types of projects are considered “unworkable” and should not be completed. You can unclaim a project by selecting Unclaim in the Project dropdown in Line. Unclaim projects if they meet the criteria below.

 

If.... Then...
The only audible content is in a non-English language Unclaim the project as “No English audio present”

The content violates our Terms of Service

(pornography, excessively violent, hate speech, etc)

Unclaim the project as “Contains explicit or disturbing content”

 

If you submit a project that is 100% foreign language, you may receive a grade of 1/1 and have the project pay removed.

 

Project-Specific Instructions
Occasionally a project may have approved special instructions that deviate from our normal guidelines. These instructions will be clearly marked as Special Instructions in the editor with either a yellow banner or in a designated Special Instructions section in the left-hand menu. They will also appear on the Find Work page. 

Customers will sometimes include separate instructions that go against our Style Guide in the glossary or speaker name section. Any customer-provided requests that do not appear in the designated Special Instructions section or banner and that go against our style guidelines should be ignored.

 

Not following official special instructions is considered an error and may result in a score of 1/1.

 

Transcription Rubrics

In transcription, a grade consists of scores in two categories: Accuracy and Formatting. This scoring rubric is used by graders when assessing overall project quality.

5 - Customer ready Transcript may contain occasional errors. Overall, the transcript is accurate and high quality.
3 - Not customer-ready Transcript contains frequent errors, and multiple edits would be needed before this is considered customer-ready.
1 - Unusable Transcript appears incomplete, partially unedited, or of such poor quality to be unusable.*

 

Accuracy Rubric
The overall accuracy quality of the graded sections can be described as:

5 - Customer-ready 3 - Not customer-ready 1 - Unusable

Spoken audio is accurately represented.

May contain occasional accuracy and/or punctuation errors that moderately impact meaning or readability.

Speech is almost always attributed to the correct speaker, though there may be very rare misattributions.
Spoken audio is sometimes misrepresented.

Contains frequent accuracy and/or punctuation errors that regularly impact meaning or readability.

Speech is sometimes attributed to the correct speaker, though there may be frequent misattributions.


Appears to be incomplete, unedited, or of such poor quality that the final deliverable is unusable.

This includes verbatim projects captured in the default style.

 

 

Formatting Rubric
The overall formatting quality of the graded sections can be described as:

5 - Customer-ready 3 - Not customer-ready 1 - Unusable
May contain occasional notation tag or labelling errors that minimally impact readability. Contains regular notation tag or labelling errors that impact readability.

Appears to be incomplete, partially unedited, or with poor adherence to the formatting guidelines.

 

This includes unformatted dictation projects and verbatim projects captured in the default style.

Captions Style Guide

This guide explains Rev’s expectations for captions quality. In addition to this guide, which covers the fundamental elements of a quality captions project, we have a robust Help Center for additional guidance.

Captions are used as part of accessibility services. As such, we need to ensure that a deaf or hard-of-hearing viewer receives a similar experience to that of a hearing viewer.

 

Browser Compatibility
Rev Recommends that you use the most up-to-date version of Google Chrome when working in the Dash editor

 

Capturing Captions Content

Spoken Content
RULE: Caption all spoken English words, even if there are pre-existing captions or subtitles on the video.  Lightly edit when necessary for readability. Use US spelling.

WHY:To provide a deaf or hard-of-hearing viewer the same experience of watching a video as anyone else.

 

Rules of thumb for caption accuracy:

  • Maintain the integrity of the spoken words.
    • Do not paraphrase, rearrange, or change the speaker's words.
    • Caption contractions, formal and informal, as spoken.
  • Most projects will have an auto-generated speech recognition draft. Do not overly rely on the draft.
  • Lightly edit unscripted productions, but do not omit intentionally spoken words.
  • You are expected to research proper nouns and terminology for representation and proper spelling.
    • Watching for terms on screen can be helpful.
    • Googling with a bit of context from your video/audio is also helpful.
    • URLs, hashtags, social media tags should be captioned using common convention: www.rev.com / #revcaptions / @rev
  • Include proper punctuation and capitalization per common English grammar rules.
  • Never type out a censored, silenced, obscured, or beeped/bleeped out word.
    • Use an appropriate atmospheric for the sound heard when the word is censored,   e.g. (beep)

 

Speaker Identification
RULE: Indicate speaker changes by using a dash and a space at the beginning of each speaker’s dialogue. This includes the first speaker.

RULE: When the speaker cannot be obviously identified, include a bracketed identifier after the dash and space, also called a speaker ID. Always use appropriate language for speaker IDs.

WHY: So that a deaf or hard-of-hearing viewer will know someone different has started speaking.

 

When the speaker CAN be visually identified:

Use a dash and space, or speaker label, at the beginning of the speaker’s dialogue.

WHY? So a deaf or hard-of-hearing viewer will know someone different has started speaking.

caption_speaker_label.jpg

 

When the speaker CANNOT be visually identified:

Use a dash, space, and bracketed identifier, or speaker ID, at the beginning of the speaker’s dialogue.

WHY? So a deaf or hard-of-hearing viewer will know who is speaking.

caption_speaker_id.jpg

 

EXCEPTION: For audio-only projects, only a dash and space is needed for speaker changes.

 

Atmospherics
RULE: Captions need to indicate sounds heard on screen. We call these identifiers atmospherics.

WHY: Atmospherics provide visual indicators of non-verbal sounds to the viewer. This allows the deaf or hard-of-hearing audience to pick up on sounds that are important to the content of the video.

 

How to create atmospherics:   

Do Don’t
  • Use parentheses ( ) and lowercase unless a proper noun is used
  • Describe the sound or sounds heard on screen by following this convention:
    • noun + descriptor/verb in present tense form
      • (water boiling), (door slams)
    • The noun lets viewers know who or what is making the sound, while the descriptor/verb lets them know what the sound is
  • Always use present tense, e.g. (Erin coughs)
  • Don’t use a dash or speaker label in a caption group containing only atmospherics
  • Don’t use onomatopoeia e.g.             (ribbit ribbit); 

Instead, describe what’s creating the sound, e.g. (frog croaking)

 

Lyrics and Music
RULE: Caption music and lyrics when there is no spoken dialogue occuring at the same time, even if there are pre-existing captions or subtitles on the video.

WHY: When there are no spoken words, the lyrics become the dialogue to be captioned. When lyrics are heard, it is important that they be captioned on-screen for the deaf or hard-of-hearing audience to experience. 

 

Lyrics

  • When there is no spoken dialogue occurring at the same time, lyrics should be captioned, even if there are pre-existing captions or subtitles on the video.
  • Tip: Googling portions of the lyrics can be helpful.
  • Omitting clearly heard lyrics will result in a reduction in score and can be scored as low as 1/1/1. 

How to notate lyrics:

Include a musical note at the beginning of the caption group. In Dash, use ## followed by a space to create the musical note. Dash will add a second music note at the end of the caption group after the project is submitted.

captions_lyrics.jpg

 

Music Atmospherics

  • When a project also contains spoken words, only include a background music atmospheric if there’s a significant time gap and it would benefit the viewer to include.
  • A common format is a descriptor followed by the word “music.” You can indicate the progression of music with words like begins and continues. E.g. (orchestral music begins)
  • Introductory music is a common use case. E.g. (bells chiming)

 

Atmospheric-Only Projects
RULE: For atmospheric-only projects, accurately caption all sounds that are heard using atmospherics.

WHY: Even when there is no spoken audio in the project, atmospherics tell the story for the deaf or hard-of-hearing viewer. 

 

What is an atmospherics-only project?

  • Contains no dialogue or lyrics but does contain sounds
  • Contains no dialogue or lyrics but does contain music
  • Contains only cartoon gibberish
  • Contains no sound
    • If the entire file is blank or silent, the project can be submitted with a single (silence) atmospheric
      • Please note that we strongly recommend listening to the full length of the project to ensure that there is no dialogue, sound, or singing later in the file. Omitted dialogue, atmospherics, or singing can result in a score as low as 1/1/1

 

What is NOT an atmospherics-only project?

  • Project with dialogue where none of the dialogue is in English
  • Project with singing where none of the singing is in English

Projects that are 100% in a non-English language are considered unworkable. These projects should be unclaimed and reported as having no English content. 

 

How to caption an atmospherics-only project:

  • Include atmospherics more frequently than you would in a normal project.
  • Caption all environmental sounds, action sounds, character noises or gibberish using an atmospheric.
  • Use detailed atmospherics to capture music.
    • Does the instrument convey a tone?
    • Does the volume or tempo increase or drop off? 

Atmospherics are inherently subjective, and we understand that this isn't always a well-defined area. Please do your best to caption these projects in a way that creates valuable content for hard-of-hearing viewers.

 

Foreign Language
RULE: For foreign language (non-English) in a project, follow the guidelines below.

WHY: Some customers place orders with foreign languages spoken within the content. We want to be sure we are delivering a product the customer expects, so we’ve put together guidelines on how to handle foreign language. Noting when non-English dialogue is being spoken lets a deaf or hard-of-hearing viewer know that a conversation is taking place.

 

fl_decision_tree.jpg

 

Difficult or Challenging Content
RULE: You should do your best to caption all spoken words and/or lyrics. For extremely challenging content, follow the guidelines below.

WHY: When dealing with challenging content, consider the best interest of the customer and the final product that will be seen by the deaf or hard-of-hearing viewer. 

 

If your project is entirely too challenging to accurately caption the spoken words: 

  • Unclaim the project and select “difficult audio” as the reason for unclaiming.

 

If your project contains both challenging content, and clear content: 

  • Accurately caption the spoken words that can be clearly heard.
  • For certain challenging sections of audio where it’s not crucial to understand every word, use an atmospheric to provide context to the viewer.
    • e.g. (friends conversing quietly), (group chattering)
  • If an occasional word cannot be understood, use (indistinct) in place of the word.
    • (indistinct) should only be used if you absolutely cannot determine the word; excess or inappropriate tags could result in a lower grade.

 

Glossary Terms and Resource Files
RULE: Customers may provide additional information in the form of glossary terms. Customers may choose to upload a resource file with glossary information.

WHY: This additional information is provided to help create accurate captions. 

 

What is a glossary?
The glossary is a section where customers can provide clarification on the spelling of words and terms that may be presented in the video. These can include but are not limited to niche terminology, company names, and speaker names. 

 

Important things to note about glossary terms and resource files:

  • Always check projects for included glossary terms. Misspelling a term found in the glossary will result in a lower score.
  • If a customer provides instructions in a glossary or resource file that go against the guidelines found in this guide, they should be ignored. Caption the project as usual.
    • These types of instructions are not approved by Rev.
    • Following instructions that go against our guidelines can result in scores as low as 1/1/1.

 

Remember - Glossaries and resource files should only clarify spelling. Any additional instructions are not approved and should not be followed.

 

Scripts
RULE: Scripts should be used as resources to help create accurate captions. Regular Rev guidelines still apply to projects submitted with scripts.

WHY: When a customer provides a script, their expectation is that we will use their document to create the captions for their video. 

 

General guidelines when following scripts:

  • Follow scripts as closely as possible to ensure that all dialogue is captioned correctly. This will ensure that names and important terms are spelled correctly.
    • For some projects, it may be possible to use the scripts exactly as provided by copying and pasting the content into Dash.
    • If copy and pasting isn’t an option, the script should be used as a resource to ensure accuracy.
  • Some scripts may not be complete and we may need to add in captions for spoken dialogue and/or atmospherics that have not been included.
  • Scripts can provide valuable information about characters and scenes. This may be included as notes, separate from the dialogue. 

 

Remember - A script should only be provided to assist with captioning dialogue, names, terms, and sounds. Any additional instructions that do not follow Rev guidelines are not approved and should not be followed.

 

Formatting Captions

Caption Length
RULE: Individual caption groups must always contain 60 or fewer characters.

WHY: Captions that fall within certain character limits are easier to read quickly. This ensures that viewers do not miss spoken content.

The typing area turns green and then yellow as you add more characters and near the 60 character limit. 

It’s perfectly acceptable to submit captions with a white, green or yellow caption group. 

The typing box turns red when you are over the 60 character limit. Split the text across multiple caption groups until the red color disappears.

caption_group_length.jpg

 

Caption Grouping
RULE: Captions should be split into individual caption groups such that whole phrases, nouns, sentences, and flow of dialogue is interrupted as minimally as possible, while abiding by the 60 character limit.

WHY: This allows the captions to be easily read by the viewer, with logical breaks and enough time on screen.

 

Split captions...

  • after terminal punctuation
  • before conjunctions
  • before prepositions

caption_grouping.jpg

 

Syncing Captions

Caption Timing (Alignment)

RULE: Sync each caption group (atmospherics and speech) so it appears on-screen when the audio begins. 

WHY: The deaf or hard-of-hearing viewer should see the text on screen at the same time it would have been heard by anyone else.

 

The start time needs to align with the beginning of the sound. 

  • Aim for precision, but it’s ok for the start time to be up to a ½ second early or late from the true beginning of the sound.

Do not worry about the end time. 

  • Rev automatically calculates the end time for a caption group after you submit the project (post-processing).
  • NEVER add extra spaces to a caption group OR double up the captions in an attempt to adjust the amount of time the caption group is on-screen. This causes errors in the file format for customers.

Keep in mind the readability of the caption groups.

  • Split the caption groups if a speaker is talking very slowly (more than 5 seconds to say a sentence) or there is a long pause. This maintains proper timing with the speech and ensures the caption group does not end too early.
  • Use advanced caption format if multiple speakers are talking very quickly. This combines two quick phrases into the same caption group for readability.

 

Captions Grading Scale

A grade consists of scores in three categories: Accuracy, Formatting, and Alignment. You can view the rubric below to see how graders will assess your work.

5 - Customer-ready Captions may contain occasional errors. Overall, the captions are accurate and high quality.
3 - Not customer-ready Captions contain frequent errors, and multiple edits would be needed before this is considered customer-ready.
1 - Unusable Captions appear incomplete, partially unedited, or of such poor quality to be unusable.*

 

Accuracy Rubric
Does the text accurately reflect the spoken audio?

Score Quality Description
5 - Customer-ready

Spoken audio is accurately represented in the majority of the project. There may be some errors that impact readability or change the meaning of what was said, but they are infrequent.

AND/OR

Punctuation errors may be present but are infrequent. Captions are readable.

3 - Not customer-ready

Spoken audio is sometimes misrepresented. Multiple errors are present that moderately impact readability or change the meaning of what was said, though the overall capture shows intentional effort.

AND/OR

Punctuation errors that impact readability may be present and are noticeable.

1 - Unusable Project appears to be incomplete, unedited, or capture is so poor that the final deliverable is unusable.

 

Formatting Rubric
Are the captions styled according to Rev's guidelines?

Score Quality Description
5 - Customer-ready

Captions are well-formatted, with no egregious style guide violations, though a few errors that don't impact readability may be present.

AND/OR

Speaker changes have been correctly indicated, though there may be an occasional error.

3 - Not customer-ready

Captions are sometimes poorly formatted. Egregious style guide violations are present that moderately impact readability, though the overall capture shows intentional effort.

AND/OR

Speaker changes are mostly correctly indicated, though there may be more frequent errors.

1 - Unusable Project appears to be incomplete, unedited, or no attempt was made to follow formatting guidelines.

 

Alignment Rubric
Do the caption groups align with the audio?

Score Quality Description
5 - Customer-ready Caption groups are synced within 0.5 seconds of speech throughout the project, though there may be occasional timing errors. Overall readability of the captions is not impacted.
3 - Not customer-ready Several caption groups are timed incorrectly, impacting the readability of the captions.
1 - Unusable Project appears to be incomplete or unedited, or no reasonable attempt was made to sync the project.
Premium Captions Supplemental Style Guide

This supplemental guide functions in conjunction with our standard Captions Style Guide and incorporates additional guidelines that should be applied when working on Pro projects.

Captions are used as part of accessibility services. As such, we need to ensure that a deaf or hard-of-hearing viewer receives a similar experience to that of a hearing viewer.

 

Browser Compatibility
Rev recommends that you use the most up-to-date version of Google Chrome when working in the Dash editor

 

Capturing Premium Captions Content

Special Instructions
RULE: Instructions that appear in the Special Instructions section in Dash should always be followed, even if they fall outside of the standard Rev Style Guide. 

WHY: Rev has come to an agreement with some customers to honor special instructions to complete these files to best suit the customers’ and their audience's needs. 

 

Special Instructions should only be followed if they are in the appropriate section in Dash.

special_instructions.jpg

 

Please note:  Sometimes, customers may add instructions that go against Rev's Style Guide and have not been approved. In these instances, we would not follow the instructions.

Here are some areas where unapproved instructions may be and should be ignored:

  • Customer-provided resource file
  • Customer-provided script
  • Glossary
  • Speaker names

 

REMEMBER - Special Instructions should only be followed if they are in the Special Instructions section in Dash and the project is a Pro project

 

Scripts 
RULE: Scripts should be used as resources to help create an accurate captions file. Regular Rev
guidelines still apply to projects submitted with scripts. 

WHY: When a customer provides a script, their expectation is that we will use their document to
create the captions for their video. 

 

General guidelines when following scripts:

  • Follow scripts as closely as possible to ensure that all dialogue is captioned correctly. This will ensure that names and important terms are spelled correctly.
  • For some projects, it may be possible to use the scripts exactly as provided by copying and
    pasting the content into Dash.
    • If copy and pasting isn’t an option, the script should be used as a resource to ensure accuracy.
    • Some scripts may not be complete and we may need to add in captions for spoken dialogue and/or atmospherics that have not been included.
  • Scripts can provide valuable information about characters and scenes. This may be included as notes, separate from the dialogue.

 

REMEMBER - A script should only be provided to assist with captioning dialogue, names, terms, and sounds. Any additional instructions that do not follow Rev guidelines are not approved and should not be followed.

 

Glossaries and Resource Files 
RULE: Customers may provide additional information in the form of glossary terms. Customers may choose to upload a resource file with glossary information. 

WHY: This additional information is provided to help create an accurate captions file. 

 

The glossary is a section where customers can provide clarification on the spelling of words and terms that may be presented in the video. These can include but are not limited to niche terminology, company names, and speaker names.

 

Important things to note about glossary terms and resource files:

  • Always check projects for included glossary terms and resource files. Misspelling a term found in the glossary or resource file will result in a poor customer experience.
  • If a customer provides instructions in a glossary or resource file that go against the guidelines found in the standard Captions Style Guide, they should be ignored. Caption the project as usual.
    • These types of instructions are not approved by Rev

 

REMEMBER - Glossaries and resource files should only clarify spelling. Any additional instructions are not approved and should not be followed.

 

Silence and General Atmospherics
RULE: Atmospherics need to be added at regular intervals during extended periods of silence or when a sound continues for an extended period. We should not have gaps of 10 seconds or more between captions.

WHY: Many Video on Demand (streaming) platforms require atmospherics in these scenarios.

 

Silence

  • A (no audio) atmospheric is needed for periods of silence. Use (no audio) to indicate the beginning of a period of silence.
  • For extended periods of silence, the (no audio) atmospheric should be repeated every 8 to 10 seconds.
    • Do not use continues for this atmospheric
  • If a project is completely silent, a (no audio) atmospheric should be repeated every 8 to 10 seconds.

 

General Atmospherics

  • Appropriate atmospherics should be used for sounds.
  • For extended periods of unchanging sounds, an atmospheric should be used every 8 to 10 seconds.
    • Use continues/continue in the atmospheric to indicate that the music or sound is continuing
      • (birds continue chirping)
      • (bell ringing continues)

 

Music Atmospherics
RULE: Atmospherics need to be added at regular intervals during extended periods of silence or when a sound continues for an extended period. We should not have gaps of 10 seconds or more between captions.

WHY: Many Video on Demand (streaming) platforms require atmospherics in these scenarios.

 

Music Atmospherics

  • Appropriate atmospherics should be used for music and for lyrical vocalization.
  • For extended periods of unchanging music or lyrical vocalization, an atmospheric should be used every 8 to 10 seconds.
  • If a project is a music video with indistinguishable lyrics for the duration of the project, an appropriate atmospheric should be used every 8 - 10 seconds
    • Use continues in the atmospheric to indicate that the music or sound is continuing
      • (upbeat music continues)
      • (singer continues vocalizing)

 

Formatting Premium Captions

Carets
RULE: When added text appears anywhere in the lower ⅓ of the video, use an up caret ^ to move captions to the top of the screen, with a few exceptions.

WHY: Captions should not cover important text that exists within a video.

 

DO USE an up caret when there is both:

  • Added text that appears in the lower third of the screen that is intended to be readable

AND

  • No text in the upper third at the start time of the caption group

 

Examples of added text include:

  • Names / Titles
  • Websites / URLs / hashtags
  • Opening credits
  • Scoreboard
  • News tickers
  • Existing subtitles / captions

 

DON'T USE an up caret for:

  • Text that is native to the video recording and was not added in later, such as:
    • Game interface
    • Software interface
    • Presentation slides
  • Video property text, such as:
    •  
      • Production timecodes
      • Logos (or functioning as a logo)
  • Graphics / images
  • Text that is on-screen for the full duration of the video

 

*If there is also text in the upper third at the start time of the caption group, do not use an up caret.  

 

NOTE: Anything that falls under the Don’t Use list would also not count as text in the upper third. That means that if there is qualifying text in the lower third, the items in this list would not count as “upper third text” when deciding if a caret is needed for the lower third text.

 

Syncing Premium Captions 

Caption Timing 

RULE: Sync each caption group (atmospherics and speech) so it appears on-screen when the audio begins.

WHY: The deaf or hard-of-hearing viewer should see the text on screen at the same time it would have been heard by anyone else.

 

1. The start time needs to align with the beginning of the sound. 

  • Aim for precision, but it’s ok for the start time to be up to a ½ second early or late from the true beginning of the sound.

2. Do not worry about the end time. 

  • Rev automatically calculates the end time for a caption group after you submit the project (post-processing).
  • NEVER add extra spaces to a caption group OR double up the captions in an attempt to adjust the amount of time the caption group is on-screen. This causes errors in the file format for customers.

3. Keep in mind the readability of the caption groups.

  • Split the caption groups if a speaker is talking very slowly (more 5 seconds to say a sentence) or there is a long pause. This maintains proper timing with the speech and ensures the caption group does not end too early.
  • Use advanced caption format if multiple speakers are talking very quickly. This combines two quick phrases into the same caption group for readability.
Subtitles Style Guide

This guide explains Rev’s expectations for subtitling quality. We trust you to deliver high-quality work. Customers rely on your accurate and timely translations as a crucial part of their daily work. The goal of subtitles is to help non-English-speaking viewers understand the content of a video.

 

Browser Compatibility
Rev recommends that you use the most up-to-date version of Google Chrome when working in the Atlas editor

 

Capturing Subtitles Content

Introduction and Expectations
As a subtitler, you have 5 main responsibilities:

  • Listen to the audio/ video and translate all required video content, so that the meaning of the video is understood in the new target language.
    • Subtitling a video is more than just translating what is said; it also includes translating on-screen text and ensuring the timing of the subtitles matches the spoken audio. Another way to think of this is, if the video were in a language that you don’t speak, do the subtitles provided allow you to easily follow along and understand what is happening?
  • Submitted translation should be based on the language spoken in the audio/video.
    • Provided English captions or other resources should be used as a tool to assist with accuracy. Do not overly rely on the accuracy of these outside resources.
  • Format the subtitles so they meet national subtitle formatting standards.
  • Adjust subtitle group timing for best possible accuracy.
  • Complete all correction requests in a timely manner for a project to be completed.

 

Note: Incomplete submissions, unedited/poorly edited machine translation, failure to review the audio in full, submissions not in the target language, and/or submissions that do not match the audio are subject to pay removal and may result in immediate account closure.

 

Unworkable Projects
Certain types of projects are considered “unworkable” and should not be completed. If you submit a project that is classed as unworkable, you will not be paid for the job and the project will be scored as not customer-ready.

 

IF... THEN...
There is no spoken or sung English speech  But there is foreign language (non-English) speech

Project is unworkable

Unclaim the project as “No English audio present”

But there are unspoken English captions shown on-screen

Project is unworkable

Unclaim the project as “No English audio present”

But there are valuable sounds present along with foreign language

Project is unworkable

Unclaim the project as “No English audio present”

Project is entirely silent  

Project is unworkable

Unclaim the project as “No English audio present”

There is no spoken dialogue (English or foreign language) but sounds are present. Project is workable if the sounds/atmospherics qualify as an atmospherics-only project.

 

 

Content
Translating into subtitles is different than other translation work because most of the time it is not a word for word translation. 

  • Always keep in mind that the viewer cannot understand the English video, and the subtitles should capture the video meaning.
  • This means that idioms and other figurative speech should NOT be translated literally.
  • You are expected to research proper nouns and industry specific terminology for representation and proper spelling. Watch for on-screen cues, and use publicly available web-searches to research.

Subtitles should capture ALL of the following content:

  1. Spoken English Content
  2. Existing On-screen Text (OST)
  3. Atmospherics

 

You are expected to produce high quality and accurate customer-ready subtitles. 

Ensure you do your own proofreading and editing before you submit a finished project.

NOTE: Incomplete submissions, unedited/poorly edited machine translation, failure to review the audio in full, submissions not in the target language, and/or submissions that do not match the audio are subject to pay removal and may result in immediate account closure.

 

Atmospherics
Subtitles need to indicate sounds heard on screen, including non-English content. We call these identifiers atmospherics.

  • In some instances, there may already be English atmospherics present which then require the appropriate translation. In instances where there is no English atmospheric present for a sound, you are responsible for creating appropriate atmospherics for the sounds heard.

How to create atmospherics:   

Do Don’t
  • Use parentheses ( ) and lowercase unless a proper noun is used
  • Describe the sound or sounds heard on screen by following this convention:
    • noun + descriptor/verb in present tense form
      • (water boiling), (door slams)
    • The noun lets viewers know who or what is making the sound, while the descriptor/verb lets them know what the sound is
  • Always use present tense, e.g. (Erin coughs)
  • Don’t use a dash or speaker label in a caption group containing only atmospherics
  • Don’t use onomatopoeia e.g.             (ribbit ribbit); 

Instead, describe what’s creating the sound, e.g. (frog croaking)

Tip: Parentheses are only permitted for atmospherics, and cannot be used for anything other than atmospherics when translating in Atlas.

 

Existing On-Screen Text
If text appears on screen that has not been spoken/translated in the subtitles and it carries significant meaning, it must be translated. 

 

If both speech and an atmospheric are present, the on-screen text (OST) takes priority over the atmospheric and should replace it.  If the on-screen text is too lengthy, some paraphrasing may be required.

 

Some exemptions to this requirement apply. Some examples include:

  • The spoken content already matches the information in the OST
  • There is no room to add a new subtitle group, such as at the end of the video
  • A software presentation, such as a slideshow, where the content is lengthy and doesn’t require translation

 

To add existing on-screen text:

  1. Add a new line within the existing subtitle group
  2. Translate the on-screen text
  3. Adjust the timing of the subtitle group to  appear at the same time as the on-screen text

 

Subtitles Formatting

Speaker Identification

Indicate speaker changes by using a dash and a space at the beginning of each speaker’s dialogue. This includes the first speaker.

When the speaker cannot be obviously identified, include a bracketed identifier after the dash and space, also called a speaker ID. Always use appropriate language for speaker IDs.

 

When the speaker CAN be visually identified:

Use a dash and space, or speaker label, at the beginning of the speaker’s dialogue.

caption_speaker_label.jpg

 

When the speaker CANNOT be visually identified:

Use a dash, space, and bracketed identifier, or speaker ID, at the beginning of the speaker’s dialogue.

caption_speaker_id.jpg

 

EXCEPTION For audio-only projects, only a dash and space is needed for speaker changes.

 

Tip: Proper nouns are never translated, but IDs such as as roles (Narrator, Instructor) should be translated.

 

Lyrics and Music

Lyrics

Lyrics should be translated when present and when they do not have overlapping dialogue.

  • To signify musical lyrics or singing, use a music note at the beginning of each subtitle group.
  • When the lyrics end, add a music note to the end of the last lyric subtitle group. 

 

Add a music note by typing ## and a space 

su_lyrics.png


Music Atmospherics

When a project also contains spoken words, include a background music atmospheric if there’s a significant time gap  and it would benefit the viewer to include.

 

  • A common format is a descriptor followed by the word “music” in the target language. You can indicate the progression of music with words like begins and continues in the target language. E.g. (orchestral music begins)
  • Introductory music is a common use case. E.g. (bells chiming)

 

Subtitle Group Length
A subtitle group is the unit of text that is shown on-screen at any one time. Individual subtitle groups must always contain 60 or fewer characters, as groups that fall within these character limits are easier to read quickly. This ensures that viewers do not miss the translated content.

If the subtitle group is too long, it will be highlighted in red. In these instances, the subtitle grouping must be adjusted/rearranged with the other surrounding groups prior to submitting a project.
 

Subtitle groups should be split into individual groups such that whole phrases, nouns, sentences, and flow of dialogue is interrupted as minimally as possible, while abiding by the 60-character limit.

  • Begin a new subtitle group whenever there is a speaker change. Two speakers are not permitted in one line.
  • If a speaker has a long pause, it’s likely you’ll need a new subtitle group to maintain proper timing with speech.
  • You cannot exceed 60 characters in a subtitle group. If the subtitle group is too long, it will be highlighted in red.

 

Subtitle Timing

Subtitle groups that appear on-screen need to be timed so that the beginning and the end of each group match where the speaking begins and ends.

  • Often, you will be able to use the timing already included from the caption groups for the start times. End times will still need to be adjusted
  • Any time subtitle content is rearranged for length and grouping, or when adding content such as OST, start and ending times will require adjustment. 

Tip: Use Advanced Caption Formatting (ACF) for simultaneous content to enable proper subtitle timing.

 

Moving Subtitles Up
Videos sometimes have pre-existing text in the lower ⅓ of the screen. If the subtitles  could overlap with that text at all, even for only a split second, we need move the subtitles up.  We call this inserting an up-arrow caret (^). There are some exceptions to this when there is also qualifying text in other sections on screen as well.

 

Existing caption groups will only have carets present if they were for a premium-service caption project. Other caption projects will not have carets added. You are expected to ensure all carets are added as required in subtitle groups.

 

The example below demonstrates the following:

  • The caret symbol in the subtitle group (^)
  • How the subtitle appears at the top of the video screen after inserting the caret for the qualifying text in the bottom third.

su_caret_example.png

 

Special Instructions
Some projects may have Special Instructions attached. These instructions are located in Atlas in the yellow box, per the example shown here.

SU_SI.png


There may also be a link to a Help Center Article. These links provide clarifying information along with additional information/instructions, or an added glossary.  

NOTE: Customers are expecting these approved instructions to be followed. Failing to do so will result in a project being assessed as not client-ready.

 

Subtitles Review & Project Resources

Watch the video
After you complete the translation, watch the video to double-check the timing and translations.  This will help to minimize errors. 

 

Project Resources
Always check the Project Details page to see if the customer has added a resource such as a script or glossary terms/speaker labels.  A file may have to be redone if you ignore these pieces of information. 

Was this article helpful?
2 out of 3 found this helpful