Oh Caption, My Caption!!

I have developed a lot more grace for caption writers in the last year. Imagine combining Tetris with a Rubik’s Cube. There’s more to captions than just typing words to put on a screen. The “Oh Captain, My Captain” references the film Dead Poet’s Society, which in large part is about looking at things from a different perspective, seeing what others may gloss over in the routines of life. Welcome to a look behind the curtain of captioning…

Captions need to start with the sound, usually a voice, and should match word for word. Missing either of these tends to irritate those of us who can both hear and read.

But, they also need to be at a readable pace, especially without audio support. (Since limited audio is the whole point of having captions.) For children’s TV, this means 120-130 words per minute (wpm) for educational programming and 120-150 wpm for general material. Best practice for adult programming is captions around 150-200 wpm. Industry experts say captions should never exceed 225 wpm.

Do you realize we generally talk at 250 wpm? So, normal monologue, or worse, normal dialogue, is flying by faster than we could read it if it were written out in text. If everything we say on a daily basis had to be captioned faithfully, we’d need to adopt a southern drawl. This would render New Yorkers and auction aficionados completely nuts, and quickly too.

Now – how many of us have had a lot to say & struggled to fit it into Twitter’s character limit? Tough to do – especially if you choose to use full and familiar words, and the words that best express the idea being conveyed. In captions, a space is a character — and the limit is two lines of 32 characters each. Any word, even a short one, that crosses that 32 character boundary is moved down to the next line, taking up valuable space there. Punctuation included.

And then there are the two aspects of screen time. Each caption should remain on the screen for at least a full second. Any less, and a blink could mean the loss of useful information. But what if the action on screen is 2-3 characters in a discussion with short answers? Identifying the speaker will take up valuable characters — this can be managed by moving the text to the left or right based on where the speaker is on screen. But only if you’re not trying to put the words of two persons on one caption…which has to happen if they’re saying “Ready?” “Yes, let’s go.” within the space of one second (because a third character says “Me too!” before the camera cuts to the next scene).

Even Tetris masters are starting to squirm, but we’re not done yet.

There’s a hard boundary of 60 data bytes per second (bps). According to current standards, the computers that process captions can’t handle more than 60 bps. So, when the caption writer is content and sends the whole timed and carefully fitted package to the final process, the software will start adjusting start times because that 60 bps limit has been exceeded. Sometimes, that means adjusting the time by a full second, which could shift the discussion in the previous paragraph to start as the camera shows actors walking out the door. Irritating whether you can hear the audio or not.

You don’t even want to read a paragraph about how special characters, like the accented characters used in Spanish text, count as 2 data bytes or the joys of Google Translate or the fact that Spanish words get incredibly high character counts. (I tried to find a long way to say the word long – nosuchluck.) Just take all I said above and double it. 🙂

I’m calling it done on the DWW-5 captions, they’re firmly in the hands of our partners in Iowa. My inner perfectionist is having a fit because it’s not perfect yet — but the other side of me is painfully aware of my status as the bottleneck. The desire and enthusiasm is out there for the release of the final season of Dr. Wonder’s Workshop. The workload of new initiatives and revival of projects needing a few small adjustments is piling up. It’s past time to be done. Someone will ask “so, what took so long?” — this article is my answer.

Hats off to the professional caption writers who make all these decisions at the pace of current video offerings. I’ll forever be a lot less critical — although, I’ll likely still insist on accuracy in spelling.