PEEP How To’s

An image of Peep and Chirp sitting down facing each other with their feet touching.

Video Descriptions

An image of a cartoon rodent with the numbers 1,2 and 3 above it's head.Creating descriptions for any video typically breaks down into three main activities:

  1. Authoring the descriptions
  2. Recording the descriptions
  3. Adding the description audio to the video

Resources are available online to guide authors through the process of writing descriptions. These include the Described and Captioned Media Program’s Description Key:

Recording can be an expensive, professional-level process or it can be an inexpensive do-it-yourself project. In the end, you need to have an audio track of sufficient quality and in a format that is compatible with the target media player.  Steps for recording your own descriptions are outlined below.

  • Record audio using a USB microphone plugged directly into a computer. There are many low-cost, good-quality USB microphones available, such as those from Blue Designs.  Avoid using the built-in microphone of a laptop or smartphone since the audio quality will be inferior.
  • Audio can be recorded with any number of free or low-cost software applications, such as Audacity, that will save your description track as a .wav or .mp3 file (the most common formats for recording and playing audio).
    • For the highest quality, record the descriptions in a quiet room that is echo-free.  We recorded our descriptions in a typical office which did not have special sound baffling or other accommodations.
  • Read through the script several times to become comfortable with the language, and to warm up your voice so that you can speak clearly.
    • We recorded two versions of each description and selected the best versions during the editing process.

The technical production is not especially difficult, but for the newcomer, it may take some time to become comfortable with audio and video editing tools. As always, YouTube is a helpful resource for how-to videos.

  • Both the original soundtrack and the description track need to be opened in an audio editor.
    • We used Adobe’s Audition to export the original video’s soundtrack as a .wav file.
  • The volume level of the soundtrack should be dipped at the points when descriptions are spoken. Also, the volume levels should be adjusted so that the description audio is clear to the listener but also not so loud that it is jarring.
    • We opened both the soundtrack and the description audio in Audacity, then adjusted the audio level of each track and made any edits to the description that were needed. We prefer Audacity because it is a free, straight-forward audio-processing tool. Audio editing could also be done in other more sophisticated and powerful audio editors such as Adobe’s Audition, Pro Tools, or Garage Band.
  • After the editing is complete, the dipped soundtrack and the description track need to be combined into a single audio track, known as the mixed track.
    • Again, we used Audacity to mix the two tracks together, then we exported a .wav file.
  • Now, the new mixed track must be married to the video. This must be done with video-editing software. First, we deleted the original (un-dipped) soundtrack, and then we replaced it with the mixed track.  We used Adobe Premiere Pro as our video editor, but many other video-editing tools could be used, including iMovie.
    • For newcomers to video-editing tools, this process can be a little tricky. For example, it often requires several steps to remove an audio track permanently (i.e., simply muting the track is not enough).
  • Finally, the video and its new mixed audio track must be exported into a format that is compatible with the target media player. Again, we used Adobe Premier Pro to export the final described video as an .mp4 file.

An image of Chirp having water dumped on him.NOTE: The finished video will have a single, mixed audio track– the combination of the dipped soundtrack and the description audio. Users will not be able to turn off these open descriptions. Currently, descriptions that can be turned on and off (called closed descriptions) are not supported by any video-hosting platform (YouTube, Vimeo, etc.,) or by any browser-based media player. The only way accommodate closed descriptions is to build a video player that provides a button or a menu option to choose from among the described or undescribed audio tracks. More commonly, video providers today typically offer two separate links to the same video– one link to the described video, and one link to the non-described video.

Video Captions

An Image of Chirp and a cartoon beaver dancing with their eyes closed.Creating captions for a digital video typically breaks down into two main activities:

  1. Authoring the captions
  2. Linking the captions to the video

Over the past few years, creating and displaying captions has become far easier and less technically challenging. Commonly used video hosting platforms including YouTube, Vimeo, and Facebook provide native support for closed captions, as do popular Web browsers (such as IE, Firefox, Safari and Chrome).

The first step in the captioning process is to create the caption file. A caption file is, essentially, a transcript of the video that includes timing and styling information that enables the captions to appear when and where you want them to.

  1. Authoring the captions

Using a service provider:

Numerous agencies will caption your videos for a fee. These include WGBH’s Media Access Group3Play Media and many others.

Do-it-yourself captions:

There are several free caption-editing tools available online, including NCAM’s CADET and YouTube’s caption-editing tool. Caption editors are also available for purchase; these vary in price from approximately $100 to several thousand dollars for professional-level caption-authoring tools.

Regardless of the caption tool, the process of authoring the caption file is similar. The author either imports an existing transcript of the video or types the transcript into the caption software from scratch. YouTube offers automated speech-to-text transcription which can be fairly accurate for videos that make use of a single speaker who speaks slowly and has no discernable accent, but the transcript will lack proper capitalization and punctuation. With more complicated audio tracks, such as those with low audio levels, or speakers who mumble or have accents, the automated transcription usually contains so many errors that one is better off typing the transcript from scratch rather than cleaning up the transcript created by YouTube.

An image of two cartoon rabbits standing next to each other looking to the left. The caption author must also break up the transcript into readable chunks – the actual blocks of captions that will be displayed – and identify the timing for the captions to be displayed. Most caption-authoring tools make this process straightforward and easy. Additionally, some caption authors may choose to position their captions on the screen (left or right, higher or lower on the screen, etc.) or present captions in various fonts and adjust the color and transparency of captions and caption backgrounds. While many such options exist, it’s important to note that in most viewing situations, users will have the final decision on where captions will appear and how they will look, often overriding the choices of the caption author. The content and timing of the captions cannot be adjusted by the user, however.

Once the caption file is created, it will need to be saved in the format appropriate for the target player.  For example, Facebook requires SRT caption files, while Web browsers will accept a variety of caption-file formats including WebVTT, SRT and TTML.  NCAM’s CADET caption-authoring software exports these and other caption formats.

  1. Linking the captions to the video

If the video is going to be played from a video-hosting service, such as YouTube or Vimeo, instructions for uploading caption files will be provided by the host, typically during the video upload process. Captions can be added at the same time that the video file is uploaded, or they can be added later.

If the video will be played using a Web browser, the caption file must be linked to the video in the HTML created by the author of the Web site.  This is accomplished through the use of the track element.  The track element was created specifically for carrying and delivering text tracks, including but not limited to captions.  The track element is simply used as a child element of the video element, and it points to the caption file that corresponds to the video being delivered to the user.   For example, in the following code sample the media player will begin playing the WebVTT caption file (myvideo_captions.vtt) at the same time that it plays the video file (myvideo.mp4):

<video controls>
<source src=”myvideo.mp4″ type=”video/mp4″ />
<track kind=”captions” src=”myvideo_captions.vtt” srclang=”en” label=”Captions” default />

Add Accessibility to Games

An image of Quack singing.Case Study: House Hunt HTML5 Game

The following accessibility features were built into the House Hunt game:

  • Captions
  • Keyboard Access



While a number of HTML-based video players come with the means to display captions synchronized to a video’s audio playback, synchronizing captions for HTML-based games typically requires developers to custom-build their own solutions. This is because the audio clips that are played during the course of a game are not linear. When an audio clip starts to play, something needs to trigger the corresponding caption to be displayed. The process to do this would require the developer to doing the following:

  • determine a format for storing the caption data (typically JSON)
  • write the captions
  • provide a text element for displaying captions
  • load the caption data
  • retrieve and display the corresponding caption when an audio clip is played
  • clear the caption display at the appropriate time
  • provide the means of hiding and showing the caption display

We were able to build the HTML version of the House Hunt game using SpringRoll, a game-building framework developed by PBS and CloudKid, which comes with a captioning component built into it. It also has a simple editing tool for writing captions, which are stored in a JSON file that accompanies the game. Unfortunately, SpringRoll currently clears the caption as soon as the audio is finished playing instead of waiting until the end of the caption’s duration, set in the caption JSON file. The effect of this is that shorter, single words that are spoken, such as ‘ant’, do not allow for the caption to be displayed long enough to be read.

Keyboard Access

For keyboard users, games should allow the player to navigate using both tab and arrow keys. For games built using the HTML 5 Canvas element, this also requires developers to build customized solutions. House Hunt was built using SpringRoll and CreateJS – frameworks which allow players to interact with the Canvas content using various input devices, such as the keyboard and mouse. However, when implementing keyboard access, it is up to the game’s designers and developers to manage the tab and keyboard interactions so that the tab order and navigation flow is logical to the player. This also means that the Canvas game is hijacking the tab keyboard input, so a means to allow tab navigation into and out of the game needs to be included in the programming.

Another option is to take the interactive elements out of the Canvas element and layer them on top as HTML buttons. This provides two benefits:

  • the interactive buttons appear in the tab flow of the full page (allowing navigation into and out of the game); and
  • the interactive buttons are accessible to screen-reader users (see Screen-Reader Access).

An image of a cartoon raccoon poking its head out from a hole in a tree.Using SpringRoll, the House Hunt game itself is displayed in the Canvas element, but the buttons were built as HTML elements positioned on top. The normal tab order was maintained, but the navigation by arrow was customized to allow for navigation around the game element grid.

Issues that were encountered that have not been resolved include the following:

  • Arrow key access is captured and trapped by the game.
  • When the buttons in the game are disabled – such as when the audio is being played – the tab navigation sometimes jumps around elements in an unpredictable manner.

Screen-Reader Access

Screen-reader access was not emphasized in the coding of the House Hunt game, so only partial screen-reader access is available.

Currently, screen-reader programs cannot identify content inside the HTML Canvas element. This means it is up to the developers of the game to provide other means of identifying elements of the game that are visual in nature. For example, in the House Hunt game, when two leaves are selected that do not match, Peep shakes his head and the leaves return to their original state. Unfortunately, without an audio reference, a screen reader user does not realize this sequence of events has occurred.

In addition to providing audio cues for the action that is occurring in the game, the changes in states of interactive objects need to be defined in their labels. In the House Hunt game, all of the leaf buttons have labels which include the order they appear: “leaf 1 of 12”, “leaf 2 of 12”, etc. When a leaf is selected, its state is changed to display an animal or a house. The label on the button is also changed to reflect the new state (in this case, the item being displayed:  i.e “ant”, “hollow tree”).


  • When the second leaf is revealed and a match is not made, the leaves go back to their original state too quickly for screen-reader users to identify the second object.
  • When leaves are selected, the game audio that is played is difficult to hear because the screen reader is speaking.
  • At this time, screen-reader audio descriptions of other animations occurring during the course of the game are not included.

Accessible Documents

An image of a cartoon snail.An accessible document is one that can be read by everyone, including people using assistive technology. Digital documents such as Web pages, Word documents and PDFs can all be made to be accessible. This doesn’t require special accessibility tools or software but it’s not as easy as pressing a button labeled “Make This Accessible.” There is a lot of information about accessibility available on the Web, including from the software makers themselves. We’ve provided links to some of this information below.

While there is a wide-ranging list of criteria for making a digital document accessible, most of what makes a document accessible is structure. A well-structured document can, for example, enable a person who is blind and using special text-to-speech software (called a screen reader) to efficiently navigate a document’s headings, paragraphs, columns and tables, in addition to reading the text.

Well-structured documents are also beneficial to people who cannot use a mouse or who employ other assistive technology, such as eye-gaze equipment or a mouth-operated joystick.

In addition to using appropriate structural markup, content in images must be made accessible. This includes non-text elements that provide information such as pictures, diagrams, charts, graphs, images of text, and math equations. These can be made accessible by providing a text description of each image which a screen reader will read aloud in place of the image.

Below are brief explanations of the ways in which three common types of digital documents (Web pages, Word and PDF) can be made accessible, and links for more specific instructions.

Note: Printed documents can be unreadable for people who are blind or have low vision. Although some people make use of magnifiers and other assistive technologies, these don’t work for everyone. Printed documents can also be translated into braille. However, not everyone who is blind or has low vision reads braille. Therefore, it’s important to always provide accessible versions of digital documents.

Accessible Websites

The Web Content Accessibility Guidelines 2.0 (WCAG 2.0) is an international standard for Web accessibility.  WCAG 2.0 is a highly detailed recommendation, but the WCAG quick-reference guideprovides a useful shortcut for understanding compliance levels and success criteria.

In addition, a small industry has developed to assist individuals and organizations in creating accessible Web sites. This includes everything from free how-to guides and courses, to accessibility-checking software, to Web-development companies and consulting firms dedicated to accessible Web design. A brief search will bring you to a variety of useful resources all along this spectrum.

Accessible Word Documents

An image of two cartoon beavers standing together. The best way to ensure accessibility of Word documents is to use the formatting toolsavailable within Word itself. For example, using the table-editing tool or the outline tool will create tables and outlines that are automatically formatted to be accessible. Use heading styles, list tools, and other formatting markup to take advantage of semantic structure rather than using visual styles to simulate structure (e.g., applying bold styles to make plain text look like heading text).

Accessible PDF

PDF is a popular format from Adobe that preserves the fonts, images, graphics and layout of nearly any source document. This ensures that users will see the finished document exactly as the author intended. In order for PDFs to be accessible, they need to be created using properly structured tags, appropriate alternative information where necessary (e.g., images must be marked with text alternatives), and the tags must reflect the proper reading order. If a source document is created with this in mind, it can be successfully accessed using a screen reader or other assistive technology.

Screen-reader users are especially affected by the way in which the PDF is created, so it is crucial that PDFs be authored to be accessible by everyone. Providing proper structure in the source document (Word, InDesign, or other format) is a crucial first step toward having an accessible PDF. A PDF that hasn’t been made accessible can be difficult or even impossible for a blind or visually impaired user to navigate. A variety of tools and techniques exist to create accessible PDFs, but authors must be careful to structure the source document properly, export the PDF correctly, and review the final PDF using applications (such as Adobe Acrobat) that permit additional markup or structure to ensure that the document is as accessible as possible.

Using Word source documents:

Using InDesign source documents:

We want to hear from you!

Accessible Peep’s videos and games are universally designed with their accommodations available to everyone. Please use them with students of all abilities and learning styles and use the how-to guides to create your own accessible learning materials. Please let us know how it goes by sending us an email.

An Image of Peep, Chirp and Quack standing together and looking up.

Ideas that work.The DIAGRAM Center is a Benetech initiative supported by the U.S. Department of Education, Office of Special Education Programs (Cooperative Agreement #H327B100001). Opinions expressed herein are those of the authors and do not necessarily represent the position of the U.S. Department of Education.


  Copyright 2019 I Benetech

Log in with your credentials

Forgot your details?