A young student wearing pink is working on a touchscreen making changes to a 3d printed object in a school tech class.

Multimodal STEM Documents

Volker Sorge
University of Birmingham & Progressive Accessibility Solutions

What are they?

Documents in STEM (science, technology, engineering, and mathematics) subjects are a specialized type of subject matter not only in terms of their topics and intended audience, but also in terms of their content and how it is presented. They generally use highly topical vernacular, mathematical formulas, diagrams, data visualizations, and so on. While any one of these features on its own poses a considerable accessibility problem, their combination makes the accessibility of science texts particularly challenging.

Access to STEM material therefore presents a major challenge for learners with visual impairments — and to some degree for readers with learning disabilities like dyslexia or dyscalculia — that goes far beyond the problems encountered in an ordinary document. Traditionally, documents were made accessible in a manual process, generally on an as-needed basis and often restricted to monographs important for teaching a particular subject. Texts would be translated into Braille, with formulas being set in specialized formats, and experts preparing tactile versions of diagrams. Alternatively, subject matter experts would make audio recordings of the material, pronouncing formulas unambiguously, and giving detailed explanations of diagrams and illustrations.

For electronic documents a plethora of different systems, techniques and workflows have been created to make different parts accessible separately, e.g., formulas, graphs, and charts, and to put them into audio and tactile formats. This process often relies on proprietary software or requires specialist hardware, each with their own learning curve both for those that prepare and those that consume the content.

In an age where most content moves to the web or EPUBs, and everyone can become their own content creator, these techniques become rapidly outdated. Multimodal STEM documents therefore refer to the idea of integrating all possible output options into single web-based documents that can be easily created by teachers and as easily accessed by readers regardless of the special needs and ideally in a single environment. Moreover, the content has to be flexible enough to cater to the needs for most, if not all, disabilities and personal needs. Thus, learners do not have to be forced to become proficient on multiple software systems or have to resort to disjointed teaching material.

Why are they important?

Since the beginning of this millennium we have seen massive changes in our habits to learn, teach and study the sciences. Teaching materials are moving more to online resources and learning management systems. Students no longer research a subject by going to the library and finding a book or reading a paper, but they search the internet for information and read a relevant online article.

Not only has the way we consume STEM content changed, but also the way it is being produced. Teaching material can now be assembled and customized quickly and easily by everyone. Teachers, lecturers and professors prefer to use their own notes for teaching, multiplying the amount of material that needs to be made accessible on a daily basis. And even if material can be made accessible quickly, the nature of web documents means they can be easily changed and updated rendering any previous effort obsolete.

And while in many countries’ legislation mandates the accessibility of teaching material, online or otherwise, at least in primary and secondary education, they rarely specify what the quality standard should be. For example, it is accepted practice to make images accessible by providing descriptions as alternative text. While this can be sufficient for the casual image, for a scientific diagram this is generally not sufficient.

Who is doing it already?

Many different products exist that cater to various aspects of STEM. (For example, here is a good overview of resources for tactile graphics.) However, there is very little support for a holistic approach to create accessible science texts.

A number of learning management systems aim to provide alternative formats like HTML only or EPUB documents. For example, Blackboard Ally for LMS can generate alternative formats from fairly inaccessible formats like PDF that provide augmented accessibility features. However, for STEM-specific material LMS’s still fall short of what is desired. Mathematical formulas are only rarely correct and diagrams or graphics are still presented as images only.

There are a number of systems that can produce content in an information-rich format that can be embedded into a web document to allow improved accessibility. For example, the HighCharts library by HighSoft can generate information-rich formats that allow for simple interaction by tabbing with screen reading software and sonification. BrailleR is a program that allows the generation of statistical diagrams and data visualizations from the R software environment for statistical computing. Mathematics can be rendered multimodal using the MathJax library.

How can they be used in the classroom?

As STEM education is an important part of the curriculum, provision of good, accessible content is paramount. In special education, it is possible to cater for specific needs by making individual parts of content accessible, but to train children on specialized tools, in an inclusive classroom setting, is generally not feasible.

First, the support that is required to provide students with the necessary material and training is often not available. And second, forcing children with visual or learning impairments to use alternative content that is often slow to be created, or asking them to learn a number of specialized systems, puts them at a disadvantage compared to their peers. Consequently, providing STEM documents where students can access all content in the assistive technology environment they are used to without additional effort is an absolute necessity.

Obstacles to STEM Accessibility

STEM documents have a number of particularities that raise the barrier for their accessibility. Here is an overview of the particularly challenging components.

  • Highly Specialized Vernacular Most scientific subjects come with their own particular language, use highly specialized terms that cannot be found in ordinary dictionaries, or are taken from other languages, such as Latin. They can contain a mixture of scripts and punctuation symbols. Consider, for example, molecules from chemistry using standard nomenclature (IPUAC). They are often composed of combinations of letters, parentheses, comma-separated lists of numbers, sub- and superscripts, etc. More complex ones can even include Greek letters. Consequently, using ordinary assistive technologies like screen readers can often lead to mispronunciations or incorrectness (e.g., notations like Greek letters are omitted) that are intolerable for scientific subjects where precision of expression is often the key. Setting screen readers to read specific words more slowly or letter by letter can help to work around these problems. However, the obvious drawback is that readers have to spend considerably more time on the text as well as lose the reading flow, which is far from ideal.

  • Tables Many sciences rely on presenting data, to convey information or back up experimental results, in a tabular form. But unlike ordinary tables, where standard row-by-row or column-by-column reading is sufficient to comprehend the content, scientific tables often need to be viewed comparatively or holistically. For example, the distribution of zeros in a table can convey more meaning than the actual numerical values of all the other entries.

For readers with learning impairments, techniques such as highlighting can be helpful, but for readers with visual impairments who rely on extreme magnification or on-screen readers, it is nearly impossible to get a picture of a table as a whole, and linear exploration will generally not reveal the information as intended by the author. One solution is to employ advanced screen-reading techniques such as cursor virtualization, which can help a reader to jump between different cells in a table. In addition, tables can be authored with appropriate Accessible Rich Internet Application (ARIA) annotations to guide screen readers to a non-linear navigation.

  • Formulas Mathematical, statistical and chemical formulas can be found across the majority of scientific texts. As math accessibility is a long-standing issue, there exists assistive technology specializing in mathematics (Soiffer, 2005; Davide, Krautzberger, Sorge, 2016; Sorge, 2020) as well as some support for mathematics in general screen-reading technology(Freedom Scientific, 2018; Apple, 2018; Sorge, Chen, T.V., Tseng, 2014; Texthelp, 2018). However, the reading of complex formulas and the pronunciation of mathematical expressions can vary considerably across different subject areas or STEM disciplines. As a very simple example, consider the imaginary number: it is normally represented by i, but in many engineering disciplines j is used, as i denotes current. Simply put, the further advanced or specialized a scientific text, the less likely it is for current screen-reading technology to be sufficient to handle formulas correctly.

  • Diagrams Graphical illustrations are an important means of conveying information in STEM subjects, and they are ubiquitous in teaching material. While good visualizations are commonly used to great effect for the sighted world, they are practically useless to a visually impaired person and particularly a blind audience. Indeed, often diagrams not only complement the exposition in the text, but are used in lieu of an exposition, with the consequence that if one cannot read the diagram, one cannot understand the document.

Multimodal STEM Content

The goal of multimodal documents is to adapt their behavior depending on the reader’s needs and personal preferences. This is fairly easy to achieve in web documents as their two main markup languages, HTML5 for regular text and SVG (Scalable Vector Graphics) for graphics, allow embedding hidden information that can be exploited for accessibility purposes.

The following examples show how complex content can be made accessible using auditory, visual and tactile modalities. We concentrate on a mixture of formulas and diagrams, and all examples are in SVG format with embedded content that provides the accessibility features.

Tactile Output

Converting content into tactile formats (e.g., braille notation) is a traditional way to provide access to documents and graphics for the visually impaired. In electronic documents, this format is commonly achieved via sending Braille to a connected tactile display. Many modern screen readers convert text into Braille output using either their own implementation or the Liblouis library. Some also enable the translation of mathematical formulas. Below is a formula rendered with MathJax, which can expose Nemeth Braille code to a connected tactile device.

Math Example with Braille
Screen Reader Instructions

Users might have to disable screenreader reading modes (e.g., “browse mode” in NVDA, [NVDA Key + Space], or “virtual cursor” in JAWS, [Insert + Z]) before being able to launch the MathJax explorer application, and following the instructions below to interact with the equation.

For Voice Over users, ignore the Voice Over instructions on how to enter the group and instead just follow the instructions below to navigate the equation.

For activation and navigation press the following keys:

  • Enter or Return to activate the formula exploration when it is in focus,

  • Escape to leave exploration mode,

  • Down, Up to explore the next lower or higher level of the formula, respectively,

  • Right, Left to navigate horizontally by moving to the next or previous sub-expression on the current level, respectively.

For convenience, the Braille code that is sent to the tactile output device is displayed as a subtitle of the formula.

\[ x=\frac{b\pm\sqrt{b^2-4ac}}{2a} \]

While the use of a refreshable Braille display is fine for text and formulas, it cannot handle diagrams. Thus, in the absence of reliable and affordable refreshable tactile displays for graphics, a complete solution for online, fully tactile, STEM documents are not yet available.

Not every reader is comfortable with a refreshable Braille display, and therefore a natural next step is to turn documents automatically into tactile copies that contain text, formulas, and ideally graphics. Efforts of this nature are currently underway by the PreTeXt project. PreTeXt is another information–rich, text-representation format specializing in mathematics that allows the easy generation of HTML, EPUB, and other formats. It allows embedding of formulas such as the one above as well as complex mathematical graphics.

Another effort to turn source texts into embossable Braille output is Duxbury Systems, which still requires considerable manual editing when including mathematical content, and it has no provisions for graphics.

However, all of these approaches include the presentation of diagrams, for which a number of mixed-modality solutions have been developed.

Screen Reading

Although it is generally assumed that all blind people read braille and understand graphics by feeling tactile replicas, this is not the case. Reliable statistics are not available, but estimates are that 10-15% of blind people read braille and only 2-3% are comfortable with tactile graphics. This number decreases for people who develop a visual impairment later in life. In STEM, the percentage is almost certainly larger, but it is still small.

For graphics like diagrams or charts, making these accessible is compounded by the fact that it is often difficult to convey tactilely all the information that is readily available visually: tactile resolution is considerably smaller than image resolution, making it difficult to clearly separate features in crowded diagrams; colors can only be modeled to a limited extent by different textures before they become indistinguishable; and text in graphics cannot always be fitted as Braille and needs to be abbreviated or supplied in an extra key. All this make pure tactile graphics often difficult and cumbersome to read.

Consequently, there are attempts to enable screen reading and interaction with graphics directly in browsers similar to working with ordinary text and to some extent mathematics. We have already commented on the disadvantages bitmap graphics have, due to the limitations of alternative text. However, although SVG effectively offers all the technical specification that can enable effective presentation of graphical material to visually impaired readers, support for working with SVG in mainstream screen readers is still relatively poor. One reason is the late adoption of SVG as an official HTML5 standard and, in particular, its implementation in all major browsers; in Internet Explorer, SVG was unsupported until version 9. Moreover, screen readers often have problems with highly nested structures that require non-linear progression through Document Object Model (DOM) elements. Nevertheless, there are some successful approaches to making complex STEM diagrams web accessible with general screen readers, either by using ARIA constructs to guide screen readers or by turning SVG images into rich web applications using JavaScript functionality.(Moseng, and others, 2018; Sorge, Lee, Wilkinson, 2015; Godfrey A, Jonathan R, Murray, Sorge, 2018).

Math Example with Speech

SVG graphics can be turned into rich web applications. For a formula this is quite straightforward and works similarly to what we have previously demonstrated for tactile output. We consider the same example as before:

This time when you explore it, it returns speech instead of Braille. If a screen reader is switched on, the text in the subtitles is actually spoken. Navigation works similarly with the addition that some annotational language is being used to indicate, for example, that an element is denominator or enumerator.

Instructions for SVG Navigation

Users might have to disable screenreader reading modes (e.g., “browse mode” in NVDA, [NVDA Key + Space], or “virtual cursor” in JAWS, [Insert + Z]) before being able to launch the MathJax explorer application, and following the instructions below to interact with the equation.

For Voice Over users, ignore the Voice Over instructions on how to enter the group and instead just follow the instructions below to navigate the equation.

For activation and navigation press the following keys:

  • Enter or Return to activate the formula exploration when it is in focus,

  • Escape to leave exploration mode,

  • Down, Up to explore the next lower or higher level of the formula, respectively,

  • Right, Left to navigate horizontally by moving to the next or previous sub-expression on the current level, respectively.

\[ x=\frac{b\pm\sqrt{b^2-4ac}}{2a} \]

Interactive Aspirin Molecule

A similar approach can be taken for general diagrams. Below is the depiction of the structural formula for the Aspirin molecule. It can be explored in a similar fashion as the mathematical formulas:

a1 as3 5 b1 a6 b1 3 a2 as1 4 b2 b3 a3 b2 5 a4 b3 3 a3 as1 5 b2 b4 a2 b2 4 a5 b4 6 a4 as1 3 b3 b5 a2 b3 4 a7 b5 2 a5 as1 6 b4 b6 a3 b4 5 a8 b6 1 a6 as3 3 b7 b8 b1 a1 b1 5 a10 b7 4 a13 b8 2 a7 as1 2 b5 b9 a4 b5 3 a8 b9 1 a9 b10 0 a8 as1 1 b6 b9 a5 b6 6 a7 b9 2 a13 b11 0 a8 as3 1 b11 a5 b6 0 a7 b9 0 a13 b11 2 a9 as2 1 b12 b13 a7 b10 0 a11 b12 2 a12 b13 3 a10 as3 4 b7 a6 b7 3 a11 as2 2 b12 a9 b12 1 a12 as2 3 b13 a9 b13 1 a13 as3 2 b11 b8 a6 b8 3 a8 b11 1 b1 1 a1 a6 b2 1 a2 a3 b3 1 a2 a4 b4 1 a3 a5 b5 1 a4 a7 b6 1 a5 a8 b7 1 a6 a10 b8 1 a6 a13 b9 1 a7 a8 b10 1 a7 a9 b11 1 a8 a13 b12 1 a9 a11 b13 1 a9 a12 as1 as4 1 a2 a3 a4 a5 a7 a8 a2 a3 a4 a5 a7 a8 b2 b3 b4 b5 b6 b9 as3 a8 1 as2 b10 2 as2 as4 3 a9 a11 a12 a9 a11 a12 b12 b13 as1 b10 1 as3 as4 2 a1 a6 a8 a10 a13 a1 a6 a8 a10 a13 b1 b7 b8 b11 as1 a8 1 as4 1 as1 as2 as3 a1 a2 a3 a4 a5 a6 a7 a8 a9 a10 a11 a12 a13 b1 b2 b3 b4 b5 b6 b7 b8 b9 b10 b11 b12 b13

Again, left-clicking on the molecule or pressing Enter when the graphic is focused will bring up the exploration mode. Hierarchical navigation is achieved using the arrow keys. The X key provides additional information on some of the components of the molecule, while Z toggles the subtitles. Low vision support can not only be provided via magnification, which can be switched off with N, but also by changing color contrasts. In the example, C allows you to cycle through a number of high contrast color combinations, while T switches to a monochromatic display.

User studies have indicated that screen reading and keyboard exploration helps readers to identify neighboring elements and get an overview of the hierarchical nature of the information, but they often find it difficult to imagine the spatial layout.


It is often difficult to understand what is being felt in a tactile drawing/model without any additional information such as a key, tour or audio description. Audio-tactile diagrams try to solve this problem by complementing the tactile experience with audio feedback that can give information to components or explain elements that are difficult to represent tactually. The concept was first introduced by Parkes in 1988 as an audio/touch technique (Don, 1998; Technology for People with Disabilities, 1991) that even non-braille readers and people with other print disabilities can use to access graphical information. The user obtains a two-dimensional overview of a tactile graphic and hears information spoken when they encounter a text label or some graphical object while exploring the tactile graphic.

Audio-Tactile Aspirin Molecule

The following diagram is an audio-tactile copy that can be made fully accessible using the ViewPlus IVEO platform.

Aspirin. Molecule consisting of ring with 6 elements and functional group C O C O C H3 and functional group C O O H. Benzene ring Substitutions at positions 1 and 2. Ring with 6 elements double bond between positions 4 and 5 double bond between positions 2 and 3 double bond between positions 1 and 6 Substitutions at positions 1 and 2. Carbon 1. Single bond. Carbon 2. Double bond. Carbon 3 bonded to 1 hydrogen. Single bond. Carbon 4 bonded to 1 hydrogen. Double bond. Carbon 5 bonded to 1 hydrogen. Single bond. Carbon 6 bonded to 1 hydrogen. Double bond. Functional group Ester. Functional group C O C O C H3. Single bond. Oxygen 2. Single bond. Carbon 3. Double bond. Oxygen 4. Single bond. Carbon 5 bonded to 3 hydrogens. Functional group Carboxylic acid. Functional group C O O H. Single bond. Carbon 1. Single bond. Oxygen 3 bonded to 1 hydrogen. Double bond. Oxygen 2.

In lieu of audio-tactile exploration you can simply hover with the mouse pointer over regions of the diagram and see what would be spoken when on touch. Alternatively, using the TAB key allows you to move through the elements separately and see which elements can be touched separately. A screen reader will also speak the text for the corresponding touch action.

While audio-tactile graphics present an ideal means for readers to engage with diagrams, they have two major drawbacks. First, reading them requires running additional, often proprietary software outside a web browser that might not be available on all platforms. Second, they are relatively costly, due to the price of embossers and to the time it takes to emboss a tactile graphic, which is a consideration if one only wants to glance briefly at a diagram when reading an article.


So far, the techniques to make STEM diagrams accessible rely on verbal explanations of their content. An alternative for non-visual exploration and editing of graphical representations is sonification, the transformation of any data relation into non-speech sound (Thomas, 2008; Sarkar, Rajib, Bakshi, and Sa, 2012). Data sonification can be achieved thanks to the ability of the human auditory system to identify even slight changes in a sound pattern so that the amount of information conveyed through an auditory representation can be, in some cases, very close to the visual equivalent. Based on this perceptual capability, many different solutions have been developed; in particular, sonification systems to explore any kind of visual scenarios (including images in STEM subjects) and sonification techniques to explore exclusively images in STEM subjects.

As sonification has been dealt with before as a DIAGRAM topic, we refer the reader to the corresponding Sonification Chapter from 2019.


Accessibility of scientific material, and in particular diagrams, is one of the most challenging tasks in accessibility research. It is not merely a niche concern, as access to education is a human right, and restricting disabled students from learning scientific subjects due to lack of accessible material is a clear discrimination. In fact, the societal importance of this work cannot be overestimated as the need to participate in the information society where, for instance, data visualization commonly occurs in news, sports, and social media, requires that complex, data intensive, content can be accessed by all, everywhere, and on any device.

While in an ideal world all STEM content would be in a format that is accessible outright, the reality is far from this ideal. But we believe an important first step in this direction is to educate the educators that produce learning material to generate content in formats that lend themselves to be made accessible. In particular, graphics are still far too often exported in bitmap formats (e.g., JPG, GIF, PNG, BMP) even though many authoring tools allow for the generation of far more flexible and “intelligent” formats. For example, statistic software packages, like R or SPSS, often come with many options to generate data visualizations; but as many users only pay attention to what the output looks like and not to what additional information it can convey, many visualizations are still produced as inaccessible bitmaps. Some software even allows the embedding of data that makes the generation of accessible formats a posteriori so much easier. For example, chemical drawing programs do not only use chemical knowledge to guide authors while producing diagrams, they also allow this knowledge to be exported by way of standard chemical information formats like CML or MOL. This knowledge, when provided together with the corresponding graphic, can be used to make diagrams automatically accessible without any further intervention (Sorge, 2016). But, unfortunately, still too often the knowledge that is actually put into the creation of STEM diagrams is not preserved when publishing them. Consequently, we summarize and advise some practical advice:

For Teachers

When authoring content, make sure that you save or export it in a format that can be made accessible. While it is often tempting to simply export a formula or drawing as a simple image because it has a small size and a format you are familiar with from your camera or phone, this is generally not helpful for accessibility. Try to get familiar with some of the basic output formats your software can produce and choose LaTeX over JPG, SVG over GIF, a MOL file over a PNG, etc.

Here is a rule of thumb: If the format is human readable and can give you some understanding of what its content means, it is likely to be more useful than a binary format.

Whenever possible, use a specialized program for your subject area to author your content. For example, use R for statistic diagrams, or ChemDoodle for molecules. These systems are aware of the meaning of the input and can therefore generate semantically interesting content, unlike generic drawing programs where lines, points, and characters are just meaningless geometric primitives.

For Students and Parents

If you receive teaching material as documents that contain inaccessible STEM content, a first attempt should be to examine if the document contains hidden information that could help to make it accessible. Often content is dragged-and-dropped from authoring software directly into Word documents or PowerPoint slides. While on the surface this content might appear as images, often additional information is embedded in the background. For example, chemical diagrams can be dragged directly from chemical drawing software into a Word document; while superficially displayed as an image, the actual MOL file information is usually contained as well. This can often be extracted directly (e.g., right-clicking on an expression and exporting it to a separate file), by specialized software (e.g., by importing it into other chemical drawing software), or by unzipping the Word file manually and browsing through its component files.

If all this fails, ask your teachers how they authored the content and if it is possible to export more meaningful output. Educate them about what different output formats would be more helpful.


  • Soiffer, Neil. 2005. “MathPlayer: Web-Based Math Accessibility.” In Proceedings of the 7th Int. ACM Sigaccess Conference on Computers and Accessibility, 204–5. ACM.

  • Cervone, Davide, Peter Krautzberger, and Volker Sorge. 2016. “Towards Universal Rendering in MathJax.” In Proceedings of the 13th Web for All Conference, 4. ACM.

  • Sorge, Volker. 2020. “Speech Rule Engine Version 3.0.”

  • Scientific, Freedom. 2018. “Jaws.”

  • Apple. 2018. “VoiceOver.”

  • Sorge, Volker, Charles Chen, T.V. Raman, and David Tseng. 2014. “Towards Making Mathematics a First Class Citizen in General Screen Readers.” In 11th Web for All Conference, 40:1–40:10. Seoul, Korea: ACM.

  • Texthelp. 2018. “EquatIO.”

  • Moseng, and others. 2018. “HighCharts — Make Your Data Come Alive.”

  • Sorge, Volker, Mark Lee, and Sandy Wilkinson. 2015. “End-to-End Solution for Accessible Chemical Diagrams.” In Proceedings of the 12th Web for All Conference, 6:1–6:10. ACM.

  • Godfrey, A. Jonathan R., Paul Murray, and Volker Sorge. 2018. “An Accessible Interaction Model for Data Visualisation in Statistics.” In 16th International Conference on Computers Helping People with Special Needs, 590–97. Springer.

  • Parkes, Don. 1988. “Nomad, an Audio-Tactile Tool for the Aquisition, Use and Management of Spatially Distributed Information by Visually Impaired People.” In Proc. Of 2nd Int. Symposium on Maps and Graphics for Visually Handicapped People.

  • Technology for People with Disabilities. 1991. “Nomad: Enabling Access to Graphics and Text Based Information for Blind, Visually Impaired and Other Disability Groups.” In Technology for People with Disabilities, 5:690–714.

  • Hermann, Thomas. 2008. “Taxonomy and Definitions for Sonification and Auditory Display.” In Proceedings of the 14th International Conference on Auditory Display (Icad 2008). Paris, France: IRCAM.

  • Sarkar, Rajib, Sambit Bakshi, and Pankaj K Sa. 2012. “Review on Image Sonification: A Non-Visual Scene Representation.” In Recent Advances in Information Technology (Rait), 2012 1st International Conference on, 86–90. IEEE.

  • Sorge, Volker. 2016. “Polyfilling Accessible Chemistry Diagrams.” In Computers Helping People with Special Needs: 15th International Conference, Icchp 2016, Linz, Austria, July 13-15, 2016, Proceedings, Part I, edited by Klaus Miesenberger, Christian Bühler, and Petr Penaz, 9758:43–50. LNCS. Springer.

Published: 2020-08-31

Ideas that work.The DIAGRAM Center is a Benetech initiative supported by the U.S. Department of Education, Office of Special Education Programs (Cooperative Agreement #H327B100001). Opinions expressed herein are those of the authors and do not necessarily represent the position of the U.S. Department of Education.


  Copyright 2019 I Benetech

Log in with your credentials

Forgot your details?