Computer beside a 3D printer with a number of 3D models of gears, a mountain range, bones, teeth, and dioramas.

Multimodal User Interfaces

Emily B. Moore

Director of Research & Accessibility

PhET Interactive Simulations,

University of Colorado Boulder

And

Jenna L. Gorlewicz

Assistant Professor of Mechanical Engineering

Saint Louis University

What are Multimodal User Interfaces?

When you read an article, listen to a podcast, or touch the screen of your smartphone, you are using some of your five senses (seeing, hearing, or touching) to experience information. The use of more than one of our senses at a time can help provide additional information about a problem, object or concept that one sense alone could not provide. Multimodal user interfaces provide multiple modes of communication at the interface between you and the technology, allowing you to use more of your senses at once.

Broadening the modalities used for communicating with technology can increase access to technology while also increasing the amount and efficiency of the information communicated. For example, e-readers can display the text of your favorite book (using a visual mode), or also read it aloud to you (using an auditory mode). While some may just prefer to listen to the text, others might choose to see and hear the text simultaneously. If the student who is blind has a refreshable braille display, the text could be read with braille while listening to the audio. You may turn the pages of your book by touching the screen (a touch mode), or turn the page using a voice command (an auditory mode). Multimodal interactions let you choose to access the information in the way that works for you. In the above e-reader example, a person with a visual impairment could use the auditory modality to access the text, while a person with a hearing impairment might use the same e-reader to access the text visually. This is an example of a single technology being flexible and able to adapt to meet the needs of the user.

As technologies become more sophisticated, the capability to include a broad palette of modalities in multimodal interfaces becomes possible and may ultimately support technologies matching the full spectrum of human perception and communication. For more detail and examples of multimodal user interfaces, check out the 2017 and 2018 DIAGRAM Center Reports. In this year’s section on multimodal user interfaces, we build on these prior reports, detailing the importance of multimodal user interfaces particularly as new technologies evolve. We close with specific callouts to educators, parents, students, and designers for promoting and innovating multimodal user interfaces.

Why are Multimodal User Interfaces Important?

Multimodal user interfaces enable communication through multiple senses so that users can send and receive information in a preferred or more engaging way. This is beneficial in educational technologies and in any technology. The advancement of multimodal interfaces has had broad implications in many industries, including entertainment, medicine, defense, and education. For example:

Inventor of the telephone, Alexander Graham Bell, was a teacher of deaf and deaf-blind students. Bell’s interest in sound stemmed from an effort to make sound visible to his students, and his work ultimately broadened speech modalities from auditory to visual, and in the process transformed how humans with and without hearing impairments communicate.
More recently, Apple’s 2007 introduction of the iPhone brought touchscreen devices to the mass market. Initially, this was considered a huge setback to accessibility for blind consumers, as this new technology was based around a flat, featureless glass surface that displayed information visually. However, in 2009, Apple released the iPhone 3GS, which included the VoiceOver screen reader and a collection of new input gestures, bringing touch and auditory communication to this visual interface. Almost immediately, the iPhone became one of the most accessible pieces of assistive technology in existence – even though it was not originally designed to be an assistive technology – contributing to it becoming one of the most widely used consumer technologies today.
Video games have also embraced multimodal user interfaces. Video games were once primarily solo visual experiences. Now games are rich auditory experiences, with many commercial games supporting chat and voice conversations or come with game controllers that rumble to simulate interactions. Video games also have increasingly realistic graphics and animation, enriching the visual experience. Together, these multimodal interface features support immersive, engaging, and collaborative experiences for players.

As technologies process and convey larger amounts of information and accomplish more complex tasks, the need for multimodal user interfaces increase. The evolution of user interfaces has significant implications for workforce development. Not only are the ways learners are prepared for the workforce changing as new technologies enable new forms of interaction, entirely new careers are being created to support research, design, and development of multimodal user interfaces (see Leading Edge section).

Who is Doing This Already?

Multimodal interfaces span solutions that are high and low technology and incorporate different sensory modalities. The 2017 and 2018 DIAGRAM Center reports have detailed numerous examples of alternative means of representing information. These examples include 2D tactiles (for representing graphics through touch), 3D printing (for creating 3D manipulatives), alternative text (for providing textual descriptions of images), and audio/video description (for narration of videos, tours, or exhibits). Indeed, there are many examples of multimodal interfaces or methods that already exist. One common example of such a multimodal interface is touchscreens. Touchscreens use the visual display on the screen, coupled with auditory feedback (e.g., ringtones) and haptics (e.g., vibration cues) to convey information to the user. Such feedback helps users type on the keyboard, receive alerts, and consume and navigate information displayed on screen through these varying modalities.

Other existing multimodal interfaces include video game systems (which use a combination of visual, auditory, and vibrations) and a variety of web-based tools, including simulations such as the PhET Interactive Simulations and the FLOE chart authoring tool by the Inclusive Design Research Centre. Several tools have been developed by the DIAGRAM community that promote multimodal information transfer such as The Quick Start Guide to Accessible Publishing and Imageshare.

While multimodal interfaces are on the rise, even existing technologies that are commercially available (e.g., touchscreens) are not using all modalities to their full capacity. Many additional multimodal interfaces are under development in research. In the sections that follow, we focus on these new, emerging technologies that are using combinations of modalities to convey information and foreshadow the future of multimodal user interfaces more broadly.

Opportunities – How are Multimodal User Interfaces Impacting Education?

The vast majority of students in the United States are educated in integrated classrooms that include students with and without disabilities learning in the same room (U.S. Dept. of Education, 2016). Inclusive classrooms often require collaborative teaching arrangements, with general education teachers working with special education teachers on the planning, preparation, and implementation of classroom activities.

Technologies with multimodal user interfaces can provide opportunities for learners with and without disabilities to utilize the same learning resources, supporting teachers and students in effective, efficient, and inclusive classrooms. Additionally, many technologies can also be adaptive, flexing to meet the communication as well as the content needs of an individual or group of learners. For example, textbooks contain words and images and may have online resources such as videos to complement the text materials. If a textbook provides images with no description of the image, or the online video provides information verbally only, students with vision or hearing impairments will be left out – thus losing opportunities for independent investigation.

Further, students who cannot access the visual display of the book, or the auditory display of the video, will also be disadvantaged when trying to join group and classroom discussions of the materials. If the textbook material was provided in an accessible format, such as an EPUB or with an accessible e-reader, and the images had associated descriptions (i.e., alternative text), visually impaired learners could have access to auditory options to access the visual information. If the online video had captions, providing the verbal information in a visual text format, learners with hearing impairments could access the visual representations of the auditory information. In both cases, by broadening the modalities used to convey information through a multimodal user interface, opportunities for all students to be able to engage in independent investigations and to participate fully in group and class discussions increases tremendously. Some currently available educational technologies with multimodal user interfaces include innovative digital textbooks, such as the Reach for the Stars: Born Accessible Astronomy Textbook and the PhET Interactive Simulations collection of accessible science simulations.

To highlight the potentially transformative capabilities of multimodal user interfaces for impacting education, here is a sample interaction one of the accessible PhET Simulations could support:

Dion, Iris, and Aki are using the accessible PhET sim John Travoltage to explore static electricity as part of an in-class activity

In this simulation, the character John is standing on a rug next to a door. Rubbing his foot on the rug results in charges transferring onto his body, and moving his hand close to the doorknob can result in a transfer of charges from his body to the door – and a shock!

Their teacher has asked them to explore the simulation and discuss what happens as more charge is collected on the body of the character, John. This simulation has descriptions, sound effects, and sonification available. See chapter on sonification for more information. The three students share Aki’s computer, which has screen reader software that allows Aki, who is blind, to control the simulation with her keyboard and hear descriptions of what is being interacted with and what is occurring in the simulation.

The students each take turns controlling the computer and adding charges to John’s body. Dion and Iris use a mouse, and Aki uses the keyboard. They all hear the sound of the foot rubbing, the ‘pop’ of charges transferring onto John’s body, and a low hum that increases in pitch and volume as more charges are added. The action of foot rubbing and the total number of charges is described as more charges are added. Once they have enough charge added, the charges transfer to the doorknob with a ZAP! Together, they go on to explore the relationship between the amount of charge on John’s body and the location of John’s hand, determining that the more charge there is on John’s body, the farther his hand can be away from the doorknob while still resulting in a shock. Along the way, Dion, Iris, and Aki are each relying on different modalities for their primary source of information (Aki is relying primarily on descriptions while Dion and Iris are relying primarily on the visual display). All learners in this group have some type of learning difference, but all are using the sound effects and sonification.

Notice how, in this example, Aki, the student who is blind, can seamlessly participate in the learning activity with her sighted peers. She could utilize her screen reader software to follow along with auditory descriptions as her peers experimented, and as she experimented her peers could follow along visually. The multimodal user interface, along with the appropriate assistive technologies, provided an opportunity for all of the students in the group to engage in the class activity together.

An additional example of a multimodal user interface in education is illustrated via Vital’s touchscreen-based tools focused on making graphics accessible, electronically. Here is a sample interaction in a sixth grade math class:

Jake, Olivia, and Akash are using Vital’s Android application to learn about bar graphs in mathematics. The teacher has used Vital’s web-based content creator to mark up pictures of bar graphs from their math textbook so that the students can feel (through vibrations), hear (speech and sonification), and see (visual display) the bar graphs in real time during class. The teacher has asked the students to explore the bar graphs and discuss the different trends in each of them. Using their touchscreen, which they all share, the students begin to explore the graphics. As they run their fingers over the titles and labels, the text is read aloud to them. A general description and overview of the graphic can be accessed at any time by double tapping on the screen. As they run their fingers over the different bars on the chart, they hear them. As they move their fingers up and down, the pitch of the tone varies, providing an easy way to compare the heights of the bars with respect to one another. Values and categories of the bars can be read aloud via text-to-speech or can be determined by tracing back along the grid lines to the x and y-axes using vibration feedback.

The students can take turns exploring the graph multimodally and can swipe to the right to explore the next image. They all hear the text-to-speech being read aloud for the headers and labels, they hear the pitch change as each of them take turns comparing bar heights, and they feel the low, buzzing vibration as they trace along the different grid lines. Together, they determine the maximum and minimum values of the bars, as well as the trend over time. They answer questions about the bar graph, and then move to the next image and set of questions. Along the way, John, Olivia, and Akash are each relying on different modalities for their primary source of information (Olivia is relying primarily on speech, sonification, and vibration, while Akash and John are relying primarily on the visual display).

Similar to the accessible PhET simulation, in this example, the student who is blind, Olivia, can seamlessly participate in the learning activity with her sighted peers. She uses the vibrations built within commercial touchscreens to navigate the graphic. Using the sonification and auditory descriptions coupled with the visual output, she and her peers can explore the graph individually and then collaboratively to extract and discuss the information in it. In this example, the commercially available touchscreen is also serving as an assistive technology (and it could be paired with other assistive technologies), providing an opportunity for all of the students to collaborate and contribute in math class.

Leading Edge

Many current multimodal user interfaces were originally conceived to emphasize one or two modalities, with additional modalities added over time as the need or technological capacity developed. Examples include the addition of alternative text to images in textbooks and websites and the addition of gesture input and haptic feedback to mobile phones. The next generation of multimodal user interfaces are being developed with a broader set of modalities in mind from the start and are resulting in outcomes with greater parity between modalities.

One current research thread stems from an interest in understanding how to expand “touch” in the electronic space, supporting intuitive tangible interactions that, when coupled with auditory display modalities, can serve to augment or in some cases decrease the current visual display paradigm. There are a growing number of new technologies that explore how to create these tangible interactions on new types of interfaces, from touchscreens to wearable devices that provide the user with information through vibrations, pressure, temperature, or force.

As highlighted above, one example of this initiative is in the touchscreen space. Tools like Vital’s convey graphical information using the built-in vibration capabilities coupled with sonification, auditory description, and visual display on commercially available touchscreens. A growing number of new technologies are coming to market specifically focused on enabling new “touch dimensions.” These include the Graphiti, American Printing House’s dynamic touch-sensitive pin array; the Blitab tablet, which is capable of a full page of braille; shapeshift, a refreshable multi-height pin display that can render 3D objects and dynamic movement [39]; and microfluidic-based tablets that are capable of refreshable, raised dots (e.g., The Holy Braille Project). While some of these are closer to becoming more widely available than others, they are all pushing the boundaries of multimodal interfaces.

There is also advancing efforts for multimodal virtual reality, augmented reality, mixed reality, 360 video (XR), wearables, and innovative ubiquitous computing environments, such as Dynamicland.

Final Thoughts

As new and more complex technology develops, the need for multimodal user interfaces will only increase. In fact, entire fields are focused on designing aspects of these interfaces. For example, the field of Human Computer Interaction specifically focuses on understanding how the design of new interfaces can better promote communication between users and technologies. In hardware and software design, more emphasis is being placed on designing with all stakeholders involved, providing an opportunity for individuals with diverse perspectives to contribute early in the design process. As these fields continue to expand, more people with disabilities are being included in the design process directly, providing a new pathway for workforce and career development while also producing new technologies that better serve all people.

There are many opportunities for you or a child you know to get involved as a user, designer, developer, or research participant, including:

The Interaction Design and Children conference holds an annual design challenge, providing children the opportunity to submit ideas of new technologies that could impact the world.
Local colleges and universities are often looking for user testers to try out new prototype technologies. Search the website of your local higher education institutions for calls for participants, or reach out to their computer science departments directly.
Many projects in accessible media and technology (such as DIAGRAM, the Inclusive Design Research Centre, XRaccess.org, and many others) are home to active virtual communities that welcome contributions from software development and design professionals, researchers, users, and advocates. Consider participating in one of these communities and contributing to problem identification as well as the design and development of solutions.
Consider expanding your involvement in the technology world by learning more about design (d.school resources) and development (e.g., Scratch website, see the 2017 chapter on Accessible Coding).

Callouts

Callout for Educators

Investigate your students’ individual needs for accessing information before deciding on how you will augment the curriculum. Don’t be afraid to experiment with different modalities and combinations of modalities.
Before choosing to use a specific program or device in an educational setting, consider the Universal Design for Learning guidelines. These guidelines can assist in lesson planning, units of study and/or development of curricula that can reduce barriers to learning.
Familiarize yourself with emerging technologies and their implications in supporting diverse learners; they may influence what modalities you choose to use.
If you use the multimodal features of a technology in your teaching and have an experience, lesson, or activity that students respond especially well to, share this with others. Email a description of the experience, lesson you learned, and any associated files to info@diagramcenter.org and we will include them in our Imageshare resource.

Callout for Parents

Discuss with your children how they learn best so that you can advocate for their needs.
Work with your child’s teacher to find tools that can help your child learn.
Familiarize yourself with new technologies so you can help your child at home.
Make sure all options are considered when developing Individualized Education Program (IEP) goals and resources.

Callout for Students

Talk to your teachers about the ways that you learn best. Tell them if it is easier to understand information by hearing it, seeing it, interacting with it, or in combination.
Ask your teacher if you can complete assignments in non-traditional ways that will help demonstrate your ability.
Explore new career pathways like those in the Human-Computer Interaction field

Callout for Designers

Support and implement participatory design in the design process of products and reference resources such as the Inclusive Design Guide.
Consider an inclusive design approach from the onset, as opposed to an afterthought.

Published: 2019-08-31

Multimodal User Interfaces

What are Multimodal User Interfaces?

Why are Multimodal User Interfaces Important?

Who is Doing This Already?

Opportunities – How are Multimodal User Interfaces Impacting Education?

Leading Edge

Final Thoughts

Callouts

Callout for Educators

Callout for Parents

Callout for Students

Callout for Designers

Work and Resources

About Benetech

DIAGRAM Center

Multimodal User Interfaces

What are Multimodal User Interfaces?

Why are Multimodal User Interfaces Important?

Who is Doing This Already?

Opportunities – How are Multimodal User Interfaces Impacting Education?

Leading Edge

Final Thoughts

Callouts

Callout for Educators

Callout for Parents

Callout for Students

Callout for Designers

Work and Resources

About Benetech

DIAGRAM Center

Log in with your credentials

Forgot your details?