An analysis of existing research supports a notion that already has begun to transform instruction in schools from coast to coast: that multimodal learning–using many modes and strategies that cater to individual learners’ needs and capacities–is more effective than traditional, unimodal learning, which uses a single mode or strategy.

According to a new report commissioned by Cisco Systems, adding visuals to verbal (textual and/or auditory) instruction can result in significant gains in basic or higher-order learning, if applied appropriately. Students using a well-designed combination of visuals and text learn more than students who use only text, the report says.

It also provides insights into when interactivity strengthens the multimodal learning of moderate to complex topics, and when it’s advantageous for students to work individually when learning.

“There is a lot of misinformation circulating about the effectiveness of multimodal learning,” said Charles Fadel, Cisco’s global education leader. “As curriculum designers embrace multimedia and technology wholeheartedly, we considered it important to set the record straight, in the interest of the most effective teaching and learning.”

The report, titled Multimodal Learning through Media: What the Research Says, was conducted by the Metiri Group, which serves the education community through a broad range of consulting services. It is the third in a series of meta-studies that address “what the research says” about various topics in education; prior reports tackled technology in schools and education and economic growth.

Information was gathered for the report using meta-analysis, or combining the results of several studies that address a set of related research hypotheses. Only studies published after 1997 and addressing the use of multimedia in education were considered.

“The real challenge before educators today is to establish learning environments, teaching practices, curricula, and resources that leverage what we know about the limitations of human physiology and the capacity explained by the cognitive sciences to augment deep learning in students,” says the study.

How students learn

New information about how we acquire knowledge is now available through functional magnetic resonance imaging (fMRI) of the human brain at work and rapid sampling techniques that reveal the pattern of brain activity over time as people read, listen, talk, observe, think, multitask, and perform other mental tasks.

In its introduction, the Metiri Group report indicates that the brain has three types of memory: sensory memory, working memory, and long-term memory.

Working memory is where thinking gets done and is dual-coded with a buffer for storage of verbal or text elements, and a second buffer for visual or spatial elements. Short-term memory is thought to be limited to approximately four objects that can be simultaneously stored in visual or spatial memory and about seven objects that can be simultaneously stored in verbal memory.

Within working memory, verbal/text memory and visual/spatial memory work together, without interference, to strengthen understanding. However, overfilling either buffer can result in cognitive overload and weaken learning.

Recent studies also suggest that convergence, or sensory input combined with new information at the same time, has positive effects on memory retrieval. It creates linked memories, so that the triggering of any aspect of the experience will bring to consciousness the entire memory.

Sensory memory is caused by experiencing anything through the five senses (sight, sound, taste, smell, and touch) and is involuntarily stored in long-term memory as episodic knowledge. However, these sensory memories degrade very quickly, and it’s only when the person pays attention to elements of sensory memory that these experiences get introduced into working memory. Once an experience is in a student’s working memory, the learner then can consciously hold that experience in his or her memory and can think about it in context.

Long-term memory is nearly unlimited, and it’s estimated that a person can store up to 10 to the 20th power bits of information over a lifetime–equivalent to 50,000 times the text in the U.S. Library of Congress (30 million cataloged books; 58 million manuscripts).