Key Takeaways
1. Multimedia Learning Works: Words and Pictures Are Better Than Words Alone
People learn better from words and pictures than from words alone.
The core thesis. For centuries, verbal instruction like lectures and books dominated education. However, research shows that combining words with relevant pictures significantly improves learning, particularly understanding. This isn't just about adding visuals; it's about leveraging the brain's capacity to process information through both verbal and visual channels.
Beyond verbal. While verbal learning is powerful, multimedia learning explores going beyond purely verbal methods. Advances in graphics technology highlight the potential of using words and pictures together to promote deeper human understanding. This approach is grounded in how people learn from different formats.
Evidence is strong. Across eleven experimental tests, learners who received explanations with both words and pictures consistently performed better on problem-solving transfer tests than those who received words alone. The median effect size was a large 1.39, demonstrating that multimedia presentations are more effective at fostering understanding.
2. Learning Follows Cognitive Rules: Dual Channels, Limited Capacity, Active Processing
Multimedia messages that are designed in light of how the human mind works are more likely to lead to meaningful learning than those that are not.
Design for the mind. Effective multimedia design isn't about showcasing technology; it's about aligning instruction with human cognitive architecture. A cognitive theory of multimedia learning posits three core assumptions about how people process information from words and pictures.
Three key assumptions:
- Dual Channels: Humans process visual/pictorial information and auditory/verbal information through separate channels (like eyes/ears or visual/verbal working memory).
- Limited Capacity: Each channel has a limited capacity for processing information at any one time.
- Active Processing: Learners actively engage in selecting relevant information, organizing it into coherent mental structures, and integrating these structures with prior knowledge.
Building understanding. Meaningful learning involves a five-step process: selecting relevant words, selecting relevant images, organizing words into a verbal model, organizing images into a visual model, and integrating these models with each other and prior knowledge. Effective multimedia design facilitates these active cognitive processes within the constraints of limited capacity dual channels.
3. Reduce Mental Overload: Exclude Irrelevant Information (Coherence Principle)
People learn better when extraneous material is excluded rather than included.
Less is more. Adding interesting but irrelevant words, pictures, sounds, or music to a multimedia lesson can actually hurt learning. This extraneous material competes for limited cognitive resources, diverting attention and mental effort away from the essential content needed for understanding.
Three types of extraneous material:
- Irrelevant words/pictures (e.g., interesting facts or photos unrelated to the core explanation).
- Irrelevant sounds/music (e.g., background music or sound effects that don't convey instructional content).
- Unneeded words/symbols (e.g., overly detailed text when a concise caption suffices).
Evidence supports conciseness. Across fourteen tests, learners who received concise multimedia presentations without extraneous material consistently outperformed those who received expanded versions. The median effect size was a large 0.97, indicating that removing non-essential elements significantly improves understanding.
4. Reduce Mental Overload: Place Related Words and Pictures Together (Contiguity Principles)
Students learn better when corresponding words and pictures are presented near rather than far from each other on the page or screen.
Proximity aids integration. When words and pictures that refer to the same concept or event are separated in space (on a page/screen) or time (in a presentation sequence), learners expend cognitive effort searching and holding information in memory to make connections. This extraneous processing reduces capacity for deeper understanding.
Spatial Contiguity: Placing corresponding text and graphics close together (e.g., captions next to the relevant part of an illustration or animation) reduces visual search and facilitates integration. In five tests, integrated spatial presentations yielded a large median effect size of 1.19 compared to separated presentations.
Temporal Contiguity: Presenting corresponding narration and animation simultaneously (rather than one after the other) ensures both are active in working memory at the same time, making integration easier. In eight tests, simultaneous presentations resulted in a large median effect size of 1.31 compared to successive presentations.
5. Reduce Mental Overload: Avoid Redundant On-Screen Text with Graphics and Narration (Redundancy Principle)
People learn better from graphics and narration than from graphics, narration, and printed text.
Don't overload channels. Presenting the same words simultaneously in both auditory (narration) and visual (on-screen text) formats, along with graphics, can overload the visual channel. Learners try to process both the graphics and the text visually, and may also expend effort trying to reconcile the two verbal streams.
Narration offloads visual processing. When words are presented as narration, they are processed in the auditory channel, freeing up the visual channel to focus on the graphics. This balanced load allows for more efficient processing and integration of words and pictures.
Evidence against redundancy. In five tests, learners who received graphics with narration performed better on transfer tests than those who received graphics, narration, and redundant on-screen text. The median effect size was a medium-to-large 0.72, supporting the principle that adding redundant on-screen text can hinder learning.
6. Reduce Mental Overload: Highlight Key Information (Signaling Principle)
People learn better when cues that highlight the organization of the essential material are added.
Guide attention. In lessons containing a lot of information, learners may struggle to identify and focus on the most important parts. Signaling involves adding cues that highlight the structure and key elements of the essential material, guiding the learner's attention and organization processes.
Types of signaling:
- Verbal cues: Outline sentences, headings, vocal emphasis on key words, pointer words ("first," "second").
- Visual cues: Arrows, distinctive colors, flashing, pointing gestures, graying out non-essential areas.
Signaling helps focus. By directing the learner's attention to what is most important and how it is organized, signaling reduces extraneous processing and supports the construction of a coherent mental model. Preliminary evidence from six tests shows a medium median effect size of 0.52 favoring signaled multimedia lessons.
7. Manage Complexity: Break Lessons into Learner-Paced Parts (Segmenting Principle)
People learn better when a multimedia message is presented in user-paced segments rather than as a continuous unit.
Digestible chunks. When complex material is presented in a fast-paced, continuous format (like a long narrated animation), learners may not have enough time to fully process and understand one part before the next is presented. This essential processing overload prevents deeper learning.
Pacing matters. Segmenting breaks the lesson into smaller, meaningful parts (e.g., steps in a process) and allows the learner to control the pace of moving from one segment to the next (e.g., by clicking a "continue" button). This gives learners time to process each chunk before proceeding.
Evidence for segmenting. In three tests, learners who received complex narrated animations in user-paced segments performed better on transfer tests than those who received the same lessons as continuous units. The median effect size was a large 0.98, indicating that allowing learners to control the pace of complex material significantly improves understanding.
8. Manage Complexity: Introduce Key Concepts Before the Main Lesson (Pre-training Principle)
People learn more deeply from a multimedia message when they know the names and characteristics of the main concepts.
Build foundational knowledge. When a multimedia lesson introduces many new terms or concepts while simultaneously explaining a complex process, learners may experience essential processing overload. They have to expend cognitive resources learning the new terms and trying to understand the overall system.
Offload complexity. Pre-training involves introducing the names, locations, and characteristics of key components or concepts before presenting the main multimedia explanation. This allows learners to build component models beforehand, freeing up cognitive capacity during the main lesson to focus on building a causal model of the system.
Pre-training improves transfer. In five tests, learners who received pre-training on key concepts before a multimedia lesson performed better on problem-solving transfer tests than those who did not. The median effect size was a large 0.85, supporting the idea that equipping learners with foundational knowledge helps them manage the complexity of the main lesson.
9. Manage Complexity: Use Narration, Not Text, with Graphics (Modality Principle)
People learn more deeply from pictures and spoken words than from pictures and printed words.
Balance the load across channels. When graphics (like animations or illustrations) are presented with printed words (like on-screen text or captions), both visual information and verbal information must be processed by the visual channel. This can overload the visual system, especially with complex or fast-paced material.
Auditory advantage. Presenting words as narration allows the verbal information to be processed by the auditory channel, while the graphics are processed by the visual channel. This distributes the cognitive load across both channels, preventing overload in the visual system.
Strong evidence for modality. In seventeen tests, learners who received graphics with narration consistently performed better on transfer tests than those who received graphics with printed text. The median effect size was a large 1.02, making the modality principle one of the most strongly supported findings in multimedia learning research.
10. Foster Engagement: Make the Language Conversational (Personalization Principle)
People learn better from multimedia presentations when words are in conversational style rather than formal style.
Connect with the learner. Multimedia learning can be viewed as a social interaction between the instructor (author, narrator, agent) and the learner. Using a conversational style, including "you" and "I" and direct comments, can increase the learner's feeling of social presence and encourage them to engage more deeply with the material.
Beyond information delivery. This principle suggests that effective instruction is not just about delivering information efficiently; it's also about motivating the learner to actively process that information. A conversational tone can prime a social response, making the learner more willing to cooperate and make sense of the message.
Personalization boosts understanding. In eleven tests, learners who received multimedia lessons with words in a conversational style consistently performed better on transfer tests than those who received the same lessons in a formal style. The median effect size was a large 1.11, indicating that personalizing the language fosters generative processing.
11. Foster Engagement: Use a Human Voice (Voice Principle)
People learn better when narration is spoken in a human voice rather than in a machine voice.
Voice as a social cue. The quality of the narrator's voice can also serve as a social cue, influencing the learner's perception of the instructor and their willingness to engage. A friendly human voice may convey a stronger sense of social partnership than a synthesized machine voice.
Human voice encourages processing. While machine voices can be perfectly understandable, they may lack the subtle social cues that encourage learners to view the interaction as a conversation. This lack of social presence might reduce the learner's motivation to engage in deeper cognitive processing.
Preliminary evidence. In three tests, learners who heard narration spoken by a friendly human voice performed better on transfer tests than those who heard the same narration spoken by a machine voice. The median effect size was a medium-to-large 0.78, providing preliminary support for the voice principle.
12. Effective Design Depends on Learner Expertise and Material Characteristics (Boundary Conditions)
Design principles – such as the modality principle – are not immutable laws that must apply in all situations.
Context matters. Multimedia design principles are not universal commandments but rather guidelines that should be applied based on the specific learning context. Two key boundary conditions influence when certain principles are most effective: the learner's prior knowledge and the complexity and pacing of the material.
Learner expertise: Some principles, particularly those aimed at reducing extraneous processing or managing essential processing (like Modality, Spatial Contiguity, and Pre-training), tend to have stronger effects for learners with low prior knowledge. High-knowledge learners may already possess the skills or schemas to overcome poor design or manage complexity on their own.
Material complexity and pacing: Principles that help manage essential processing (like Segmenting and Modality) are often more effective when the material is complex and presented at a fast pace. These conditions are more likely to cause cognitive overload, making design interventions that reduce load particularly beneficial.
Last updated:
Review Summary
Multimedia Learning by Richard Mayer receives mostly positive reviews, with readers praising its informative content and research-backed principles. Many find it useful for instructional design and multimedia education. Some readers note its academic writing style can be dry, but the insights are valuable. The book is recommended for those in educational technology and multimedia design fields. Reviewers appreciate the comprehensive coverage of cognitive theories and practical applications. Several mention its relevance to both education and business presentations. Some suggest reading newer works for updated information.