Cognitive load theory in the classroom

Cognitive load theory is rapidly becoming one of the most talked-about theories of how we learn. But what are the implications for how we teach? Teacher and blogger Tom Needham outlines the basics, and what they could mean for you, in the first of this three-part series.

Six years ago, I read Why Don’t Students Like School? by Daniel Willingham, a text that not only made me reconsider almost all aspects of how I was teaching but also acted as a springboard into the depths of educational research. His explanation of the importance of memory and the conceptual distinction between working and long-term memory revolutionised how I thought about instruction and made it abundantly clear that I had not been focusing upon the vital notion of retention. Cognitive load theory is also based on the conceptual difference between working and long-term memory and provides a number of strategies to optimise instruction within that framework.

An overview of some of the theory

What is it that makes experts proficient? In 1973, a study(1) was conducted to investigate what made grandmaster chess players superior to other players. While an intuitive answer may have attributed their dominance to more proficient problem-solving abilities, the application of a generic ‘means-ends’ analytical approach or the fact that they weighed up and considered a wider range of alternative strategies, the reality was a difference in their memories. Players, both expert and novice, were shown a chessboard with pieces arranged in plausible and typical game situations for five seconds. When asked to recall the positions of the chess pieces, expert players were significantly and consistently better than novices.

However, if the pieces were arranged randomly, then this gap in performance disappeared: experts and novices performed the same. With the random configurations, experts could not rely upon recalling thousands of game configurations as the pieces did not conform to or fit game patterns that they had stored in long-term memory. Similar results have also been found in other domains, including recall of text and algebra. The conclusion of these studies was that when solving problems or engaged in cognitive work, experts within a field rely upon their larger and more-developed long-term memory deposits, patterns of information that are also called schemata. While short- term memory has a limited capacity, long-term memory capacity is vast and seemingly endless.

Recognising the fact that novices have less relevant knowledge stored in their long-term memory, Sweller et al. explain: ‘Novices need to use thinking skills. Experts use knowledge.’(2) Because ‘thinking skills’ rely upon working memory, an aspect of cognition that has a small and fixed capacity for holding and manipulating items, novices soon reach the limits and, due to excessive cognitive load, find tasks difficult or impossible as a result. The implications of these findings are striking for teachers. In a general sense, we should be spending much – if not most – of our time as teachers trying to increase our students’ domain-specific background knowledge so that we can help them overcome the seemingly unalterable capacity in their short-term memory and recall, apply and use relevant knowledge from their long term memories. Sweller et al. posit that ‘we should provide learners with as much relevant information as we are able’(3) and that ‘assisting learners to obtain needed information during problem solving should be beneficial’(4). They also posit that ‘providing [learners] with that information directly and explicitly should be even more beneficial’(5). Explicit teaching, at least for novices, is almost certainly preferable to asking students to discover things for themselves. If we are not explicit, there is a chance that students will not retain and understand what we are teaching, resulting in a missed opportunity for them to increase their knowledge.

In order to develop in expertise, students need to increase their knowledge; and in order for them to increase their knowledge efficiently, they need direct and explicit teaching.

The worked example effect

In short, the worked example effect refers to the idea that if you want novices to succeed in a particular domain, they would be better off studying the solutions to problems rather than attempting to solve them. Asking students to repeatedly write extended answers to questions ‘unnecessarily adds problem-solving search to the interacting elements, thus imposing an extraneous cognitive load’.(6) In the absence of well-developed background knowledge, students flounder because they have little stored in their long-term memories to help them. Comments in class such as ‘I don’t know how to start’ and ‘What do I write?’ are sometimes indicative of this scenario.

I teach English, and responding analytically to texts is a complex activity containing multiple components, many of which are abstruse for novice learners. If you try to describe these elements, you are forced to use abstract phrases such as ‘sophisticated analysis’ and ‘judicious use of quotations’; and, in the absence of examples, these terms merely serve to mystify the process further. This is the language of mark schemes, terminology that may make sense to experts but leaves novices confused. Creating worked examples – in English this may mean sentences, paragraphs or essays – exemplifies these opaque terms, converting the abstract into the concrete.

Sweller et al. argue that ‘worked examples can efficiently provide us with the problem-solving schemas that need to be stored in long-term memory’.(7) Studying worked examples is beneficial because it helps to build and develop students’ background knowledge within- their long term memories, information that can then be recalled and applied when attempting problems. The grandmasters in the chess study were successful because of the breadth and depth of their background knowledge. Similarly, English teachers find writing (one of the problems in our domain) easy because we have long-term memories that contain myriad ‘problem solving schemas’ and mental representations of analytical responses to texts.

If we accept the notion that short-term memory capacity is pretty much fixed – as well as the idea that we cannot really teach generic higher-order thinking skills – then building domain-specific background knowledge may be our most important job as teachers. Studying worked examples is more effective and efficient than merely attempting problems. Deconstructing and studying model sentences, paragraphs and essays should, in the long run, be superior to merely writing them.

Research into the worked-example effect in English

In Cognitive Load Theory, Sweller et al. refer to English, the humanities and the arts as ‘ill-structured learning domains’(8) to distinguish them from mathematics and science. They make the point that in maths and science problems, we can ‘clearly specify the various problem states and the problem-solving operators’(9) – essentially rules that dictate process and approach. ‘Ill-structured domains’ do not have such rigid constraints. Although there are subjective elements within English and often innumerable ways of approaching a task, different approaches may be considered of equal worth and demonstrate a comparable level of proficiency. The variables within analytical writing can, like the colours within a painter’s palette, be arranged in numerous and diverse patterns; however, these different configurations can be judged to contain equivalent skill and quality. Despite this, the researchers make the important point that ‘the cognitive architecture … does not distinguish between well-structured and ill-structured problems’,(10) meaning that the findings of Cognitive Load Theory apply to all domains. The researchers also explain that ‘the solution variations available for ill-structured problems are larger than for well-structured problems but they are not infinite and experts have learned more of the possible variations than novices’.(11) Over the years, teachers have read, thought about and produced innumerable pieces of analysis and, as a result, have developed rich schemata of this kind of knowledge which they can recall, choose from and apply when dealing with problems.

Sweller et al. point out that ‘even though some exposure to worked examples is used in most traditional instructional procedures, worked examples, to be most effective, need to be used much more systematically and consistently to reduce the influence of extraneous problem-solving demands’(12). A five-year curriculum that systematically and consistently uses worked examples should help students build a rich schemata of ‘possible variations’,(13) moving them more quickly and efficiently along the continuum from novice to expert than if they had just completed lots of writing tasks. The constant studying of concrete worked examples is far superior to describing proficiency using abstract and often vague descriptors and success criteria. When describing complex performance in the absence of concrete examples (which is the purpose of a mark scheme), the sheer breadth and possible variation of what is being described necessitates a wide lens of representation. While this is advantageous to the expert, allowing complexity to be summarised and condensed, it is obfuscatory and perhaps even meaningless for students. Experts have abundant and detailed schemata that exemplify abstract terms like ‘critical analysis’, ‘judicious references’ and ‘contextual factors’; novices do not.

In Cognitive Load Theory, two studies directly relevant to English are referenced. In the first,(14) students were given extracts from Shakespearean plays, half receiving texts with accompanying explanatory notes, the other half receiving no additional notes. Perhaps unsurprisingly, the group who were given the notes performed better on a comprehension task. In the other study,(15) students were given an essay question to answer. One group received model answers to study; the other did not. The study found that ‘the worked example group performed significantly better than the conventional problem- solving group’(16).

What does this look like in English?

If we want students to perform well in complex tasks like writing, we should be giving them the necessary information ‘directly and explicitly’. Echoing Engelmann’s sentiment that we should ‘teach everything students will need’,(17) the work of Sweller et al. also points to the superiority of explicit, direct instruction, approaches that seem more efficient and effective for novice learners. With regards to English, we should be explicitly teaching sentence structures and vocabulary. We should provide this information to students when they are completing extended writing and one way of doing this is through vocabulary tables that contain definitions and examples. Not just examples of how the vocabulary words are used, but also examples of the sentence styles that students should include. Each of these example sentences is a worked example in itself and, with effective teacher questioning and annotation, can be a powerful way of turning abstract and amorphous success criteria (‘use sophisticated sentences’/‘use a range of complex sentences’ etc.) into concrete examples that the learner can ‘study and emulate’.(18)

To minimise cognitive load, students have these tables when they are annotating the poem, allowing them to make the link between text and interpretation.

Although Cognitive Load Theory contains a number of different effects, the worked example effect is described by the researchers as being ‘the most important’;(19) and, because of this importance, we have incorporated it into all stages and aspects of our curriculum. Almost always, when students are asked to write, they will have studied a related and relevant worked example.

If you would like to know more about cognitive load theory, here are some useful resources:

1) Greg Ashman’s blog has many detailed posts about CLT.(20)

2) This succinct and practical summary.(21)

3) Oliver Caviglioli’s fantastic graphic overview of Cognitive Load Theory by Sweller, Ayres and Kalyuga.(22)

Parts of this article first appeared on Tom Needham’s blog. Reproduced with permission.


1. Chase, W. G. and Simon, H. A. (1973) ‘Perception in chess’, Cognitive Psychology 4 (1) pp. 55–81.

2. Sweller, J., Ayres, P. and Kalyuga, S. (2011) Cognitive load theory. Berlin: Springer, p.21

3. Ibid. p.31

4. Ibid.

5. Ibid.

6. Ibid. p.99


8. Ibid. p.102

9. Ibid.

10. Ibid.

11. Ibid.

12. Ibid. p. 100.

13. Ibid. p. 102.

14. Oksa, A., Kalyuga, S., & Chandler, P. (2010) ‘Expertise reversal effect in using explanatory notes for readers of Shakespearean text’, Instructional Science 38 (3) 217–236.

15. Kyun, S. A., Kalyuga, S. and Sweller, J. (in preparation) The effect of worked examples when learning English literature.

16. Ibid. p. 101.

17. Engelmann, S. (2014) Successful and confident students with direct instruction. Eugene, OR: NIFDI Press, p.35

18. Sweller, J., Ayres, P. and Kalyuga, S. (2011) p. 100 19. Ibid. p. 108.