The psychology of behaviour management (part 1)

Behaviour management in classrooms has become a hotly contested debate in recent years, both in the UK and abroad. But despite being commonly understood as a priority for effective teaching, many teachers complain that training in this area is often insufficient, or under evidenced. In the first of a series of features, Nick Rose explores some of the science behind the major schools of thought in this area.

The topic of behaviour management and the problems teachers face in dealing with disruption to lessons continues to provoke strong argument within the profession. The extent of the problem was explored in a 2014 paper by Terry Haydn(1) which argued that whilst ‘official’ reports like Ofsted inspections appeared to rate behaviour as at least ‘satisfactory’ in the majority of schools, there was evidence that deficits in classroom climate continue to be a serious and widespread problem. Examples of blogs detailing the sorts of issues in school approaches to behaviour are plentiful(2).

Systems of rewards and punishments have long been the norm in schools, but perhaps because of a growing feeling that behaviour has become increasingly difficult to manage, behaviour management has become the focus of experimentation. Some schools have started looking for novel solutions to the problem of disruption in lessons (for example, Kilgarth School in Birkenhead, UK was recently reported to have ‘banned’ punishment altogether); others believe that proportionate sanctions need to be available to teachers as a deterrent. In 2015, the UK government set up a working party, led by Tom Bennett, to develop better training for new teachers and showcase effective practices in schools.

One controversial approach has been to move schools away from systems of reward and punishment towards a restorative justice approach. Originally developed within the context of police work, the idea of restorative practice involves conversations between ‘offender’ and ‘victim’ (or the student and teacher) to give an opportunity to discuss how they have been affected by events and to decide what should be done to move forward. There are claims that this(3) approach can improve behaviour and results, but critics argue that such policies are making schools less safe (4). Whilst not always explicitly linked, many of the processes appear to draw upon techniques used in cognitive behavioural therapy (CBT). For example, Restorative Thinking are an organisation that work with schools to implement school restorative practices that make the link to CBT and other forms of therapy explicit.

Another approach has come from Doug Lemov’s Teach Like a Champion (5). Lemov’s approach involves using standardised routines to create a positive classroom climate. The system has sparked considerable interest in the UK, but also has many critics.

Most teachers likely already use some combination of these various approaches, but teachers may not be aware of the psychological theories and practices which they are (implicitly or explicitly) based upon. Over three articles, I want to briefly explore these psychological underpinnings in the hope they help explain some of the advantages and limitations of each system.

Part 1: Behaviourism

‘Behaviourist’ is sometimes used in a pejorative way when describing behaviour management systems, but schools using some sort of system for rewarding or sanctioning behaviour are implicitly using a behaviourist approach.

Behaviourism was a term coined by John Watson in an article published in 1913, but its roots go back to the famous studies by Ivan Pavlov (who discovered classical conditioning as an accidental sideline to his Nobel Prize winning research on digestion). However, the behaviourist most associated with education is B. F. Skinner. Much misunderstood, and often unfairly maligned, his theory of operant conditioning continues to influence schools to this day.

Drawing on the earlier work of Edward Thorndike, Skinner developed his theory of operant conditioning by exposing animals like rats and pigeons to carefully controlled stimuli and recording their responses (a setup often referred to as a ‘Skinner box’). Skinner identified a variety of techniques which could be used to shape animal behaviour and wrote about how these might be applied to human behaviour (and education specifically).

The core idea within operant conditioning is reinforcement and punishment. Very simply, when an animal receives reinforcement after performing a behaviour, they are more likely to repeat that behaviour. Conversely, receiving a punishment after performing a behaviour leads the animal to be less likely to repeat that behaviour in future. Skinner further described reinforcements and punishments as being ‘positive’ or ‘negative’ in character:


Skinner’s rather harsh reputation means that many teachers are surprised to discover that he was very much against the use of punishment in schools. Skinner believed that one of the major disadvantages of punishment is that, even where it is consistently applied, it merely temporarily suppresses an undesirable behaviour.

Severe punishment unquestionably has an immediate effect in reducing a tendency to act in a given way. This result is no doubt responsible for its widespread use. We ‘instinctively’ attack anyone whose behavior displeases us – perhaps not in physical assault, but with criticism, disapproval, blame, or ridicule. Whether or not there is an inherited tendency to do this, the immediate effect of the practice is reinforcing enough to explain its currency. In the long run, however, punishment does not actually eliminate behavior from a repertoire, and its temporary achievement is obtained at tremendous cost in reducing the over-all efficiency and happiness of the group (6).

Contrary to his rather cold, clinical popular reputation, Skinner was a compassionate humanitarian (he won the American Humanist Association’s ‘Humanist of the Year’ award in 1972) who wanted science to help shape a better society by utilising rewards rather than punishment in order to promote pro-social behaviour. (I suspect he’d have approved of Kilgarth School’s decision to ‘ban’ punishment, for instance.)

However, the issue around the effectiveness of punishment is rather more complex than Skinner believed. For example, a fascinating meta-analysis by Balliet and Van Lange (7) examined whether punishment was more effective at promoting cooperation in high- or low-trust societies. They reviewed 83 studies involving 7361 participants across 18 societies and found a rather surprising conclusion: punishment appears to effectively promote cooperation in societies with high trust. In essence, they argue that where there is a great deal of trust, members of a society adhere to norms that encourage both cooperation and the punishment of those who defy cooperative social norms. Punishment is less effective in societies where there is a lack of trust; the authors argue that social norms may be less strongly shared and enforced and so punishment may be less effective in these societies:

A willingness to pay a cost to punish others, especially noncooperative others, is likely to be viewed as a strong concern with collective outcomes. At the same time, such benevolent views of costly punishment may be more likely to occur in societies that contain higher amounts of trust in others, which we conceptualized earlier in terms of beliefs about benevolence toward the self and others.

An important question for future research is whether ‘benevolent punishment’ is as effective at an organisational level (e.g. a school) as it appears to be at a society level. However, the implication would be that in benevolent, high-trust environments, the proportionate use of punishment to support cooperative social norms can be effective.

Another reason why punishment may be effective is a phenomenon called ‘loss aversion’. The work of Tversky and Kahneman suggests that there is an asymmetry between the effects of positive reinforcement and negative punishment – in that where people weigh up similar gains and losses, people tend to prefer avoiding losses to making gains. For example, Hackenberg(8) reports an experiment where the value of a loss was worth approximately three times more than a gain. It seems highly likely that this effect might also apply to the sorts of token reward systems employed in schools, suggesting that negative punishment (e.g. loss of merits) may be more motivating than opportunities to gain merits.


Skinner believed that rewards were the most effective way of shaping behaviour and focused a great deal of his research attempting to find out the most effective patterns of reinforcement. In his ‘Skinner box’ experiments, he was able to carefully control the ‘schedule of reinforcement’ and measure the concomitant changes in the desired behaviour.

Schedule of reinforcementExample
Fixed ratioA student receives a reward after a fixed number of times they perform a desired behaviour (e.g. a merit every time they attempt an extension question)
Variable ratioA student receives a reward after a variable number of times they perform a desired behaviour
Fixed intervalA student receives a reward after a fixed period of time in which they perform the desired behaviour (e.g. a merit for working hard for 5 minutes)
Variable intervalA student receives a reward after a variable period of time in which they perform the desired behaviour

Intuitively, teachers see the need for consistency where punishments are applied and I’ve sometimes heard teachers argue that rewards should be given with equal consistency. However, Skinner’s work on ‘schedules of reinforcement’ appears to show that such systems tend to be relatively ineffective. The problem with systems seeking high consistency in rewarding students is that whilst the student’s behaviour may be swiftly modified, the desirable behaviour may become highly contingent upon the presence of the reward. The odd thing about rewards is that they appear to work better when they are slightly unpredictable. A simple summary of these differences:

Schedule of reinforcementAdvantages and disadvantages
Fixed ratioBehaviour changes quickly Extinction occurs quite rapidly when rewards cease
Variable ratioBehaviour changes quickly Extinction occurs slowly when rewards cease
Fixed intervalBehaviour changes more slowly Extinction occurs quite rapidly when rewards cease
Variable intervalBehaviour changes more slowly Extinction occurs quite slowly when rewards cease

In Skinner’s experiments, the extinction rates (the rates at which the desired behaviour stopped being performed) were quickest where there was continuous reinforcement (i.e. a reward given every time the behaviour was performed). Where there was variability in the time interval or ratio, then the behaviour persists for longer in the absence of reinforcement. Skinner believed this represents the ‘power’ of the slot machine. The fact that playing it is unpredictably rewarded by a pay-out encourages the person to continue playing – even where they hit a long streak of losing.

In schools, sometimes these reward systems take on the structure of ‘token economies’ (systems also used in prisons and psychiatric units – where individuals earn tokens for ‘good behaviour’ which can be used to purchase privileges). However, whilst explicit reward schedules have been used with children (e.g. children with ADD or autism), reward systems have a number of problems which often undermine their use in schools.

One issue is ‘satiation’ – particularly older children rapidly lose interest in the tokens (e.g. merit stickers) or even primary reinforcers (e.g. sweets) that teachers hand out for desirable behaviour. I recall a student teacher handing out sweets to reward Year 10 students for answering questions in class. Many of the students took part, but I noticed one lad sat there scowling with his arms crossed. Chatting to him, it was clear he knew many of the answers so I asked why he wasn’t putting his hand up – he said, ‘What’s the point? I can just buy my own sweets if I want them.’ This problem often leads into what I call ‘reward inflation’ as teachers either have to constantly find novel rewards or end up handing out more and more tokens to elicit the same desirable behaviour.

Another issue is that reinforcement can have negative effects. It’s devilishly hard in a class of 30 students to accurately assess how much effort students have genuinely put into their class or homework. Giving praise or a merit for work which actually required little effort may inadvertently imply that you have low expectations of that student.

Lastly, children aren’t stupid. They rapidly learn when they are being manipulated by a reward system and sometimes manage to turn the tables on the teacher by learning to manipulate the criteria used to elicit a reward. I knew one teacher who, in an attempt to tame a particularly difficult class, had managed to trap themselves into handing out four or five merits to a number of the most naughty children every lesson.

Two great articles by Daniel Willingham further explore some of these problems: ‘Should learning be its own reward?’(9) and ‘How praise can motivate – or stifle’.(10). At the end of this second article, Willingham summarises the way a teacher’s most common form of positive reinforcement – praise – might best be utilised:
Praise should be sincere, meaning that the child has done something praiseworthy. The content of the praise should express congratulations (rather than express a wish of something else the child should do). The target of the praise should be not an attribute of the child, but rather an attribute of the child’s behavior.

In summary

Whilst the term ‘behaviourist’ is used in a pejorative way by some teachers, Skinner desired that his research be used to create societies where reinforcement (rather than punishment) would encourage people to do the right thing. There’s an enormous amount that schools could potentially learn from the classic works on operant conditioning and ways to run token economies (which most school reward systems tend to form).

However, there are some interesting reasons why some of Skinner’s ideas may need updating. Benevolent punishment and negative punishment (which may tap into our innate loss-aversion bias) may in some cases be equally or more effective than rewards (so long as they are deserved but a little unpredictable). Both can potentially be used to effectively support behaviour in schools.

In the next article in this series, I’m going to take a similar look at the topic of ‘restorative practices’ and some of the ideas from cognitive-behavioural therapy which underlie many of the systems used in schools.

This article originally appeared on the Evidence into Practice blog (available here: and has been modified for researchED Magazine.


1. Haydn, T. (2014) ‘To what extent is behaviour a problem in English schools? Exploring the scale and prevalence of deficits in classroom climate’, Review of Education 2 (1) pp. 31–64.
2. Smith, A. (2014) ‘Seven signs of a “good enough” discipline system’, Scenes From The Battleground [Blog]. Available at:
3. Wall Street Journal (2015) ‘Suspension, restorative justice and productive schools’, April 8. Available at:
4. Sperry, P. (2015) ‘How liberal discipline policies are making schools less safe’, New York Post [Online], March 14. Available at:
5. Lemov, D. (2015) Teach like a champion 2.0. San Francisco, CA: Jossey-Bass.
6. Skinner, B. F. (1953) Science and human behaviour. New York, NY: Macmillan, p. 190.
7. Balliet, D. and van Lange, P. A. M. (2013) ‘Trust, punishment, and cooperation across 18 societies: a meta-analysis’, Perspectives on Psychological Science 8 (4) pp. 363–379.
8. Hackenberg, T. D. (2009) ‘Token reinforcement: a review and analysis’, Journal of the Experimental Analysis of Behaviour 91 (2) pp. 257–286.
9. Willingham, D. T. (2007) ‘Should learning be its own reward?’, American Educator 31 (4) pp. 29–35. Available at:
10. Willingham, D. T. (2005) ‘How praise can motivate – or stifle’, American Educator 29 (4) pp. 23–27, 48. Available at: