Table of Contents
Recap
Establishing a Common Vocabulary
--Mathematics
--Learning
--Attention
--Expert
--Fluency
--Understanding
--Attainment
--Ability
Types of Professional Knowledge
The Importance of Effort
Blending Approaches
Phasing Teaching
--Moving from Current Practice to Mastery Approach
--Types of Questioning
Proportioning Content
The Importance of Knowledge
--Inflexible Knowledge
--Continuing our Shared Language
Human Cognitive Architecture
--Is mathematical problem solving biologically primary or secondary?
Key Principles in Cognitive Science for Learning
--Working memory
--Cognitive Load
--Anxiety and Cognitive Load
--The Difference Between Novice and Expert
--Generic Cognitive Skills and Domain Specific Skills
--Information Store Principle
--Relationships
--Teach Everything Correctly First Time
--Narrow Limits of Change Principle
--Worked Example Effect
--Split Attention Effect
--The Redundancy Effect
Variation Theory
--Mathematical Confidence
Storage and Retrieval
--Performance is not the same as Learning
--The Importance of Forgetting
--Desirable Difficulties
--The Testing Effect
--Testing Potentiates Learning
--Marking and Feedback
--The Hypercorrection Effect
--Better Multiple Choice Questions
--Massed vs Spaced Practice
--Implications for Overlearning
--Blocked vs Interleaved Practice
--The Generation Effect
--Performance is not a Good Proxy for Learning
--Teachers and pupils can be fooled
--The Teacher Parable
Moving from Propositional to Strategic Knowledge
Recap
Part 1 and Part 2 of this blog concerned
the background efficacy of the approach, its history, evidence base and
evolution over the past 100 years in particular, and how schools could go about
adopting the approach in practice.
In this part, I will consider the key
logistical and pedagogical considerations for ensuring a mastery approach has
as great an impact as possible. The blog
is broken into chapters, which have been written in a narrative but can also
stand alone should you wish to dip in and out.
By taking Aristotle’s 1-to-1 model of
educating, where a tutor works with his pupil, Carleton Washburne was able to
create the mastery model of learning.
Central to these ideas, as demonstrated by Burk and Ward in the 1910s,
was that all pupils can learn all content if they are starting at the right
point and given the right amount of time.
Morrison further emphasized the importance of considering all pupils
individually with his introduction of correctives to Carleton’s model in the
1920s.
It is critical that teaching should start
from what the pupil already knows. The
new learning will build on this and be just beyond already embedded knowledge
and understanding. This makes the
mastery model of schooling incompatible with non-homogenised groupings of
pupils. That is, mixed ability and mixed
attainment classes, where the gap between highest and lowest is large, is
anathema to the mastery approach.
The model stood the test of time and was
generally accepted as being impactful, but it was not until Bloom picked up the
story that the rigour of research and evidence was able to confirm what Carleton
had asserted.
Bloom was introduced to the idea of
mastery and the potential impact by his friend, John B Carroll. Carroll argued that all pupils can learn well
given the right conditions. In 1963, he
embarked on a lifetime’s work to prove this was true. Carroll’s Model of Schooling showed that
ability is an index of learning rate – all pupils can learn, but they require
different amounts of time. He also
emphasized the importance of instruction design and resource design, which
needed careful thought and planning if a pupil’s attention was to be drawn to
the relevant information and ideas.
In addition to the quality and type of the
instruction and materials, Carroll highlighted the same key ingredient that
both Aristotle and Washburne had insisted on as being essential for learning to
take place: effort.
The results of successfully implementing a
mastery approach are profound, with much greater numbers of pupils learning
well than previously. Carroll, Bloom,
Block, Guskey and others find similar results: a significant shifting of the distribution
of those who reach particular levels of attainment.
The distribution of attainment in a
traditional setting follows broadly a normal curve, but in mastery settings,
the distribution is significantly skewed towards greater levels of attainment.
As discussed previously, we are interested
in long term memory. The challenge is to
ensure as many pupils as possible learn well.
The mastery cycle (below) describes the
logistical issues in running the approach, but inherent in the approach is that
all teaching results in learning. That
is to say, all teaching results in a change to the long term memory. This must be the case in all learning
episodes.
The fabulous Oliver Caviglioli recently created a poster version of my mastery cycle, which can be downloaded from his website.
The fabulous Oliver Caviglioli recently created a poster version of my mastery cycle, which can be downloaded from his website.
In this part of the blog, I explore the
crucial approaches and pedagogies necessary to make this happen.
Establishing a Common Vocabulary
Many of the words we use in education are
also commonly used in day to day language.
This often results in confusion and mixed meanings. But the words we use in the science of education
are well defined and have specific meaning.
Before continuing, to avoid ambiguity, I will set out some key words and
their definitions.
Mathematics
I take ‘mathematics’ to mean a way of
existing in the universe. Mathematicians
are curious in all aspects of their lives.
Mathematicians, when faced with a problem, enjoy the state of not yet
knowing the resolution (indeed, knowing there may not even be a
resolution). Because they are curious,
mathematicians, when faced with a problem, ask themselves questions of it. They can specialise, pattern spot,
conjecture, generalise, try to disprove, argue with themselves, monitor their
own thinking, reflect and notice how these new encounters have changed them as
a human being. That is to say,
mathematics is an epistemological model: a way of considering the very nature
of knowledge.
Sadly, in many of the North-Western
cultures, children have been conditioned to believe that mathematics is about
wading through questions, getting ‘right’ or ‘wrong’ answers. This is a confusion to mathematicians, since
it does not represent our domain at all.
Mathematicians are not in the business of answering lists of
questions. Rather, they meet scenarios
and, driven by their curiosity, create their own questions and follow their own
lines of enquiry. Many of these lines of
enquiry result in unexpected results, but we do not consider these to be
‘wrong’, simply not what we thought would happen. Often, great discoveries in mathematics have
resulted from lines of enquiry that lead to unexpected results.
Mathematicians enjoy being stuck. They revel in the initial apparent
impenetrability of a scenario and understand that by attacking it in a
structured way, enlightenment can arise.
Learning
Learning is the bringing about of some change
in the long term memory. When faced with
enabling a pupil to learn a novel idea, skill or information, as teachers we
are concerned with changing their long term memory.
First of all in terms of embedding the
novel idea in the long term memory in the form of some mental representation
that can be thought about and, secondly, in assimilating this new mental
representation into the schema of knowledge and ideas that already exists.
Attention
Dan Willingham’s phrase ‘memory is the
residue of thought’ is a handy reminder to us that we are seeking to change the
long term memory. But ‘thought’ is too
loose a definition. When faced with a
mathematical problem, scenario or task, a pupil may well be ‘thinking’ about
it, but they may just be thinking, ‘this is crap’. Instead, it is a very specific aspect of
thinking that results in a change in the long term memory: attention. Attention is focused and deliberate. We are interested in what pupils are
attending to, not just what they are thinking about. When presented with a novel mathematical
idea, we want pupils to be attending purely to the mathematics (or as pure as
is achievable in reality) and the mathematical structure. We also want to draw attention to how the
mathematical idea relates to knowledge they already have in their current schema.
Giving attention is difficult, it requires
focus and a belief that what is being attended to is important. This hard, deliberate process is how the long
term memory is changed.
Expert
Expertise relates to layers of
attention. As one becomes more expert,
one can attend to higher layers of attention.
For example, as a child I learned how to play the piano. When learning to play the piano, one needs to
give huge amounts of attention to the position of one’s hands, their movement,
the pressure each figure is exerting, the meaning of musical symbols, and so
on. As one becomes more expert at
playing the piano, one can attend to higher level aspects. Nowadays, when I play the piano, I have
absolutely no idea where my hands are or what they are doing. I can attend to higher levels such as melody,
composition or beauty. The process of
learning is the process of becoming more expert. It is never ending, there is always more one
can attend to. What a sad state of affairs
it would be if one day, one simply closed the lid of the piano and said, ‘well,
that’s the piano finished!’
Mastery is about becoming more expert, not
about ‘mastering’ things. Crucial to the
mastery approach is the recognition that there is always more to learn.
Fluency
We consider someone to be fluent in a
skill, idea, concept or facts at the point at which they no longer need to give attention. It is important to note that fluency is
simply the state of attention not being necessary in order to perform, but this
does not mean that one couldn’t, if one wanted to, choose to give
attention. Considering the piano example
again, although I don’t know where my hands are or what they are doing because
I no longer need to give attention to that aspect of performing, I can choose to give attention to it. I might, for example, see another pianist do
something with their hands and think, ‘gee, that’s interesting, how did she do
that?’ I can then give deliberate
attention to that lower level aspect.
Quite often, when learning mathematics, great new insight comes from
choosing to give deliberate attention to an area of mathematics one is already
fluent in. So, fluency is when attention
is no longer necessary. Attention is
hard, it is effortful, fluency is effortless.
When learning to manipulate algebraic expressions, for example, pupils
need to give a lot of attention to the rules and conventions in order to carry
about even simple rearrangements, but as they become fluent, this becomes
effortless and they can attend to other, higher level aspects such as what the
underlying relationships between variables are.
Understanding
Let’s imagine mathematics as a complex web
of interconnected ideas (it is, of course, not a web, but the analogy holds
well and is useful).
Often, understanding is described in quite
a wooly way. People will say, for
example, it is the number of connections or the ability to use the idea in
another area of mathematics. But
understanding has a much more precise meaning.
The mathematics is understood if
its mental representation is part of a network of representations. The degree
of understanding is determined by the number and strength of its connections. A
mathematical idea, procedure, or fact is understood thoroughly if it is linked
to existing networks with stronger or more numerous connections. – Hiebert and
Carpenter, 1992.
Mathematical ideas are connected and, as
pupils mature, they assimilate new ideas into their schema in the form of mental
representations. These representations
form a map of mathematics that can continue to grow – there is no limit to the
number of connections we can make.
Understanding is about the reasons why the connections are true. Again, there is no limit to the depth of
reasoning one can make, so understanding can be thought of as infinite.
The depth and strength of the reasoning
why connections are true is what we define as understanding. This has a beautiful corollary: understanding
never ends, there is always more that we can understand about ideas.
Understanding is not a dichotomous
state, but a continuum . . . Everyone understands to some degree anything that
they know about. It also follows that understanding is never complete; for we
can always add more knowledge, another episode, say, or refine an image, or see
new links between things we know already. – White and Gunston, 1992.
Attainment
Attainment is the point that a pupil has
reached in learning a discipline. It can change; pupils can unlearn as
well as learn. It is not precise. But it is very useful in determining
appropriate points on a curriculum from which to springboard pupils to new
learning. We, as educators, continually assess these attainment points so as to
best ensure the curriculum we are following can adapt and flex to what has been
understood or forgotten. Knowing the prior attainment of pupils (rather than
what has been previously presented at them) is crucial if we are to ensure
pupils are learning appropriate new ideas and concepts.
Ability
Ability is an index of learning rate. It
is the readiness and speed at which a pupil can grip a new idea. It can change;
as with all human beings, pupils will make meaning from some metaphors, models
or examples, more readily than they will of others. In mathematics, for
example, we often see pupils quickly understanding some numerical pattern, say,
who then take a long time to grip a geometrical relationship. An individual can
have a high index of learning rate during some periods of their life and a low
one at others. Again, as educators, we are continually assessing ability so
that we are best able to judge the amount of time, additional practice, new
explanations or support that a pupil needs in order to really grip an idea.
Knowing the ability of a pupil (rather than wooly ideas of engagement or
enjoyment) is crucial if we are to ensure that pupils are learning new ideas
and concepts for the appropriate amount of time (rather than some arbitrary
amount of time presented on a scheme of work).
A common misconception is that pupils who
are low attaining are also low ability.
This misconception arises when ‘conveyor belt’ approaches to curriculum
are deployed rather than a mastery approach.
In conveyor belt, coverage rather than learning is the focus. Teachers race through objectives and teach
all pupils the same content as mandated by a scheme of work on any given week
or day. This results in low attaining
pupils being asked to learn material they are simply not ready for. The gap from their true starting point to
what they are being asked to grip is a severe handicap, so they appear to be
slow learners. But, in a mastery
approach, where we ensure that all pupils are learning the right level of
content for the right amount of time, low attaining pupils are being taught
content just beyond their current understanding and so can assimilate and
connect the new learning much more readily, leading to fluency and then
understanding. When taught at the right
level, all pupils can learn at pace.
Types of Professional Knowledge
Many teachers of mathematics enter the profession with high levels of mathematics content knowledge. This knowledge is connected to, but not the same as, mathematics pedagogical knowledge. Knowing how to bring about learning is complex and requires many years of professional learning to acquire. Some of this knowledge can be studied, reading the best evidence (propositional knowledge), some of it can be acquired through hearing about practice, perhaps a teacher giving a presentation at a CPD event (case knowledge), and, most importantly, some of this knowledge only comes about by teachers experiencing events themselves (strategic knowledge). This strategic knowledge involves teachers thinking about and considering propositional and case knowledge, which they then develop further based on actual practice in real classrooms.
What is set out in this blog has had to
pass the test of the three types of knowledge.
Propositional knowledge is incredibly useful and stimulates professional
enquiry, but many aspects of education research can not be replicated beyond
laboratory conditions, so, although theoretically interesting, those ideas do
not form part of the mastery approach.
Only testable ideas that are able to be applied to real classrooms are
considered here.
The Importance of Effort
Both conceptual understanding and
procedural fluency are necessary in learning mathematics, but they are not
sufficient. As Kilpatrick, Swafford,
Findell (Adding It Up: Helping Children Learn Mathematics, 2001) remind us,
pupils must also have strategic competence (the ability to solve problems),
adaptive reasoning (the capacity for reflecting and reasoning, which leads to
understanding) and, critically, productive disposition (a belief that one’s own
effort matters)
These combined give the conditions for
learning. Often the most important of
these, effort, is shied away from to great detriment. Schools avoid honest conversations with
pupils and parents, yet it is this honesty that can bring about huge gains in
learning. Families and pupils need to
understand that their success is a result of their effort and their failure is
a result of their laziness. A pupil can
have the worst teacher, be at the worst school, have shoddy books, yet still learn
well because they put in great effort.
Conversely, a pupil may attend the best school in the world and be under
the instruction of an amazing teacher who uses the very best materials, yet
completely fail to learn because they expend no effort. Effort matters. A lot.
Washburne’s mastery model was based on the
teachings of Aristotle. Central to the
model is the recognition that effort matters and, further, that pupils
understand that it is their own effort that determines their success.
Where pupils recognise this, the impact is
profound. Generally, in the North
Western cultures today, pupils and families have surrendered their agency. Pupils routinely blame their failure in a
lesson or on a task or test on their perceived quality of the teacher, rather
than realising they are the key driver.
Attitudes and beliefs around which factors
influence success vary wildly around the world.
Where pupils understand that the main
factor in success is their own effort, the impact on attainment is
significant. A recent
McKinsey analysis of attainment against self-efficacy showed pupils
in the most disadvantaged circumstances who belief their own effort is key,
outperform pupils in the most advantaged settings who believe success is a
result of external factors.
By examining sub-sets of pupils who have
undertaken PISA tests, John Jerrim was able to identify east Asian children
attending Australian schools far out performed native Australian children. In his paper, “Why
do East Asian children perform so well in PISA? An investigation of
Western-born children of East Asian descent” (John Jerrim, 2014),
Jerrim concludes that the hard work ethic of these children is a key factor in
them outperforming native pupils by two and a half years of learning. This conclusion is similar to that of Feniger
and Lefstein (2014).
The Nuffield Foundation research paper, Values
and Variables – Mathematics Education in High Performing Jurisdictions
(2010), again points to the importance and impact of a culture of
self-efficacy.
This belief in hard work and the
transformative impact that effort makes is central to the mastery approach. It is therefore incumbent upon educators to
be direct and honest with pupils and their families that they play and active,
not passive role in their learning.
Blending Approaches
Mastery is an entire and complete model of
schooling. There are many models that
exist, having varying degrees of impact both in terms of the currency they give
to pupils (school grades) and long term engagement in a subject or discipline
(e.g. whether or not pupils pursue mathematics at higher education or enter
mathematical careers later in life).
Much debate occurs around which model to adopt.
Two models that might be seen as being at
the extremes are Inquiry Learning and Teacher Directed Instruction. At the extremes of these lie Discovery
Learning and Direct Instruction (here I take Direct Instruction to mean the scripted
intervention programme arising from Project Follow Through). Advocates of both often take the view that
the approaches are mutually exclusive.
Washburne, rightly, understood that education is nuanced and rarely are
such fanatical positions helpful.
It has long been an element of the mastery
cycle that instruction is varied in order to allow as many opportunities for
‘meaning making’ as possible. The
approach very much embraces teacher instruction, but also includes time for inquiry. It can be shown that, at the extremes, Direct
Instruction does indeed lead to good outcomes in terms of pupil performance on
tests, but not optimal performance. As
models move towards teacher direction in all lessons, performance passes a
plateau and begins to reduce.
Equally, by increasing the opportunity for
pupils to undertake suitable inquiry, performance initially increases, but
quickly begins to worsen.
By blending both direct teacher
instruction and appropriate opportunities for inquiry, pupil performance
increases.
We are seeking to strike a balance between
teacher-directed methods and inquiry methods.
Getting the recipe right consists of several key considerations, namely
the type of instruction in each, the order in which the instruction happens, and
the ratio of the methods used.
The pedagogic choices made when phasing
teaching have a significant impact on pupil outcomes.
Phasing Teaching
I will suggest that effective teaching of
a novel idea in mathematics passes through four phases as the pupil moves from
novice to fluency to understanding.
Those phases are
· Teach
· Do
· Practise
· Behave
During the ‘Teach’ phase, the idea is
entirely novel to pupils, though just beyond their current knowledge and
understanding. The teacher will instruct
the pupils, tell them key facts, pass on knowledge, show and describe, use
metaphor and model, all in order to bring about connections in the pupil’s
current schema so that they can ‘meaning make’.
This phase is often described as explicit teaching. It is a crucial phase – after all, the
teacher knows things and the pupil does not; so tell them!
The end of the ‘Teach’ phase does not
result in learning. It is merely the
first step. At this stage the new
knowledge is ‘inflexible’, and it is our job as teachers to bring meaning and
understanding to the knowledge so that it becomes ‘flexible’ (more on
inflexible and flexible knowledge later).
We now ask pupils to ‘Do’. At this stage, they do not yet know or
understand the new idea, they are replicating what the teacher has told or
shown them. The ‘Do’ phase has two
important purposes. Firstly, the teacher
is able to observe whether or not the pupils have made meaning of the model,
example, metaphor or information they have been given or shown. The teacher can see and act; are the pupils
able to replicate what I have demonstrated?
If not, the teacher can change their model, example or explanation, perhaps making stronger and
more explicit connections to previous knowledge and understanding. The second reason for the ‘Do’ phase is to
give pupils a sense that the idea or task is surmountable – that they, quite
literally, can do what they are being asked.
Well structured ‘Teach’ and ‘Do’ builds pupils’ confidence and shows
them there is nothing to be afraid of, the new idea is within their reach.
Once both teacher and pupil are clear that
the pupil is able to ‘Do’ – that is to say, they can perform – the teacher now
segues the pupil to the ‘Practise’ phase.
During ‘Practise’, we wish to move beyond
simply performing. We want the pupil to
gain a confidence in working with the new idea, to see its underlying
relationships and to assimilate the new idea into their schema of
knowledge. In order to achieve these
more meaningful goals, the pupil needs to be able to attend to a higher
level. In other words, as described
earlier, the pupil needs to have achieved fluency at the performing level
first, so that they may attend to connections, relationships and a deeper
conceptual appreciation.
So, we shall define the point at which the
pupil moves from ‘Do’ to ‘Practise’ as the point at which they achieve fluency
(as defined earlier in this blog).
The final phase, ‘Behave’, is the most
important phase. This is the phase that
brings about understanding.
At this stage, teachers create opportunities
for pupils to behave mathematically.
I know of no better description of
mathematical behaviour than the rubric included in John Mason’s 1982 book,
Thinking Mathematically.
This simple flowchart perfectly captures
how mathematicians actually behave.
Our assumption at this stage is the pupil
has become fluent in the new idea or skill, is able to work confidently with
the mathematics and has assimilated the idea into their schema of
knowledge. It is tempting, then, to plan
‘Behave’ tasks that are based on the new mathematical idea, which pupils have
just gripped, but in learning mathematics and, in particular, in thinking
mathematically, maturation matters. The
type of thinking and behaving we want pupils to do at this stage requires an
embedded sense and understanding of the mathematical ideas that will arise.
When planning for the ‘Behave’ phase,
therefore, we will not be asking the pupils to use the novel idea, but instead
to be drawing on well embedded and matured mathematical ideas that connect to
the new learning. The new learning that
has occurred in this learning episode will mature over time as more
connections are made and more opportunities are given to see the idea from
different perspectives. Later in the
journey of learning mathematics, the new idea will be used (many times) in the
‘Behave’ phase. It is incredibly
difficult to determine how mature an idea needs to be before pupils can
‘Behave’ mathematically with that idea, but a good rule of thumb would be
around 2 years.
As an example, suppose the new idea
encountered in this learning episode had been working with fairly interesting
3d trigonometry, at the ‘Behave’ phase, we might be asking pupils to work with
ideas of angle facts or simple Pythagoras, which they will have met much
earlier on. They can see the connection
to the new idea, but it won’t demand that they use it (though there is nothing
wrong in scenarios that make it possible to use the new idea and ideas
beyond!). Not only do pupils get an
appreciation for how their ability to use earlier ideas, which seemed at the
time to be complex and now appear simple and fluent, has become more embedded
and eloquent, pupils are also benefitting from meeting previous ideas again,
bringing benefits of ‘spacing’, which I discuss later in this blog.
Many teachers find it an uncomfortable –
perhaps even illogical – process to plan the ‘Behave’ phase as one that relates
to much earlier learning rather than the new idea, but it is crucial to do so
if we want to bring about optimal gains in learning, understanding and long
term recall.
Moving
from Current Practice to Mastery Approach
It has been some time since mastery was
the dominant model of schooling in the UK.
Since the introduction of the National Curriculum in 1988, schools have
almost unanimously adopted a conveyor belt approach (see Part 1 of this
blog). This approach has resulted in an
obsession with coverage rather than learning.
Lessons ‘cover’ content and objectives, but tend not to be concerned
with understanding and long term recall.
Another result of the conveyor belt is the
wholly obtuse belief that learning happens in perfectly apportioned pockets of
time. It is a common feature of schemes
of work to assume that each mathematical idea will be learnt in precisely 1
hour. How serendipitous this would be!
Worse, we even hear apparently responsible
educators, managers and inspectors talking about pupils ‘making progress in 20
minutes’. This is, of course, utter
nonsense. Learning is not linear, it is
highly complex and involves regressing as well as progressing.
I suggest here, as Washburne, Bloom,
Carroll and many others have done before me, that a ‘learning episode’ (the
amount of time required to grip a novel idea) has no fixed time period. Yes, some things can be learnt in an hour, but
some may take weeks or years.
I take ‘learning episode’ to be my measure
when talking about the four phases outlined above. The teacher will flow through the four phases
during the ‘learning episode’, taking the right amount of time necessary
(informed by their observations, discussions, questions and experience).
When one travels around the UK today, the
typical phasing of a learning episode looks something like
Generally, at the moment in conveyor belt
approach, the teacher will spend a short amount of time demonstrating and
instructing, then ask pupils to work on similar examples. They have to undertake a lot of ‘doing’
before the ideas start to become clear to them.
Eventually, they find they no longer have to give great attention to the
surface level and can begin to discern relationships and concepts. At this stage, the pupils are now practising,
which they are given a large amount of time to do.
In current UK classrooms, most pupils only
ever proceed to this ‘Practise’ phase and the ‘Behave’ phase is entirely
absent. This makes coverage easier –
teachers can ‘get through the curriculum’ – but misses the most important
phase, which means pupils do not get the opportunity to reason, understand,
embed and improve long term recall.
It is a common feature of the current UK
education landscape to hear teachers lamenting the fact that pupils have
forgotten what they have been taught previously. But without the ‘Behave’ phase, they have not
been taught, they have just had presentation and practise. Yes, they can perform, but performance is not
the same thing as learning at all. If
learning did not occur, nor did teaching.
Perhaps the lament should more accurately be the rather unsurprising
statement; my pupils can’t recall something they were never taught!
I suggest that a more impactful phasing
could look like this
Notice the increase in time spent on
explicitly teaching the novel idea, through modelling, examples, metaphors,
information, etc. With an increased
amount of teaching time, pupils are able to move more quickly from the ‘Do’ to
the ‘Practise’ phase. Now, a good
amount of time is reserved for the ‘Behave’ phase. As discussed earlier, and demonstrated in the
McKinsey data, increasing the amount of direct teaching results in greater
gains, but only to an optimal proportion.
In order to achieve the sweet spot between teacher-directed and pupil-inquiry,
the ‘Behave’ phase gives opportunities for meaningful inquiry.
This model is more effective than the
conveyor belt model, since it takes the pupils into the ‘Behave’ phase, which
requires them to make deeper connections and reason and reflect. This time spent considering the ideas at a
deep structure level, rather than just at surface level brings about gains in
terms of long term memory.
However, the above suggested phasing can
be improved further. I suggest that the
following distribution of a ‘learning episode’ is even more powerful.
Here the ‘Teach’ and ‘Do’ phases are
broken up and intertwined, which helps the teacher to hold their own teaching
to account before progressing too far with an idea – a checking activity to
ensure the intended meaning is being received by the pupils before attempting
to build on it. It also helps to space
out the learning of an idea and gives opportunity to disrupt the time spent
thinking about one thing. This important
aspect of learning is further explored later in this blog.
Our goal is to get the benefits of the
‘sweet spot’, the optimal balance between teacher-directed and
pupil-inquiry. There is no hard or fast
rule to the proportion of time spent on each, but a good rule of thumb would be
an approximate 80:20 split between the combined ‘Teach’, ‘Do’ and ‘Practise’
phases and the ‘Behave’ phase.
With this phasing, teachers are carefully
building up an appreciation of the novel idea, ensuring pupils become fluent in
its use, then providing a reflective period in which pupils use earlier, but
connected, ideas with which to undertake mathematical thinking.
Combined, this phasing pulls together
several key benefits for learning that the field of cognitive science has been
confirming over the last 50 years.
Later, I outline the impact on memory that this approach can have.
Types
of Questioning
Each phase uses carefully planned and
deliberate types of questioning.
During the first phase, the teacher is
teaching. This teaching is carefully
considered, planned, well executed and explicit. During this phase, the teacher uses questions
as stories, models and examples. These
questions are ‘demonstrated’ – literally, the teacher is demonstrating what
success looks like, they are to the point, accurate and efficient
demonstrations of what pupils might encounter when working with the novel idea
and how to resolve such problems. At
this stage, the novel idea is not known or understood. The most efficient way to get a child to know
a new piece of information or idea is simply to tell them. As teachers, we hold in ourselves a body of
knowledge unknown to the pupils. We
carefully reveal this knowledge, at the right time taking into account the
maturity of their schema, gradually building up their appreciation of our
domain.
The implication is clear; curriculum is
the single most important tool we have at our disposal. A carefully planned route through our subject
– which is not linear, but complex and takes into account forgetting and unlearning
as well as learning – is vital if we are to know when and how to reveal the
canon of our discipline. This journey
through learning a subject spirals upwards as we mature. Ideas are met and then re-met as we grow
older. Earlier ideas suddenly have new
meaning as we can view them from the perspective of maturity, integrating them
with latterly learnt material, shining a new light on them and revealing
underlying relationships that did not seem apparent at an earlier stage. All mastery approaches adopt a spiral or
staircase curriculum approach – it is vital in bringing about the gains of
maturation and schema assimilation. At
the end of this blog, I discuss curriculum design and optimal phasing in more
detail.
Having demonstrated what we know pupils
will be able to do, we then ask them to do so.
During the next phase, pupils are doing.
The questions at this stage still involve the teacher, since pupils have
not yet gripped the novel idea. Pupils
are replicating, being successful, performing, gaining confidence. The teacher is a crucial part of this stage,
ensuring confidence is being built by continuing to guide pupils. At this stage, therefore, we call the
question types ‘guided’.
This transition and mixing of the
‘demonstrated’ and ‘guided’ can be instant, for example, the teacher
demonstrates a solution and then immediately asks the pupils to do a similar
one (show – do), or can take place with greater explanation, for example, the
teacher demonstrates a question, takes some questions from the pupils, addresses these in discussion, points out features, then demonstrates a few more examples
before asking the pupils to have a go at a few.
These pedagogic choices happen in real time – the teacher can judge the
impact of their example (perhaps by surveying the class or asking pupils to
show the response to a guided question on mini-whiteboards) and then decide the
best course of action (more examples, different models or allowing the pupils to
do some more of their own).
The teacher is continually monitoring the
level of confidence, deftness, accuracy and insight their pupils are showing
during the teach-do interchanges. As the
pupils move from significant concentration on surface level issues such as
process, the teacher is watching for the transition to procedural fluency. As this is attained, the pupils are slowly,
purposefully segueing into practising.
As the pupils move to practising, the
teacher delivers questions designed to reveal underlying relationships and
deeper structure. These questions are well
ordered, carefully planned, with deliberate and purposeful variation such that
the novel idea is connected to previous learning and assimilated into the
pupils schema because they are able to appreciate connections, logic and
relationship. We call these questions
‘structured’.
In the final phase of the ‘learning
episode’, our aim is to elicit mathematical thinking. We will call these questions ‘intelligent’
Questions that elicit mathematical
thinking can include scenarios where pupils must evaluate mathematical
statements, classify mathematical objects, interpret multiple representations,
create and solve problems, and analyse reasoning and solutions.
Crucially, we are seeking to take pupils
from a point of specialising, through conjecturing, generalising and,
critically, reasoning and reflecting.
It is these ‘intelligent’ questions that
bring about understanding and make our knowledge much more flexible and
memorable.
Proportioning Content
Assimilating new ideas and information
into an established complex schema is difficult. Before the moment of the new idea, the pupil
has a perception of the universe – a series of held views, beliefs and
truths. Asking the pupil to disrupt that
view of the world is a significant burden on them. As discussed, connecting already established
and understood knowledge and ideas to the new learning, enables a pupil to
‘meaning make’ more readily – after all, if one can see a new idea from the
perspective of already believed ideas and how it fits with their wider view of
the universe, it is much easier to believe the new truth.
It is such a big ask of pupils to believe
and grip novel ideas or knowledge that we should take steps to make this
process as gentle and effective as possible.
An important step is to not overwhelm the pupil with novel
information. In a conveyor belt
curriculum, the content of each lesson is almost entirely novel – this
objective led approach sees teachers racing through new mathematical ideas like
a tick list. All of the questions,
discussion and exploration in the lesson is concentrated on the new idea. In a mastery approach, a very different
structure is used. In each learning
episode (rather than lesson), only a small proportion of the content of the
lesson is novel. The majority of the
content is drawn from previously encountered material, with links to the new
idea at hand. A good rule of thumb for
old:new content is approximately 80:20.
So, in each learning episode, only around 20% of the content is focused
on the new idea. This greatly improves
assimilation and also brings about gains of both spacing and interleaving
content (more on that later).
The Importance of Knowledge
As discussed, inquiry is a critical
element in learning mathematics. It is
the stage where reasoning, conjecturing, generalising and reflecting
occur. These are all important in
revealing underlying relationships and bringing about understanding, which
greatly increases likelihood of long term recall.
Unfortunately, inquiry is often conflated
with ‘discovery learning’, where pupils are expected to discover and create
their own new knowledge. Inquiry is not
this. Inquiry is an intellectually
demanding process, forcing pupils to give deliberate and sustained attention to
ideas, concepts and connections. When
carrying out inquiry, pupils draw on embedded knowledge and understanding. It is true that from this existing knowledge,
with carefully constructed inquiry, pupils can and do construct new meaning and
even new knowledge, but this is an incredibly inefficient process (see
later). Rather, we improve the gains
from inquiry when we first ensure that the required knowledge is already in
place. After all, it is incredibly hard
to think about something when one doesn’t know anything!
Phasing the teaching process such that prerequisite
knowledge comes before inquiry is key.
Furthermore, the required knowledge should be embedded through maturity,
meaning the inquiry process may take a couple of years before it really draws
on some information or idea being learnt today.
The disastrous meta-study effect sizes
often quoted to diminish the importance of inquiry arise from meta-anlyses
conflating inquiry with discovery learning or from including studies that do
not take account of the importance of correct phasing and maturation. It is easy to paint inquiry as having no
impact if we ask pupils to carry out inquiry without having the prerequisite
knowledge or to carry out inquiry with novel ideas that they have not yet
assimilated into their schema. But,
carried out well, inquiry does not only greatly advance understanding, it is
also the key to improving long term memory of mathematical ideas.
It cannot be said too often: knowledge
comes first. But inquiry must come too,
if we are to move from knowing to understanding.
Inflexible
Knowledge
Because of the polarised nature of
education debate, radical advocates of knowledge often sneer at inquiry and
radical advocates of inquiry sneer equally at knowledge. These two camps have established themselves
as though never the twain shall meet.
They position inquiry and knowledge as mutually exclusive. This is clearly moronic. Education is complex. The debate is never so black and white, there
is always nuance. Both knowledge and
inquiry matter. It is the phasing and
proportion of each that needs to be got right.
The more extreme inquiry promoters paint a
picture of knowledge as being about rote learning. In fact, very little knowledge is rote
knowledge. Usually, when talking about
rote knowledge, people are really describing inflexible knowledge.
Inflexible knowledge is a perfectly normal
step in learning. Most of us, when
learning something new, will acquire inflexible knowledge.
The oft quoted example of rote knowledge
from Anquished English by Richard Lederer is the pupil who gives the response:
“A menagerie lion running about the earth
through Africa.”
What question is the pupil responding
to? The pupil has been asked to describe
the equator!
Clearly, this knowledge is not
useful. They have misheard the sentence
“an imaginary line…” and have absolutely no concept for what the equator
is. Furthermore, they have not tried to
assimilate the sentence with known information and ideas – they would surely
spot the ridiculousness of the sentence.
The pupil has simply remembered the line being said. This is rote knowledge. That is to say, this is memorising in the
absence of meaning.
But rote knowledge is rare. Most things that are remembered do have
meaning (even if that meaning is not yet understood). Consider, for example, the pupil who tells
their teacher “Eight takeaway five is three.
But you can’t do five takeaway eight.”
Clearly, this has meaning. The
model of subtraction they are using is one of removing, literally taking
away. This knowledge is not rote. It is true that they don’t yet fully
understand subtraction, but the knowledge they have is useful and is a
perfectly natural step in learning about subtraction. This knowledge does fit with other learnt
ideas (removing objects, say). It is connected,
but it is not complete.
We want our pupils to become creative
problem solvers, but we should not despair at inflexible knowledge. Our job as teachers is to schedule the
learning of mathematics, such that the discipline is carefully revealed to pupils
over time at the right stage of maturity.
Inflexible knowledge is very different to
rote knowledge. It is meaningful. Inflexible knowledge is inflexible because
the knowledge is tied to the surface structure – pupils can use it only in
examples that are the same – but does not transfer to the deep structure of the
idea. In other words, inflexible
knowledge cannot transcend specific examples.
In the above, the pupil is not able to say how the concept of
subtraction could be applied to the case where the removing model breaks down.
Continuing
our Shared Language
Surface structure: particular examples, designed to
illustrate the deep structure
Deep structure: a principle that transcends specific
examples
Rote knowledge: memorisation in the absence of meaning
Inflexible knowledge: has meaning, but limited to specific
examples. A natural step to deep,
flexible knowledge
From a teacher’s point of view, it is
important to remember that knowledge tends to be inflexible when it is first
learnt. This is a natural step. Don’t despair!
Continuing to work with this knowledge,
assimilating into schema of established truths, leads to fluency and
expertise. The knowledge gradually
shifts from being organised around surface structure (examples) to deep structure
(principles).
In order to help pupils in this shift, we
must use carefully considered examples, showing not only when the learnt idea
will apply, but also when it breaks down and non-examples. Teachers should be explicit in telling pupils
when they have acquired rote knowledge and also open about inflexible knowledge
– learning a discipline is a leap of faith for the pupil, be honest with them
when they have inflexible knowledge: tell them that it is not complete, but
will be built upon later. This honesty
avoids one of the most significant problems in a conveyor belt, objective-led
curriculum, where teachers are racing through objectives and are dishonest
about inflexible knowledge. For example,
it is not unusual to hear a teacher telling a pupil that ‘multiplication makes
things bigger’ or that ‘to multiply by 10, just add a zero’. These shortcuts enable a teacher to ‘get
through’ an objective more quickly, but they embed serious misconceptions,
which are very tricky to undo later.
Instead, an honest approach is much more helpful. For example, when the pupil above says, “you
can’t do five takeaway eight”, we tell them that we understand why the examples
they are using at the moment would make that seem true, but in fact it is
possible and we will teach them how later in the curriculum.
At any given time, all human beings,
including our pupils, know only what they know.
Our schema of knowledge and understanding is continually growing. Educators must appreciate that and celebrate
that it is a natural step to deeper and deeper understanding of the universe.
Human Cognitive Architecture
The evolutionary psychologist, David
Geary, has proposed a distinction between types of knowledge. He splits knowledge into Biologically Primary
and Biologically Secondary knowledge.
Biologically Primary knowledge is
knowledge that we have evolved to be able to acquire easily, without the need
for thought or attention. For example,
speaking. Although we need to think
about words and vocabulary, the act of speaking itself is untaught.
Biologically Secondary knowledge is
knowledge we, as a culture, have generated.
This requires attention and is difficult to learn as described earlier.
A simple example of the distinction is the
fact that it is easy to ‘learn’ how to speak, but difficult to learn how to
read.
There is no need to teach Biologically
Primary knowledge, so schools are concerned with the business of Biologically
Secondary knowledge.
This secondary knowledge is the knowledge
we, as a species and social collective, have created. It is our art and our music, our science and
our literature, our pursuit of sport, our love of dance, our interest in
history, our rich languages.
Biologically secondary knowledge is our combined culture. I like to think of biologically primary
knowledge as the knowledge that keeps us alive, but biologically secondary
knowledge is the knowledge that makes it worth living.
A, perhaps unexpected, result is that the
problem solving is Biologically Primary.
We have evolved to solve complex problems, particularly those that
increase chances of survival. But as
mathematics educators, we are interested not in generic problem solving, but
specifically in mathematical problem solving.
Is
mathematical problem solving biologically primary or secondary?
When human beings are unable to obtain
knowledge from others, they use randomness as an action for generating new
responses, which can then be tested and lead to hypotheses or conclusions. This is known as the ‘Randomness as Genesis
Principle’. This way of creating new
knowledge is incredibly inefficient and prone to significant misinterpretation.
When faced with a mathematical problem,
the randomness as genesis principle could
apply. That is to say, it is
possible to consider mathematical problem solving as biologically primary. Pupils can
learn mathematical ideas and mathematical truths without being taught. They can use randomness as an approach –
brute force is the method most pupils will resort to when faced with a mathematical
problem that requires prerequisite knowledge they have not been taught. Through trial, testing, errors, re-trial,
drawing conclusions and iterating, it is possible
for pupils to construct new mathematical meaning. But this approach is inefficient and pupils
only have a finite time at school.
Instead, it is far more efficient and impactful to simply teach the
pupil the knowledge they require. The
process of problem solving is also something that can be taught.
Pupils become significantly enhanced
problem solvers if they are explicitly taught how to tackle problems. To achieve this, the teacher can:
- Prepare Problems and use them in whole-class
instruction
- Assist students in monitoring and reflecting on
the problem-solving process
- Teach students how to use visual representations
- Expose students to multiple problem-solving
strategies
- Help students recognise and articulate
mathematical concepts and notation
(Woodward, J., Beckmann, S., Driscoll, M.,
Franke, M., Herzig, P., Jitendra, A., Koedinger, K. R., & Ogbuehi, P.
(2012). Improving mathematical problem solving in grades 4 through 8: A
practice guide (NCEE 2012-4055). Washington, DC: National Center for Education
Evaluation and Regional Assistance, Institute of Education Sciences, U.S. Department
of Education. Retrieved from http://ies.ed.gov/ncee/wwc/publications_reviews.aspx#pubsearch)
So, I suggest that mathematical problem
solving is both Biologically Primary and
Biologically Secondary.
Key Principles in Cognitive Science for
Learning
Cognitive science is a cross-disciplinary study
of the mind with
contributions from fields such as linguistics, computer
science, psychology, artificial intelligence, philosophy, neuroscience, and anthropology.
Earlier in this blog, I discussed the
three types of professional knowledge; propositional, case and strategic
knowledge. The field of cognitive
science offers much of use to educators.
John B Carroll set out in 1963 on a lifetime of work to uncover how an
understanding of human cognitive architecture can help educators plan, design
and deliver more effective learning episodes.
Many have followed and added to the canon, with some remarkable and
surprising results. Much of what is
hypothesised in cognitive science remains at the propositional knowledge phase
and has not been able to be replicated beyond laboratory conditions. In this section of the blog, I seek to
highlight just a few areas of cognitive science that we can draw on for
improving the single aim of mastery: learning.
Working
memory
Carleton
Washburne suggested in the 1920s that the human mind can only cope with thinking
about so much at once. This ‘conscious
thought’ was to be defined as what one is immediately concerned with. Nowadays, this aspect of short term memory is
referred to as 'working memory'. The
working memory is responsible for temporarily holding information, so that it
is available for processing. Most of us
can cope with only a small number of pieces of information at any given time,
typically 2 or 3, perhaps as many as 4 or 5.
Suppose
you were asked to perform the following calculation in your head
287 x 34
This
is a trivial problem to solve with pencil and paper, yet asked to perform the
same task mentally, most people struggle.
This is because one is being asked to process too many pieces of
information at once. The working memory
can’t cope.
Working
memory is a limited cognitive tool both in terms of capacity and duration. It can be thought of as
interconnected cognitive mechanisms that maintain newly
acquired information and retrieve stored information to an active state for
processing and manipulation
This
limited tool plays a crucial role. It is
where thinking takes place. It is in the
working memory where complex cognitive tasks such as reasoning and problem
solving occur.
The
working memory acts as a bottleneck between the learning of a task and the
long-term memory. In order to get into
the long term memory, the idea or information must first be processed in the short
term, working memory.
This
makes placing things in the long term memory difficult, which is very important from an evolutionary point of view. Imagine if one remembered
every single thing one ever encountered!
The bottleneck ensures that only important information, that is
information that one has given attention to, is able to pass into the long term
memory. The working memory is playing an
important role as a buffer between all of the nonsense we encounter and what we
remember as truth.
Cognitive
Load
Cognitive
load is defined as the “total amount of mental energy imposed on working memory
at an instance in time” (Cooper, 1998, p. 10).
Sweller
et al., 1998, suggests that this overall cognitive load can be broken into
three subcomponents:
Intrinsic
Cognitive Load (ICL): the load imposed on the learner by the nature of the
instructional material that must be processed and learned
Exrtaneous
Cognitive Load (ECL): the load imposed by factors such as instructional strategies,
message design, interface design, and the quality of instructional materials
and learning environments
Germane
Cognitive Load (GCL): the load imposed by cognitive processes directly relevant to
learning
Clearly, from an educator’s point of view,
we should seek to maximise the latter.
GCL is the energy being used when attending. Since attention is the only known way of
making information and ideas pass into the long term memory, this energy being
exerted is desirable. Learning is hard.
Given the limited nature of short term,
working memory, we should also seek to minimise both ICL and ECL. Using the following colour coding to consider
learning episodes
we can describe much current classroom practice
as often placing great demands in terms of ICL and ECL. Learning episodes often look like
where the very nature of the learning
materials being used by the teacher creates unnecessary ICL burden. These resources may be muddled, verbose,
contain unnecessary information or use confusing language or diagrams. As teachers, we can lessen the demand on the
brain by presenting materials that are concise, accurate, clear and relevant.
Similarly, in this typical scenario, ECL
is taking up lots of mental energy. ECL
is demanding when the way in which information is being communicated is long
winded or irrelevant, or when the learning environment is competing for
attention by containing other stimuli or distracting features. Again, it is a simple problem to solve. Teachers can communicate precisely, use
appropriate media, ensure that learning environments do not distract.
When ICL and ECL are taking up so much
mental energy, there is less energy available for the desirable GCL. When we minimise both ICL and ECL, learning
episodes can be more fruitful by giving greater energy to attending. A more appropriate load phasing could look
like
It is worth noting that significant controversy surrounds the claim that ICL can be reduced. Mayer and Moreno (2010) outline the segementing principle, which aims to reduce ICL by presenting information step by step. They claim that this helps pupils to better organise new information. Mayer (2005) suggests that ICL can be reduced by using the pretraining principle, where pupils are given information about the new content before starting the new learning unit. The intention is to increase the impact of a pupil’s prior learning on the new material.
I believe the controversy is warranted. Both of these approaches, which do appear to reduce ICL, might better be considered as simply changing the task that pupils are meeting and so not actually reducing the intrinsic load of learning the idea at all. For this reason, most instructional designers concentrate on reducing ECL only.
Anxiety
and Cognitive Load
It is worth considering the impact of anxiety, since there is good
evidence to suggest anxiety takes up working memory and has detrimental impact
on cognitive load. (Gerardo Ramirez , Elizabeth A. Gunderson , Susan C. Levine
& Sian L. Beilock (2013): Math Anxiety, Working Memory, and Math
Achievement in Early Elementary School, Journal of Cognition and Development,
14:2, 187-202).
Anxiety can arise in pupils when learning new mathematics if they have a
poor grasp of earlier, pre-requisite mathematics. A mastery approach mitigates against this,
since, unlike conveyor belt approaches, in a mastery approach teachers
homogenise pupil groups and choose appropriate starting points on the journey
through mathematics such that all pupils are building on firm foundations. However, even with this approach, as
mentioned earlier pupils will forget or unlearn as a natural part of the
non-linear journey through learning a
discipline and, given teachers are human beings too(!), we are all fallible and
will make mistakes in judging the correct starting points. It is, therefore, important to continually
consider this aspect of anxiety and to minimise it by always testing for
prerequisite knowledge as shown in the mastery cycle diagram ealier.
Social cues also play a role in bringing about a feeling of anxiety in
pupils who are learning mathematics. All
mathematics teachers are familiar with the experience of hearing other teachers,
parents or the media condemn mathematics as intractable and to be feared.
These fears take up working memory – literally, the pupil is thinking
about their fears rather than thinking about the mathematics – so it must be
addressed.
When dealing with deep structure rather than surface structure, pupils
must attend to higher order aspects such as underlying relationships and
general principles. This requires more
of the working memory. A result of this
is that anxiety is disproportionately damaging to high performing pupils. Their working memory is more disrupted
because they tend to work on mathematics using deeper problem solving
approaches, rather than the simplistic, single step approaches that lower performing
pupils tend to use.
Another aspect of anxiety that is crucial to understand is that of
teacher anxiety. Many teachers who teach
mathematics do, themselves, have underlying fears about the subject. In the UK, only around 24% of mathematics
teachers have a post-school qualification in mathematics, so the vast majority
of the workforce is non-specialist.
Teacher anxiety is communicated to pupils and can lead them to embed those same anxieties. Studies show that teacher
anxiety impacts on pupil performance, with a stronger impact on girls’ performance.
Reducing pupil anxiety is therefore a goal of the effective mathematics
teacher. This requires sticking to the
mastery cycle, which ensures that fundamental skills are secure and assimilated
before moving on and that continual formative assessment monitors for when pre-requisite
knowledge is forgotten or not fully secure.
Teacher anxiety can be reduced significantly through effective CPD. This CPD should focus on how to teach a
concept, rather than the mathematical concept itself. When the focus is on how to teach, teacher
anxiety lessens far more rapidly than when the CPD is really about teaching the
teacher the mathematics.
Assessment types can be changed too.
In mastery, as described in earlier parts of this blog, assessment is
not about labelling pupils, it is about working out whether or not one’s
teaching has been impactful yet. There
is no need to time tests or to assign grades in a mastery approach. Removing both timing and grading
significantly reduces pupil anxiety and has no detrimental impact on learning
(quite the opposite, in fact!)
Finally, teachers should avoid consoling pupils. This may sound counterintuitive when talking
about reducing anxiety, but consoling a pupil who has answered a question
incorrectly is disingenuous and gives them no help to become secure in
mathematics. Rather than saying, “Well
done, you tried your best, that’s all that matters”, teachers should use
responses such as “yes, the work is challenging, but I know, with hard work,
you can do it!”
The
Difference Between Novice and Expert
As discussed, the more expert a pupil becomes, the greater the impact of
anxiety, because experts work at a deep structure level, whereas novices tend
to work at the surface structure level.
So, when designing instructional materials and modalities, it is
important that the teacher takes account not just of the ICL – ECL – GCL
relative proportions, but also the type of audience they are instructing. Novices and experts learn differently and
attack problems at different structural levels.
The format of instructional materials suitable for an expert may not be
appropriate for a novice and vice-versa.
We have seen that as a pupil becomes more expert, they tend to consider
mathematical ideas as general principles, which they can work with across
various problems and formats. But pupils
do not begin with expertise, they begin with inflexible knowledge, which they
can use in only restricted examples.
Their knowledge is superficial at this stage.
This is true of all learning. We
all move from the surface level, superficial knowledge to expertise as we
continue to learn. Take, for example,
the trainee teacher. We were all once in
that position. When observing a trainee
teacher, we can see that their attention is focused on the superficial: what am
I saying? How long do I spend on
this? Where should I be standing? What resources should be on the table? But the expert teacher is attending to much
higher level principles, such as pedagogic choice. This expertise comes about by studying
(propositional knowledge), networking and learning with other as well as
articulating our own experiences for critique and development (case knowledge),
and most importantly through actually teaching (strategic knowledge). This latter part is critical if one is to
become an expert teacher. It takes a
long time – perhaps around 10 years – to experience enough real classroom
encounters for this strategic knowledge to develop.
A key weakness of education systems in many of the north-western
cultures is the lack of honesty and clarity about how long it takes to become
an expert teacher. In the UK, a single
year of teacher training, followed by a probationary year, results in Qualified
Teacher Status. The assumption of many
is that this is the end of the training period and that the teacher is now an
skilled educator. This is clearly
idiotic. Teaching is an incredibly
complex profession and is a skill that continues to develop throughout one’s
career. The learning never ends, one
never completes the journey. There is
always more to learn, always more expertise to develop.
Strategic knowledge is the most important. Experiential learning is necessary for us all
to notice our own practice. Take again,
for example, the trainee teacher. We
all, as maths teachers have to go through the experience of finding out that it
is a really bad idea to place compasses and glue sticks on the table before the
start of a lesson on constructing 3d shapes.
Because the kids bloody stab each other and stick the glue to their
foreheads! These are things we have to
experience, not simply read.
As expertise develops, the way in which knowledge is organised in the
mind moves from disconnected, inflexible knowledge to a problem based
schema. Experts encounter problems and
are able to connect both the content knowledge and the principles and
procedures necessary for attacking the problem.
In the novice mind, content information and problem solving knowledge
are separate.
This is why, as discussed earlier, novices attack problems with brute
force trial and error. The expert, on
the other hand, recognises features in the new problem that they can connect to
problems they have solved in the past.
They work from the known to the unknown.
Because knowledge is organised in different ways, the expert has
efficient ways of addressing new problems.
Their knowledge is connected, making it more easy to search their memory
for similar situations and the resolutions that followed. The novice mind, with its disconnected storage,
is inefficient.
Recognising the differences between novice and expert is extremely
important if teaching is to be successful.
Teachers, who are often expert in the mathematical ideas they wish their
pupils to learn, will often forget the experience of being a novice and, in
good – but misguided – faith, design instructional materials and learning modes suitable for
experts (suitable for themselves!), leaving the novice pupil unable to access
the meaning.
Generic
Cognitive Skills and Domain Specific Skills
Human beings acquire generic skills without the need to give specific
attention to the skill, they come automatically. Domain specific skills are not acquired
automatically, so teachers must instruct pupils in domain specific skills if
they are to be gained.
A common debate in education is whether or not skills should be the
purpose of schooling. I suggest that, by
making pupils bright – that is by building their schema of knowledge across
multiple disciplines – they are able to think critically and creatively. There is no need to teach creative thinking –
it is an byproduct of being learn’d!
Information
Store Principle
Human long term memory is indescribably large – despite many efforts to
determine the storage capacity (often in the language of computer science) no
one has yet been able to find any limit to the long term memory. In practical terms, it appears to be
insatiable. Who we are, as human beings, in every sense can be thought of as the record of our experiences, emotions,
encounters, and living histories. In a
real sense human beings are their
long term memories.
Our long term memory is our
aptitudes. The chess grandmaster is able
to triumph not because of some generic problem solving skill, but because they
recognise configurations and the possible futures of those configurations. They remember them. They have encountered them in the past and
can all upon them. This is the only
reason they are a grandmaster.
To build exceptional competence in any discipline means to build up an
enormous knowledge base in the long term memory.
Using information from the long term memory takes up no mental
energy. Unlike using the working memory
to think about novel information, which is extremely limited, drawing on the
long term memory appears to have no bounds on the number of pieces of
information that can be utilised at once.
Building this knowledge base is generally achieved by obtaining that
knowledge from other people through borrowing, imitating, reading and story
telling.
For these processes to occur, the pupil must have a relationship with
the teacher.
Relationships
Learning is a social endeavor.
Too often, this aspect of education is ignored, yet, without good
relationships learning is unlikely to occur.
Human beings have evolved over huge periods of time to borrow knowledge
from those around them. For millennia,
story telling has been the key mode of knowledge transfer, with one generation
handing down a body of knowledge to the next.
As described earlier, the working memory acts as a protective buffer to
prevent unimportant information getting into the long term memory. So knowledge needs to be considered important
by the pupil. Human relationships play
an enormous part in bringing about this feeling of importance. The pupil will consider the information
important when they have faith in the person telling them the new
knowledge. The teacher must establish a
relationship with the pupil such that the pupil trusts them and has belief in
their assertions. In order to accept new
knowledge as truth, the pupil first must believe that the teacher is a carrier
of truth and is sincere in their desire for the pupil to become learn’d.
Too little emphasis is placed on the crucial role of human relationships
between teacher and pupil (or, indeed, teacher and trainee teacher, mentor and
mentee, head teacher and staff).
Teach
Everything Correctly First Time
A reason pupils lose faith in a teacher stems from the common practice
of teachers lying to pupils. It is a
feature of conveyor belt approaches - where teachers are racing through
objectives and are more concerned with coverage than learning - that teachers
will conceal truth about a mathematical idea.
This truth is later revealed, thus exposing the teacher as a liar. Faith falls apart.
For example, our pupil from earlier in this blog who says, “Eight
takeaway five is three. But you can’t do
five takeaway eight.” It can be tempting
for the teacher, who simply wished to ‘get through the lesson objective’ to
agree with the pupil, “that’s right, you can’t do five takeaway eight.” The pupil trusts the teacher and remembers
this fact. Later, the same teacher will
need to break this apparent truth. This
happens continually throughout the pupil’s life at school. They are told lies such as;
“to multiply by 10, add a zero to the right hand side of the number”
“multiplication makes things bigger”
“it is not possible to find the root of a negative number”
The experience of the pupil is one of continual disappointment in the
teacher.
Rather than adopting these approaches (in fact, scrap any aspects of
conveyor belt in your practice!), be truthful at all times. Teach everything correctly first time. Do not use examples that are not
generalisable or metaphors that break as the concept develops. Rather than responding, “that’s right, you
can’t do five takeaway eight”, tell the pupil, “I can see why you think that at
the moment, because we are looking at one type of subtraction, but, actually, it
is possible! Isn’t that exciting! And later, as you learn more about
subtraction, I will show you how.”
Narrow
Limits of Change Principle
When dealing with novel information, the human mind can only process
very limited amounts of information at any given time. For most of us, the working memory limit is
around 3 or 4 items of new information. As described earlier, the working
memory is not only limited in terms of number of pieces of information, but
also in duration. Most of us can hold
something in working memory for a maximum of around 20 seconds before it is lost
or replaced. These two protective devices
ensure the long term memory is not inundated with meaningless information. So, from an evolutionary point of view, the
dramatic limits of working memory are necessary and helpful. However, from a learning point of view, these
limits are inconvenient.
Working memory can also process information that is held in the long
term memory. When carrying out
processing of information already stored in long term memory, the operation of
working memory is dramatically different; there are now no capacity or duration
limits. The working memory can cope
quite simply and without encumbrance with vast and varied pieces of
information.
This can be utilised when learning new information. Take for example, the following list. Read the list of 20 letters and try to
remember them:
This is quite a tricky thing to do.
This list is new, so the working memory struggles to cope with 20 pieces
of new information at once.
However, knowing that, if something is already embedded in the long term
memory, we are able to work with any number a pieces of information and that
the problem of duration goes away, as a teacher I can rearrange the information
such that it draws upon already learnt knowledge.
Suppose we think of the domain of mathematics as a complex web of
interconnected ideas
When learning a new idea, as teachers we know what previously learnt and
understood ideas connect to the novel idea, so we can shine a light on the new
idea from the perspective of established knowledge. This means the pupil can have far less demand
on their working memory, since they are using information from their long term
memory.
Here is the same list again, read it and remember it:
This list is much easier to learn.
The information is the same, but the teacher has presented the
information in such a way that it draws on already learnt knowledge. Because the entity ‘BBC’ is a known idea, we
can think of this as one piece of information instead of three. This ‘chunking’ is a useful way of partially
overcoming the limits of working memory.
As teachers, we must therefore ensure the scheduling of our curriculum
is such that we can allow pupils to encounter new knowledge and concepts from
the view point of well-connected ideas that they have a good understanding of
already.
Worked
Example Effect
During their time at school, pupils in the UK have approximately 1600
hours of mathematics lessons. In this
time, they are to learn around 320 novel mathematical ideas. Of course, we will expect pupils to undertake
a great many more hours study and work outside of school, but the time they get
to spend with an expert is limited by design.
It is important, therefore, that the time pupils actually spend in the
company of their teacher is used as effectively and efficiently as possible.
When asked to work on a problem, assuming the underlying knowledge is in
place, pupils can go about addressing it.
But, if the teacher first shows a worked example of such a problem, the
pupil will then be able to address their problem far more readily. The time teachers invest in showing worked
examples pays dividends.
Split
Attention Effect
This view of lessons having to be efficient is often railed against by
teachers – they argue that learning is not a factory process and not about
efficiency. Well, duh, of course. But the reality is what it is – they only get
so much time with you; you have a moral obligation to make that time as
impactful as possible.
Continuing then with the theme of efficiency, we come to the Split
Attention Effect. When teachers are
demonstrating worked examples or preparing tasks or questions for pupils to
work on, it is worth considering the limits of working memory and ensuring that
– at the point of learning new material – the information is presented as
clearly and with as little burden on the working memory as possible. One very simple example of this is to remove
the need for pupils to split their attention between diagrams and
information. So, for example, when
working on a problem involving angle facts, say, rather than having a diagram
on one part of the page and then a few sentences explaining the angles, we can
make the information much more integrated by labelling the angles in the
diagram. This physical integration of
the information reduces the demand on working memory by removing the need to
consider two separate sources of information.
For example
becomes
The
Redundancy Effect
It should be noted, however, that it is not always necessary or desirable
to integrate information into diagrams.
Where the information is simply repeating what is on the diagram, there
is no need to add it. That is, where the
nature of the diagram itself already informs to reader, then adding information
becomes redundant.
Mayer (2001) uses the term “coherence effect’ in
reference to this situation.
Another aspect of the redundancy effect to consider is the gains that
can occur in learning when, rather than using two modes to communicate
information, one is eliminated. For
example, if showing a PowerPoint slide with text, it is beneficial to avoid
reading the text aloud to the audience – let the audience read it.
Variation Theory
The role
of variation in learning mathematics has long since been established. Zoltan Dienes wrote on the impact that
variance and invariance can have when encountering new mathematical ideas in
his 1971 journal article ‘An Example of the Passage from the Concrete to the
Manipulation of Formal Systems’, (Educational Studies in Mathematics Vol. 3,
No. 3/4, Lectures of the Comprehensive School Mathematics Project (CSMP).
Conference on the Teaching of Geometry (Jun., 1971), pp. 337-352).
In the
Perceptual Variability Principle, Dienes prescribes the utilisation of a
variety of contexts to maximize conceptual learning.
The Mathematical
Variability Principle states that children need to experience many variations
of “irrelevant attributes”. For example,
there are irrelevant attributes inherent to the concept of like and unlike
terms in algebra. Concepts of like terms do not depend, for instance, on the
nature of the coefficients or signs. By varying the signs and the coefficients
using whole numbers, decimals or fractions, and keeping constant the relevant
attributes pupils will become conscious of what happens to different numbers in
the similar situations while ensuring an understanding of like terms and unlike
terms.
Dienes
considers the learning of a mathematical concept to be difficult because it is a
process involving abstraction and generalisation. He suggests that the two
variability principles promote the complementary processes of abstraction and
generalisation, both of which are crucial aspects of conceptual development.
The role
of variation is therefore to reveal underlying relationships and principles,
such that the journey to abstraction is both easier for a pupil to attain and
one that they have faith in believing as truth.
Dienes
continued to work on his theories of variation, with many others picking up the
importance of variance and invariance for learning mathematics over the years
and contributing to the evidence base.
Notably,
Ference Marton in Swenden working with colleagues in China and Hong Kong to
further promote the importance of variation led to their work being translated
for Western audiences (Gu, Huang & Marton 2004), which had a great
influence on reigniting the discussion around variation theories.
Unfortunately,
the translation of their work (or, more accurately, mis-translastion) has led
to a false distinction being made between procedural and so-called ‘conceptual
variation’. Prima facie, this makes no
sense. Concepts in mathematics do not
vary!
This
distinction has resulted in much muddled and damaging assertions being made in
the UK about variation. In recent years, even national organisations have promoted the idea of conceptual variation in
mathematics – arguing that the teaching of mathematics should include taking a
concept and somehow varying it. This is
clearly moronic. Mathematical concepts
are not malleable.
However,
this confusion should not distract from the important role that variation
theory can play in learning mathematics: drawing attention to underlying
relationships.
Other
notable work on variation includes Mason and Watson, 2006. This important article highlights the issue
in Marton’s theory. Marton suggests we
learn what varies against an invariant background. But often what we hope pupils will learn in
mathematics is a constant underlying dependency relationship.
Labels of
‘procedural’ and ‘conceptual’ variation do not get at the full range of the
importance of variation in learning and doing mathematics, that is to
draw attention to the underlying relationships.
Mathematical
Confidence
A helpful
result of careful and intelligent use of variance and invariance can be the
building of mathematical confidence, which in turn lowers anxiety and decreases
cognitive load.
In order
for pupils to become creative mathematical problem solvers, it is necessary
that they gain the motivation to want to pursue mathematics and persevere when
faced with apparently intractable problems.
Motivation – that is, the very desire
to continue and go further – is greatly enhanced when pupils are successful and
confident.
As
described earlier, through examples we can demonstrate to pupils how to attack
a type of question, scenario or problem.
As teachers, we plan for the problems they will encounter and manipulate
the way in which the problem will unfold before them, such that, when they are
beginning to solve a problem, pattern emerges from the mist. When pupils notice pattern and relationships,
they can begin to conjecture, “Ah!
Look! The pattern is X, so when I
do Y, what should happen is Z. Let me
try!”
This
builds an expectation in the pupil’s mind – they believe they have discerned
relationship and can now continue to work on the problem, but now with an
anticipation of what will happen and why.
When these expectations are confirmed through experiment and result,
pupils gain a sense of mathematical confidence.
Note: we
will, of course, also manipulate problems such that the expectation a pupil has
and the conjectures they make will not be confirmed. These unexpected results also play a key role
in building a pupil’s ability to reflect and extend their reasoning.
There are
many problems and tasks that mathematics teachers have in their canon that are
designed to build such mathematical confidence.
Suddenly, an apparently intractable problem becomes addressable and
pupils can plot a path through.
Variance
and invariance can play a powerful role in building mathematical confidence.
Consider
the identity
(x – 2) (x + 1)
≡ x2 – x
– 2
We could
demonstrate the truth of this identity in many ways to our pupils and then ask
them to follow our examples to find other such identities. Often, text books will contain exercises with
random questions for pupils to work through.
But what if we used invariance to help build mathematical confidence.
Suppose as
the next example, we looked at
(x – 3) (x + 1)
≡ x2 – 2x
– 3
Here, the x
+ 1 term has remained invariant. What do
you notice?
And
perhaps as the next,
(x – 4) (x + 1)
≡ x2 – 3x
– 4
At this
point, pupils may spot pattern emerging and be able to conjecture what the next
example would be.
(x – 2) (x + 1)
≡ x2 – x
– 2
(x – 3) (x + 1)
≡ x2 – 2x
– 3
(x – 4) (x + 1)
≡ x2 – 3x
– 4
Most pupils will look at the x – 5 example next and rightly conjecture
that the coefficient of x will be 4 and that the constant term will be –
5. This confirmation of their
expectation builds confidence. As
teachers, we would direct them to try ‘going backwards’ and find the result
(x – 1) (x + 1)
≡ x2 – 0x
– 1
and so
on. The pattern is useful in bringing
about confidence but also in revealing the nature of the relationships between
the terms in the expressions.
We will
see pupils confidently deal with the case where the varying term is x –
0
(x – 0) (x + 1)
≡ x2 + x
– 0
which then
leads, by pattern, to the natural conclusion that the next example will be
(x – -1) (x + 1) ≡ x2
+ 2x + 1
So, by keeping one aspect invariant, we are able to build
mathematical confidence at the point where the task is novel and also begin to
reveal underlying relationships.
(x – -1) (x + 1) ≡ x2
+ 2x + 1
(x – 0) (x + 1)
≡ x2 + x
– 0
(x – 1) (x + 1)
≡ x2 – 0x
– 1
(x – 2) (x + 1)
≡ x2 – x
– 2
(x – 3) (x + 1)
≡ x2 – 2x
– 3
(x – 4) (x + 1)
≡ x2 – 3x
– 4
This
systematic way of working – of specialising – is what allows the pupils to
conjecture. We can then change features
and build towards generalisation.
Note, the
power of variation here is in revealing underlying relationships and building
mathematical confidence at the point of first learning. Later, when the pupil passes through the
‘Doing’ phase to the ‘Practising’ phase (when they are fluent), it is no longer
desirable to give such structure. We
want the questions to become random so that the pupil has to decide when to use
a principle or not.
As another
example, take the following sets of subtraction questions, adapted from
Transforming Primary Mathematics (Mike Askew)
Which set
is the most helpful in building mathematical confidence and revealing
underlying relationships?
Clearly,
the sets are identical sets of questions, but arranged differently. Set A is more typical of what pupils will
encounter in text books, the questions are arranged randomly, with no obvious
pattern emerging. Set B, however, has
been arranged in such an order that pupils will spot pattern and
connections. They will notice that
performing 122 – 92 is the same problem as performing 120 – 90 and begin to
reason why this is the case. Working
with Set B, at the point at which this idea is novel, gives pupils the chance
for expectation, confirmation and confidence.
The teacher may suggest, “show me more questions that are the same as
120 – 90. Tell me how you know you are
correct.”
As a
teacher, what question would you choose to come next in Set B?
Perhaps
500 – 395 or 505 – 400, thus connecting this particular subtraction with other
methods of subtraction and giving opportunities to explore relationships and
connections.
Variation
Theory is useful because it gives these opportunities. Variation Theory is not about extensive lists
of questions where pupils stop expecting, testing and conjecturing and simply
become passive in stating obvious next answers.
Working with variance and invariance requires the teacher to carefully
balance the benefits of confidence and relationships with the danger of long
sets of questions that result in pupils no longer thinking about what they are
doing.
Rohrer and
Taylor (2007) found some interesting results when looking at how many questions
pupils need to work on, which we shall come to later in this blog.
Again, I
would suggest that Set B is a useful and powerful approach when the
mathematical idea is novel, but, actually, Set A becomes the useful arrangement
later once pupils have gripped the idea – the random nature of the questions
forces pupils to attend to the principle as a whole and make decisions about
how to work on the problems.
Using
variance and invariance to reveal underlying relationships is the key purpose
of Variation Theory. Another useful
outcome of varying can be for pupils to discern commonalities and differences
when working with examples and non-examples.
Ask a
pupil to draw a triangle on a piece of paper.
Almost all pupils will draw something like
or
This is
because these are the triangles that pupils repeatedly encounter. Traveling around the UK over the past couple
of decades, observing and inspecting mathematics, I have time and again seen
teachers refer to triangles but only ever use these types. Pupils come to believe that ‘triangleness’ is
like a ladder against a wall or the roof of a house. They believe triangles have one horizontal
side. Pupils rarely draw
or
And for
many pupils, the following is not a triangle at all
Instead,
they will call this an ‘upside down triangle’.
Perhaps
even more concerning is that many pupils believe that the following shape is a triangle
It is easy
to see why; after all, this does look like the roof of a house.
I use this
simple example of triangles to highlight the need, when introducing new ideas,
to ensure that pupils encounter many examples and non-examples of the idea. We will return to examples and non-examples
later.
Variation
in procedure can usefully help pupils discern the key principles of a
mathematical idea. We can ask pupils
what is the same and what is different about procedures. For example, choose two three-digit numbers
and subtract the smallest from the largest.
Use the
same numbers and now, instead of how you might have gone about the original
subtraction, perform the subtraction using the following rules
Each of
the above procedures for performing subtraction is a formal, generalisable,
recognised procedure. Each does, of
course, give the same result. But why? What is the same? What is different? How are the procedures connected? What do these connections reveal about the
nature of subtraction, place value, base systems and digits?
By working
on this task (and please do, it’s wonderful!), we begin to see pattern in the
process of subtraction, begin to discern, begin to generalise.
[Note: the
above activity was handed to me on a piece of paper some years back by a maths
teacher at a conference. I don’t know
who that person was or who authored this task – if you do, please let me know
so I can include a credit here]
I would
suggest that this type of activity exemplifies effective use of Variation
Theory in mathematics and is considerably more powerful in building confidence,
reasoning and understanding than simply asking pupils to work through long
lists of questions.
Here is
another example of using variation in the mathematics classroom, which I would
suggest you pause and try. I created
this task a while back and have tried it with many pupils and teachers.
The
initial problems are clearly trivial.
But, when the base is changed, the task requires a completely different
type of thinking. Working in these
different bases, but in a systematic way, pattern emerges! This allows for conjecture and, finally,
generalising. Importantly, working on
these different bases allows pupils to have greater clarity about working in
base 10.
A
significant weakness in UK mathematics over the last 30 years is the absence of
multi-base arithmetic. I would suggest all
pupils learn multi-base arithmetic.
After all, how can we, as teachers, be sure that pupils understand
arithmetic if they only ever work in one base – all we have shown is that they
can perform in one specialised case.
As a final
example of variation, I include this question set as an exemplar of the need to
show pupils the same idea in varied contexts.
Storage
and Retrieval
By design,
the mastery cycle seeks to optimise learning by
·
ensuring all
pupils are taught the right level of mathematics (just beyond what they already
know), building on secure understanding of prerequisites
·
giving all
pupils varied experiences with mathematical ideas, that transition from doing,
to practising to behaving mathematically
·
ensuring that
novel ideas are meet carefully in such a way that they are seen as important
and draw the pupil’s attention in order to pass the gatekeeper of working
memory and enter the long term memory
·
continually
checking that ideas are being gripped by an ever present cycle of formative assessment
and correctives
·
never moving
on to mathematical ideas that require a current idea if it has not been fully
understood and embedded
All of
this is with the intention of changing the pupil’s schema of knowledge and
understanding, assimilating new truth in a logical way. With all this effort to ensure that a pupil’s
long term memory is changed, we now face the next challenge: optimising the
storage of that knowledge and making it readily retrievable.
It has
long been known that the memory is the key concern of the educator, Washburne,
Ward, Burk and others were discussing the role of memory in the 1910s. In 1943, Hull wrote about memory from two
points of view. Firstly, what he
referred to as the pupil’s ‘momentary reaction potential’ – that is, the
potential they have to use their prior knowledge in the moment through
recalling learning. He noted that this
varied from person to person. The second
aspect of memory Hull identifies is what he calls ‘habit strength’. Hull knew that some actions required no
thought, they could just be performed.
Habit, is a sensible view of this since it reflects a common view of
what it means to be able to do something habitually. Later, Estes (1955), refines Hull’s work and
talks about ‘response strength’ versus ‘habit strength’. This takes the idea of
what one can think about in the moment further and starts to apply the notion
of there being a strength to this ability, which can explain the differences
various people display. The idea of
response strength also takes the debate towards the idea that this aspect of
memory is not a fixed potential, but can be improved (strengthened). The research into these two aspects of memory
continued for several decades, with large a number of experiments being carried
out in laboratory conditions.
Enter
Robert and Elizabeth Bjork. The Bjorks
have dedicated much of their professional working life to furthering the
understanding of human memory. In 1992,
they go further and redefine the two aspects as ‘retrieval strength’ versus
‘storage strength’.
We now
have a view of the long term memory as being able to be improved both in terms
of how readily one can recall knowledge and how well that knowledge is embedded
in the long term memory.
Performance
is not the same as Learning
In its
current incarnation, the formal examination system in the UK measures whether
or not a pupil can perform on a certain bank of questions, of certain types, at
a certain point in time. Performance is
easy to measure, which is why national systems often resort to simplistic,
mechanistic approaches for benchmarking the success or otherwise of the system.
Unfortunately,
performance is not the same as learning and, more critically, is not even a
good predictor of learning. Being able
to perform at any given time is heavily influenced by local conditions—cues, predictability, recency—which can
serve as crutches that prop up performance, but will not be there later when
the knowledge might actually be useful!
We have
known for a long time that current performance is a very poor predictor of long
term learning, yet schools are forced to operate in ways that reward pupil
short term performance over meaningful, long term learning. This, of course, leads to poor design of
learning episodes, which can be praised by an inspector or observer in the
moment (all the kids had smiley faces, they all put their thumbs up at the end
of the lesson, everyone could do the target question at the end, the pupils all
made progress!), but are in fact not learning episodes at all – they are
presentation and regurgitation.
The key
driver of systems adopting poor assessment practices is because they are easier
and cheaper to implement. But there is
another, serious reason why assessment that actually measures learning is not
routinely used by national systems: cheating.
Rather
than terminal performance examinations, we could instead choose an approach of
continual assessment, where pupils are working very closely with their teacher,
who builds strong relationships with them and gets to know them inside out. Pupils can build portfolios of evidence
throughout their time at school, demonstrating mathematical understanding on
deep structure problems and over sustained periods of time, as we spiral
through the curriculum and pupils grow their schema. This teacher assessment led approach could
discern what pupils truly do know and understand. So, why did we abandon such approaches (note, for example, the ATM GCSE, which was abolished in the 1990s, had no terminal
exam) and opt for systems that measure point in time performance only? Well, continual assessment is very hard to
carry out and takes a great deal of time, it also requires teachers to have
very high levels of professional knowledge around assessment and make accurate judgements over time that are free of bias, it comes with an enormous
moderation burden and, finally, it relies on teachers maintaining their
professional integrity and ethics whilst simultaneously working in a high
stakes profession. Alas alas, no system
has ever been able to achieve all of this!
At a local
level, however, there have been many excellent examples of continuous
assessment working, including – notably for this blog – Carleton Washburne’s
own schools and pupils.
We work in
a system that measures performance only and we need to be alert to the flaws of
such a system and alive to the extremely weak practice it can drive. It can feel rather scary for the teacher in a
high stakes system to change their lesson design to focus on long term learning
rather than performance, but it is morally reprehensible not to do so.
Suppose a
class has just had a one hour lesson on Pythagoras’ Theorem. During the lesson, the teacher has repeatedly
emphasised that the lesson is about Pythagoras’ Theorem and shown multiple
examples. The teacher then gives the
pupils similar questions to work on. The
teacher is then pleased that the pupils can perform.
Well of
course they can perform! They have just
been given all the cues to do just that.
They are replicating.
But what
we, as teachers, want to achieve is for pupils to be able to encounter problems
in the future that may or may not require the use of Pythagoras’ Theorem and
for them to be able to recognise appropriate scenarios and put their learning
to good use.
In other
words, as teachers we should focus more on getting pupils able to know when to
use an approach, rather than simply how to use the approach that day. Again, as discussed earlier, a learning
episode phasing that includes only 20% new content and 80% previously learnt
content helps with this, since the lesson is not then just populated with
questions like the examples just shown.
The
Importance of Forgetting
“In the practical use of our intellect,
forgetting is as important as remembering.
If we remembered everything, we would most occasions be as ill off as if
we remembered nothing.” - William James, 1890
We
encounter huge amounts of information in our everyday lives. It is important (for one’s own sanity!) that
not all of this is remembered. Imagine if
you could take a pill so you never forgot anything, it would be awful! If every single thing you had ever been told
was continually to mind, the impact would literally be maddening. So, forgetting is a really important
evolutionary mechanism that protects the mind.
Teachers should be alert to forgetting and phase their learning episodes
such that important ideas and information are brought to mind again for the
pupil at the point just before being forgotten.
Desirable
Difficulties
Learning
is difficult, but we want our children to become learn’d. So removing as many of these difficulties as
possible is clearly a useful thing to do (e.g. lessening the load on the
working memory by removing distractions or giving clear instruction). But not all difficulties are unhelpful to the
process of learning.
Many
cognitive scientists, and in particular the Bjorks, have explored the impact on
introducing difficulties during learning.
This has included work on asking participants to practice not at the
criteria (e.g. throwing a ball five metres and three metres, when the test will
be to throw it 4 metres), interrupting the learning through distraction (e.g.
when learning about one idea, periodically diverting the learner to think about
an entirely separate idea) and interrupting the learning episode (e.g. instead of asking a novice tennis player to learn everything about serving a
ball first, the novice is asked to learn myriad of skills, intertwined in the
same learning episode).
Much of
this work for a long time focused on physical activity such as sport and much
of this work has not be replicable beyond laboratory conditions. However, some work in the last 30 years in
particular has shown encouraging results, which bring interesting implications
for the mathematics teacher.
These
difficulties that increase long term learning are referred to by Robert Bjork
as ‘desirable difficulties’. Bjork
outlines four key desirable difficulties:
·
Varying the conditions
of learning
This could
include varying the learning environment.
Bjork looked at moving pupils between bright, clean, inspiring classrooms
to dark, cramped basement like ones.
There is propositional knowledge and case knowledge regarding this
desirable difficulty. However, in this
blog, I shall not be considering this area since I have never been able to find
strategic knowledge of any impact (that is to say, I do not know of any real
classroom examples)
·
Distributing
or spacing study or practice
Typically,
pupils practise a topic in one period of time and then are tested on the
topic. Spacing the topic over a longer
period, with gaps in the practice has a significant impact on long term
learning.
·
Using tests
(rather than presentations) as learning events
Rather than
only presenting new ideas, asking the pupils to answer a question about that
idea first has a significant impact on long term learning (even if they know
nothing about it)
·
Providing
contextual interference during learning (interleaving rather than blocking)
Interrupting the learning of an idea with
different ideas has a significant impact on long term learning.
I will
expand on each of the three desirable difficulties – that have all three levels of professional
knowledge to support them – throughout the rest of this section of the blog.
The
Testing Effect
Exercise in repeatedly recalling a
thing, strengthens the memory
- Aristotle
Regular
low stakes or no stakes quizzing is a key element of mastery approaches. Washburne (though really it was Ward and
Burk’s work) outlined entire curriculum journeys through each subject,
punctuating the journeys with quizzes and tests.
In
conveyor belt approaches, testing is used to label pupils as those who can
learn well and those who can’t. In a
mastery approach, testing is used to enhance learning.
When faced
with learning a novel idea, even when the learning episode is highly effective,
pupils very quickly forget much of what was learnt. This is a protective mechanism for the human
mind and evolutionarily important. The
amount of content retained after a learning episode decays quickly. However, if that learning episode is brought
to mind again, the rate of decay lessens and lessens. This is yet another reason why all mastery
approaches embrace a spiral curriculum model.
On the
whole, the way in which teachers bring learning to mind again is to review it –
perhaps through a re-teaching process or asking pupils to read their
notes. This is a useful activity and
does indeed improve retention by lessening the rate of decay.
A perhaps
surprising result, however, is that reviewing material in this way is less
impactful than simply asking pupils to answer questions on the previously
learnt content. Rather than studying an
idea several times throughout the spiral, it is move beneficial to replace the
repeated study with testing.
Here are
some typical results from Roediger and Karpicke (2006), which is one of several
studies to show this ‘testing effect’
As you can
see, those students who did two periods of study immediately before a test
performed well. Those who did just one
period of study followed by a testing exercise, did not perform as well when
the test was immediately afterwards (5 minutes gap). This is what we would expect. The first group was engaged in cramming.
But, when
a longer period of time passes – 1 week in this case – the results are
reversed. The crammers perform
significantly worse than those who studied and were then tested.
On the
right hand side, another experiment shows the impact of three models. The first group had four periods of study,
the second had three periods of study followed by a test and the final group
had just one period of study followed by three tests. The results are striking. The crammers perform well if the test is
immediately afterwards, but their long term recall is much worse. Now the group that had just 25% of the study
time of the crammers, followed by three tests, far outperform all others.
The
testing effect can feel counterintuitive – one would imagine that those who
study for longer will have the greater long term recall, but this is not the
case. Testing instead of reviewing
brings much greater long term benefits.
As discussed earlier, performance is not the same as learning. This is a clear example of that statement.
It is the
act of asking a pupil to recall their learning (testing) that leads to greater
retention.
Testing
Potentiates Learning
Another
powerful use of testing that the teaching for mastery teacher must be aware of
is that testing potentiates
learning. That is to say, testing a
pupil before the teaching of an idea
by asking them questions on what has not yet been learnt, alerts them to the
fact that learning must happen. By
considering the questions, even if they can’t do any of it, pupils become more
ready to learn the new idea. They are
getting a glimpse of what will be expected of them and are able to recall
previously learnt material that may connect to the new problem they are seeing. This makes the pupil more alive to learning
the new idea and increases their potential to learn.
Marking
and Feedback
In my 2004
book, Chapter 18 is titled, ‘Marking Books’.
The chapter in its entirety reads: ‘I wouldn’t bother’.
Few
practices in teaching take up such enormous amounts of time and energy as
marking. If we are going to dedicate
such huge resource to an activity, we must be sure that there will be a
significant impact on learning and that this impact is greater than if they
time and energy had been invested in undertaking a different activity. Marking books, grading papers, writing
comments and other common marking and feedback policies that schools deploy
simply do not meet this test.
Marking
and feedback can have an impact. If done extremely well and if, and only if,
that marking and feedback is genuinely
used to change the learning experience.
In all practicality this is nigh on impossible for a teacher with 200
pupils and what we see instead is marking and feedback to tick a policy box
rather than any meaningful attempt to change learning. The time wasted to such ineffective practice
is vast. This time could be spent on
planning learning, creating questions, developing subject knowledge and making
pedagogic choice. All of these have a
greater impact on learning than marking and feedback (even if done well).
The TALIS
report gives us a view of the scale of the issue. Teachers in the UK spend around 10 hours per
week on marking and administration related to assessment.
Furthermore,
this wasted time is also a key factor in lowering professional satisfaction in
teachers. Teachers regularly report
marking, feedback and the recording of grades as a significant waste of their
time.
Marking
and feedback are a very poor use of a teacher’s time. Instead, use that time to think carefully
about learning episodes and the materials and approaches you will use to
communicate mathematical ideas.
If one
must mark books, then finding time efficient ways to enhance learning is
key. I rather like a suggestion I heard
from Dylan William, instead of ticking and crossing questions, a statement on
the page along the lines “there are five wrong answers here, find them and
correct them”, can be a quick way of making the pupil undertake a useful
activity. This creates a situation where
the pupil, not the teacher, must locate and identify the incorrect responses
they have given. When a pupil finds
their own errors and corrects them, the gains are much greater than when they
must correct an error their teacher has identified.
The
Hypercorrection Effect
An area
where feedback might be worth the
time invested is to bring about a hypercorrection effect. Hypercorrection occurs when pupils have given
a response to a question, which they feel highly confident is correct, but then
receive feedback revealing their response was in fact wrong.
The
feeling of surprise a pupil has when discovering something they firmly thought
to be true was a misconception, leads them to better correct the original
problem and to be far more likely to remember the correction in future,
improving long term learning of the idea even though, following the original
study of it, they had misunderstood.
Designing
activities to bring about hypercorrection requires them to be such that
feedback is given and takes account of the level of confidence the pupil had in
their assertion. This could again lead
to significant workload for the teacher and not give the gains in learning
needed to justify such investment of time and energy.
Robert
Bjork proposes a simple, yet powerful, alternative: better multiple choice
questions.
Better
Multiple Choice Questions
Traditional
multiple choice questions are a quick and easy way for a teacher to glean a
sense of the level of understanding in a class of pupils. These models are also useful, when used at
scale in technology products, for discerning trends in strengths and weaknesses
in the population. But to bring about
the hypercorrection effect, we must know something about the level of
confidence associated with responses.
Erin
Sparck, Elizabeth Bjork and Robert Bjork designed an approach to confidence
weighted multiple choice questions that achieves this (see https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5256426/
for further details)
Rather
than only asking the pupil for their response, they must also indicate their
confidence in their response. A scoring
system is then used to heavily penalise confident wrong answers.
Here are
some examples (note the pupils do not see the associated scoring)
As you can
see in the two examples above, the pupil must choose between three possible
answers to the question, but they can choose to place their response on the
answer itself (confidently asserting), between answers (equally or skewed
towards one they feel more confident about), or to simply state they ‘don’t
know’. Giving the correct response is
the best score. Asserting confidently a
wrong response is significantly punitive.
This helps to bring about the emotional response we are looking for in
order for the hypercorrection effect to occur.
The impact
of this approach is significant.
The
confidence weighted multiple choice approach gives gains in long term learning
over a standard multiple choice quiz.
Sparck, Bjork and Bjork also explored whether standard multiple choice
quizzes could be improved by asking pupils to state their confidence in their
response.
There
appears to be no additional gains in learning from asking pupils to state the
confidence of their response on a standard multiple choice quiz. The impact would appear to be an outcome of
the confidence weighting and scoring system.
Creating
confidence weighted multiple choice questions is a straightforward and quick
task for the teacher. So, in this
particular case, it does seem a good use of time to create these simple
feedback mechanisms.
Massed
vs Spaced Practice
There have
been a great many studies into the impact of massed vs spaced practice. Here, I will use Rohrer and Taylor (2007) as
the main example, since their study specifically focused on mathematics.
Briefly,
massed practice refers to carrying out all of the practice on an idea, skill or
concept in one period, whereas with spaced practice the pupil practises over a
longer period with gaps between practice.
Rohrer and
Taylor also look at the impact of ‘light massing’, where pupils still carry out
all of their practice in one period, but undertake much less practice.
In one
experiment, three groups were asked to carry out different types of practice,
as below
The
‘spacers’ worked on four problems, but spread out over two weeks, the ‘massers’
on the same four problems in one week and the ‘light massers’ worked on just
two problems in one week. Each problem
was given the same amount of practice time.
The gap
between completing practice and taking the test was the same for all groups,
one week. The results on the test are
shown below
The
spacers significantly outperform the massers.
This result has been replicated many times across many disciplines.
Implications
for Overlearning
The
results only consider participants who answer at least one practice problem
correctly.
Note that,
despite the ‘massers’ undertaking double the amount of practice than the ‘light
massers’, there was no significant difference in their test scores.
Because the
‘light massers’ answered at least one practice problem correctly, this finding suggests
no gain resulting from overlearning.
This has an important implication for teachers who set pupils practice
worksheets with dozens (or hundreds!) of minimally different questions or
variation theory worksheets that focus on quantity of content over the need to
discern underlying relationship. It appears
that, as long as pupils get at least one question correct, there is no need for
a vast number of practice questions.
This is an unsettling finding for many educators who have long been
wedded to overlearning as an important element in the learning episode. There is not enough evidence in Rohrer and
Taylor’s study to assert that overlearning is not effective, but it should at
least raise the question when one is designing practice problems.
It is
possible that overlearning might have significantly boosted test scores if
there had been, say, a tenfold increase in the amount of practice rather than
twofold, but given the constraints on time that teachers face, the gains from
such overlearning might not be worth the amount of time needed to undertake the
activity.
A null
effect of mathematics overlearning was also observed previously (Rohrer &
Taylor, 2006).
Blocked
vs Interleaved Practice
Blocked
practice refers to the practice of learning about and practising one distinct
aspect of a domain at any given time.
Robert Bjork often tells the story of learning to play tennis under the
guidance of a professional tennis coach.
The trainee will be instructed on how to serve a ball and then practise
this one micro-skill for weeks. Once
deemed to have gripped this one aspect, the coach then instructs on the next
micro-skill, say, backhand and so on.
Interleaved practice refers to the practice of skills or ideas in a
phasing that is disrupted by the practising of other skills or ideas. These can be related or not. In the tennis anecdote, the novice player now
has practice sessions that include all of the micro-skills. The initial experience of this is confusion
and difficulty for the new player, since they are being asked to get to grips
with lots of unfamiliar and unconnected movements all at the same time. But, over time, the interleaved practice
leads to some interesting results.
Taking
another example from Bjork, he looked at participants trying to learn the style
of some unfamiliar artists. Some
participants were asked to study an individual artist’s work all at once before
moving on to the next artist (blocked practice), whilst others had to learn all
of the artists’ styles in a randomly presented sequence. For example,
Intuitively,
most people think that learning the style of one artist at a time would result
in being able to firmly grip the similarities in that artist’s work and,
therefore, be able to spot a painting by the same artist in future because it
would contain those same similarities.
However,
as discussed earlier when considering variation, it would appear that
discerning differences in styles, which is what the interleaved approach
achieves, was more beneficial in terms of long term learning.
To measure
the learning, Bjork showed participants new paintings that they had not yet
encountered and asked them to choose the correct artist.
The
results of the experiment show a significant increase in performance by those
who were asked to study the styles through interleaved (spaced) practice.
It is also
interesting to note that the participants themselves expressed strongly that
they would perform better using blocked (massed) practice over interleaved
practice. This remains their belief even
after they have been shown the actual results!
This
strong bias for practising in a blocked way is likely a result of experience –
after all, it is how almost all educators and trainers ask their pupils to
carry out practice.
Given that
interleaved practice leads to better retention, the implication for the
teaching for mastery teacher is to be able to design practice sequences that
highlight not just what is the same but also what is different. There is a strong link to variation theory
here, which is about discerning underlying relationships and principles in and
across ideas.
For
example, the teacher who is trying to get their pupils to grip a sense of
‘triangleness’, should not only use examples of triangles, but should
interleave these with examples of non-triangles.
Much of
the research around interleaving is centred on physical skills, such as the
tennis example, but we, of course, are interested in the evidence directly
related to mathematics.
Let us, again, turn to Rohrer and Taylor. In their paper, “The
shuffling of mathematics problems improves learning” (2007), they
considered these hypotheses using mathematical content.
Participants were taught and then asked to
practice finding the volume of four different solids.
They were later tested, with questions
looking typically like
Groups of participants followed different
practice procedures, with one group undertaking interleaved practice and the
other blocked practice.
The results reflected earlier studies of
blocked vs interleaved practice, with a significant increase in performance
from the interleaved practice group.
Just like the results of cramming shown
earlier, immediate performance is better when the participants blocked their
practice. But when tested later, the tables
are turned, with the interleavers far outperforming the blockers.
Once again, the implication for teachers
is to carefully consider the difference between immediate performance and long
term learning. Clearly a one-hour lesson
containing blocked practice will look more ‘effective’ to the inspector or
observer, since the pupils will perform well in that immediate time, but this
common practice would appear to have poor results when it comes to long term
learning. The somewhat messy looking
interleaved lesson – certain to upset the inspector! – is actually the
desirable practice procedure to engage pupils with.
Rohrer and Taylor postulate that the
superior test performance after interleaved practice is a result of requiring
the students to know not only how to solve each kind of problem but also which
procedure was appropriate for each kind of problem. This supports the point I make earlier that
it is more important for a pupil to know when to use an approach rather than
simply how to use the approach.
The
Generation Effect
Malcolm Swan asked 779 key stage 4 pupils
to recall how often particular scenarios occurred in the classroom Pupils, quite rightly, report that the most
common scenario is they listen while the teacher explains. This is, of course, a very good activity.
What is interesting about the results is
what pupils report as being less common.
Routinely, pupils report that they do not
have many opportunities to create their own questions. In other words, pupils report that they are
not being asked to conjecture.
In the 1980s, teaching for mastery was
generally referred to as 'diagnostic teaching'.
Here are some excepts from the teacher standards at the time:
·
Explore
existing ideas through tests and interviews, before teaching.
·
Expose
existing concepts and methods
·
Provoke ‘tension’ or ‘cognitive conflict’
·
Resolve
conflict through discussion and formulate new concepts and methods.
·
Consolidate
learning by using the new concepts and methods on further problems.
It was an expectation that teachers should
provoke tension and cognitive conflict.
That is to say, teachers would design problems and activities that led
to pupils questioning something they had held as truth (much like the
hypercorrection effect discussed earlier).
An important part of this process is for pupils to conjecture, test,
confirm, generalise and reason. In doing
so, pupils follow their own lines of inquiry and ask their own questions.
The ‘generation effect’ tells us that if
we give pupils minimal information and then ask them to generate a problem,
they will retain the learning far longer than if we simply give them the
problem to solve.
It is important that pupils believe they
are generating their own problems, but of course, the teacher has designed the scenario
such that the pupil will ask the questions we want them to ask. This is not discovery learning!
It is incumbent upon the teacher to ensure
that the pupil will be able to succeed at generating appropriate questions by
making sure the required knowledge and understanding is in place and by having
a good view of what the pupil already knows and believes.
The implication for teachers is
clear: the teacher should ask themselves
how often they create opportunities for pupils to generate their own questions
to solve and how to go about designing such opportunities. Some powerful examples include the use of ‘Always,
sometimes or never?’ prompts, asking ‘what is the same and what is different’
and using the prompt ‘and another… and another… and another…’, to make pupils
continue to generate new examples or counter-examples.
Performance
is not a Good Proxy for Learning
The trap of high stakes systems is that
teachers are judged on what can be observed.
Unfortunately, what we can observe
is performance, which is an unreliable indicator of learning. In the moment, during an inspection, for
instance, we can only infer learning.
We have seen that conditions of
instruction that make performance improve rapidly often fail to support
long-term retention and transfer, whereas conditions of instruction that appear
to create difficulties for the learner, slowing the rate of apparent learning, often optimise
long-term retention and transfer. This
issue presents a real challenge to those who wish to judge the effectiveness of
teaching through observation – that is, it’s pretty much impossible to reliably
do so! This type of inspection is both
laughable and damaging, since it drives counterproductive teaching practices.
The reality is teachers do exist in a
landscape of inspection and this is not going to go away, so it is incumbent
upon the profession to at least ensure inspection is as meaningful and formative
as possible. This means training inspectors
to make long term inference rather than immediate performance observations. This is clearly a more intellectually demanding
task to carry out, but there is surely no excuse for not trying to make
inspection better reflect what we know about long term, sustained and
meaningful learning.
Teachers and pupils can be fooled
The lure of performance means that teachers
become susceptible to choosing poorer conditions of instruction over better
conditions and pupils to preferring those poorer conditions.
If teachers and observers applaud rapidity
and apparent ease of learning during lessons over conditions that more readily
lead to long term retention, a system wide preference and bias for poorer
conditions of learning becomes the accepted norm.
Also, pupils do not appear to develop a nose for identifying impactful ways of learning.
Rather, they are misled by indices, such as how fluently they process
information during a re-reading of material, into believing in poorer
conditions of learning.
This appears to be the case across several
aspects discussed above, as the graphs below demonstrate.
This unshakable misconception that we, as
learners, carry is an important consideration for the teacher. Pupils are repeatedly biased towards modes of
learning that actual results show to be less effective than the modes they
determine to be unhelpful.
The
Teacher Parable
Another finding – one I believe most
teachers actually know in their hearts – that teachers should be aware of is
that teachers themselves almost always overestimate the impact of their
teaching.
A nice example of this can be found in
Newton’s experiment looking at the perception an instructor had about the impact
of their teaching against the actual impact.
Newton created two groups of participants;
tappers and listeners. The tappers were
handed a card on which was the name of a popular melody (e.g. Happy Birthday to
You). The tapper then tapped out the melody
on the table with their finger. The
listeners then recorded the name of the melody they believed they had just
heard.
The tappers were asked to predict how many
of their melodies had been correctly identified by the tappers. Here are the results
As you can see, the tappers wildly
overestimated their musical performance!
In the tapper's mind, the melody they are
tapping out is part of the overall song they can ‘hear’ in their head. They hear the instruments and lyrics, the
familiar tempo and all the richness of the music. So, to the tapper, it is obvious what melody is being performed.
The listener has none of this context,
none of this background information. All
they have is a novel set of tapping noises and rhythm.
This is often the case when mathematics is
being taught too. The teacher has
forgotten what it is like – intellectually and emotionally – to be in the
position of novice. They embark on the
teaching of, say, introducing trigonometric ratios, with all the richness of background
information and connections to other mathematical ideas (including ideas
conceptually beyond this stage), and have a sense of ease about the new
idea. This sometimes leads to ideas
being communicated to pupils as though they are also expert. The teacher, believing their explanation to
be clear, sensible and obvious, often gains a false sense of security in the
impact of their teaching, just like the tappers did.
This is why in a teaching for mastery
approach, continual assessment through questioning, discussion, listening,
observing and quizzing is so important.
The teacher must always be checking they are not fallen into the trap of
the Teacher Parable.
Moving
from Propositional to Strategic Knowledge
For decades, I have been implementing the strategies
described in this blog in my own classrooms and with schools I work with. Taking John Carroll’s seminal work on
cognitive science from the 1960s onwards, building the understanding with findings
from many others over the years and trying to untangle those hypotheses
that are not replicable beyond controlled laboratory conditions with those
theories that have been shown to work in the classroom. I have been fascinated and am obsessed with
finding answers to questions such as
· How long passes before someone starts to forget
something?
· What is the most effective period of time to allow to
pass before using the testing effect to force a pupil to recall?
· When should old material arise again in the spiral?
· How much maturation must occur before a pupil can effectively
use that prior learning and understanding in their own inquiry?
These questions have been intractable for
many decades now. Experiments have been
limited in scale and scope, meaning the data available to address these
fundamental questions is not yet sufficient to give educators truly useful
guidance.
Around 15 years ago, along with a group of
colleagues, I started to propose a large scale data collection that might help
to give new insight. We designed and built
an online system, over many iterations in different countries, capable of
capturing data on not just pupil performance, but also on teaching decisions,
curriculum planning, learning episode phasing, pupil retention, forgetfulness
and spiral intervals. Now with millions
of data points collected, Complete
Mathematics (our online platform) is starting to reveal interesting
patterns.
Of course, these are only patterns at the
moment and we are growing the community all the time and waiting for the data
bank to build up into the hundreds of millions of data points rather than just
tens of millions. At that stage, I will
publish the trends of inferences that the data suggests.
To date, we are seeing interesting
commonalities around high test results and long retention related to the nature
and phasing of the spiral in use (schools can personalise the model from the
default one).
I would like to end this part of the blog
by sharing these very tentative results.
Firstly, we are seeing correlation between
high retention rates and test results when a novel idea is encountered in study
mode and then is met again over three testing moments in the spiral. Beyond three times, there appears to be no discernable
difference, below three there is poorer retention and test performance over
time.
At the moment, most of the models indicate
four sequential study and test encounters with the novel idea over four learning episodes.
The experience for the pupil is a learning
episode studying a novel idea (takes as long as it takes) as described earlier
in this blog. The next learning episode
is concerned with a new idea and the pupils are studying that idea, but content
from the previous idea is also contained in the lesson (though no teaching of
this previous idea occurs), meaning pupils need to recall and answer questions
on the previous idea (the testing effect).
This continues to build up so that, by the fourth learning episode, the
content of the episode is 20% study of a novel idea and 80% testing of three
previous ideas (actually, older ideas are often included too in the form of the
weekly, no stakes quizzes that Complete Maths pupils undertake outside of class
time, but this content does not appear in class time).
With this scheduling of
study-test-test-test for each novel idea, future test performance is greatly
enhanced.
The second finding I would like to share
here relates to the timing of study episodes.
As discussed previously, every idea occurs again in the spiral so that
pupils can consider the idea from the point of a more mature schema and see
further connections and make further reasoning.
These results are very nascent and should be read as simply an
interesting early finding and not used to change the scheduling of your
curriculum.
We are seeing correlation with high rates
of retention of test performance over time with the following spacing of study
periods.
There does not, at this stage, appear to
be any additional gains in studying the idea again after the 90 day study.
So, on early indications, each
mathematical idea will have five study periods and 15 testing periods on the
entire journey through mathematics.
There are approximately 320 mathematical
ideas that pupils are required to grip in their entire time at school in the Complete
Maths model. This means, for a pupil to
best grip all of those ideas, we need to provide approximately 1600 learning
episodes between year 1 and year 13.
In the coming years, we will continue to
monitor, refine and expand our model to take account of effective trends. We will move more deeply from correlation to
causation and, since our data set is live and vast, hope to be able to confirm
some of the assertions that Carleton Washburne made a century ago and cognitive
scientists have been able to replicate in real classroom conditions at the
small scale.
If you have enjoyed reading the first three parts of this discussion of a mastery approach, you might also like to continue reading the rest of the story in my book Teaching for Mastery.
No comments:
Post a Comment