It’s only five minutes late!


This is an indicator of the passage of time, not an educational mechanism.

I’ve been talking about why late penalties are not only not useful but they don’t work, yet I keep talking about getting work in on time and tying it to realistic resource allocation. Does this mean I’m really using late penalties?

No, but let me explain why, starting from the underlying principle of fairness that is an aesthetic pillar of good education. One part of this is that the actions of one student should not unduly affect the learning journey of another student. That includes evaluation (and associated marks).

This is the same principle that makes me reject curve grading. It makes no sense to me that someone else’s work is judged in the context of another, when we have so little real information with which we could establish any form of equivalence of human experience and available capacity.

I don’t want to create a market economy for knowledge, where we devaluate successful demonstrations of knowledge and skill for reasons that have nothing to do with learning. Curve grading devalues knowledge. Time penalties devalue knowledge.

I do have to deal with resource constraints, in that I often have (some) deadlines that are administrative necessities, such as degree awards and things like this. I have limited human resources, both personally and professionally.

Given that I do not have unconstrained resources, the fairness principle naturally extends to say that individual students should not consume resources to the detriment of others. I know that I have a limited amount of human evaluation time, therefore I have to treat this as a constrained resource. My E1 and E2 evaluations resources must be, to a degree at least, protected to ensure the best outcome for the most students. (We can factor equity into this, and should, but this stops this from being a simple linear equivalence and makes the terms more complex than they need to be for explanation, so I’ll continue this discussion as if we’re discussing equality.)

You’ve noticed that the E3 and E4 evaluation systems are pretty much always available to students. That’s deliberate. If we can automate something, we can scale it. No student is depriving another of timely evaluation and so there’s no limitation of access to E3 and E4, unless it’s too late for it to be of use.

If we ask students to get their work in at time X, it should be on the expectation that we are ready to leap into action at second X+(prep time), or that the students should be engaged in some other worthwhile activity from X+1, because otherwise we have made up a nonsense figure. In order to be fair, we should release all of our evaluations back at the same time, to avoid accidental advantages because of the order in which things were marked. (We may wish to vary this for time banking but we’ll come back to this later.) As many things are marked in surname or student number order, the only way to ensure that we don’t accidentally keep granting an advantage is to release everything at the same time.

Remember, our whole scheme is predicated on the assumption that we have designed and planned for how long it will take to go through the work and provide feedback in time for modification before another submission. When X+(prep time) comes, we should know, roughly to the hour or day, at worst, when this will be done.

If a student hands up fifteen minutes late, they have most likely missed the preparation phase. If we delay our process to include this student, then we will delay feedback to everyone. Here is a genuine motivation for students to submit on time: they will receive rich and detailed feedback as soon as it is ready. Students who hand up late will be assessed in the next round.

That’s how the real world actually works. No-one gives you half marks for something that you do a day late. It’s either accepted or not and, often, you go to the back of the queue. When you miss the bus, you don’t get 50% of the bus. You just have to wait for the next opportunity and, most of the time, there is another bus. Being late once rarely leaves you stranded without apparent hope – unlucky Martian visitors aside.

But there’s more to this. When we have finished with the first group, we can immediately release detailed feedback on what we were expecting to see, providing the best results to students and, from that point on, anyone who submits would have the benefit of information that the first group didn’t have before their initial submission. Rather than make the first group think that they should have waited (and we know students do), we give them the best possible outcome for organising their time.

The next submission deadline is done by everyone with the knowledge gained from the first pass but people who didn’t contribute to it can’t immediately use it for their own benefit. So there’s no free-riding.

There is, of course, a tricky period between the submission deadline and the release, where we could say “Well, they didn’t see the feedback” and accept the work but that’s when we think about the message we want to send. We would prefer students to improve their time management and one part of this is to have genuine outcomes from necessary deadlines.

If we let students keep handing in later and later, we will eventually end up having these late submissions running into our requirement to give feedback. But, more importantly, we will say “You shouldn’t have bothered” to those students who did hand up on time. When you say something like this, students will learn and they will change their behaviour. We should never reinforce behaviour that is the opposite of what we consider to be valuable.

Fairness is a core aesthetic of education. Authentic time management needs to reflect the reality of lost opportunity, rather than diminished recognition of good work in some numerical reduction. Our beauty argument is clear: we can be firm on certain deadlines and remove certain tasks from consideration and it will be a better approach and be more likely to have positive outcomes than an arbitrary reduction scheme already in use.

What are we assessing? How?

How we can create a better assessment system, without penalties, that works in a grade-free environment? Let’s provide a foundation for this discussion by looking at assessment today.


Bloom’s Revised Taxonomy

We have many different ways of understanding exactly how we are assessing knowledge. Bloom’s taxonomy allows us to classify the objectives that we set for students, in that we can determine if we’re just asking them to remember something, explain it, apply it, analyse it, evaluate it or, having mastered all of those other aspects, create a new example of it. We’ve also got Bigg’s SOLO taxonomy to classify levels of increasing complexity in a student’s understanding of subjects. Now let’s add in threshold concepts, learning edge momentum, neo-Piagetian theory and …

Let’s summarise and just say that we know that students take a while to learn things, can demonstrate some convincing illusions of progress that quickly fall apart, and that we can design our activities and assessment in a way that acknowledges this.

I attended a talk by Eric Mazur, of Peer Instruction fame, and he said a lot of what I’ve already said about assessment not working with how we know we should be teaching. His belief is that we rarely rise above remembering and understanding, when it comes to testing, and he’s at Harvard, where everyone would easily accept their practices as, in theory, being top notch. Eric proposed a number of approaches but his focus on outcomes was one that I really liked. He wanted to keep the coaching role he could provide separate from his evaluator role: another thing I think we should be doing more.

Eric is in Physics but all of these ideas have been extensively explored in my own field, especially where we start to look at which of the levels we teach students to and then what we assess. We do a lot of work on this in Australia and here is some work by our groups and others I have learned from:

  • Szabo, C., Falkner, K. & Falkner, N. 2014, ‘Experiences in Course Design using Neo-Piagetian Theory’
  • Falkner, K., Vivian, R., Falkner, N., 2013, ‘Neo-piagetian Forms of Reasoning in Software Development Process Construction’
  • Whalley, J., Lister, R.F., Thompson, E., Clear, T., Robbins, P., Kumar, P. & Prasad, C. 2006, ‘An Australasian study of reading and comprehension skills in novice programmers, using Bloom and SOLO taxonomies’
  • Gluga, R., Kay, J., Lister, R.F. & Teague, D. 2012, ‘On the reliability of classifying programming tasks using a neo-piagetian theory of cognitive development’

I would be remiss to not mention Anna Eckerdal’s work, and collaborations, in the area of threshold concepts. You can find her many papers on determining which concepts are going to challenge students the most, and how we could deal with this, here.

Let me summarise all of this:

  • There are different levels at which students will perform as they learn.
  • It needs careful evaluation to separate students who appear to have learned something from students who have actually learned something.
  • We often focus too much on memorisation and simple explanation, without going to more advanced levels.
  • If we want to assess advanced levels, we may have to give up the idea of trying to grade these additional steps as objectivity is almost impossible as is task equivalence.
  • We should teach in a way that supports the assessment we wish to carry out. The assessment we wish to carry out is the right choice to demonstrate true mastery of knowledge and skills.

If we are not designing for our learning outcomes, we’re unlikely to create courses to achieve those outcomes. If we don’t take into account the realities of student behaviour, we will also fail.

We can break our assessment tasks down by one of the taxonomies or learning theories and, from my own work and that of others, we know that we will get better results if we provide a learning environment that supports assessment at the desired taxonomic level.

But, there is a problem. The most descriptive, authentic and open-ended assessments incur the most load in terms of expert human marking. We don’t have a lot of expert human markers. Overloading them is not good. Pretending that we can mark an infinite number of assignments is not true. Our evaluation aesthetics are objectivity, fairness, effectiveness, timeliness and depth of feedback. Assignment evaluation should be useful to the students, to show progress, and useful to us, to show the health of the learning environment. Overloading the marker will compromise the aesthetics.

Our beauty lens tells us very clearly that we need to be careful about how we deal with our finite resources. As Eric notes, and we all know, if we were to test simpler aspects of student learning, we can throw machines at it and we have a near infinite supply of machines. I cannot produce more experts like me, easily. (Snickers from the audience) I can recruit human evaluators from my casual pool and train them to mark to something like my standard, using a rubric or using an approximation of my approach.

Thus I have a framework of assignments, divide by level, and I appear to have assignment evaluation resources. And the more expert and human the marker, the more … for want of a better word … valuable the resource. The better feedback it can produce. Yet the more valuable the resource, the less of it I have because it takes time to develop evaluation skills in humans.

Tune in tomorrow for the penalty free evaluation and feedback that ties all of this together.

Can we do this? We already have.

How does one actually turn everything I’ve been saying into a course that can be taught? We already have examples of this working, whether in the performance/competency based models found in medical schools around the world or whether in mastery learning based approaches where do not measure anything except whether a student has demonstrated sufficient knowledge or skill to show an appropriate level of mastery.

An absence of grades, or student control over their grades, is not as uncommon as many people think. MIT in the United States give students their entire first semester with no grades more specific than pass or fail. This is a deliberate decision to ease the transition of students who have gone from being leaders at their own schools to the compressed scale of MIT. Why compressed? If we were to assess all school students then we would need a scale that could measure all levels of ability, from ‘not making any progress at school’ to ‘transcendent’. The tertiary entry band is somewhere between ‘passing school studies’ to ‘transcendent’ and, depending upon the college that you enter, can shift higher and higher as your target institution becomes more exclusive. If you look at the MIT entry requirements, they are a little coy for ‘per student’ adjustments, but when the 75th percentile for the SAT components is 800, 790, 790, and 800,800,800 would be perfect, we can see that any arguments on how demotivating simple pass/fail grades must be for excellent students have not just withered, they have caught fire and the ash has blown away. When the target is MIT, it appears the freshmen get their head around a system that is even simpler than Rapaport’s.


Pictured: A highly prestigious University with some of the most stringent entry requirements in the world, which uses no grades in first semester.

Other universities, such as Brown, deliberately allow students to choose how their marks are presented, as they wish to deemphasise the numbers in order to focus on education. It is not a cakewalk to get into Brown, as these figures attest, and yet Brown have made a clear statement that they have changed their grading system in order to change student behaviour – and the world is just going to have to deal with that. It doesn’t seem to be hurting their graduates, from quotes on the website such as “Our 85% admission rate to medical school and 89% admission rate to law school are both far above the national average.

And, returning to medical schools themselves, my own University runs a medical program where the usual guidelines for grading do not hold. The medical school is running on a performance/competency scheme, where students who wish to practise medicine must demonstrate that they are knowledgable, skilful and safe to practice. Medical schools have identified the core problem in my thought experiment where two students could have the opposite set of knowledge or skills and they have come to the same logical conclusion: decide what is important and set up a scheme that works for it.

When I was a solider, I was responsible for much of the Officer Training in my home state for the Reserve. We had any number of things to report on for our candidates, across knowledge and skills, but one of them was “Demonstrate the qualities of an officer” and this single item could fail an otherwise suitable candidate. If a candidate could not be trusted to one day be in command of troops on the battlefield, based on problems we saw in peacetime, then they would be counselled to see if it could be addressed and, if not, let go. (I can assure you that this was not used often and it required a large number of observations and discussion before we would pull that handle. The power of such a thing forced us to be responsible.)

We know that limited scale, mastery-based approaches are not just working in the vocational sector but in allied sectors (such as the military), in the Ivy league (Brown) and in highly prestigious non-Ivy league institutions such as MIT. But we also know of examples such as Harvey Mudd, who proudly state that only seven students since 1955 have earned a 4.0 GPA and have a post on the career blog devoted to “explaining why your GPA is so low” And, be in no doubt, Harvey Mudd is an excellent school, especially for my discipline. I’m not criticising their program, I’ve only heard great things about them, but when you have to put up a page like that? You’re admitting that there’s a problem but you are pushing it on to the student to fix it. But contrast that with Brown, who say to employers “look at our students, not their grades” (at least on the website).

Feedback to the students on their progress is essential. Being able to see what your students are up to is essential for the teacher. Being able to see what your staff and schools are doing is important for the University. Employers want to know who to hire. Which of these is the most important?

The students. It has to be the students. Doesn’t it? (Arguments for the existence of Universities as a self-sustaining bureaucracy system in the comments, if you think that’s a thing you want to do.)

This is not an easy problem but, as we can see, we have pieces of the solution all over the place. Tomorrow, I’m going to put in a place a cornerstone of beautiful assessment that I haven’t seen provided elsewhere or explained in this way. (Then all of you can tell me which papers I should have read to get it from, I can publish the citation, and we can all go forward.)


Not just videos!


Just a quick note that on-line learning is not just videos! I am a very strong advocate of active learning in my face-to-face practice and am working to compose on-line systems that will be as close to this as possible: learning and doing and building and thinking are all essential parts of the process.

Please, once again, check out Mark’s CACM blog on the 10 myths of teaching computer science. There’s great stuff here that extends everything I’m talking about with short video sequences and attention spans. I wrote something ages ago about not turning ‘chalk and talk’ into ‘watch and scratch (your head)’. It’s a little dated but I include it for completeness.

Teaching for (current) Humans

da Vinci's Vitrvuian Man. Human figure with arms and legs outstretched showing the ratios of the perfect form.

Leonardo’s experiments in human-octopus engineering never received appropriate recognition.

I was recently at a conference-like event where someone stood up and talked about video lectures. And these lectures were about 40 minutes long.

Over several million viewing sessions, EdX have clearly shown that watchable video length tops out at just over 6 minutes. And that’s the same for certificate-earning students and the people who have enrolled for fun. At 9 minutes, students are watching for fewer than 6 minutes. At the 40 minute mark, it’s 3-4 minutes.

I raised this point to the speaker because I like the idea that, if we do on-line it should be good on-line, and I got a response that was basically “Yes, I know that but I think the students should be watching these anyway.” Um. Six minutes is the limit but, hey, students, sit there for this time anyway.

We have never been able to unobtrusively measure certain student activities as well as we can today. I admit that it’s hard to measure actual attention by looking at video activity time but it’s also hard to measure activity by watching students in a lecture theatre. When we add clickers to measure lecture activity, we change the activity and, unsurprisingly, clicker-based assessment of lecture attentiveness gives us different numbers to observation of note-taking. We can monitor video activity by watching what the student actually does and pausing/stopping a video is a very clear signal of “I’m done”. The fact that students are less likely to watch as far on longer videos is a pretty interesting one because it implies that students will hold on for a while if the end is in sight.

In a lecture, we think students fade after about 15-20 minutes but, because of physical implications, peer pressure, politeness and inertia, we don’t know how many students have silently switched off before that because very few will just get up and leave. That 6 minute figure may be the true measure of how long a human will remain engaged in this kind of task when there is no active component and we are asking them to process or retain complex cognitive content. (Speculation, here, as I’m still reading into one of these areas but you see where I’m going.) We know that cognitive load is a complicated thing and that identifying subgoals of learning makes a difference in cognitive load (Morrison, Margulieux, Guzdial)  but, in so many cases, this isn’t what is happening in those long videos, they’re just someone talking with loose scaffolding. Having designed courses with short videos I can tell you that it forces you, as the designer and teacher, to focus on exactly what you want to say and it really helps in making your points, clearly. Implicit sub-goal labelling, anyone? (I can hear Briana and Mark warming up their keyboards!)

If you want to make your videos 40 minutes long, I can’t stop you. But I can tell you that everything I know tells me that you have set your materials up for another hominid species because you’re not providing something that’s likely to be effective for current humans.


Assessment is (often) neither good nor true.

If you’ve been reading my blog over the past years, you’ll know that I have a lot of time for thinking about assessment systems that encourage and develop students, with an emphasis on intrinsic motivation. I’m strongly influenced by the work of Alfie Kohn, unsurprisingly given I’ve already shown my hand on Focault! But there are many other writers who are… reassessing assessment: why we do it, why we think we are doing it, how we do it, what actually happens and what we achieve.

Screen Shot 2016-01-09 at 6.50.12 PM

In my framing, I want assessment to be as all other aspects of education: aesthetically satisfying, leading to good outcomes and being clear and what it is and what it is not. Beautiful. Good. True. There are some better and worse assessment approaches out there and there are many papers discussing this.  One of these that I have found really useful is Rapaport’s paper on a simplified assessment process for consistent, fair and efficient grading. Although I disagree with some aspects, I consider it to be both good, as it is designed to clearly address a certain problem to achieve good outcomes, and it is true, because it is very honest about providing guidance to the student as to how well they have met the challenge. It is also highly illustrative and honest in representing the struggle of the author in dealing with the collision of novel and traditional assessment systems. However, further discussion of Rapaport is for the near future. Let me start by demonstrating how broken things often are in assessment, by taking you through a hypothetical situation.

Thought Experiment 1

Two students, A and B, are taking the same course. There are a number of assignments in the course and two exams. A and B, by sheer luck, end up doing no overlapping work. They complete different assignments to each other, half each and achieve the same (cumulative bare pass overall) marks. They then manage to score bare pass marks in both exams, but one answers only the even questions and only answers the odd. (And, yes, there are an even number of questions.) Because of the way the assessment was constructed, they have managed to avoid any common answers in the same area of course knowledge. Yet, both end up scoring 50%, a passing grade in the Australian system.

Which of these students has the correct half of the knowledge?

I had planned to build up to Rapaport but, if you’re reading the blog comments, he’s already been mentioned so I’ll summarise his 2011 paper before I get to my main point. In 2011, William J. Rapaport, SUNY Buffalo, published a paper entitled “A Triage Theory of Grading: The Good, The Bad and the Middling.” in Teaching Philosophy. This paper summarised a number of thoughtful and important authors, among them Perry, Wolff, and Kohn. Rapaport starts by asking why we grade, moving through Wolff’s taxonomic classification of assessment into criticism, evaluation, and ranking. Students are trained, by our world and our education systems to treat grades as a measure of progress and, in many ways, a proxy for knowledge. But this brings us into conflict with Perry’s developmental stages, where students start with a deep need for authority and the safety of a single right answer. It is only when students are capable of understanding that there are, in many cases, multiple right answers that we can expect them to understand that grades can have multiple meanings. As Rapaport notes, grades are inherently dual: a representative symbol attached to a quality measure and then, in his words, “ethical and aesthetic values are attached” (emphasis mine.) In other words, a B is a measure of progress (not quite there) that also has a value of being … second-tier if an A is our measure of excellence. A is not A, as it must be contextualised. Sorry, Ayn.

When we start to examine why we are grading, Kohn tells us that the carrot and stick is never as effective as the motivation that someone has intrinsically. So we look to Wolff: are we critiquing for feedback, are we evaluating learning, or are we providing handy value measures for sorting our product for some consumer or market? Returning to my thought experiment above, we cannot provide feedback on assignments that students don’t do, our evaluation of learning says that both students are acceptable for complementary knowledge, and our students cannot be discerned from their graded rank, despite the fact that they have nothing in common!

Yes, it’s an artificial example but, without attention to the design of our courses and in particular the design of our assessment, it is entirely possible to achieve this result to some degree. This is where I wish to refer to Rapaport as an example of thoughtful design, with a clear assessment goal in mind. To step away from measures that provide an (effectively) arbitrary distinction, Rapaport proposes a tiered system for grading that simplifies the overall system with an emphasis on identifying whether a piece of assessment work is demonstrating clear knowledge, a partial solution, an incorrect solution or no work at all.

This, for me, is an example of assessment that is pretty close to true. The difference between a 74 and a 75 is, in most cases, not very defensible (after Haladyna) unless you are applying some kind of ‘quality gate’ that really reduces a percentile scale to, at most, 13 different outcomes. Rapaport’s argument is that we can reduce this further and this will reduce grade clawing, identify clear levels of achieve and reduce marking load on the assessor. That last point is important. A system that buries the marker under load is not sustainable. It cannot be beautiful.

There are issues in taking this approach and turning it back into the grades that our institutions generally require. Rapaport is very open about the difficulties that he has turning his triage system into an acceptable letter grade and it’s worth reading the paper to see that discussion alone, because it quite clearly shows what

Rapaport’s scheme clearly defines which of Wolff’s criteria he wishes his assessment to achieve. The scheme, for individual assessments, is no good for ranking (although we can fashion a ranking from it) but it is good to identify weak areas of knowledge (as transmitted or received) for evaluation of progress and also for providing elementary critique. It says what it is and it pretty much does it. It sets out to achieve a clear goal.

The paper ends with a summary of the key points of Haladyna’s 1999 book “A Complete Guide to Student Grading”, which brings all of this together.

Haladyna says that “Before we assign a grade to any students, we need:

  1. an idea about what a grade means,
  2. an understanding of the purposes of grading,
  3. a set of personal beliefs and proven principles that we will use in teaching

    and grading,

  4. a set of criteria on which the grade is based, and, finally,
  5. a grading method,which is a set of procedures that we consistently follow

    in arriving at each student’s grade. (Haladyna 1999: ix)

There is no doubt that Rapaport’s scheme meets all of these criteria and, yet, for me, we have not yet gone far enough in search of the most beautiful, most good and most true extent that we can take this idea. Is point 3, which could be summarised as aesthetics not enough for me? Apparently not.

Tomorrow I will return to Rapaport to discuss those aspects I disagree with and, later on, discuss both an even more trimmed-down model and some more controversial aspects.

Beauty Attack I: Assessment

For the next week, I’m going to be applying an aesthetic lens to assessment and, because I’m in Computer Science, I’ll be focusing on the assessment of Computer Science knowledge and practice.

How do we know if our students know something? In reality, the best way is to turn them loose, come back in 25 years and ask the people in their lives, their clients, their beneficiaries and (of course) their victims, the same question: “Did the student demonstrate knowledge of area X?”

This is not available to us as an option because my Dean, if not my Head of School, would probably peer at me curiously if I were to suggest that all measurement of my efficacy be moved a generation from now. Thus, I am forced to retreat to the conventions and traditions of assessment: it is now up to the student to demonstrate to me, within a fixed timeframe, that he or she has taken a firm grip of the knowledge.

We know that students who are prepared to learn and who are motivated to learn will probably learn, often regardless of what we do. We don’t have to read Vallerand et al to be convinced that self-motivated students will perform, as we can see it every day. (But it is an enjoyable paper to read!) Yet we measure these students in the same assessment frames as students who do not have the same advantages and, thus, may not yet have the luxury or capacity of self-motivation: students from disadvantaged backgrounds, students who are first-in-family and students who wouldn’t know auto-didacticism if it were to dance in front of them.

How, then, do we fairly determine what it means to pass, what it means to fail and, even more subtly, what it means to pass or fail well? I hesitate to invoke Foucault, especially when I speak of “Discipline and Punish” in an educational setting, but he is unavoidable when we gaze upon a system that is dedicated to awarding ranks, graduated in terms of punishment and reward. It is strange, really, that were many patients to die under the hand of a surgeon for a simple surgery, we would ask for an inquest, but many students failing under the same professor in a first-year course is merely an indicator of “bad students”. So many of our mechanisms tell us that students are failing but often too late to be helpful and not in a way that encourages improvement. This is punishment. And it is not good enough.

A picture of Michel Foucault, mostly head shot, with his hand on his forehead, wearing glasses, apparently deep in contemplation.

Foucault: thinking about something very complicated, apparently.

Our assessment mechanisms are not beautiful. They are barely functional. They exist to provide a rough measure to separate pass from fail, with a variety of other distinctions that owe more to previous experience and privilege in many cases than any higher pedagogical approach.

Over the next week, I shall conduct an attack upon the assessment mechanisms that are currently used in my field, including my own, in the hope of arriving at a mechanism of design, practice and validation that is pedagogically pleasing (the aesthetic argument again) and will lead to outcomes that are both good and true.

Getting it wrong

It’s fine to write all sorts of wonderful statements about theory and design and we can achieve a lot in thinking about such things. But, let’s be honest, we face massive challenges in the 21st Century and improved thinking and practice in education is one of the most important contributions we can make to future generations. Thus, if we want to change the world based upon our thinking, then all of our discussions have no use if we can’t develop something that’s going to achieve our goals. Dewey’s work provide an experimental, even instrumental, approach to the American philosophical school of pragmatism. To briefly explain this term in the specific meaning, I turn to William James, American psychologist and philosopher.

Pragmatism asks its usual question. “Grant an idea or belief to be true,” it says, “what concrete difference will its being true make in anyone’s actual life? How will the truth be realized? What experiences will be different from those which would obtain if the belief were false? What, in short, is the truth’s cash-value in experiential terms?”
William James, Pragmatism (1907)

(James is far too complex to summarise with one paragraph and I am using only one of his ideas to illustrate my point. Even James’ scholars disagree on how to interpret many of his writings. It’s worth reading him and Hegel at the same time as they square off across the ring quite well.)

Portrait, side view, of a man before middle-age, with dark receding hair, wearing a waistcoat and red tie over a white shirt. He appears to be painting.

Portrait of William James by John La Farge, circa 1859

What will be different? How will we recognise or measure it? What do we gain by knowing if we are right or wrong? This is why all good education researchers depend so heavily on testing their hypotheses in the space where they will make an impact and there is usually an obligation to look at how things are working before and after any intervention. This places further obligation upon us to evaluate what has occurred and then, if our goals haven’t been achieved, change our approach further. It’s a simple breakdown of roles but I often think as educational work in three heavily overlapping areas: practice, scholarship and research. Practice should be applying techniques that achieve our goals, scholarship involves the investigation, dissemination and comparison of these techniques, and research builds on scholarship to evaluate practice in ways that will validate and develop new techniques – or invalidate formerly accepted ones as knowledge improves. This leads me to my point: evaluating your own efforts to work out how to do better next time.

There are designers, architects, makers and engineers who are committed to the practice of impact design, where (and this is one definition):

“Impact design is rooted in the core belief that design can be used to create positive social, environmental and economic change, and focuses on actively measuring impact to inform and direct the design process.”  Impact Design Hub, About.

Thus, evaluation of what works is essential for these practitioners. The same website recently shared some designers talking about things that went wrong and what they learned from the process.

If you read that link, you’ll see all sorts of lessons: don’t hand innovative control to someone who’s scared of risk, don’t ignore your community, don’t apply your cultural values to others unless you really know what you’re doing, and don’t forget the importance of communication.

Writing some pretty words every day is not going to achieve my goal and I need to be reminded of the risks that I face in trying to achieve something large – one of which is not actually working towards my own goals in a useful manner! One of the biggest risks is confusing writing a blog with actual work, unless I use this medium to do something. Over the coming weeks, I hope to show you what I am doing as I move towards my very ambitious goal of “beautiful education”. I hope you find the linked article as useful as I did.


Maximise beauty or minimise …?

There is a Peanuts comic from April 16, 1972 where Peppermint Patty asks Charlie Brown what he thinks the secret of living is. Charlie Brown’s answer is “A convertible and a lake.” His reasoning is simple. When it’s sunny you can drive around in the convertible and be happy. When it’s raining you can think “oh well, the rain will fill up my lake.” Peppermint Patty asks Snoopy the same question and, committed sensualist that he is, he kisses her on the nose.


This is the Amphicar. In the 21st century, no philosophical construct will avoid being reified.

Charlie Brown, a character written to be constantly ground down by the world around him, is not seeking to maximise his happiness, he is seeking to minimise his unhappiness. Given his life, this is an understandable philosophy.

But what of beauty and, in this context, beauty in education? I’ve already introduced the term ‘ugly’ as the opposite of beauty but it’s hard for me to wrap my head around the notion of ‘minimising ugliness’; ugly is such a strong term. It’s also hard to argue that any education, when it covers any aspects to which we would apply that label, is ever totally ugly. Perhaps, in the educational framing, the absence of beauty is plainness. We end up with things that are ordinary, rather than extraordinary. I think that there is more than enough range between beauty and plainness for us to have a discussion on the movement between those states.

Is it enough for us to accept educational thinking that is acceptably plain? Is that a successful strategy? Many valid concerns about learning at scale focus on the innate homogeneity and lack of personalisation inherent in such an approach: plainness is the enemy. Yet there are many traditional and face-to-face approaches where plainness stares us in the face. Banality in education is, when identified, always rejected, yet it so often slips by without identification. We know that there is a hole in our slippers, yet we only seem to notice when that hole directly affects us or someone else points it out.

My thesis here is that a framing of beauty should lead us to a strategy of maximising beauty, rather than minimising plainness, as it is only in that pursuit that we model that key stage of falling in love with knowledge that we wish our students to emulate. If we say “meh is ok”, then that is what we will receive in return. We model, they follow as part of their learning. That’s what we’re trying to make happen, isn’t it?

What would Charlie Brown’s self-protective philosophy look like in a positive framing, maximising his joy rather than managing his grief? I’m not sure but I think it would look a lot like a dancing beagle who kisses people on the nose. We may need more than this for a sound foundation to reframe education!

A Year of Beauty

Plato: Unifying key cosmic values of Greek culture to a useful conceptual trinity.

Plato: Unifying key cosmic values of Greek culture to a useful conceptual trinity.

Ever since education became something we discussed, teachers and learners alike have had strong opinions regarding the quality of education and how it can be improved. What is surprising, as you look at these discussions over time, is how often we seem to come back to the same ideas. We read Dewey and we hear echoes of Rousseau. So many echoes and so much careful thought, found as we built new modern frames with Vygotsky, Piaget, Montessori, Papert and so many more. But little of this should really be a surprise because we can go back to the writings of Marcus Fabius Quintilianus (Quinitilian) and his twelve books of The Orator’s Education and we find discussion of small class sizes, constructive student-focused discussions, and that more people were capable of thought and far-reaching intellectual pursuits than was popularly believed.

… as birds are born for flying, horses for speed, beasts of prey for ferocity, so are [humans] for mental activity and resourcefulness.” Quintilian, Book I, page 65.

I used to say that it was stunning how contemporary education seems to be slow in moving in directions first suggested by Dewey a hundred years ago, then I discovered that Rousseau had said it 150 years before that. Now I find that Quntilian wrote things such as this nearly 2,000 years ago. And Marcus Aurelius, among other stoics, made much of approaches to thinking that, somehow, were put to one side as we industrialised education much as we had industrialised everything else.

This year I have accepted that we have had 2,000 years of thinking (and as much evidence when we are bold enough to experiment) and yet we just have not seen enough change. Dewey’s critique of the University is still valid. Rousseau’s lament on attaining true mastery of knowledge stands. Quintilian’s distrust of mere imitation would not be quieted when looking at much of repetitive modern examination practice.

What stops us from changing? We have more than enough evidence of discussion and thought, from some of the greatest philosophers we have seen. When we start looking at education, in varying forms, we wander across Plato, Hypatia, Hegel, Kant, Nietzsche, in addition to all of those I have already mentioned. But evidence, as it stands, does not appear to be enough, especially in the face of personal perception of achievement, contribution and outcomes, whether supported by facts or not.

Evidence of uncertainty is not enough. Evidence of the lack of efficacy of techniques, now that we can and do measure them, is not enough. Evidence that students fail who then, under other tutors or approaches, mysteriously flourish elsewhere, is not enough.

Authority, by itself, is not enough. We can be told to do more or to do things differently but the research we have suggests that an externally applied control mechanism just doesn’t work very well for areas where thinking is required. And thinking is, most definitely, required for education.

I have already commented elsewhere on Mark Guzdial’s post that attracted so much attention and, yet, all he was saying was what we have seen repeated throughout history and is now supported in this ‘gilt age’ of measurement of efficacy. It still took local authority to stop people piling onto him (even under the rather shabby cloak of ‘scientific enquiry’ that masks so much negative activity). Mark is repeating the words of educators throughout the ages who have stepped back and asked “Is what we are doing the best thing we could be doing?” It is human to say “But, if I know that this is the evidence, why am I acting as if it were not true?” But it is quite clear that this is still challenging and, amazingly, heretical to an extent, despite these (apparently controversial) ideas pre-dating most of what we know as the trappings and establishments of education. Here is our evidence that evidence is not enough. This experience is the authority that, while authority can halt a debate, authority cannot force people to alter such a deeply complex and cognitive practice in a useful manner. Nobody is necessarily agreeing with Mark, they’re just no longer arguing. That’s not helpful.

So, where to from here?

We should not throw out everything old simply because it is old, as that is meaningless without evidence to do so and it is wrong as autocratically rejecting everything new because it is new.

The challenge is to find a way of explaining how things could change without forcing conflict between evidence and personal experience and without having to resort to an argument by authority, whether moral or experiential. And this is a massive challenge.

This year, I looked back to find other ways forward. I looked back to the three values of Ancient Greece, brought together as a trinity through Socrates and Plato.

These three values are: beauty, goodness and truth. Here, truth means seeing things as they are (non-concealment). Goodness denotes the excellence of something and often refers to a purpose of meaning for existence, in the sense of a good life. Beauty? Beauty is an aesthetic delight; pleasing to those senses that value certain criteria. It does not merely mean pretty, as we can have many ways that something is aesthetically pleasing. For Dewey, equality of access was an essential criterion of education; education could only be beautiful to Dewey if it was free and easily available. For Plato, the revelation of knowledge was good and beauty could arose a love for this knowledge that would lead to such a good. By revealing good, reality, to our selves and our world, we are ultimately seeking truth: seeing the world as it really is.

In the Platonic ideal, a beautiful education leads us to fall in love with learning and gives us momentum to strive for good, which will lead us to truth. Is there any better expression of what we all would really want to see in our classrooms?

I can speak of efficiencies of education, of retention rates and average grades. Or I can ask you if something is beautiful. We may not all agree on details of constructivist theory but if we can discuss those characteristics that we can maximise to lead towards a beautiful outcome, aesthetics, perhaps we can understand where we differ and, even more optimistically, move towards agreement. Towards beautiful educational practice. Towards a system and methodology that makes our students as excited about learning as we are about teaching. Let me illustrate.

A teacher stands in front of a class, delivering the same lecture that has been delivered for the last ten years. From the same book. The classroom is half-empty. There’s an assignment due tomorrow morning. Same assignment as the last three years. The teacher knows roughly how many people will ask for an extension an hour beforehand, how many will hand up and how many will cheat.

I can talk about evidence, about pedagogy, about political and class theory, about all forms of authority, or I can ask you, in the privacy of your head, to think about these questions.

  • Is this beautiful? Which of the aesthetics of education are really being satisfied here?
  • Is it good? Is this going to lead to the outcomes that you want for all of the students in the class?
  • Is it true? Is this really the way that your students will be applying this knowledge, developing it, exploring it and taking it further, to hand on to other people?
  • And now, having thought about yourself, what do you think your students would say? Would they think this was beautiful, once you explained what you meant?

Over the coming year, I will be writing a lot more on this. I know that this idea is not unique (Dewey wrote on this, to an extent, and, more recently, several books in the dramatic arts have taken up the case of beauty and education) but it is one that we do not often address in science and engineering.

My challenge, for 2016, is to try to provide a year of beautiful education. Succeed or fail, I will document it here.