Indulgences, LLMs, and the crisis of the university

  • Themes: Artificial Intelligence, Technology

Just as indulgences once acted as a proxy for salvation, so exams now serve as proxies for learning.

A woodcut from 1510 critiquing the practice of indulgence trading in medieval Europe.
A woodcut from 1510 critiquing the practice of indulgence trading in medieval Europe. Credit: The History Collection

Students have always tried to cheat.

I used to be a secondary English teacher, and remember one student submitting an essay that was copied and pasted from SparkNotes. What gave him away? He hadn’t bothered to delete the advert for teeth whitening that was on the webpage.

Another time, a student submitted an essay with a series of paragraphs in the middle that had been copied and pasted from Wikipedia, with just a couple of words tweaked.

In 2014 a university lecturer complained about students who overused the thesaurus function on their word processors and ended up with absurdities like ‘sinister buttocks’ replacing ‘left behind’.

SparkNotes, Wikipedia and the word-processor thesaurus are technologies of the past. In that world, the most effective and untraceable form of cheating on essays was only available to the very wealthy, who could afford to pay someone smart to write a bespoke essay for them. There used to be a lot of demand for untraceable plagiarism, but very limited supply.

Large Language Models have changed all that. One of the most popular use cases for LLMs is cheating on essays – or, in Anthropic’s slightly more euphemistic term, ‘completing academic assignments’. For all the talk of curing cancer, solving email or establishing colonies in space, it is the democratisation of cheating and the disruption of assessment where LLMs have really excelled.

Institutions can cope with elite rule-breaking. When everyone breaks the rules, it causes a crisis. It is theoretically possible that there will be students graduating with good degrees this summer who have not written a word of their own during their entire degree course. What should universities do in response?

Assessment is a measurement of learning. Unfortunately, learning is invisible, and we cannot measure it directly. We cannot scan someone’s brain to see if they understood that morning’s lesson on multiplication. Instead, we have to measure learning indirectly. We set students tasks, and based on how they perform those tasks, we make inferences about what they have learned. If a student can correctly fill in a multiplication grid, then maybe we are justified in inferring that they can do multiplication. But if we know that they copied the multiplication grid from a friend, that will obviously change our inference.

The most basic principle of assessment is that the score on the test itself does not matter: what matters is the inference you can make. The multiplication grid itself does not matter, the essay does not matter, the portfolio does not matter. They are proxies for learning, not learning itself. What matters is what that particular proxy allows you to infer about what the student has learned. This is the fundamental explanation for why allowing LLMs to do the assessment is a problem: they break the link between the proxy and most of the inferences that anyone would want to make from an assessment.

The process of proving that a test score does support the inference you want is known as validation, and it is not straightforward. Universities and employers want to make inferences about current performance based on exams taken years ago. Some of the best validation processes therefore involve following up with students years after the exam.

Even if you successfully validate an assessment once, that may not be enough. Over time, a test can stay the same but the inferences weaken as students find ways of gaming it. A hallmark of a badly-designed test is one that rewards intensive study of the mark scheme more than the study of the construct it is supposed to be measuring.

Many national and international school-assessment systems have fairly rigorous validation processes, but university systems tend to be less thorough. Universities could turn the crisis of LLMs into an opportunity and use it to rethink their assessment procedures from first principles. It is unlikely they will do this, and it is a high-risk strategy, because a well-designed exam system might reveal some uncomfortable truths about just how much – or how little – students learn at university.

There is a long-running debate about whether the value of a university degree derives from the skills and knowledge it teaches the student, or whether the degree functions as a signal to employers that the graduate is of above average intelligence and diligence. The absence of good assessment evidence makes the debate hard to settle one way or the other, but there are a number of facts that suggest signalling is a more important part of the mix than many would assume.

University fees have increased dramatically over the past few decades, even as the cost of access to knowledge has collapsed. It is not always clear that the hefty student fees are being spent on improving learning: at many universities, the numbers of administrative and managerial staff have increased rapidly, while teaching is often carried out by teaching-only staff on part-time and insecure contracts. There’s also been a building boom, and the phenomenon of ‘campus expansion’, where universities open a secondary campus in a different city.

Signalling theory explains these facts. Students want to put the name of a high-status institution on their CV, and high fees and shiny new buildings signal prestige.

Technology has democratised cheating, forced a destabilising focus on the first principles of assessment, and revealed the credentialism at the heart of modern academia.

It’s a set of circumstances which is capable of destroying institutions. In the 16th century, the sale of papal indulgences was democratised by the printing press. This led to a focus on the theological first principles of sin and forgiveness, which upended religious institutions across Europe. Theologians ended up with their own validation problem: did indulgence certificates or purple vestments or beautiful buildings or polyphonic motets say something meaningful about the state of your immortal soul? Or were they just proxies whose link to reality had been broken – worse, proxies that not just failed to measure good behaviour, but actively rewarded bad behaviour?

Martin Luther’s use of the printing press to create viral pamphlets is often seen as the emblematic example of how technology sparked the Reformation. But Luther’s pamphlets were a reaction to an earlier and equally innovative use of the printing press: the indulgence certificate template. This was a printable pro forma that dramatically expanded access to indulgences and enabled Luther’s antagonist, Johann Tetzel, to raise money for the grand building project of St Peter’s Basilica.

Luther’s response to Tetzel was the Ninety-Five Theses, which are all about the validation problem.

Thesis 28: ‘What is certain is that, when the money clinks in the tin, profit increases, and avarice can, too. The church’s power of intercession, however, is entirely in God’s hands.’ Tetzel is charging people for certificates which he says will remit the punishment due for sin. But the certificate is just a proxy. How can we be certain that there is a link between the proxy and the will of God?

Thesis 36: ‘Any truly remorseful Christian has a right to full remission of punishment and guilt even without letters of indulgence.’ You can just repent of your sins. The indulgence letter isn’t necessary, just as you can learn things without going to university or getting a degree certificate.

Thesis 43: ‘Christians should be taught that giving to the poor or lending to the needy is better than buying indulgences.’ What if – like a badly-designed test – the sale of indulgences makes people less likely to do the things that will actually win them forgiveness – like donating money to the poor and needy?

Proto-Protestants were making similar arguments decades before Luther. What made Luther’s attack so incendiary was not just the easy circulation enabled by the printing press. It was that, thanks to Tetzel’s own innovative use of the printing press, many ordinary people would have purchased indulgences themselves, or known someone who had. Indulgence purchasing was a reality to the ordinary person in a way it wasn’t a generation earlier – just as plagiarism is now a reality to the median student in a way it wasn’t five years ago.

The final historical lesson of indulgences is that problems which start out within an institution get noticed by people outside the institution. Luther and Tetzel were two Catholics arguing about the meaning of sin and forgiveness. Before long, their argument spread beyond the church to secular leaders, who destroyed its privileges and property. Universities might not be interested in a reformation, but a reformation might be interested in them.

Author

Daisy Christodoulou