The Mediocrity Machine

Without refinement and iteration LLMs seem to want to be distinctly average…

During a recent creative writing lesson with a key stage 3 class I noticed several students using AI to assist in their work. I had previously discussed with the class the ways in which AI can and cannot be useful, but to facilitate a more controlled use of AI in the work I supplied them with a GDPR compliant chatbot that I created, using the Mizou platform, to ensure standardised guardrails and to offer limited help within clearly defined parameters. I allowed the student to get assistance with synonyms and examples of imagery and even to ask for feedback from the AI on short samples of their own writing. Ultimately, the assessment was to be handwritten in their books, and they were allowed to use AI in the preparation stage, as long as they adhered to the clear guidelines I provided.

However, some students still attempted to generate large pieces of work using AI, overestimating its effectiveness and underestimating the importance of the learning process.

How to showcase AI ‘creativity’?

As a result, I decided to demonstrate why I believe the students are more clever and creative than AI could ever be. I produced a prompt, similar to the kind used to scaffold a piece of work for a class: to describe a desert sequence – a task I had actually used earlier in the year. I then fed my prompt into four different AI models, which produced four seemingly different descriptions. However, upon closer examination, all the descriptions were structured in almost the same way. My prompt included fairly detailed success criteria, such as a student might be given. The AIs followed these very closely.

Moreover, each of the AI-generated descriptions used similar styles and reached for approximately the same kinds of cliches. I then took all four models and ran them through Claude, an AI assistant, asking it to identify the similarities. It reached the same conclusions I did, noting the use of the same images, such as cacti “standing sentinel with their thorny arms pointing towards the skies,” (all four versions used the word sentinel and three in relation to cacti) and the presence of similar sound imagery. The descriptions utilised the same metaphors and relied heavily on poetic language, resulting in a fairly generic, paint-by-numbers approach.

Yes, the vocabulary was impressive. Yes, it ticked all of the boxes for the assessment criteria – not a surprise as this was literally what I included in the prompt. Yes, it would score highly on all of the structural and content aspects of the assessment. However, what was lacking was any sense of originality, which really should come as no surprise.

What did the students make of it?

I created a resource with my instructions and the four examples together on a single worksheet. The students were initially impressed, but then I got them to look for similarities between the texts and the truth of the matter began to appear. The students were able to see the striking structural similarities – once they ‘pulled back’ and moved away from the micro-focus on those seemingly impressive elements such as the vocabulary and use of imagery.

Showing these examples to the students highlighted the strengths and limitations of AI. It is useful for providing alternatives, serving as a thesaurus, giving examples of language use and even offering feedback on the effectiveness of one’s own work. AI also excels at grammar and structural elements where numerical algorithms are applicable. However, when it comes to creativity and originality, AI falls short.

Taking it further

As an extension to this thought exercise I ran all four of the texts back through Claude and asked it to identify the similarities – it came up with this:

Personification of the desert landscape:
Emphasis on the harshness and unforgiving nature of the desert:
Vivid sensory descriptions:
Use of poetic and metaphorical language:
Introduction of desert flora and fauna:
Reflection on the passage of time and the desert’s ancient history:

The students had spotted most of these – that is how clear the patterning was.

The Prompt I used was this:

Please create a description of a \[desert scene\] that could be the setting for the start of a narrative. Include the following techniques; sensory language, simile, metaphor, personification, zoomorphism, anaphora, descriptive verb and noun choices, adverbial sentence starters. Vary some of the sentence openers, introduce some more varied punctuation, including semi-colons, parenthetical dashes and some different sentence ranges for effect.

The prompt certainly explains why personification and sensory language is present in each case – I asked for it – however it does not explain the other features and it doesn’t explain the structural similarities.

Why so average?

This is more easily explained when we consider the mechanism behind the ‘Artificial Intelligence’ (oh how I wish this term had never come into use…).

Given that what the AI is doing is aggregating all of the examples of descriptions of deserts that it can get and then blending them to reach a sort of ‘average’ it is hardly surprising that what happened was an underlying level of similarity.

What was stark was that the students could see it and so could I.

The classroom takeaway – hopefully my students will now see how much more creative and interesting they are than an AI.

However…

That isn’t where this experiment ended. I had one more thing I wanted to do. I asked Claude to try to suggest why this was the case – sure enough it gave a fairly good answer. This is what it had to say on the matter…

The patterns and weaknesses observed in the AI-generated desert scenes can be attributed to the fundamental way in which AI language models, such as GPT-3, are trained and function.

1. Large-scale pattern recognition: AI language models are trained on vast amounts of text data, allowing them to identify and learn patterns in language use, such as common phrases, descriptive techniques, and literary devices associated with specific themes or genres. When generating text, the models rely on these learned patterns, which can result in the use of clichés, tropes, and generic language.

2. Lack of real-world understanding: While AI models can recognize and reproduce patterns in language, they do not possess a deep, real-world understanding of the concepts they describe. They do not have firsthand experiences or the ability to grasp the nuances and complexities of a particular environment, such as a desert. This limitation can lead to descriptions that feel superficial or lacking in specificity.

3. Absence of true creativity: AI language models are essentially complex statistical tools that predict the most likely next word or phrase based on the input they receive. They do not possess genuine creativity or the ability to generate truly original ideas. As a result, the texts they produce often feel formulaic and lack the unique perspective that a human writer can bring to a piece.

4. Limited emotional intelligence: AI models struggle to understand and convey complex human emotions and experiences. They can recognize and reproduce patterns associated with emotional language, but they do not have the capacity to truly grasp the psychological impact of a situation on a character or reader. This limitation can lead to texts that feel emotionally flat or disconnected.

5. Dependence on training data: The quality and diversity of the text data used to train AI language models directly influence the outputs they generate. If the training data is biased towards certain themes, styles, or perspectives, the AI-generated text will likely reflect those biases. This can result in the repetition of stereotypes, clichés, and overused literary devices.

6. Lack of context awareness: AI models process text on a local level, focusing on the immediate context of the words and phrases they generate. They do not have a broader understanding of the overall narrative, theme, or purpose of the text. This can lead to inconsistencies, logical gaps, or a lack of coherence in the generated passages.

The idea of showing students the limitations as well as the uses of AI will need to be a part of how we shift our practice and I found that using these kinds of models is an effective way of demonstrating this.

Going forwards

Integrating LLMs to enhance the writing process will be a valuable skill and something I’ll definitely work on, but first I want my students to become confident in their own creative abilities and know that they can be interesting and sophisticated without the crutch of AI powered tool. I’m a firm believer in developing the expertise first and then enhancing it if and when it’s appropriate and helpful.

I hope this is useful to someone somewhere – as ever, happy AI-enhanced teaching!

And, as ever, if this was in any way helpful do sign up for alerts and don’t forget to follow me on LinkedIn, Facebook, Instagram and X.

#EdTech #AIinEDU #AIintheclassroom #AIinschools #AIforteachers #AIineducation #AI #Education

Here is a link to the resources I used in the lesson in case you’re interested!

desert-scenes Download

handout Download