Critical GenAI Literacies

Ethical, Epistemic, and Practical Considerations

Alasdair B R Stewart

Introduction

Note

I have setup custom GPTs to illustrate concepts and use cases.

To use them you only need a free ChatGPT account.

All a custom GPT is is a convenient way to save a specific prompt.
The instructions for each are available within the online materials.
If you prefer to use another genAI tool, just copy&paste the instructions as the opening prompt in a new chat.

Important

It is vital that any use of generative AI complies with the policies and guidance of your academic institution and funder.

What we will cover today is not a replacement to any guidance provided by your institution and/or funder. Instead, it aims to supplement and clarify such guidance with critical reflexive considerations to make when using genAI.

Note

You do not need to pay for genAI tools - and your university will never expect you to pay out of your own pocket for anything required in your programme / courses.

Competition between companies also means most models and core features are available for free - subscriptions usually just buy higher usage limits and some non-essential additional features.

Critical GenAI Literacies

Critical genAI literacies are important for understanding genAI development, impacts, and use-cases.

Assumptions and intentions behind genAI development and deployment.
Social, political, and economic impacts of genAI.
Questions of fairness, justice, and equality.
Challenging narratives pushed by the AI hype-machine.
Going beyond simple lists of dos and don’ts
Learning to use genAI critically, ethically, and effectively.

An overview of LLM training

Vast amounts of data is gathered from the open web, including copyright content without permission and illegal content.

A ‘base model’ is trained on this data, it can generate text but not necessarily and consistently in a back-and-forth conversation style.

An ‘instruction (aka chat) model’ is made through fine-tuning the initial base model on example prompts and responses.

An ‘aligned model’ is made by further fine-tuning the instrution model and using safe-guard mechanisms to reduce offensive and harmful content in responses.

%%{init: {"themeVariables": {"fontSize":"25px"}}}%%
flowchart TD
    A["Training Data"] --> B["Base Model"]
    B --> C["Instruction-Tuned Model"]
    C --> D["Aligned Model"]
    linkStyle 0 stroke-width:4px 
    linkStyle 1 stroke-width:4px
    linkStyle 2 stroke-width:4px

GenAI is not “data + magic algorithm = AI”.

Humans are involved in creating, collecting, labelling, reviewing, and so on.

Much of the labelling and reviewing, including of offensive and harmful content, is out-sourced to low-paid precarious workers in the Global South.

Humans are heavily involved in design decisions and choices that influence how genAI models respond - setting out principles they train their models to comply with.

This involves making decisions on:

whether and how a model should respond to particular types of prompt
defining what are considered ‘factual’ or ‘moral’ questions
what counts as and how to ensure ‘balance’

An overview of genAI chat loop

System instructions are pre-setup when using consumer facing chat interfaces such as ChatGPT, Gemini, and Claude.

System instructions are used for mix of alignment, style, and so on, as well as information on tools available to the LLM (e.g. web browsing).

When given a prompt in a new chat, the system instructions and the prompt are passed to the LLM - alongside any data such as file uploads, images, etc.

The model predicts one word (or more precisely ‘token’) at a time. After predicting each word, it adds it to the ‘context window’ then predicts the next word from it, and so on.

%%{init: {"themeVariables": {"fontSize":"25px"}}}%%
flowchart TD
  subgraph CW["Context Window"]
    A[System Instructions] --> B[Chat History + Any Data]
  end

 CW --> M[LLM]
 M --> R[Model Response]
 R --> CW

💩 Smells Like Hype

‘Artifical Intelligence’ is a term that lumps together a diverse array of different things. The vagueness of the term enables overly bold claims to be made about genAI, which whilst impressive, misrepresents its strengths and limitations.

LLMs like ChatGPT are statistical approximations of their training data. They generate text by predicting the next token in a given context based on the complex patterns and associations encoded within their model weights.

‘Hallucination’ is a misnomer for inaccurate information generated by genAI. LLMs are always calculating the next most probable token, they do not distinguish between accurate and inaccurate.

So-called ‘thinking’ / ‘reasoning’ models similarly do not ‘think’, they are trained to generate ‘thinking tokens’ before generating tokens for the main response, but it still remains a calculation of the next most probable token.

Evidence is showing that there are limits to the extent genAI generalises, doing well on problems well represened within its ‘training distribution’, but failing for problems outwith it.

Prompting

Prompting 101

“Better input, better output”

“Prompt engineering” is better thought of as “prompt crafting”, there are guidelines can follow, but a lot relies upon experimentation and iteration to gain a sense of how models respond to different prompts.

Three key aspects to consider -

Role - what role is the genAI taking in the interaction and for what purpose.
Context - background information, clarifying details, additional explanation, etc.
Output - what to include in and how to format responses.

May need to tweak phrasing and structure of prompts for different genAI models - including models by the same genAI developers. Most genAI companies provide a “prompting guide” for their model.

Prompting 201

“Understand both topic and model”

There is a stark difference between “generate an image” and “generate a watercolour picture with inked lines”.

In contrast to abstract notions of genAI being “intelligent”, it responds to the particular words and phrasing provided in the prompt that influence the calculations of the next most probable token.

Effective prompting involves learning how to describe and explain what you want - this requires having at least some rudimentary knowledge of the topic.

GenAI has heavy defaults it tends to gravitate towards, whether as a result of what is most common within its training data, later fine-tuning, and/or current model limitations.

Crafting prompts then often involves finding phrasing to prevent - or at least limit - undesired behaviours it tends to default towards.

Sounding Boards

Academic Integrity & ‘Authorship’

In general, emerging genAI guidelines emphasise:

Work must remain your ‘own effort’.
Acknowledge any genAI use.
Be aware of limitations and issues.
Check outputs.

Growing issue of genAI embedded into all software, and in way that does not necessarily provide much control nor retain ‘own voice’.

GenAI models over-eager to rewrite, presented as supercharged equivalent of spelling & grammar checker, when the changes it makes are highly opinionated.

How then to maintain own voice, authorship, and editorial control when using genAI?

Example writing feedback prompt 1

I require assistance revising the following:

A major issue with current genAI models is a design focus on doing tasks for you, which they seem to have been ‘over-trained’ on. You can see the same issue in the ways Apple, Google, and Microsoft are implementing AI into writing and messaging apps - where you can prompt it to write ‘draft’ content or ‘refine’ big chunks of what you have written. There is no feedback or back and forth, writing and editorial choices are delegated to the genAI. This results in everything reading in the same generic genAI style and the extent genAI will gladly and over-eagerly revise text can introduce whole range of issues. Even where all the initial work is your own, if you delete proof-reading and copy-editing solely to genAI it can result in a changes meanings and even citations no longer supporting the points they were cited for as the genAI had misinterpreted the original text and decided to elaborate and add in more info that was not in the cited sources.

Default ChatGPT response 1

Example writing feedback prompt 2

Assess the following paragraph. I require assistance on ensuring it has an appropriate topic sentence and removing any tangential information to ensure conciseness:

A major issue with current genAI models is a design focus on doing tasks for you, which they seem to have been ‘over-trained’ on. You can see the same issue in the ways Apple, Google, and Microsoft are implementing AI into writing and messaging apps - where you can prompt it to write ‘draft’ content or ‘refine’ big chunks of what you have written. There is no feedback or back and forth, writing and editorial choices are delegated to the genAI. This results in everything reading in the same generic genAI style and the extent genAI will gladly and over-eagerly revise text can introduce whole range of issues. Even where all the initial work is your own, if you delete proof-reading and copy-editing solely to genAI it can result in a changes meanings and even citations no longer supporting the points they were cited for as the genAI had misinterpreted the original text and decided to elaborate and add in more info that was not in the cited sources.

Default ChatGPT response 2

Writing Aid - Custom GPT

GenAI Writing

Learning relevant terms - such as “topic sentence” - is vital for receiving more tailored and specific feedback and examples.

Consider ways to specific the flow of interaction, steps to follow, and level and form of feedback provided.

Remember genAI is opinionated in its style and tone, there is a multitude of ways to be ‘formal’ in your writing.

Though caution - even when finding prompts that vary the style and tone more, it retains clear ‘ChatGPT-isms’.

Beyond spelling and basic grammar, use it for feedback to make your own edits rather than delegating editorial control.

Sounding Boards

Prompt for feedback rather than auto-corrections:

Describe what it should and should not do - “Provide feedback, rather than …”
With longer prompts may have to repeat what not to do multiple times

Prompt for multiple suggestions and explanations:

“For each identified issue provide four examples for how it may be addressed …”
“Provide explanations that will help me make my own informed revisions”

Prompt with details on the structure, form, and type of feedback:

A little knowledge goes far in improving quality of responses.
Specifying the structure and form of output aids in more consistent responses.

🧪 Experiment

Try the ‘Sounding Board’ GPTs within the online materials.

Take a look at the instructions and notes.

Test modifying the prompts or writing new one for other use-case.

Learning Aids

GenAI and Learning

GenAI is over-eager to just directly provide answers / solutions, rather than aid users’ learning. (Though there has been some improvement on this.)

Sadly, and concerningly, see this a lot with apps that claim to support learning / ‘homeworker helpers’.

When prompting on topics open to debate / interpretation, it often defaults to sterile standardised interpretations – particularly with social theory.

GenAI is useful for elaborating, summarising, rephrasing, providing additional examples, etc.

However, it is incapable of ‘critical thinking’ and its explanations can remain relatively shallow on some topics. It’s good for overviews and quick reminders, but is no replacement for academic texts.

Default ChatGPT

RStudio Cloud Helper GPT

Default ChatGPT

Theory Navigator GPT

Learning Aids

Prompt for how it should help:

What role is it taking within the interaction?
What precise areas is it providing support with?

Include context on existing knowledge and other relevant info:

For example, if learning a programming language specify any specific packages etc being used.
Should explanations to tailored for an ‘absolute beginner’ or avoid ‘standardised interpretations’?

Specify level of information, form of support, and structure for responses:

If using it to test existing knowledge, what level of information should its responses to wrong answers include?
Any additional info to provide, steps to follow, things to highlight, etc.

🧪 Experiment

Try the ‘Learning Aids’ GPTs within the online materials.

Modify the prompts used for the instructions, or writing new one for other use-case.

Experiment with changing the response structure and level of information provided.

Ethical Issues and Practical Risks

Key Issues

Whose data is it anyway?
Exploitation of workers involved in training
GenAI replacing human workers and worsening working conditions
Supporting or replacing learning?
Contributing to spread of disinformation
Reproducing and amplifying biases and stereotypes
False impression of genAI capabilities

Climate impacts
Hallucinations/‘bullshitting’
Unreliable and inconsistent
Sterile standardised explanations
Descriptive, lacks “critical thinking”, with limited genuine “synthesis”
General and broad responses
Obsequious and servile
Privacy concerns

AI Fallacies

GenAI as View From Nowhere - assumption that AI with the right data and training is route towards objective truth and remove of human bias, based on false notion that data+reason=Truth, whereas genAI is a technology situated and embedded within specific power relations.

Inevitable AI Futures - the presentation of very particular visions of the role genAI will play in the future as inevitable, obscuring the diversity of alternative ways genAI could be developed and deployed, as well as relying upon significant assumptions of future capabilities.

Doomsday Scenarios - focus upon long-term future risks - often playing on scenarios from science fiction - that detracts from the here and now issues being created by genAI.

Tech Solutionism - the view that social problems will techno-magically be solved once we achieve super-capable AI, regularly used to deflect issues such as energy consumption as AI will somehow magically provide salvation to climate disaster, as if the problem of climate change was a mere technical problem rather than a complex socio-political issue.

Web, Files, Data

Knowledge Limits

GenAI does not have access to the data it was trained on, the model consists of ‘weights’ (numerical values) derived through its training.

Responses can ‘read well’ but be superficial, lacking details and specifics, and be ‘plausible’ but not correct.

Defaults to ‘rule of three’ (“the good, the bad, and the ugly”) in general but especially when response covers topic that was not as well represented in its training data. This makes responses on first glance read as encompassing whilst only giving surface level information.

With some use-cases then, it can be vital to consider files, web-browsing, and/or data.

Files

GenAI is trained on vast corpus of texts. It remains though a statistical approximation of these. Providing a copy of the text can significantly improve results.

GenAI is opinionated in providing summaries, evaluations, and assessments of texts. Consider how to craft prompts that provide it with guidelines to follow.

Keep criteria relatively broad - lack of critical thinking and randomness make it too inconsistent at using complex and precise evaluation criteria. (Test re-running the same criteria with same file and can often find it provides different evaluations.)

Prompt for things like page numbers to make it easier to check information and support your own reading.

Remember genAI is opinionated, even when it provides a ‘balanced’ overview. It is not a replacement for developing your own interpretation and understanding.

Web Search

Web search is another tool that can aid in improving responses - especially for prompts for newer information not within its training data.

However, there are specific response structures ChatGPT gravitates towards when using web search, making it harder to customise the structure of responses.

Web search is not always a good option. With social theory given a lot of texts are behind paywalls and general lack of information on some social theorists within the open web, web search can result in worse responses.

It is also a tool that varies considerably across different genAI services. Across updates, ChatGPT’s own web browsing tool went from decent but tended towards using the same websites across responses, to absolutely dire - rarely using more than a few responses per prompt, to reasonably good following a recent update - running multiple searches and incorporating lot more sources in responses.

Despite being promoted as providing more accurate ‘grounded’ responses, genAI still ‘bullshits’ - it can mix up sources, editorialise, add claims not in sources cited, and so on.

Data

GenAI data analysis tools are generally decent for quick and dirty analysis for tables/graphs - though as with all responses, check the code it generated to create the figures.

GenAI has a bad tendency to fabricate data rather than using the data set provided - often this is due to vague unclear prompts, though on rare occassions it will still occur with more detailed and specific prompts.

If wanting analysis that you will run more than once, scripts and data dashboards are far better options to provide reusable and reproducible code and analysis.

GenAI ‘interpretation’ of data is horrendously speculative, broad-brush, and at times outright bad practice - engaging in practices such as p-hacking.

It also has a bad tendency to impose generic notions based more on ‘pop science’ than the actual standards used within specific fields - such as claiming statistical results that are generally considered adequate within the social sciences are poor results.

Web, Files, Data

Consider whether and which tools to use:

Web search useful for general news, whereas files better option with academic questions.
Web search can result in worse responses due to the amount of slop online.

Provide details for what it should look for:

Be clear in your prompt to ensure search terms match your aims.
This remains important with file uploads as well, genAI can be lazy, picking the first relevant bit of content it finds, ignoring the rest.

Specify how information should be presented:

If summarising an article are you wanting ‘background, methods, findings, conclusions’ or something else?
Can prompt to include quotes and page numbers, how to style figures, etc.

🧪 Experiment

Take a look at the ‘Files, Web, Data’ GPTs in the online materials.

Try prompting with the web search tool to specify type of sources to prioritise and ways to structure its responses.

Can you think situations where file upload and/or web search could improve prompts for learning?

Inevitable Futures?

Chess Engines

Chess engines were early example of potential of AI - including potential to outperform humans.

Deep Blue famously defeated the world champion, Kasparov, in 1997.

Strength over human players continues to grow:

Stockfish ELO - 3,643
Magnus Carlsen ELO - 2,831

And yet, despite initial debates about the ‘end of chess’…

Chess Engines as Tool

Chess engines became an embedded tool within chess, arguably contributing to its increasing popularity.

Chess engines used for:

Game reviews
Aiding commentary
Analysis and prep
Catching cheaters

Trainers also use chess engines as a tool for creating learning materials, aiding players review their games, etc.

This involves a shift from asking ‘can AI outperform a human player’ to ‘how can AI aid human players’.

Summary

The assumptions and intents behind genAI development and deployment have social implications and present a multitude of ethical issues.
Critical genAI literacy is a vital skill to develop – for making sense of genAI, its impacts, ethical issues, and imagining alternative genAI futures.
Critical genAI literacies also aid in crafting prompts that avoid - at at least minimise - some default behaviours that can be problematic.
Important to ensure use-cases maintain ‘authorship’ and support learning - where critical genAI literacies also aid in crafting prompts beyond the usual dos and don’ts.

💩 Smells Like Hype

Despite the repeat claims each time a new model is released that its intelligence is at a ‘PhD level’, genAI remains far far far away from having the capability to produce a dissertation that would pass a viva.