How to enhance the accuracy of GenAI models

By Eliza.Compton, 2 April, 2025
View
Generative AI seems to offer ways to provide individualised feedback to large cohorts but can these models be relied on to give students accurate, relevant responses? Here are three approaches for educators to try
Article type
Article
Main text

Much as certain products require a health disclaimer, Generative AI tools should come with their own warning: “May provide inaccurate information.” This caveat raises a question (among the many practical and ethical concerns that AI use is provoking in higher education): what mechanisms can be used to ensure GenAI provides students with accurate, relevant and contextual information? This is particularly relevant for educators who may not have technical expertise or access to specialist resources.

In short, can anything be done to improve GenAI accuracy? 

Before delving in, let’s look at why GenAI tools may sometimes produce inaccurate responses. Many GenAI tools, particularly large language models (LLMs), use probabilistic text generation rather than factual retrieval, meaning they predict the most likely sequence of words based on patterns learned from an extensive training dataset. Most LLMs rely on existing data and do not access live databases, which limits their knowledge to the information available at the time of training. Because they are designed to always provide an answer, they may generate misleading or entirely fabricated responses when faced with gaps in their knowledge. 

A notable application of GenAI in higher education is using LLMs as chatbots to support student learning. Acting as virtual teaching assistants, these chatbots can answer queries and offer explanations on complex topics. They can also be instructed to assess student work and provide feedback.

However, given the risk that chatbots will give incorrect responses to queries beyond their training, how can we ensure they generate accurate and reliable information, particularly on specific subject areas? 

Here, we suggest three ways to mitigate this risk, in descending difficulty of application:

1. Retrieval-augmented generation: the best approach but with challenges

The retrieval-augmented generation (Rag) model is one of the most effective ways to enhance chatbot accuracy. Rag upgrades an LLM by integrating it with a designated knowledge base, allowing chatbots to retrieve relevant, up-to-date information before generating responses. For example, by linking to a database that contains course materials or institution-specific guidelines, a chatbot can provide responses that are both accurate and aligned with the curriculum.

Degree of difficulty: Implementing Rag requires technical expertise in setting up databases, configuring retrieval systems and optimising the model to prioritise retrieved information. This complexity makes Rag less accessible to educators who may lack the resources or technical support needed for its set-up and maintenance. 

2. Creating a custom GPT

An alternative to RAG is to use GenAI models such as ChatGPT or Copilot to make custom chatbots. For instance, custom versions of ChatGPT can be developed to act as virtual tutors, providing students with personalised feedback aligned to a specific curriculum or marking criteria. This customisation of ChatGPT is enabled through the Create GPT feature, where additional knowledge can be uploaded as documents, links or structured notes. Specific instructions can then be applied to define the chatbot’s behaviour, tone and role, such as instructing it to act like a subject-specific tutor. The feature also allows you to set instructions to help guide the chatbot’s focus, ensuring its responses are consistent and aligned with the intended learning outcomes. 

Degree of difficulty: This option is more accessible than Rag but is not without its challenges. The process of creating a custom chatbot does not require coding, but it does involve using tailored prompts and uploading relevant data to guide the chatbot’s responses within a particular domain. And while many are still becoming familiar with using GenAI in teaching, the process of fine-tuning models can appear overwhelming. Also, creating a custom GPT requires a subscription, which may not be financially viable for all institutions or individual educators. 

3. Co-prompting with students: a simple and practical approach

For a straightforward, no-cost solution, using a standard GenAI model with carefully structured prompts is an effective alternative. Consider a session where students are working through formative questions but, because of the size of the cohort, it’s not feasible to provide individual feedback. So, how can we use GenAI to overcome this challenge? 

A practical way to engage students to be actively involved in their learning is through an “assess and obtain feedback loop”. It offers the ability to support student learning through personalised feedback, even in large group settings. Here’s how to do it.

  1. Design the assignment and prompt
    Create the assignment in a Word document. On a separate document, write out the prompts that will be used by the model to assess each question. Each prompt should include: clear instructions, the question, and marking guidelines or rubric. Give students guidance or a demonstration on how to use the GenAI model effectively.
  2. Deliver the prompt
    Once students have completed the first question, share the relevant prompt with them in a format they can easily copy and paste (such as a shared document or group chat).
  3. Generate feedback
    Instruct students to paste the prompt along with their answer into the GenAI model (for example, ChatGPT or Copilot). The model will then assess the student answer based on your marking guidelines rather rather than relying solely on its pre-trained data, improving the accuracy of the evaluation. This enables the model to generate feedback that is tailored to each student’s strengths and weaknesses.

So, whether through Rag, custom GPTs or simple prompt-based methods, GenAI models can be used in bespoke ways to provide responses that are more accurate and relevant to your teaching and learning requirements. Moreover, these approaches also provide examples where we do not necessarily have to redesign our teaching to support student learning with GenAI.

Nazim Ali is lecturer (E&S) in bioscience, senior fellow (Advance HE) and deputy director of education; and Sarah Aynsley is reader (E&S) in bioscience, a national teaching fellow and deputy director of scholarship, both in the School of Medicine at Keele University, UK. Linda Becker is research associate at Bochum University of Applied Sciences, Germany.

If you would like advice and insight from academics and university staff delivered direct to your inbox each week, sign up for the Campus newsletter.

Standfirst
Generative AI seems to offer ways to provide individualised feedback to large cohorts but can these models be relied on to give students accurate, relevant responses? Here are three approaches for educators to try

comment