;

Research, Collaboration, Impact! Making text classification more explainable

Professor Yulan He

Professor in Natural Language Processing

23 October 2023

Technology & Science

Collaboration and the culture that underpin it are necessary cornerstones to tackle society's biggest challenges and make an impact on the world around us. In 'Research, Collaboration, Impact!' we examine how teams in the Department of Informatics are working to overcome some of the world's biggest issues, and the partners they're working with to ensure their research is making it a better place. In our second issue, we explore new approaches being developed to extract meaning from complex texts and what they mean for the advancement of Artificial Intelligence with Professor Yulan He.

Professor Yulan He and her collaborators have developed three new frameworks to improve how neural models identify the meaning of complex texts. Named HINT, CUE and GIANT, these new approaches improve the accuracy of model explanations, reduce the overconfidence of misclassifications and improve the quality of recommender systems, ultimately making AI more explainable and fairer.

Background

Getting at the true meaning of text is by no means an easy task. ‘The limits of my language mean the limits of my world’, philosopher Ludwig Wittgenstein wrote in his Tractatus, pointing to the formative role of language in shaping who we are. In complex texts, topics oftentimes shift at pace, and narration is heavily layered. Natural Language Processing (NLP) researchers, who aim to build systems that can understand large amounts of human-generated texts, have their work cut out for them. Text classification, where machines work through gargantuan amounts of text data, often for the purpose of refining recommender systems or sentiment analysis, face the challenge of identifying meaning every day.

In this line of work, NLP experts face a double whammy: they need to build systems that can reliably identify meaning and must be able to work out how the machine got there. Neural networks are notoriously black box – it can be hard to tell how an AI system has arrived at a recommendation or decision. This is a general issue that occupies many researchers in computer science at present. Interpretability, or explainability, in AI has gained considerable attention globally from many stakeholders in academia, industry and government.

HINTs, CUEs and GIANTs on the path to inbuilt interpretability

Getting neural networks to interpret texts in a meaningful way has been the object of a great many studies. One of the major approaches is to pick up on individual words and phrases, and work towards an interpretation based on them. For Professor Yulan He, this approach however has proved stubbornly insufficient. ‘Consider the statement, ‘This food is delicious’’, she explains. ‘Now, if you were to remove the term ‘delicious’, the classification of this statement would change from positive to neutral’. While it is fairly straightforward to identify the single term that gives the sentence its positive spin, Professor He continues, it is not a good method for dealing with large and complex texts. ‘If we talk about identification at the document level, this method quickly becomes nearly impossible’, she points out.

– Professor Yulan He

And then, there is the problem of interpreting machine behaviour. How did the system get to its classification, and is its reasoning defensible? Several approaches have emerged in this area. Some require an additional layer that checks back on the interpretation, others involve human supervision. While these approaches may work some of the time, Professor He says that in reality, they can quickly run into problems. ‘Wouldn’t it be better to design the model in such a way that it produces an interpretation?’, she asks. ‘This is what we propose’.

In response, Professor He and her team have developed Project HINT, an acronym for Hierarchical Interpretable Neural Text classification. The idea is, she continues, to identify hierarchical relationships between text classifiers and how they interact in order to gain a better understanding of what’s truly going on in a text. HINT, then, builds up a hierarchy of this kind that generates automatic explanations based on actual meaningful topics rather than words and phrases alone.

In a typical movie review, for instance, cinema enthusiasts may comment positively on some aspects of the production of the film while being highly critical or even negative about the movie as a whole. HINT aims to identify these different levels of relevance for the overall evaluation of meaning—interpretation is inbuilt, it is not added on. ‘The model is explainable by design’, Professor He confirms.

Figure 1: HINT example of an IMDB review. Source.

But this is not the only project that Professor He and her team have been busy working on over the past couple of months. The working environment at the Department of Informatics is conducive to keeping up the pace, she finds. ‘Our work requires significant GPU resources that allow us to do our experiments because many of them require quite heavy computation, so King’s is really great in providing that infrastructure’, she says. ‘And, I should say, they have been really supportive in enabling my PhD students at my former institution to continue working with me, which is very nice’, she smiles.

Our work requires significant GPU resources and King's have been really in providing that infrastructure. They've also been really supportive in enabling my PhD students at my former institution to continue with me, which is very nice"– Professor Yulan He

With such strong institutional support in the bag, two related and no less important projects the team have been working on are called CUE and GIANT. CUE stands for a novel Pre-trained Language Model (PLM) Classifier Uncertainty Explanation framework. PLMs such as ChatGPT and Google’s Bard derive their responses from vast amounts of training data that may contain biases, or lend themselves to making skewed predictions. At times there is a lot of uncertainty around the value of the predictions of such models. In particular the topic of AI ‘hallucinations’ where the system makes up responses has generated headline news, and a lot of more work is required to increase the reliability of Large Language Models (LLMs) like these.

CUE aims to identify where the uncertainty comes from in text classification. The framework works like this: It starts by changing the way text is represented using pre-trained language models. It does this in a hidden space. Then, it compares how uncertain the predictions become after these changes. Sources of uncertainty can be more easily identified and zeroed in on should, for instance, the quality of predictions increase upon omission of some terms. The table below shows the word tokens that the CUE framework identifies contributing to predictive uncertainty in emotion classification. The example demonstrates that classifiers tend to be confused by idioms or phrases carrying emotions different from the true emotion labels. For example, the second sentence contains a metaphorical phrase, `itchy trigger finger', making it a tricky case for emotion classification.

Example Text	Predicted Label	True Label
Despite having lived here for 10 years, I’ve never been to Portillo’s, and given this, it’s somewhat unlikely I start going now…	Disapproval	Neutral
Somehow I got banned for replying to a troll. The mods over there have itchy trigger fingers.	Disappointment	Disapproval

Table 1: Visualisation of token-level uncertainty interpretation. Source.

GIANT, the Geometric InformAtioN boTtleneck framework, aims to assemble information about user-item interaction in graphs to make recommender systems better. The framework makes it easier to distil relevant information about user preferences from the ways in which people interact with different items.

Here too, improving interpretability and making the system explainable emerge as key themes and the pivotal motivation for Professor He to pursue this line of work. Ultimately, considering interpretability at the design stage means that checks by humans later on to make sure everything is in order is no longer required. ‘In this process, we don't need to rely on any human labels or human supervision in order to generate the interpretation’, she points out.

Figure 2: a small detail of the GIANT framework. Source.

Looking ahead: next steps

Against the backdrop of multiple parallel projects and a steady output of world-class research papers, what is next for Professor He and her team? ‘We are currently looking to leverage the strong in-context learning capabilities of LLMs such as ChatGPT to train a smaller language model for the scoring of science exam questions at GCSE level’, she says. While it is easy to automate extraction of actual marks that examiners give, it is no trivial task to identify the rationale that could explain the marking decision for individual students. ‘It would be prohibitively expensive to ask the human to provide a rationale for each mark’, she explains.

The issue of explainable AI is at the heart of Professor He’s work. An explainability answer scoring system for science exams would greatly and uniformly increase the fairness of scoring, with potentially quite significant implications for school leavers’ career choices. ‘Being good is easy, what is difficult is being just’, French writer Victor Hugo once proclaimed. If the concerted efforts of Professor He and her team to bake interpretability into AI systems are anything to go by, explainability and thus ultimately fairness in many of our everyday applications are now one step closer.

Written by Juljan Krause