Overview of Explainable AI and Layer wise relevance propagation (LRP)
It is crucial to understand why Neural Networks(NN) are predicting what they are predicting, especially if the application is safety-critical. Explainable AI(XAI) refers to approaches that establish a relationship between input and output, that can be easily understood by humans. It helps ensure that NN is using the right features before making predictions.
One of the best aspects of a NN is that it learns the important features from the input data by itself, this lets NNs solve very complex problems but leaves the programmers/designers without a clue about which input features are being given more importance in giving predictions.
Let us understand this by example, say we have an image of a person and we want to guess the age of the person. Now, from a human’s standpoint features like hair colour and wrinkles on the face could be two possible features to consider in guessing the age, more grey coloured hair and high intensity of wrinkles on the face can represent a higher age(Only for the sake of understanding, no scientific background considered behind these assumptions). Now when a neural network is asked to perform the same task of predicting the age of the person with an image of a face as input, during its training, these features will be extracted from the input image by itself and the neural network will decide which feature should have more weightage(depending on the distribution of these features in the dataset). Let’s assume that the NN gives a higher weightage to hair colour, which means if a person has grey hair, it is more likely that the NN will predict a higher age for the person. After training, if this NN is fed with an image of a young person who has grey dyed hair, it is likely that the NN will predict a higher age. And since we do not know which features did the NN put more weight on, it will be hard to pinpoint the problem.
That's where explainable AI comes in. It helps create an easy-to-understand relationship between predictions and input features, which helps us see which features had a higher weightage and which ones had a lower weightage in giving a particular prediction. And once we know this we can process our input data to filter out those bad features before feeding it to NN for training.
Layer-wise Relevance Propagation (LRP) is an Explainable AI technique applicable to neural network models, where inputs can be images, videos, or text. LRP calculates something called relevance in an iterative fashion from output class neurons to the first input neurons. So we start with a neuron representing a class and calculate Relevance, using a set of formulas shown in Figure 1, to get the relevance value of previous layer neurons. This Relevance is again used in calculating the Relevance for previous layers until we reach the first layers. At this point, the calculated Relevance shows which input features were most relevant.
Below are a set of formulas that are used in calculating Relevance:
Where,
More details about calculation and an approach for implementation can be found here: Link
Once we have the relevance values of the first input layer neurons, these can be visualized to understand the weightage of different input features on the final outcome.
Figure 3 shows an example where a Neural Network was created to classify an image as a locomotive or “passenger car”. After training, an image was fed and the prediction was “passenger car”. LRP was implemented from the neuron representing “passenger car” to the first layer neurons, and the relevance values for these input neurons were plotted using a heatmap as shown in the picture. The red pixels are the pixels that contributed positively to making that prediction and the blue pixels are the ones that contributed negatively to making that prediction.
To ask questions or to simply connect, here’s my LinkedIn profile: Praveen Kumar