If you are looking for an introduction to AI, read our article ‘An introduction to artificial intelligence’. This article is about how machine learning actually works. If you are familiar with the basics, you will be better placed to understand the latest AI developments or debates.
How do you train a machine learning algorithm?
The most important part of training an algorithm or ‘bot’ is the data. The first step is to collect and prepare the data (the more, the better). This may include adding ‘labels’ so the bot can check the answers later. Alternatively, there are an increasing number of free open source databases available too.
Secondly, show the bot your data. There are different approaches you can take at this point, and they are often separated into two categories: supervised and unsupervised environments, which sit on a spectrum.
- Supervised learning involves continually testing the bot with the data you have labelled with answers, until they can get it right with new data.
- Unsupervised learning involves asking the bot to come up with its own answers or patterns based on the data you provide.
Both of these are used in AI today, and below is an example of each.
Supervised learning: decision trees and decision forests
Suppose that you want a bot that can tell the difference between various Tesla models. To help the bot understand what to look for, you can point out certain ‘features’ which will lead to the right classification. For Teslas, the features might include height, length, engine size etc.
The bot will plot all the data on a graph and come up with a sequence of formulas to identify the different models e.g. if height is over 1500mm, it’s a model X.
The bot applies the various formulas in sequence until it identifies all the models. That ‘if this, then that’ structure is called a decision tree.
Even with a simple example of identifying Tesla models, this is a complicated process. You can imagine the number of features you’d need before the bot can make accurate predictions. As things become more complicated, you can add more trees – creating a ‘decision forest’.
Decision trees are quite common, and one of the great advantages to them is that you can explain their decisions (unlike the neural network below). On the other hand, the more complicated a single tree becomes, the more susceptible it is to bias. There’s also a common problem called ‘overfitting’ – where the tree is tailored so well to the training data that is isn’t much good with new data.
Unsupervised learning: neural networks
Neural networks are named after our body’s own. With us, things that happen together often become strongly associated.
In the brain, neurons which regularly fire together get ‘hard wired' from spindly neurons into massive motorways. With us, the signals are electrical and chemical. With AI, the process is the same but the signals are mathematical.
In a bot, these neurons are organised into layers. The first layer (the ‘input layer’) will have neurons that are based on ‘features’ – or categories – such as length, colour, weight etc. The bottom layer (the ‘output layer’) will have the bot’s ‘answers’. For unsupervised learning, these 'answers' will be unlabelled groups that share similar features. Any layers in the middle are called hidden layers. If there are a lot of them, they call it ‘deep learning’.
How does it get to the answer?
Each neuron in the input layer gets a number, and has to travel down a pathway to get to a neuron in the next layer. While travelling, the number is multiplied by a ‘weight’ and a 'bias'. These are just numbers which you use to control the extent to which a neuron is travelling down a track or a motorway. The more the path is like a motorway, the more important the association is to the bot.
Once you have the final numbers in the second layer, repeat the whole process again for each layer until you reach the output layer and get an answer.
When you first set up the network, the ‘weight’ and ‘bias’ numbers are completely random. The ‘training’ involves adjusting the weights and biases to increase your chances of getting the answer right at the end. It’s like how you used to adjust the radio dial to find the clearest frequency for the station you wanted to listen to. The difference is that there are a lot more dials – tens of thousands. That’s why you need powerful computers, and why neural networks have only recently become popular.
Note that the programmer is not often adjusting the dials. The bot is. This is partly why there is such a big ethics debate in AI.
Does it work?
Absolutely, AIs offer unprecedented accuracy on all kinds of complex and abstract problems, including image recognition and natural language processing. There are also bots working to suggest relevant content so you spend more time on places like YouTube or Spotify.
However, machine learning algorithms take a lot of effort to tune properly, require masses of data, and are a nightmare to debug. And, although they do work, we don't always know precisely how they come to the conclusions they do – which is another part of the ethics debate