As we explained in our first post, machine learning enables computers to recognise patterns in data without being told specifically how to recognise those patterns. For a computer to work out on its own which features to use to, for example, identify a cat or a “good” screw, you need a number of elements.
Firstly, you need Algorithms or models which take data and make predictions about it – How likely is it that this video includes a cat? Is this screw good or bad? – and then adapt the way it works to make their predictions more accurate. The algorithm essentially figures out which features are most likely to correctly predict the right answer and then gives more weight to those when making future predictions.
The second requirement is for Data to train, validate and test your algorithms. Training data helps an algorithm to learn how to identify cats or good screws by allowing it to correlate features in data with pre-assigned labels. Items labelled as “cat” or “good screw” will have features which are missing from items labelled as “not a cat” or “bad screw”. Validation data is then used to test how accurate the algorithm is, by asking it to label data and checking how successful it is. Finally, test data uses an entirely fresh data set to see how the model performs under production conditions. At every stage, the more training data you can provide, the more accurate your algorithm will become, so a key area of work for Google has been to support development of training and validation data sets such as ImageNet.
The third element you need is an Infrastructure that delivers the necessary computing power in an affordable way. Some of the technologies that have helped move machine learning from research lab to business tool include: distributed and cloud computing; open-source software tools, like Google’s TensorFlow, that allow developers to quickly develop machine learning systems; and specialised processors that are more efficient than standard CPUs at handling tasks like image processing. Google Cloud Machine Learning brings these elements together in a platform that provides an end-to-end service for machine learning projects.
In our final post, we’ll look at how machine learning can be used to solve real business challenges. If you’re impatient to learn more, check out the presentation made at Google Next earlier this year by Dr Fei-Fei Li, Google’s chief scientist of AI and machine learning.