How do machines learn?

Arthur Samuel, one of the pioneers of artificial intelligence defined ‘machine learning’ in 1959 as, ‘[A] field of study that gives computers the ability to learn without being explicitly programmed.’

The definition has not lost any of its validity since then. However, the general conditions, possibilities and areas of application for machine learning have changed. Today, machine learning systems can process enormous amounts of data, make very precise predictions, and be applied much more widely.

The ability of a machine to recognize patterns, interpret them correctly, and respond to them correctly does not come pre-installed on a computer. Instead, it is a process. This can take place in two ways.

In ‘supervised learning’, the machine also formulates the correct output for each input. A good example of supervised machine learning is the recognition of handwritten letters. Images of a handwritten ABC are fed into the machine learning system. For each of these images, the metadata contains the information formulated by a human ‘teacher’ as to which letters are involved. The system learns in the training phase the different forms in which humans write a letter. Once the training is complete, the system should be able to abstract from what has been learned and clearly recognize handwriting.

There is also an approach called ‘unattended learning’. While supervised learning discovers patterns where the system has a set of ‘correct answers’, in ‘unsupervised learning’, machine learning systems find patterns where we do not. This may be because the correct answers are not observable or possible, or because there is no right answer for a particular problem. The algorithms are able to recognize structures and similarities in the data and create a learning model. As a result, such systems are often a black box in which even programmers are not sure about how the system learns exactly.

In practice, unsupervised machine learning is used, for example, where providers of marketing data scan huge data sets and classify users into specific clusters.

Applications and uses of machine learning

The foundations for machine learning were laid down in the middle of the last century. However, it is only in recent years that the potential of this approach has come to the fore. Two trends have favoured the rapid further development of the field and the application possibilities of machine learning in recent years. Computers are becoming increasingly performant in order to produce large amounts of data. Machine learning has also experienced a surge since complex graphics processors with several cores have allowed parallel calculations.

On the other hand, the larger the data volume, the greater the need to interpret, organise, and process the data. Scientific developments quickly found their way into commercial applications.

  • Spam detection in e-mails
  • Image search, e.g. in search engines like Google
  • Text translation of applications such as DeepL
  • Text classification, for example in online price comparisons or news portals
  • Speech recognition / speech assistant as in Siri, Cortana, Alexa etc.
  • Fraud detection, e.g. for payment providers and in eCommerce

Further terms around machine learning

The more comprehensive the phenomena surrounding artificial intelligence and machine learning become in people’s everyday lives, the more frequently the related terms appear in the media.

The concepts of artificial intelligence, machine learning and deep learning are often blurred and confused. The connection is basically simple, with the concepts it’s like a set of Russian dolls. The largest, outer doll is AI. Machine learning is a branch of AI, while deep learning is considered a subdiscipline of machine learning.

The ‘Deep’ in Deep Learning refers to the number of layers through which the data is transformed. Deep learning uses hierarchical layers or a hierarchy of concepts in the process of machine learning. The artificial neural networks used are structured like the human brain, with the nodes connected via a network.

Automatic machine learning simplifies the setup of a system through extensive automation. Either the entire process or selected steps are automated so that no expert is required for each individual module. For example, data preparation, feature selection, or model selection can be automated. The Google Cloud AutoML service, for example, promises that the solution will enable even users with little programming knowledge to create machine learning systems such as translation models or models for natural language.

Another term often used in connection with machine learning is ‘data mining’. Machine learning and data mining have many similarities, but also significant differences. Machine learning is based on known characteristics and focused on predictions. In contrast, data mining focuses on discovering unknown properties in the data.