Automated text generation on the advance

Automated text generation is becoming increasingly common. From robot journalism to product descriptions, their use has become more and more ubiquitous. As demand continues to grow for individual online content, automated text generation is to become increasingly important.

ATG is becoming more enmeshed in our lives. For example, news portals are increasingly relying on robot journalism for their content, in the same way that online retailers are leaning more and more on automation for their product descriptions. Natural language generation software is already creating individualised reports such as annual reports, real estate documents, or fund reports.

The foundation for automated text generation has been the milestones reached in AI in recent years. Today, AI-based programs are able to simultaneously generate and provide natural-sounding content.

How does automatic text generation work?

The workings of ATG can be explained succinctly and needs no great prior knowledge.

In order to function, it first needs managed, structured data. This comes in tabular form. Using weather reports as an example, these will include values such as temperature, air pressure, precipitations, and locations. Stock market reports can include price or index fluctuations. Product descriptions are also automated and include values such as colour, size, and weight. For human readers, simple columns of numbers are not as readable as text.

NLG systems transform the data into text to improve the user experience. A program takes the data and sorts it into the relevant parts of an intelligent gap test. The templates for such text could be:

  • The maximum temperature in Berlin today will be [XX]° C.
  • During the night, temperatures will fall to [XX]° C.
  • Chances of rain are [XX] per cent.

If, in the examples above, the system has data on minimum and maximum temperature values, sentences one and two will be generated. The third sentence is only generated if there is data on precipitation probability.

More-powerful NLG software goes beyond merely describing events.

Depending on the structure of the data and the stored sentence templates, complex tools are able to provide the user with interpretations of data in text form. These use cases are particularly interesting for analyses in the area of business intelligence.

The templates for text generation are not created by computers, independent of whichever project they are being used for. This input has to come from a human writer. This means that specialist linguistic data architects are needed to set up, maintain, and continue to work on NLG projects.

How automatic text generation can be applied

Automatically generated texts have already been encountered by almost every Internet user. The size of a portal does not matter for NLG applications.

Football reports, earthquake notifications, stock exchange news, weather, and traffic jam reports are established areas of application for automatic text generation. Especially interesting are computer-generated articles for regional media. In Great Britain, the first trials are taking place in which texts with local relevance are created from public data sources.

In terms of quality, there is hardly any difference between automatically generated articles and those written by human authors for certain types of robotic journalism. Tests with football match reports, for example, showed that readers rated the products of the ‘robot journalist’ as more natural than those of a ‘real’ editor.

ATG is also widely used in e-commerce. Because online shops continuously require large quantities of product descriptions, portal operators have discovered the efficiency of computer-aided text generation.