# Novel Data Representations

Does a two-dimensional picture really contain as much information as a thousand words? What about a

four-dimensionalhyperimage from a sophisticated sensor? Asking such questions led ThinkTank Maths to re-examine what we mean by information.

The word *information* brings to mind bytes or megabytes — counting the bits that
computers use to represent data. In fact, such ideas go back to the work of Claude
Shannon in the 1940s, whose key mathematical idea, *Shannon entropy*, underpins nearly
all of the communication technologies we use every day.

However, Shannon’s approach is far from perfect when applied to the challenges faced by
modern science. The volume of data created by sensors and instruments like the Large
Hadron Collider may be so enormous that most of it actually has to be thrown away — so
how should one decide what to keep without analysing everything? Or what should a Mars
rover transmit back to Earth during its narrow communication window? To address such
questions, it is crucial to measure the *interestingness* of data rather than just the
raw information content.

ThinkTank Maths (TTM) was asked to investigate if *mathematical data representations*
could provide a different perspective on the nature of information. Transforming data
into alternative representations could highlight specific features, for example the
presence of certain interesting objects, patterns or the emergence of new dynamical interactions.

Although the aim of this work was not to create immediate applications, but exploratory,
strictly fundamental research, in the early stages, TTM discovered a new class of curious
mathematical objects we named *twistlets*. They already proved to be useful for
capturing the fine detail of *curved shapes*, and were successfully tested on complex
data provided by a large UK technology company.

The duck image below shows a twistlet representation on the right: it requires only 13 sample points, compared to 26 points on the left.

ThinkTank Maths also noticed that a modern branch of pure mathematics, that had no previously known applications, provided a powerful way to characterise image features that remained constant under certain transformations. This observation created a completely new unifying mathematical framework for understanding several previously known image processing algorithms — and pointed the way towards entirely original methods for image analysis. One interesting application was *edge detection*, as shown in the figure below.

The nature of information is a hotbed of research for TTM, in particular in the context of data compression and autonomy. Frequently, research that challenges our most fundamental concepts leads to the most surprising and fruitful mathematical innovation.