Novel Data Representations
Does a two-dimensional picture really contain as much information as a thousand words? What about a four-dimensional hyperimage from a sophisticated sensor? Asking such questions led ThinkTank Maths to re-examine what we mean by information.
The word information brings to mind bytes or megabytes — counting the bits that computers use to represent data. In fact, such ideas go back to the work of Claude Shannon in the 1940s, whose key mathematical idea, Shannon entropy, underpins nearly all of the communication technologies we use every day.
However, Shannon’s approach is far from perfect when applied to the challenges faced by modern science. The volume of data created by sensors and instruments like the Large Hadron Collider may be so enormous that most of it actually has to be thrown away — so how should one decide what to keep without analysing everything? Or what should a Mars rover transmit back to Earth during its narrow communication window? To address such questions, it is crucial to measure the interestingness of data rather than just the raw information content.
ThinkTank Maths (TTM) was asked to investigate if mathematical data representations could provide a different perspective on the nature of information. Transforming data into alternative representations could highlight specific features, for example the presence of certain interesting objects, patterns or the emergence of new dynamical interactions.
Although the aim of this work was not to create immediate applications, but exploratory, strictly fundamental research, in the early stages, TTM discovered a new class of curious mathematical objects we named twistlets. They already proved to be useful for capturing the fine detail of curved shapes, and were successfully tested on complex data provided by a large UK technology company.
The duck image below shows a twistlet representation on the right: it requires only 13 sample points, compared to 26 points on the left.
ThinkTank Maths also noticed that a modern branch of pure mathematics, that had no previously known applications, provided a powerful way to characterise image features that remained constant under certain transformations. This observation created a completely new unifying mathematical framework for understanding several previously known image processing algorithms — and pointed the way towards entirely original methods for image analysis. One interesting application was edge detection, as shown in the figure below.
The nature of information is a hotbed of research for TTM, in particular in the context of data compression and autonomy. Frequently, research that challenges our most fundamental concepts leads to the most surprising and fruitful mathematical innovation.