Thinking without feet

How spatial metaphors and cognitive processes rooted in our embodied experience of the physical world help us comprehend and navigate digital worlds.
I first encountered the concept of mapping in Don Norman’s The Design of Everyday Things. In Norman’s words, mapping refers to the “spatial correspondence between the layout of controls and the devices being controlled”. As seminal ideas so often appear to resemble common sense in hindsight, mapping is a simple yet impactful concept that many of us intuit, and yet it is staggering how often we encounter interfaces that map poorly to the functions they control.

Where I grew up, on the fringes of the Peak District National Park, it was practically a rite of passage to obtain a driver’s license as soon as legally permitted, lest you become a leach, relying on the altruism of friends to taxi you around. Learning to pilot a car is a crash course (sorry) in how mapping is implemented in the design of human-machine interfaces. When done well, little to no instruction is needed to understand in which direction the indicator arm needs to be moved, where to find the window switch, or how to turn the radio volume down.

But it was not until recent years—no doubt owing to the work of Norman and his peers and their influence on the disciplines of human-computer interaction and usability—that it has become increasingly common to find interfaces rendered invisible by the success of their design. Things that ‘just work’. Whilst mapping has become a concrete and fundamental principle of usability in the context of real-world interactions with machines and technological devices, things begin to blur in the exclusively virtual sphere. What to map to when there’s, well, nothing to map to? Nothing ‘physical’, at least.

The coordinates that lie beyond the frame of the phone or monitor screen speak to a ‘there’ which is in reality ‘nowhere’. Any depth is illusory. Spacial relations make sense only between the elements visible on screen, or those that can be introduced by interaction. In two-dimensional virtual worlds, back and forward refer to time rather than direction.

Ontological relationships can be communicated by visually communicating how elements are similar, as gestalt principles like proximity, similarity, continuity, and common region imbue individual elements with meaning in relation to other elements. Things are commonly constituted not by what they are but by what they are not. In this regard, space is not the absence of meaning, but the clarification of meaning, allowing objects to come into being by indicating how parts make wholes. A door is defined less by its frame than by its ability to create space—to render where there was previously only obstruction an entrance or exit.

After introducing the term ‘affordance’ from perceptual psychology into design, Don Norman returned in later writing to revise his advice, promoting the use of the term ‘signifier’ over affordance, in particular when referring to the virtual space. This revision signalled a recognition that certain physical characteristics, despite skeuomorphic design’s best attempts to render them, are simply lost in the screened world—the absence of tactility and dimensionality undermines comprehension. Nonetheless, spatial metaphors abound.

As Maggie Appleton explains, “There is no complex, abstract thought that does not require and involve our embodied understanding of being creatures with ups, downs, lefts, rights, ins, outs, ons, offs, and betweens.” Guy Descher, in his book The Unfolding of Language, hypothesises that our kinaesthetic relation to the exterior world likely formed the primal underpinnings of human language itself:

“The parts of the body are the closest and most immediate things in our physical environment, and are thus most deeply imprinted in our cognition, so it is no wonder that body-parts are the sources of terms for all kinds of more abstract concepts in so many languages.”

Perhaps Maurice Merleau-Ponty put it best. “I am my body.”

It is then not surprising that mapping plays an essential role not only in comprehension, navigation, and expectation, but in recall and retrieval. This seems sensible enough in the physical world when giving directions to someone or finding your parked car. For example, picture a place you enjoy visiting, like a bar, theatre, or friend’s house. Now imagine how you would get there, and you may likely sense an invisible tug on your body, your arm raising to gesture in the direction you would take.

But how does this translate to virtual environments?

I am, therefore I think

In their thorough corpus study on file management (FM), Jesse David Dinneen and Charles-Antoine Julien conclude that “A preference for navigating to files is much more common than a preference for searching”, as “navigating tasks require less cognitive effort than searching tasks”. Their hypothesis for this is that “large portions of the brain dedicated to spatial cognition and used in real world navigation are activated during FM navigation, whereas smaller areas dedicated to linguistic processing are activated during search tasks”. Barbara Tversky’s research backs up this claim, explaining that “the representations that support memory and inference appear to be spatial/motor, rather than visual”.

Movement undergirds thought. Less ‘I think, therefore I am’, and more ‘I am, therefore I think’. It would appear that human physicality is intractably linked to the formation of language and gestures that aid the articulation of complex abstract ideas. Maps are memory aids, even in the flat world of the devices we use daily. We feel our way around website architecture, fall down wikiholes, and wade into folders in search of files, with little more than the flick of a finger.

Knowing this, interaction and interface designers have established paradigms and best practices over the years, developments layering upon developments, that bring essences of space and dimensionality into the two-dimensional worlds we inhabit. Metaphors of increasing redundancy continue to litter digital experiences, from icons and pictograms to the cards and tiles that drop subtle renderings of shadow onto the surfaces ‘beneath’ them. Skeuomorphism never really went away; it was simply tempered.

There have been ebbs and flows of literalness in interface and interaction design. Fidelity to reality probably peaked along with the zenith of Flash’s popularity in the mid to late two-thousands. The iPhone precipitated the birth of the responsive web, and increasing awareness of web standards, accessibility, and search engine optimisation paved the way for a period of confused and reactive ‘flattening’ of visual digital experiences. Several years later, the pendulum swung back towards the centre, and the spatial lexicon of the web was rewritten by, somewhat surprisingly, Google, whose gradually evolving Material Design system appears to have honed how surface, space, and gesture coexist.

How long will this last with the advent of everyday augmented (AR) and extended reality (XR)? What barriers to search and navigation will innovations in AI and machine learning break down, challenging Dinneen and Julien’s thesis that navigating to files is preferable to searching? How much has already changed for a generation raised on iPhones?

Like fashion trends, the approaches to mapping and signification within human-virtual interfaces will no doubt evolve whilst occasionally revisiting the past. The way technologies so often reshape humans suggests that new embodied cognitive paradigms will very likely infiltrate the two-dimensional worlds we’ve become so well-acquainted with.

But as the pendulum shudders from its equilibrium, we cannot yet be sure in which direction it will travel.

