The Beat-Boxer and the Steam Machine: a paradigm of primitive speech

Let us suppose a beat-boxer, one of those gifted with the skill to reproduce a whole orchestra of percussive instruments using only his voice, and let us have him put in charge of a large and complicated steam-driven machine of the sort which has something fed in at one end, processes it, and puts the finished article out at the other, all the while whistling and hissing and clanking with various rods and cranks and and cogs and arms moving up and down and round and round. It is not entirely automatic, so the beat-boxer has to dance about it, furbishing it here, oiling it there, pulling this lever, turning that valve, as well as loading it at the start and unloading it at the end of each cycle.

Engines of PS Savoie, Lac Leman, built by Sulzer of Winterthur

As he moves, he accompanies himself with a soundtrack of his own improvising, which imitates the rhythm and the sounds of the machine, with some interjections of his own, and punctuated with movements and gestures. If he has a purpose at all in doing this, it is primarily aesthetic: he does it for the joy of it. But that is not to deny that it is useful to him: it keeps him attuned to the rhythm of the machine and serves as a kind of mnemonic (maybe the noises of his own that he inserts correspond to different actions that he performs at various points in the cycle). You might even imagine a circumstance where he uses his beat-box track as a tool to instruct someone else in operating the machine, because there is a correspondence between it and the various stages of the process of operating it.

If you recorded the beat-boxer going about his business, you could analyse the sound-track to bring out that correspondence, identifying elements of the track that corresponded to this movement of the machine, or that part of the process, or this action on the part of the operator. In this world of viral videos on Tik-tok and YouTube you could imagine him expanding his repertoire to include other kinds of machinery, different sorts of operation, each with its own soundtrack. The soundtracks on their own would, in a sense, embody the operations – an instance of the true sense of synecdoche, where a whole is conjured by a part. Yet at the same time each soundtrack would be an improvisation, not consciously devised, only incidentally having a structure that corresponded to something else.

This, I would suggest, serves as a paradigm for how human speech could incidentally evolve a structure corresponding to the world in which it was created, a structure that (once discovered) could be parsed and analysed into elements that correspond to things in that world, standing in relations that correspond to the relations in that world – and yet at no time is there a conscious ‘naming of parts’, no ostensive definition where we say a word and point to what it means.

This is the solution, for me, of a problem that has troubled me in a theory I have been evolving for some time. My thesis is that Language, as we know it today, is an artefact of writing, specifically of writing used to transcribe speech (something that does not happen till about a thousand years after writing is invented). My reasoning is that it is only when speech is made visible and we have a chance to study it that we can discover the structure that underlies it, a structure we can then analyse into words and grammatical relations.

The question that requires to be answered is where that structure came from, and how does it come to be made up of elements that correspond to things in the world, if it was not expressly devised to do so? And that, I believe, is the question that my example of the beat-boxer and the steam machine answers. Speech in its initial form, I suggest, is no more than the soundtrack of specific human activities, bound up with a larger pattern of gesture, movement and expression that comes naturally to humans engaged in any activity. Its key element is probably rhythm and its character is largely mimetic (or interpretive, if you like): we bind ourselves to the task in hand by improvising sounds and gestures to accompany it. It is, I would suggest, a pleasurable activity, akin to music-making, and its primary motivation is aesthetic: it expresses how it feels to be doing whatever it is – or if you like a larger canvas, how it feels to be human, in this world, doing this thing.

Tens, or probably hundreds, of thousands of years of human activity (which is probably of a fairly consistent character, given that it’s the same sort of creatures living in the same world doing the same sorts of things) will render the improvisation of such soundtracks a matter of instinct and intuition, much like birdsong, with the young attuned to learn it from their elders. And of course I say ‘soundtracks’ only to emphasise the role played by speech – in reality, it is an expressive performance, led by facial expression, gesture and movement, in which speech plays only a contributory part, one very much bound up with the rest and only separable from it when, much later on, the invention of writing (eventually) provides the means of making speech visible – and so capable of study.

Why Writing is like a Playtex Bra

‘It lifts and separates’ is a slogan that will be familiar to those of my generation – it was advertised as the chief virtue of the Playtex ‘cross-your-heart’ Bra. However, it also serves as a memorable illustration of my theory concerning the origin of what we think of as Language.

The conventional account presents Language, as we have it now, as evolved speech, i.e. its origins go back to our first utterance, with the acquisition of a written form for transcribing it a logical development that occurs in due course – around five thousand years ago – but only after speech has held sway as the primary means of human communication for a couple of hundred thousand years.

However, I think that is not what happened; in particular, the notion that speech was the original and primary means of human communication, occupying a position analogous to what we call Language (both spoken and written) today, is an erroneous backwards projection, based on the status that speech only now enjoys.

The conventional account could be summed up briefly thus: ‘First, we learned to speak, and that is what made us human and marked us out as special; then we learned to write down what we said in order to give it permanent form, and that enabled us to store the knowledge and wisdom which has enabled us to achieve our present pre-eminence.’

However, there are good reasons to suppose that the eminence currently enjoyed by speech actually results from the invention of writing and its impact on human expression – what I would call the Playtex Moment, because the effect of that impact was to lift speech above the rest of human expression, and separate it.

Prior to the invention of writing, and indeed for a good time after it, since its impact was far from immediate, speech was, I would say, simply one aspect of human expression, and by no means the most important. By ‘human expression’ I mean the broad range of integrated activity – facial expression, gesture, posture and bodily movement, and a range of sounds, including speech – which human beings use to express their thoughts and feelings. Bear in mind that up to the invention of writing (and for a good time after it) speech was always part of some larger activity, to which it contributed, but did not (I would assert) dominate.

My ground for supposing this is that it is only through the effort to give speech a written form (which probably did not start to happen till writing had been around for a thousand years) that we come to study it closely, and to analyse it. I suggest there are two reasons for this – the first is that it was not possible to study speech till it was given a permanent, objective form; the second is that the need to analyse speech is part and parcel of the process of giving it written form. Crucially, it is only in writing that the notion of the word as a component of speech arises; speech naturally presents as an uninterrupted flow – rhythm and emphasis are of significance, but word separation is not. Word separation – which not every writing system uses – is a feature of writing, not speech.

In the same way, the whole analysis of speech in terms of the relations between words – grammar – arises from writing (for the good reason that it is only through writing that we can become aware of it). It is the understanding of Language that arises through the development of writing as a tool to transcribe speech that elevates and separates speech from the other forms with which it has hitherto been inseparably bound up.

The notion that we invented writing expressly to transcribe speech does not bear examination*: it was invented for the lesser task of making lists and inventories, as a way of storing information. It was only very gradually that we began to realise its wider potential (the earliest instance of anything we might call literature occurs a good thousand years after the first appearance of writing). Rather than writing being a by-product of speech, speech – as we now know it, our primary mode of communication and expression – is a by-product of writing.

And that is why writing is like a Playtex bra: it lifts and separates speech from all the other forms of human expression – but also (to push the analogy to its limits, perhaps) offers a degree of support that is only bought at the expense of containment and subjugation.

The interesting corollary is that if our present mode of thinking is Language-based – in the sense of ‘Language’ that is used here, a fusion of writing and speech – then that, too, is a relatively recent development**; however much it might seem second nature to us, it is just that – second nature: our natural mode of thought – instinctive and intuitive, developed by our ancestors over several hundred thousand years – must be something quite other, with a different foundation (which is, I would suggest, metaphor).

*if you doubt this, examine it: ask how such an idea would first have occurred, that things should be written down and given a permanent form – to remember them? people had been remembering without the aid of writing for thousands of years – why should they suddenly feel the need to devise an elaborate system to do something they could do perfectly well already? Ask also why, of all things, they would select speech as the one thing to make a record of – only if it were the sort of speech we have now – very much formed and influenced by writing – would it seem the obvious thing to record. Finally, ask how they would go about it – devising a script for the purpose of recording speech requires the sort of analysis of speech we can only acquire through already having devised such a script.

**do not underestimate what I mean by this: it is more than using words to think with. It is the complete model of the world as an objective reality existing independently and governed by logic and reason and all that stems from that; all that can be shown to derive from Language, which in turn arises from the impact of writing on human expression, a process that is initiated about two and a half thousand years ago, in classical Greece, by Plato and Aristotle.

One way of thinking about it


For some time now I have been trying to pin down a thing that troubles me about language – to be exact, the relation between its literary form and speech, and my sense that our perception of how they stand to one another is out of kilter.

Here’s a way of thinking about it that occurred to me on my Autumn walk today, in the pleasant environs of Kinnoull hill, where you can see this fine (if rather alarming) giant squirrel:


Consider a stream or river, flowing vigorously down from the hills. By exercise of our considerable ingenuity, we can dam the stream and create a vast reservoir which has enormous potential – we might use it to power industry, directly by water wheels or indirectly by generating electricity; we might irrigate the land; we might supply many households with water – indeed all of these things together. Naturally doing so would entail considerable specialist skill and knowledge and people who had such knowledge would be rightly respected.

And yet there are two things that we must not overlook: the first is that the reservoir remains ultimately dependent on the stream that feeds it – if the source dries up, then the reservoir will eventually be exhausted; the second is that, for all our ingenuity, we have merely harnessed the water, not added to it: its power and properties are exactly those of the river, and indeed in order to be of use it needs to resume its form as a flowing stream.

There is a parallel here with language: however much we order it and standardise it, by giving it a written form, a fixed spelling, a system of punctuation, a systematised grammar, we are still only harnessing the properties of speech. True, a whole set of skills must now be acquired to master the language in its literary form, yet ultimately these are all secondary and derivative: you could have speech, full and flowing in all its power, without its literary form, but without speech, the literary form would not exist in the first place, and (as is the case with Latin, say, or Ancient Greek) once speech dies out and there are no native users who learn it at their mother’s knee, the language dies, though its literary form may continue for a time artificially sustained by some conventional use, as when Latin became the language of both church and university.

This is something you should call to mind the next time you hear someone pontificating about spelling or punctuation, and making a fetish of grammar. These things have their place, to be sure, but in the right order of things it is always a subordinate one: speech has primacy, and the language learned at our mother’s knee and spoken in the home and street is the vital source and origin, not to be disparaged but rather revered and respected.

So there.

Three Misleading Oppositions, Three Useful Axioms

There is an interesting comparison to be made between people and language: we can – especially when we are young and earnest – come to see both as standing in need of improvement, though essentially perfectible (with ourselves as the agents of perfection, naturally); only when we are older do we come to think that it might be better to accept both as they are and accommodate ourselves to their quirks and foibles, rather than seek to correct them.

Language allows words to have a range of meanings, some of which are contradictory – but where once I would have deplored that and sought to correct it – in pursuit of some Holy Grail of clarity – I now think it better to accept it but be aware of it, and consider what effect it has on our thinking. 

Of particular interest to me as a writer are a number of oppositions that we make and often take for granted, which I think can mislead us. Three I would like to single out are

truth and fiction

real and imaginary

invention and discovery

‘Telling tales’ can be a matter for praise or opprobrium, depending on whether we are talking about Homer or the class sneak, but it is interesting that we use the same words for both – ‘just a story’ ‘a mere tale’ ‘pure fiction’ can all be synonyms for ‘lies’ yet we can also speak of fiction telling us profound truths. Although we can (usually) distinguish specific instances without much difficulty, this use of the same word for both creates a kind of infection, so that all fiction is tainted with the suspicion of falsehood and – more importantly, perhaps – it is assumed that the truth must lie elsewhere and have a different form.

So, I would say: always remember that fiction can be true.

In the same way, we use ‘imaginary’ and ‘made-up’ to mean ‘false’ and ‘not real’ yet if we take ‘imaginary’ to mean ‘the product of imagination’ then surely everything that we think of as characteristically human – that is, anything that is not the unassisted product of nature – is imaginary, in the sense that it is something we have ‘thought up’ or ‘made up’ – trains and boats and planes, canals and agriculture, cities – all these things are ‘real’ yet equally none of them has come about by accident – they are the results of design and forethought, of deliberation – they originate in the human imagination, in our ability to envisage what is not present to us, to manipulate things mentally (and isn’t it interesting that ‘seeing things which aren’t there’ serves as a synonym for insanity as well as an exact description of imagination? – this is a division that runs deep).

 So, likewise, do not forget that something can be both imaginary and real.

And are these products of our imagination inventions or discoveries? We tend to use the former to mean things that we have brought about by our own efforts, things that did not exist before we dreamed them up – bicycles and steam engines, say – while we reserve the latter for things that were ‘there all along’ but which we have at some point come upon or uncovered – like penicillin, maybe, or the source of the Nile, or Gravity – yet is the distinction as clear-cut as it might seem at first glance? 

To begin with, both words mean much the same, etymologically – ‘invention’ is from  the Latin ‘to come upon’ and can still occasionally be found in that sense in English (‘The Invention of the True Cross‘  (3 May) was a catholic Feast-day commemorating the discovery of the supposed cross of Jesus by St Helena, Constantine’s mother, though it has afforded wags like Rabelais the opportunity for witticisms – ‘The Invention of the Holy Cross Personated by Six Wily Priests’ is one of the many fantastically-titled books found by Pantagruel in the library of St Victor ). And any invention could equally be described as the discovery and application of existing principles.

So: inventions generally involve discovery, too – to say that something is an invention does not preclude the possibility that it existed beforehand and independently, in some form.

Is music a discovery or an invention? Is mathematics? Is God?

Are they real or imaginary?

Truth or fiction?

– all questions rewarding to dwell upon on a rainy afternoon.