Humans, according to the language theoretician Noam Chomsky, are born with a universal, “internal grammar,” which enables verbal communication. Although this idea is still controversial, it has received some support from genetic research: Certain mutations in a gene called FOXP2 significantly impair the ability to form sentences. If language is indeed a natural phenomenon stemming from the genes and the structure of the brain, then the tools of the natural sciences might be applied to discover where the roots of language lie.
The dictionary, as a particular example, is a collection of words defined by other words, and it can be probed methodically to reveal universal laws governing human language. Prof. Elisha Moses
and Dr. Tsvi Tlusty
of the Institute’s Physics of Complex Systems Department undertook this challenge.
Together with Jean-Pierre Eckmann of the University of Geneva and summer student David Levary from Harvard University, they looked at the connections between the words – the network of relationships that arises when the words used in a definition are linked to the defined word. So for each word in their study, they drew lines connecting it to the words used to define it, as well as to those in the quotations demonstrating usage. This creates a complex graph with as many nodes as there are words in the dictionary, and if a word has multiple meanings then each one of them gets a node of its own.
For example, the Even-Shushan Hebrew dictionary defines the word love (ahava) thus: "strong affection, feeling of great or desirous attraction for someone or something.” To begin mapping a network with “love” at its center, one would draw a straight line from the word “love” to the words “strong,” “affection,” “feeling,” “desire,” “attraction,” and so on. Then, new sets of lines are drawn from the connected words to all the words in their definitions. Soon, a dense network of connections is generated between the words in the dictionary, with related words more likely to be found in proximity to each other.
Because even a large dictionary is not infinitely large, a full network of all the words contained within it is theoretically possible. This large network will typically resolve into smaller, partial networks, composed of words that tie in to a specific subject of the same area of content.
Closing a loop
Following the interconnecting lines of the dictionary network will often bring one back to one’s starting point. In other words (no pun intended), the network closes in on itself, and a word, by extent, becomes a part of its own definition. Though it appears to be a tautology, such cyclical connections may be deeply rooted in the fundamentals of language. The researchers found that in a dictionary containing around 100,000 words, some 6,000 of them will circle, through the network, back on themselves. Moses and Tlusty investigated further, discovering that many of the words that close in on themselves belong to the relatively small subset of the dictionary that is considered “basic vocabulary.” (Basic vocabulary size varies with the definition: Ogden chose 850 for his Basic English, Jōyō Kanji in Japanese covered 2,136 symbols.)
Kurt Gödel famously dealt with the paradox of circular logic in his incompleteness theorem, which states that in a closed number system, there will always be true statements that cannot be proved within the system. A dictionary is also a sort of closed system, and upon consideration one realizes that it is impossible to create a set of definitions that never repeats back on itself. Circularity appeared historically with the Ouroboros - the image of a serpent biting on its own tail, which probably showed up first in Egypt and has played a role ever since in philosophy, religion, alchemy and psychology.
Nature has no problem with circularity: DNA, for instance, encodes the information needed to make proteins, but those very proteins activate DNA and regulate its activities. So if language is a natural phenomenon, arising from the basic patterns of living structures, it might not be so surprising to discover closed cycles that loop back on themselves, concepts that are explained by referring back to the concept itself.
The scientists say that this basic structure is so fundamental to the dictionary network that every time a new concept is added a loop will form to bring its definition back around. They found that when words are connected to one another on the same loop, these were significantly more likely to either be coined or their meaning updated in the same era. So the dictionary network turns out to reveal “peer relationships” among words.
In June, 1857, three gentlemen named Richard Trench, Herbert Coleridge and Frederick Furnivall met in London to establish the Unregistered Words Committee. The idea was to produce a comprehensive English dictionary, a project they estimated would take around 10 years. It would eventually take 72 years to compile the first edition of the Oxford English Dictionary, which encompassed 10 volumes and included some 400,000 words and phrases, and 1,800,000 quotations. Hundreds of volunteers would participate in the project by sending quotations to demonstrate usage.
One of the best-known contributors was William Minor, who was later discovered to be a murderer. Minor was an army doctor who had suffered shellshock in the American Civil War. He later moved to England, where one night he killed a man in a manic fit and was confined to the Broadmoor Asylum. He spent the latter part of his life there, during which he started sending quotations to OED’s editor, Sir James Murray, who even visited Minor at Broadmoor. The unusual relationship between the editor and his prolific contributor has been described by Simon Winchester in his book The Professor and the Madman.
Prof. Elisha Moses's research is supported by the Murray H. & Meyer Grodetsky Center for Research of Higher Brain Functions; and the J & R Center for Scientific Research.