How does our brain cope with the enormous flux of information that bombards our senses? One important neural strategy is the ability to "cluster," or categorize, data and thus make sense of the world around us.
Prof. Eytan Domany, head of the Weizmann Institute's Physics of Complex Systems Department, has developed a new method, or algorithm, for performing "clustering" on computer. A patent application for the algorithm, whose physical aspects are described in the April issue of Physical Review E, has been filed through Yeda Research and Development Co., the Institute's technology transfer arm. The approach has great potential for use in data-heavy scientific and industrial applications.
For example, the algorithm may be used to analyze the vast stream of information collected by satellites orbiting the earth. It may also be of great help in "data mining," the process by which specific information, such as details on a particular product, are culled from the world's huge and constantly growing commercial data banks.
One of the most interesting aspects of the new algorithm is the fact that it mimics unassisted learning. Unlike most automated "sorting" processes, in which a computer must be informed of the relevant categories in advance, Domany's algorithm is analogous to human intuition: it doesn't need to be told how the data is structured or how it should be broken down into groups. When confronted with each new clustering task, the algorithm analyzes the data, computes the degree of similarity between its components and picks its own criteria for breaking the data into clusters.
This is similar to the way in which a young child categorizes unfamiliar objects. For example, a child who has never seen a kangaroo or a bicycle, and is exposed to hundreds of different pictures of each, will eventually figure out that the pictures represent two types of objects - in other words, that the pictures form two "clusters," one of kangaroos and the other of bicycles. As in Domany's algorithm, this process takes place independently of any instruction about the nature of the categories involved. Adults, too, intuitively group together things that are alike, even when not explicitly taught to do this.
The algorithm has already proved itself in solving a variety of clustering problems. In a recent study conducted in collaboration with Yoram Gdalyahu and Dr. Daphna Weinshall of the Hebrew University of Jerusalem, it successfully sorted out 90 images of six different toy objects: three animals, two cars and a boy. Analyzing the lines making up the images, the algorithm correctly determined that the models fall into three different groups. It then further separated the three different animals and the two different cars.
In another task, the algorithm was asked to analyze the sounds of the alphabet as pronounced by 300 people, with the sound of every person's voice represented as a combination of more than 600 acoustic parameters. Without being given any instructions other than the command to look for clusters, the algorithm correctly organized this huge mass of data into clusters corresponding to letters of the English alphabet.
Domany got the idea for his algorithm from a well-known physical phenomenon that serves as a basis of magnetic recording: when a granular magnet, such as a magnetic tape, is warm, its grains form a disorganized mess. But when the magnet is cooled down, the grains progressively organize themselves into well-ordered clusters. Using statistical mechanics of granular magnets, Domany was able to create an algorithm that can look for clusters in any type of data.
Currently, Domany's algorithm is being applied to analyzing the complex neural activity of the brain itself. The goal is to develop an automated process for sorting out brain images produced in response to different stimuli.
Domany's research team included Dr. Marcelo Blatt, Dr. Shai Wiseman and Gaddy Getz. Funding was provided in part by the German-Israeli Foundation for Scientific Research and Development (GIF).