Free Novel Read

Iconoclast: A Neuroscientist Reveals How to Think Differently Page 3


  The Anatomy of Vision

  Vision begins in the eye.4 The human eye is divided into two components: a lens system and a detector system. The outermost part of the eye, the cornea, collects incoming light rays and passes them through the lens. The lens takes the incoming light rays and focuses them onto the retina, which covers the inner surface of the rear of the eyeball. The lens functions in the same way as a camera lens, but unlike a camera lens, which is made from glass or plastic, the human lens is living tissue made up of cells that have elongated into very thin fibers. These fibers attach to muscles that surround the periphery of the lens. When the muscles contract, the lens flattens out, changing the focus point.

  Light entering the eye is projected onto the retina, and it is here that the first transformation from physical image to mental image occurs. The light strikes a specialized type of nerve cell called a photo-receptor. These cells contain special pigments that absorb energy from incoming photons and convert this energy into an electrical impulse. There are two types of photoreceptors: rods and cones, which are named for their shapes under a microscope. The rods have a larger surface area and can detect a few photons at a time, which makes them ideally suited for night vision. The tip of a cone is much smaller and is less sensitive, but the cones are packed close together in the center of the retina. This tight packing makes them ideal for picking up fine details. The cones also contain three different pigments, and the relative concentration of these pigments in a particular cone determines the range of colors it responds to.

  Until this point, the eye functions much like a digital camera. But unlike a camera’s detector, the photoreceptors in the retina are not spaced uniformly on a grid. Because the cones are packed densely near the center of the retina, and the rod spacing is sparse near the periphery, our ability to make out details of objects declines with distance from the center of vision. So even before the photoreceptors transmit electrical signals to the brain, the image has been fractionated in a way that gives premium bandwidth to things that are in the center of the visual field. By constantly moving your eyes, however, you’re able to construct a mental image of your surroundings. Your brain can keep track of this mental image and fill in the gaps in vision by making guesses that are generally pretty good. There are circumstances, however, in which these guesses fail, and it is under these conditions that the brain makes incorrect assumptions about what it is seeing. It turns out that the ways in which the brain makes these assumptions are the same ways it makes it difficult to think like an iconoclast.

  For example, there is the blind spot. Cats and dogs don’t have blind spots. The phenomenon is unique to humans and other primates. In humans, the photoreceptors are covered by a thin sheet of neurons that make connections (synapses) with the photoreceptors. This sheet of neurons performs basic image processing and then passes the signals on to the brain through the optic nerve. The optic nerve is a cordlike structure that contains the fibers from all the retinal neurons. Because they are collected into one place and pass through a hole in the retina, no photoreceptors can occupy that space, and a blind spot results. But even though you have a hole in the retina, you don’t see a black hole in your visual field. The brain mentally fills it in with its best guess of what should be there.

  The photoreceptors transmit electrical signals to a thin sheet of neurons that immediately begins to transform the pristine image that has fallen on the retina. Even before these signals leave the eye, they have already been changed in a way that is no longer an exact representation of the world. These neurons, which are called retinal ganglion cells, serve two primary purposes: to collect the visual information from the photoreceptors and transmit it to the brain; and to perform “gain” control. The retinal ganglion cells sense the intensity of light that hits the retina and adjust their output to stay within a constant range. These cells turn up the gain for night vision and turn it down in daylight. The retinal ganglion cells also place a limit on how fast visual information can reach the brain. By measuring how frequently the ganglion cells fire and by estimating how many ganglion cells cover the retina, we can estimate the bandwidth of the human eye. One estimate put the bandwidth at about 1 MB per second, which is about the speed of a cable modem.5

  After traveling down the optic nerve, the electrical impulses make a single synapse in a lime-sized structure called the thalamus and are then transmitted to the cortex, where the real transformations take place and where the mental image is constructed. Interestingly enough, the cortical visual system is laid out in a well-defined topography. The initial site of visual processing is in the part of the brain called the occipital cortex, which lies at the very back of the head. This rather large swath of territory is devoted exclusively to the initial processing of visual signals and is called, not surprisingly, area V1. If you electrically stimulate the neurons in V1, the person will “see” visual phosphenes. For the same reason, if you get hit in the back of the head, the stars you see result from the electrical discharge of these neurons. There is an almost exact representation of the retina here, with each location in the retina being mapped onto a grid in V1. The neurons here process the incoming information and extract basic features from the image such as the location of edges, their orientation relative to horizontal, and the image disparity between the two eyes, which is one element of depth perception.

  The Anatomy of Perception

  Until V1, the visual system behaves like a video camera. Although the neurons perform low-level processing of the visual signals, this type of processing does not yet constitute perception. In fact, we are not even aware of what our brains are doing at this point. After V1, however, the information flows from the back of the head in a generally forward direction toward the frontal lobes. The information takes two paths: the high road and the low road (see figure 1-1). The high road, which is a route of information flowing over the top of the brain, extracts information about where objects are located in space relative to the body. The low road, which is a pathway flowing through the temporal lobes above the ears, processes the visual information in a way that categorizes what a person sees. These two routes, the “where” and the “what” pathways, coordinate with each other so that the end result is a seamless perception of what the eyes transmit. For example, although you move your head and eyes constantly, your brain does not lose track of the objects surrounding you. This is a complicated process, and the only way in which it can be done efficiently is through a process called predictive coding. So that it is not overwhelmed with information processing, the brain makes predictions about what it is seeing and changes these predictions only when it makes an error. As we delve deeper into how this occurs, we shall see the points at which it becomes difficult to see things in ways different than you expect. And yet, the ability to do this is absolutely essential for iconoclastic thinking.

  FIGURE 1-1

  The “what” and the “where” pathways flowing forward from the primary visual area of the brain

  In the early stages of visual processing, such as V1, the brain performs its functions on a local scale. Neurons here do not have information about what other neurons are doing. As the information flows forward through the “what” and the “where” pathways, the information from other parts of the retina becomes increasingly integrated, to the point where the information ceases to be based on retinal location at all. By the time we become aware of what we are seeing, we perceive the visual stream not as a rectangular grid of light and dark spots, but as a landscape of stationary and moving objects, each with its own identity.

  To appreciate the transformation from local to global information processing, examine figure 1-2. The figure consists of only three Pac-Man shapes and three pairs of lines, but you perceive a white triangle floating above the background.6 There is no triangle, but your brain, using its global processing mode, perceives one anyway. You can force yourself to drop down from global processing by staring at one of the individual elements, but this is generally a tempo
rary state as your brain wants to make sense of what it is seeing. A flick of the eyes, and you are back to seeing the floating triangle.

  FIGURE 1-2

  The Kanizsa triangle

  Which is the true perception: Pac-Man or triangle? Regardless of which you see, the information coming from the eyes remains constant. Perception, then, is a product of the mind and brain, not the eyes. Unless you grew up in the ’70s, glued to an Atari console, the relative dominance of the triangle perception illustrates the brain’s tendency to perceive things as it expects them to be. Triangles are more common than Pac-Men. The perception of a floating triangle also provides a unified interpretation of the entire figure. At a global level, this makes more sense to the brain than the alternative perception of three Pac-Men clustering around an empty space.

  The triangle illusion demonstrates a key rule of perception: the most likely way that you perceive something will be in a manner consistent with your past experience. Commonplace perceptions feel comfortable and cost little energy to process. Conversely, uncommon perceptions force the brain into a different mode of processing in which it must figure out what exactly it is seeing, and this costs energy.

  The issue of how the brain creates perceptions from raw visual inputs is of critical importance to being an iconoclast. The iconoclast doesn’t literally see things differently than other people. More precisely, he perceives things differently. There are several different routes to forcing the brain out of its lazy mode of perception, but the theme linking these methods depends on the element of surprise. The brain must be provided with something that it has never before processed to force it out of predictable perceptions. When Chihuly lost an eye, his brain was forced to reinterpret visual stimuli in a new way.

  The Iconoclast Who Discovered MRI

  It is easy to take for granted the remarkable advances that medical technology has showered upon us. In an age of CAT scans and MRIs, the image of a human brain doesn’t carry quite the same awe-inspiring reaction that it used to. But this is a recent phenomenon. MRI is only thirty years old. The brain is the central player in iconoclasm, and much of what we know about the human brain comes from MRI.

  The story of MRI—magnetic resonance imaging—is itself a story of iconoclasm. The basic physical principle behind MRI was actually discovered in the 1940s. When an atom is placed in a magnetic field, the atom will start vibrating. This is called nuclear magnetic resonance (NMR). The rate at which the atom vibrates is determined by what kind of atom it is and the strength of the magnetic field. If you put enough atoms in a magnetic field, they will all vibrate in synchrony, and you can actually listen to this vibration with a radio antenna. Until the 1970s, this was all standard technology for chemists, who used NMR as part of their toolkit to analyze chemicals, at least until Paul Lauterbur’s revolutionary insight. Lauterbur was a chemist by training who had specialized in the study of NMR spectra of naturally occurring proteins. Because of his expertise, he had maintained informal consulting roles with some of the companies that manufactured NMR equipment. While Lauterbur was consulting with one of these companies, based in a Pittsburgh warehouse, a visiting researcher from Johns Hopkins University was experimenting with the NMR spectra of cancer tissue. He wanted to see whether NMR could distinguish normal tissues from cancerous ones.

  Indeed, NMR could tell the difference between healthy and cancerous tissue, but there was a big problem. It couldn’t tell you where the differences were. Because NMR came out of chemistry, which had traditionally focused on the analysis of test-tube samples, no one had really thought about using NMR to locate differences inside the samples themselves. Conventional wisdom said it shouldn’t matter, reasoning that you could always put a tissue sample into the NMR spectrometer.

  Lauterbur thought differently and believed that NMR could be used to find the locations of differences in a tissue sample. One of the big limitations with the technology was constructing a magnet that had a uniform magnetic field. These magnetic nonuniformities resulted in “blurry” chemical signals. Most chemists dismissed this as noise. But Lauterbur started to wonder whether the noise actually couldn’t be turned to an advantage. His insight came at a Big Boy restaurant and was scribbled on the back of a napkin. Lauterbur later recalled that “on the second bite of a Big Boy hamburger,” he was struck by an idea. Maybe that “blurring” contained embedded information that he could decipher. “Heck,” he said, “you could make pictures with this thing!”7

  Lauterbur’s epiphany led him to the idea of purposely making the magnetic field nonuniform. In NMR, this was heretical. But Lauterbur realized that if this inhomogeneity was applied in a predictable way, such as left to right, then the atoms in those different locations would vibrate at slightly different frequencies. These frequency differences could then be assembled into a crude image. Lauterbur tested his idea in the simplest way possible. He embedded a test tube filled with one kind of water inside a test tube filled with another kind of water. Applying an altered magnetic field, he produced the first cross-sectional magnetic resonance image.

  He wrote up the results and submitted them to the top scientific journal, which promptly rejected it. As Lauterbur recalled, “Many said it couldn’t be done, even when I was doing it!” Of course, the scientific establishment eventually came around, and Lauterbur’s insight changed medicine forever. He received the Noble Prize in Medicine in 2003, thirty years after his discovery.

  What is interesting about Lauterbur’s discovery is that we can trace the moment at which he broke out of conventional thinking. There are striking similarities to Chihuly’s story, and the visual nature of their insights is remarkable. What others had written off as noise in the NMR signal, Lauterbur saw as something else. He saw the potential of hidden information.

  Over and over again, iconoclasts like Lauterbur and Chihuly point to the visual nature of their insights. And so visual perception is where the hunt for the iconoclastic brain begins.

  Persons, Places, and Things

  After V1, the visual information splits into the high road and the low road, to meet up eventually in the frontal cortex. Along these two roads, the brain transitions from local processing mode to global processing and makes judgments of object identities and their locations in space. As you might imagine, it is an incredibly complex feat to perform. Only the most powerful computers can perform the task of identifying objects and cross-reference them with a catalog of labels and images from memory. Although it is a trivial task for you to distinguish an automobile from a bicycle, no matter from which direction you see them, a computer would have a great deal of difficulty doing this. Both objects have wheels, yet they may not be visible when the objects are viewed from behind. Imagine the even more complex task of how we distinguish different people from one another. Everyone has the same basic anatomy, and yet we are able to identify people, sometimes from extreme angles in which we don’t even see their face full on.

  The ability to perform such complicated perceptual functions comes with a price. Evolution has resulted in a human brain that can accomplish amazing perceptual tasks, all the while saving energy. The need to distinguish friend from foe, or predator from prey, and to do it quickly enough to decide whether to run or fight, meant that the brain had to take shortcuts and make assumptions about what it was seeing. From the earliest levels of processing in the visual system, the brain extracts useful pieces of information and discards others. Depending on which road the information takes, the bits retained or discarded may be different. The high road is concerned with extracting where objects are located and throws away the elements related to their identity. The low road, on the other hand, is concerned with identification and categorization, and less so with objects’ spatial locations.

  Although the spatial location of what we see may be important, most of what iconoclasts do differently from other people lies in how they categorize what they see. Whether one person sees ugliness or beauty in asymmetry is entirely a result of categorization. In
the same way, whether an NMR spectrum is viewed as noisy or full of extra information doesn’t come from the image itself, but in the way the viewer categorizes the image. For this reason, understanding how the low road pigeonholes objects into categories suggests ways out of predictable perception.

  As in playing the game 20 Questions, the first, and most salient, decision the brain makes is whether it is viewing a person or something else. People constitute a special category of objects. The high degree of social interaction, both at the level of facial and body expression and in the use of language, dictates that the brain treats people differently than anything else. So specialized is this function, neuroscientists have identified the precise location in the brain that responds to human faces. If we were to examine the brain from its underside, the temporal lobes would fan out like butterfly wings. The innermost portion of the lower wings contains neurons that respond only to faces and is called the fusiform face area, or FFA. Some of these neurons perform highly specialized functions and seem to be active only when viewing a face from a particular angle. Many years ago neuroscientists hypothesized that the level of specialization might go so deep that neurons might exist that responded to one thing, and one thing only. These hypothetical neurons were dubbed grandmother cells, because you might have neurons that fired only when you saw your grandmother. A great deal of specialization does exist in the FFA, although not to this degree (which is probably a good thing, because if your hypothetical grandmother cell became damaged, then you wouldn’t be able to recognize your grandmother anymore). Most aspects of facial processing appear to be carried out by a network of neurons in the FFA.8 This type of architecture is called distributed processing and is yet another example of how the brain efficiently organizes information. Because distributed processing employs a network of neurons that process different aspects of faces, no neuron is critical to the overall function, and the network gains a level of flexibility that lets it deploy resources in different ways under different circumstances. Distributed processing also means that the brain can reprogram its networks to perceive things differently.