AI Library
Books for Reading AI
Choose a book, then read it in order from the table of contents.
[AI Library] 8 Postdoctoral Research and Interdisciplinary Work
Demis Hassabis, Father of Google's Artificial Intelligence
Part 3. The Brain, the Blueprint of the Mind
8 Postdoctoral Research and Interdisciplinary Work
Kim Kyung-ran, Kim Kyung-jin
Search. One late afternoon in 2009, as Demis Hassabis walked along the Charles River in Cambridge, Massachusetts, two different worlds were colliding inside his mind. Behind him stood MIT, one of the world's top engineering universities, and across the river he could see the red brick buildings of Harvard. This geographic landscape served as a precise metaphor for the intellectual position Hassabis occupied at the time.
He had just completed his neuroscience doctorate on memory and imagination at University College London (UCL). An ordinary scientist would have had to make a choice here. Become a biologist studying the brain, or become a computer scientist writing code. But Hassabis chose both.
He resolved to become a bridge connecting these two worlds and crossed the Atlantic to MIT's McGovern Institute for Brain Research. The lab he sought out belonged to Professor Tomaso Poggio. Poggio was a giant in the field of Computer Vision, a man who had devoted his life to mathematically modeling the way the human brain perceives objects.
Hassabis's reason for choosing this place was clear. At the time, artificial intelligence research was going through a slump known as 'winter,' but Poggio's lab was different. Active attempts were underway there to translate the principles of the biological brain into computer algorithms.
It was here that Hassabis found the perfect environment to test his hypothesis: 'If we understand the brain, we can build intelligence.' During this period, Hassabis was receiving a very special grant from Britain's Wellcome Trust called the 'Henry Wellcome Fellowship.' This fellowship gave young scientists extraordinary freedom and funding to choose their own research topics and locations.
Thanks to this, Hassabis was not tied to any particular professor's project and could lay the foundational design for his own grand goal: Artificial General Intelligence (AGI).
He moved freely between MIT and Harvard, exchanging ideas with the finest minds of the era. Hassabis would visit Harvard labs to analyze the latest brain scan data, then return to MIT to design machine learning models to process that data.
In this process, he took note of the remarkable efficiency of the biological brain. The human brain perceives a complex world, walks, converses, and learns new things using only about 20 watts of energy. Meanwhile, computers of that era consumed enormous power yet struggled to tell a cat from a dog. Hassabis was convinced that the key to closing this gap lay in Computational Neuroscience.
This is the discipline of translating the way the brain's nerve cells exchange signals into mathematical formulas. He explored the possibility of incorporating the way the brain's visual cortex processes information in stages into Deep Learning technology. The atmosphere at MIT was passionate yet practical.
Hassabis refined his ideas through late-night discussions with researchers there. He felt acutely the limitations his earlier game AI creations had shown. Characters in games only appeared smart within rules a programmer had pre-programmed.
But what Hassabis wanted was true intelligence that could figure out rules on its own even when dropped into an unfamiliar environment. Through his research with Poggio, he realized that the 'hierarchical structure' the human brain exhibits when processing visual information was one answer. When we look at an object, the brain first recognizes simple lines and colors, combines them into shapes, and finally forms the concept: 'This is a car.'
This biological hierarchy later became the technical foundation that allowed DeepMind's AI to master the brick-breaking game Breakout by looking at nothing but pixels on a screen. His time in America left Hassabis with another asset: the scale of his ambition. Amid the academic atmosphere of Silicon Valley and Boston, he learned that research must not end with merely writing papers.
Great research had to be realized as technology that changes the world. He would say to colleagues he met there, without hesitation, 'We are going to solve intelligence.' Some may have thought him a braggart,
but inside Hassabis's mind, his experience as a game developer, his knowledge as a neuroscientist, and the tool of machine learning were already fusing into one. He was now ready to take all these ingredients back to his hometown of London and launch the most audacious startup in human history. The Henry Wellcome Fellowship and UCL's Gatsby Computational Neuroscience Unit. In 2010, Demis Hassabis wrapped up his life in America and returned to London.
His next destination was the Gatsby Computational Neuroscience Unit at University College London (UCL). The name may be unfamiliar to the general public, but to AI researchers, this place is something like a legendary holy site. It was a unit whose founding was led by Professor Geoffrey Hinton, who would later win the Nobel Prize in Physics, a place where the world's brightest and most mathematically gifted minds gathered to uncover the secrets of human intelligence.
Hassabis's choice of this place was no coincidence. He sensed it was the only soil where the seed called 'DeepMind' that he had been conceiving could take root. Hassabis was able to join the Gatsby Unit thanks to the Henry Wellcome Fellowship mentioned earlier.
The fellowship guaranteed him not only financial freedom but intellectual independence to determine the direction of his own research. At the time, the Gatsby Unit was led by Professor Peter Dayan. Dayan was a master of theoretical neuroscience, a figure who had mathematically defined how the brain learns and processes reward.
The Gatsby Unit was a furnace where neuroscience and machine learning melted together. The atmosphere there was intense. Researchers moved ceaselessly between biological brain experiment data and complex probability statistics.
It was here that Hassabis met his old friend Shane Legg, who would later become a co-founder of DeepMind. The two would sit in sandwich shops or pubs near campus during lunch and talk for hours. The topic was always one thing: 'How can we create human-level artificial intelligence (AGI)?' At the time, even uttering the phrase AGI in academic circles was treated as taboo.
AI research was trapped in narrow AI that solved only specific problems, and discussing general-purpose intelligence like a human's was treated
as science fiction. But within the Gatsby Unit's atmosphere of free and fundamental inquiry, Hassabis and Legg found the courage to break that taboo. During his time at the Gatsby Unit, Hassabis dug deep into Systems Neuroscience.
This is the field that studies how the entire brain, not individual cells, integrates information as a unified system, stores memories, and simulates the future. He focused on the role played by the brain's hippocampus. When we sleep, the hippocampus rapidly replays the experiences of the day, like rewinding a video, strengthening memories.
Hassabis believed this biological mechanism could be the key to dramatically improving how efficiently AI learns. His experience here taught Hassabis the power of being a hybrid. Among pure neuroscientists, he would talk about the clarity of computer algorithms; among machine learning engineers, he would talk about the flexibility of the brain.
The two groups used different languages but were solving the same problem, he realized. The question: 'What is intelligence?' His time at the Gatsby Unit was Hassabis's final incubation period just before founding DeepMind.
He did not stop at absorbing the latest theories there; he began gathering colleagues who could implement those theories as working code. The Gatsby Unit gave Hassabis the gift of academic rigor. He learned to build models that were mathematically provable and biologically sound, rather than relying on intuition that something 'seemed likely to work.'
The theoretical foundation laid here later became the groundwork that enabled AlphaGo to break through the complexity of Go and win. When Hassabis founded DeepMind in 2010, it was no coincidence that most of the company's early members came from the Gatsby Unit or were connected to it. The time and freedom provided by the Henry Wellcome Fellowship, and the intellectual community of the Gatsby Unit, served as the decisive crucible that transformed the young Hassabis into a world-class AI leader. Reinforcement Learning and the Dopamine System: Neuroscience + Computation = A Clue to General Learning. Even under London's overcast skies, the UCL lab was heated by intellectual fervor. Professor Peter Dayan, whom he met there, was like a
sage who could answer the question Hassabis had long carried. Peter Dayan was one of the pioneers of Computational Neuroscience, a figure who had mathematically proven how our brains learn through reward. The meeting of Hassabis and Dayan was like the meeting of Steve Jobs and Wozniak at Apple in the 1980s, a moment when different talents locked together perfectly.
The core subject they explored together was Reinforcement Learning. To explain reinforcement learning simply, it is similar to training a dog. When a dog sits in response to the command 'sit,' you give it a treat. The dog then connects 'the act of sitting' with 'the treat (reward)' and gradually gets better at that behavior.
But the interest of Hassabis and Dayan lay in explaining this simultaneously at the biological level and the computer algorithm level. In the mid-1990s, Peter Dayan redefined the role of dopamine, a neurotransmitter found in the brain. People commonly think of dopamine as the 'pleasure hormone,' but Dayan and his colleagues discovered that dopamine signals 'reward prediction error.'
What does this mean? Suppose you put coins into a vending machine expecting coffee, but nothing comes out. At that moment, your brain encounters the result 'nothing' when it was in a state of 'expecting coffee.' The gap between expectation and actual result: that is 'prediction error.' The brain's dopamine cells then stop firing or sharply reduce activity, sending a signal of 'disappointment.' Conversely, if the vending machine dispensed coffee along with a ten-thousand-won bill out of nowhere? Since the reward far exceeded expectations, the dopamine cells fire explosively. This is 'positive prediction error.'
Hassabis was electrified by the fact that this biological discovery matched exactly with the computer science algorithm known as TD Learning (Temporal Difference Learning). The mathematical formula that computer scientists had devised to train machines had, it turned out, already been operating inside our brains for hundreds of millions of years. This was a tremendous discovery.
It gave him the conviction: 'We don't need to invent alien technology to build AI. We just need to copy the blueprint already inside our heads.'
Through his collaboration with Peter Dayan, Hassabis saw the possibility of AI that was not merely classifying data, but 'acting on its own and learning from the results.' At the time, most AI research was focused on 'supervised learning,' identifying whether a photo showed a cat or a dog. Solving problems where the answer key already exists. But reinforcement learning has no answer key.
An agent (AI) interacts with an environment and, after tens of thousands of trials and errors, discovers strategies on its own. This is exactly how a child learns to walk. Falling down, feeling pain, getting back up, and figuring out balance on its own.
This research gave Hassabis a crucial clue about 'generality.' The dopamine system operates the same way whether we are learning to ride a bicycle, play an instrument, or play Go. In other words, a hypothesis forms: there may exist a single unified algorithm in the brain that governs all kinds of learning.
Hassabis believed that if he could implement this 'one algorithm' in a computer, that AI would not only play Go well but could also solve protein structures and tackle climate problems. The time spent in Peter Dayan's lab later became the decisive theoretical foundation for DeepMind's founding mission: 'to build a general-purpose learning algorithm.' Clues for AI Algorithms Found in the Human Brain: The Theoretical Foundation of DQN. The insights Hassabis gained while working with Peter Dayan crystallized in 2013 into DeepMind's first major achievement: the DQN (Deep Q-Network).
DQN was the breakthrough that put DeepMind's name on the map and the decisive technology that led Google to acquire this small London startup for over 500 billion won. The core idea behind this technology came, remarkably, from the brain's hippocampus, which Hassabis had studied during his doctoral research. DQN was designed to play classic 1980s Atari video games better than humans.
But this AI was not told the rules of the games. It was given only the pixel information on screen (vision) and the score (reward). At first, the AI pressed buttons randomly and behaved erratically, but over time it taught itself how to earn points.
But there was one critical problem. When the computer learned from continuous game frames, the data points were so similar to each other that
learning became unstable or the system forgot what it had learned. This is called 'Catastrophic Forgetting.' Facing this obstacle, Hassabis drew on his knowledge as a neuroscientist. 'How do humans learn new things without forgetting old ones?'
He focused on the hippocampus's 'Experience Replay' function. When we sleep or rest, the brain replays important experiences from the day in random order and at high speed. It is like studying for an exam by shuffling and reviewing key material.
Through this process, short-term memories are solidly stored as long-term memories. Hassabis applied this biological principle directly to the AI's code. He designed the AI to not immediately discard the many scenes and experiences it accumulated while playing games, but instead store them in a reservoir called an 'Experience Buffer.'
Then, during training, the system drew not only on current experiences but also randomly sampled past experiences from the buffer to learn from simultaneously. This is the core technique of DQN: Experience Replay. It worked.
The AI's learning stabilized remarkably, and performance improved explosively. In the brick-breaking game Breakout, DQN's progress looked like this: an AI that initially could not even hit the ball was perfectly returning it after 300 rounds of training, and past 500 rounds, it discovered on its own a 'tunneling' strategy that nobody had ever taught it.
It concentrated on one end of the brick wall, drilled a hole through, then sent the ball behind the wall so the bricks broke by themselves, an advanced technique. David Silver and other DeepMind researchers witnessed this on their monitors at three in the morning and could not close their mouths. It was a moment when a machine displayed creativity.
DQN, born from the theory of 'learning by reward' established through research with Peter Dayan and the 'memory replay' mechanism Hassabis brought from neuroscience, was not just a game AI. It was a historic event proving that the principles of biological intelligence could work on silicon chips. A hint found in the brain had become the decisive puzzle piece for AI algorithms.
This success gave Hassabis conviction. 'We are on the right path.' That conviction was the opening act of a grand journey that would later astonish humanity on a Go board with the birth of AlphaGo.
For Hassabis, DQN was the prototype of the first 'thinking machine' built by imitating the human brain. A Structural Comparison of Biological and Artificial Neurons.
Kim Kyung-jin
Attorney · Former Member of the National Assembly · AI Policy Researcher
© 2026 Kim Kyung-jin. All rights reserved.
