|
|||
|
|||
|
Tom Austin spent part of the afternoon of 2 March in Menlo Park, California, talking with Jeff Hawkins about his latest startup, Numenta. Housed in an unassuming space over a bookstore in Menlo Park, California, Numenta is the fourth enterprise Jeff has founded, the others being Palm Computing, Handspring, and the non-profit Redwood Neuroscience Institute (a scientific research institute focused on understanding how the human neocortex works). Beyond his current role as co-founder and intellectual leader of Numenta, Jeff is also CTO at Palm Inc. At Numenta Jeff is developing technology derived from the brain model described in his book "On Intelligence". Numenta's technology is a new type of memory-centered computing architecture modeled after the mammalian cortex that can solve problems in pattern recognition and machine learning. Our interview with Jeff Hawkins took place on 2 March 2006.
Tom Austin: I read your book, "On Intelligence," and was fascinated by it. Some of it actually has roots that go back to my grad school days in the early '70s, when I worked on a Ph.D. in biopsychology. Jeff Hawkins: The whole field certainly has a long and storied history - I've lived through some of it, and a lot of people have touched on pieces of it. Austin: Where I'd like to start is at the simple level, by establishing some contextual value for IT people. Hawkins: What we're doing at Numenta is a fundamental technology. Here's the best analogy I can give: It's like it was about 50 years ago, when people started building the first computers. We are building the first type of a new computing paradigm. And you could have asked the question 50 years ago, "What's the impact of computers on IT departments?" And 50 years ago, it would have been pretty much zero. But then very quickly, IBM and a few others figured out, "Hey, you can do some business applications with these things." And of course, now it's everything. It's interspersed throughout all the business operations in a company. And that's not where all computing is, of course, but it's a very fundamental technology. We have a sense of what our technology is going to be used for, partly because we thought about it and partly because we've actually been contacted by a lot of researchers who have read my book, seen me speak or read a paper that one of the co-founders wrote. And they were getting excited about this. I can't give names here, but one very well-known and large business consulting firm, which basically does IT-type projects for companies, was very excited about my book. It actually has an internal program involved, because it can see the technology. Basically, it has figured out that there are business applications for this technology. I'm not going to pretend to be an expert in it, but I can give you some ideas. It's not a technology where people are immediately going to say, "Oh, this is going to help me do remote e-mail or this and that." It's a technology that will help people analyze businesses and business issues. That's not what it's designed for, again - just like a computer's not designed for that - but it can be used for that. So, for example, just like an analyst will look at a business and try to figure out the underlying causes for success or failure, why revenues are going up or the margins are doing this, what the issues in manufacturing are, and so on, our technology has the ability to do that. It can look at varied data, and extract high-level causes behind it. In the beginning of the paper that Dileep George and I wrote (available at http://numenta.com/), I talk about discovering causes, and that's basically what we do as intelligent beings. We look at situations, and we try to figure out the underlying causes for problems and issues, and the way things work. So, this technology can be applied to understanding manufacturing issues, understanding marketing issues, or understanding network issues. We have a couple customers already - I call them customers, because we've actually exchanged money on this. One company that signed up for an early partner program is interested in analyzing networks. And there are all kinds of networks - there are computer networks, all kinds of information networks and all kinds of power networks. Basically, a network is a system where you have a complex set of nodes that are interacting and sending information between them. Our technology, which is called Hierarchical Temporal Memory (HTM), is really good at looking at this. It has all the right attributes. We haven't proven this yet, but we believe it's going to be the case. The idea is that you could apply HTM to look at information flowing through a network and figure out the underlying causes for whatever issues you're concerned about - whether it's bottlenecks, failures or so on. Austin: Well, there's some research going on now - Intel is sponsoring it at Carnegie-Mellon and other locations - to use advanced heuristics to detect, on the network, inappropriate behaviors that weren't based on a pattern file. They're just looking at it and saying, "Is there something different going on?" There's other research going on at the academic level, using machine learning techniques to identify inappropriate executables. Hawkins: That's right. So far, people are interested in using our technology for, by and large, existing problems where they have no solutions or solutions that aren't good enough. People have been working on these problems, but they're still struggling with how to solve them. As a starting point, one way of looking at HTMs is that if a human can do this pretty easily but a computer has trouble doing it, it's a good candidate for us. We're not restricted to that, because HTMs can do things humans can't do. If someone comes to me with a problem, I'll say, "If a human were looking at this network information flow, would they be able to figure out what's going on?" And, often they say, "The humans are sitting there looking at it, but we don't know how to codify it." Maybe it doesn't follow some simple rules - it's not like an expert system. But a human could look at it and say, "Oh my gosh, look what's going on over here." As a starting point, that's a good candidate for our technology. Now, we can do better than humans. We can be faster, and we can understand causes that humans can't, but that's a good starting point. And so today, when people first approach us, as with any new technology, they say, "Can I use this new technology to help with an existing problem I don't have a good solution for?" And that's the place they start. As you know, there are a lot of business problems that people have been struggling with for years. Take manufacturing issues, which are not purely "an IT thing." If you look at a semiconductor manufacturing line, humans spend all kinds of time trying to figure why the yield has gone down. Billions of dollars are at stake, and these are complex networks with interrelated problems. And there's a similar type of problem as to information flow through a computer network, and so on. In some cases, humans can actually see this, but in other places, they can't figure this out. There's a class of business problems that we believe HTMs can be applied to and that current researchers are spending a lot of time trying to figure out and improve on, because a lot of money is on the line and a lot of important aspects of business hinge on solving some of these problems. Again, it's a very fundamental technology. I think it will take a while for most people to come to grips with it and understand exactly how it can be used, and that's one of the reasons I wrote the white paper. Austin: Give me a good quick sense - just a nice nutshell - of what Hierarchical Temporal Memory is about? Hawkins: I'll give it in two parts. First, it's important to at least recognize that this technology came from a very clear biological understanding. And it's a very close mapping to what's going on in the human neocortex. If you believe that, it gives it some credibility and you can put some context around it so you can say, "Anything the human neocortex can do, this technology can do." And the human neocortex - half of our human brains - is what makes us smart. And humans have this tremendous flexible ability to solve all kinds of problems and understand all kinds of environments. That gives you a little context regarding what this is about. Second, what does it do? It's a memory system - but it's not like any other memory system you know. It's not like hard drives, flash memory or something like that. It's an algorithmic memory system, it's hierarchical and it inherently deals with time - thus the term "Hierarchical Temporal Memory." You don't program this memory; and you don't tell it what it needs to know; you expose it to patterns. These patterns can be coming from anything. They can be coming from a visual sensor, a network, manufacturing networks, markets or whatever. There are a lot of possibilities here. What HTMs do is learn to model the environment that they're looking at, and they build an internal representation of it. There are a couple really important things that come out of this. The first is that it discovers what we call "causes" - the things that are causing these patterns. And that's part of building the model. The discovery of causes is itself a valuable thing. If I'm trying to understand why the manufacturing yield is declining or why the network is going down, I'm looking for the highest-level causes - and some low-level causes. It may not be obvious at first, but we as humans extract these things, and so does HTM. The second part is that it can recognize things. Once it's learned something and discovered how to model its world, then when it sees a pattern that is novel, it says, "I understand this, and I have expectations about what it's going to do." Same with HTM. It gets a new pattern, and it says, "In my model of the world, I understand what's going on, and I can make predictions about what's going to happen next." HTM is a biological model that can be applied to many different types of sensory data or different types of inputs. It builds a model of that world - whatever it's being exposed to - and then it can recognize things in there, it can tell you what's going on, and it can make predictions about the future in that world. Austin: I asked one of my associates who I passed your book to, "What kind of questions should I ask?" And his question was, "If Google or Yahoo applied your technology to what it does, from a search engine point of view, would your technology potentially improve the quality of its search results?" Hawkins: I don't know a lot about how they do search results, but I can give you some examples. Imagine you go to Google Images and you type in some words, things you want to look for. How do we know what those pictures are? The only way is because some human looked at those pictures and tagged them. There is no software on the planet today that you can show a picture of something and have it correctly answer the way that you and I would: "I know what that is. That's an orange, that's a backpack, that's a dog, that's a boat." We can do that. So, for every image you would ever want to search on by keyword, some human has sat there, looked at it and typed in what they think the keywords are. In fact, we're building a particular application to develop that platform - an Image Engine. Here's a picture, what is it? Seems easy, but I can show a human images every 300 milliseconds, about 3 times a second. I can flash images in front of your face, and you'll know what every one is. And we do it so easily and so fast. If I said, "Look for a gorilla" - I've seen this done - and I just show you a stream of images, three times a second, when the gorilla shows up, bam! You know immediately, and you push a button instantly. Austin: But you don't have to be a human to do that. There are a whole bunch of mammals Hawkins: Okay, so you have to be biological. Today, there is no computer that can do it in 10 minutes, or an hour, or any time span - it's an unsolved problem. So, there's an example of a thing that could be improved. Austin: So, you're working right now on building an Image Engine? Hawkins: Let me just take a step back. We think we're on to this very fundamental algorithm. It's like programmable computers, right? It's that kind of fundamental thing. And we're building a platform. That's our product: a platform, which is the first operating system for this. You allow people to apply this technology to lots of different problems. However, we needed a problem to work on, to prove that it works, and to develop our tools. We have chosen to do visual inference, which is picture recognition. We can talk about exactly what that is, but we've chosen to do that as our exemplary problem, and we're in the process of building it. In fact, we did a little version of it to prove that it works, and now we're building a medium-scale version that's going to, hopefully, be done this summer, and then we'll eventually probably do a fully trained system. We don't know if we're going to make it commercial, make it free, or give it to the public - we don't know what we're going to do with it yet. But we're using that as our development tool and our test case, if you will. Austin: You're doing that without people having to pre-populate a list of tags? Hawkins: Well no, there's this issue where you want to supervise it. There are two ways you can have a system work. One is you can just show it patterns, and it figures out on its own that these things mean different things. Humans do this. We have a brain, but nobody comes in, plugs into our brain and says, "Here's the pattern for dog, and here's the pattern for cat." We figure this out on our own. However, you can supervise, and supervision is essentially like your mother or your father sitting there saying, "That's a cat, that's a dog, that's an orange." That can speed up the process a lot. Austin: And don't put your fingers in the fan. Hawkins: Exactly. So, HTMs work the same way. They will, on their own, discover the causes in the world, meaning what objects are in the world, and there's a hierarchy of these, but they will do that on their own. However, you can expedite the process if you know that at the top level, you can say, "This is a camera, this is a dog, this is a cat." It's generally not necessary - though in some cases, it is, and I can give some examples - but you can improve the speed of this. Austin: So, you could improve the speed of your Image Engine experiment by going to Flickr and putting all the social tags, and all the pictures ? Hawkins: I could, yes. The HTM algorithm does not require this - just like a human doesn't require this. If a human were raised without any language in the woods, by wolves, for example, that human would learn to speak, right? They would learn to recognize objects, and they would learn to know what things are different. It doesn't take supervision. However, you just won't do as good a job as if you'd spent some hours in a classroom or at home with mom and dad. And certain classifications, you just wouldn't figure out. That's why we have books, and that's why we have schools and so on. The example I used in my paper was the distinction between fruits and vegetables. There's a classification that is very difficult to determine on your own. Austin: Very difficult for humans in general. Hawkins: Yes, but once you've learned it, it's not so hard, right? However, if you were raised in the woods by animals, and no one ever told you about fruits and vegetables, you might not figure that out. And yet, that's a simple classification that most humans know, because someone told us. Austin: When you announced the formation of Numenta, I read one article that said, "You're off to license software technologies based on a novel theory of how the mind works." I read that and I winced, reacting to the mind-body dualism and other cliches. How did you react to the press coverage in general that you got? Does the press understand what you're doing? Hawkins: For the most part, very, very few people know what we're doing. It just hasn't been published very much. My book covers half of what we're doing in some sense, but it leaves out the mathematical, mechanistic business part. My book was about the biology of brains, and so there's a lot that's missing there. So first of all, most people just don't know what we're doing. And that's just a matter of education, and we haven't actually been out there trying to convince them yet, or even educate them yet, because we're in the development stages of this company, and we're not actively seeking that opportunity. I thought the press generally did a pretty good job. Basically, what they could report was who I am, who the other founders are, what credentials we have, the fact that my book proposes this new theory about how the cortex works, how it's gotten good reviews, that people have been pretty excited about it, and that we're now claiming that we're taking it to the next stage. I think one of the interesting things that came out of the launch of Numenta is I that published this book a year-and-a-half ago now, or something like that. And few very senior scientists said, "This is a landmark book, it's historic," though I've got quite a nice pile of letters people wrote me. It's selling and it did well, but for many, many people, it was a nonevent. They just didn't even know about it. Now, when we launched Numenta, it was just an announcement. We said we're starting this company. We didn't go out on a road show, and we didn't do any kind of PR of significance, but we got an awful lot of press on that. Austin: Because you have a background and credentials as a high-tech entrepreneur, who has been phenomenally successful with that. Hawkins: But I still find it funny, because the book itself, in my mind is a very significant thing. And few people picked up on that - several dozen senior scientists. And to me, that was far more significant than just starting a company. That's easy to do, right? You just rent a space and you get some people together. But it just shows you the different level of interest between academic subjects, like academic theories about biology, and someone saying, "I'm going to build a business." And so, the press was much more interested in people talking about business, because that's our culture here in Silicon Valley. And yes, I have a background here, but even if I were a very well-known scientist and I published this book, I wouldn't have gotten as much press as being a well-known entrepreneur starting a business. So, I was pleased with it. I was just surprised by how many people picked it up. We weren't looking for anything, and yet there were major stories in The Wall Street Journal, The New York Times, LA Times ? Austin: Business Week, Forbes. Hawkins: Yes - just tons of press for the company. So anyway, that's great and I think what we have to do at Numenta is deliver on what we say we're going to do. We have to produce this toolset, which we're in the process of building. We have to show that this technology really works on major problems. We're working with some customers closely now, who are going to try to apply it to different problems. And we just have to execute on that. And if we do all those things, there will be no shortage of interest in this business. If I'm right about this, this is one of the biggest things to come along in decades. Austin: You're working right now on a set of software tools that, to me, are more an attempt to validate a model that, at some point real soon, if it works, will be cast into silicon. Hawkins: Yes. We should also talk about how this technology evolved. Again, it's useful to look at how the computer industry evolved. Well, how did they build it? They started off with relays and then went to vacuum tubes. Of course, remember, they didn't have disk drives, silicon, memory chips, operating systems or compilers. They had switches. And so, they worked on this for years and years, with people figuring out how to build this. Over time, custom hardware and custom devices were created, and now there are worldwide industries built around building things for building computers. We're going to go through a similar evolution with HTM, except it's going to happen much faster. The reason it's going to happen much faster is that we're starting from a much more advanced technology base. Already, we can see how some of this will play out over time. And we've had some companies approach us who are interested in "building silicon" for HTMs. A couple companies have come to us to say, "We're silicon companies and we want to build things," and I can already speculate on several ways you could do that. However, it's a little bit early, because we don't really don't know enough about HTMs to know where the bottlenecks will be, what issues we're going to run into, what things we're going to want to accelerate or make small, and so on. So what we've decided to do for our first toolset is to build a software platform that runs on parallel computing clusters. I'd like to say standard, but there's no standard parallel computing clusters. But we're using pretty much off-the-shelf hardware. And then we'll sit back and say, "Okay, where do we go? Do we want to make them smaller? Do we need to make them faster? Do we need to make them cheaper? Do we have to make them lower power?" And if we can do all this, then when it happens, there will be so many different opportunities that no one company can really manage all of it. We're trying to play the role of a central technology-licensing platform vendor, providing as much opportunity for other people who are creative and know these other fields to do it. It's just like the computer industry - no single company can do this. And if we're onto something that big, Numenta can't run this alone. And we're not even going to try. We're trying to be a catalyst to get the whole thing going. Austin: If you succeed with the model you have for HTM, you're essentially creating an entity, whether it's software or hardware, that can be fed information, which either represents sensor information, market information, transaction information, voice, images or even intelligence data from satellites. Hawkins: Yes, yes, yes. Austin: And it eventually can figure out, using the model that you've developed, what are normal patterns and what are abnormal patterns. Hawkins: There are some characteristics about the data you have to feed it, which are pretty important. You can't just throw it anything. You have to design the system so the data is presented in a certain way. Just like for humans, the image from your eyes going to your brain is not scrambled. It has a topography to it. Things are laid out in a certain way, and that's pretty important, and we can show why. Austin: Actually, it's preprocessed in the case of the eye. Hawkins: But the point is, when the optic nerve goes to the cortex, it maintains its topography. Things that are close together tend to stay close together in the cortical representation. And that's an important part of the theory. Anyway, there are certain characteristic of the parts of problems HTMs can work on. I don't want to just say you can throw it anything. Austin: The temporal aspect is pretty easy, conceptually, for me to understand. The spatial aspect seems to take a little more thinking. Hawkins: Some people have it the other way around. Some people have trouble seeing the temporal aspect, but we can come back to that. But anyway, say it is the right type of data, and you present it in the correct way. This is generally very low-level data. You're sending little things; you're not trying to feed in high-level concepts. What HTMs do is learn to represent the structure of the world they're looking at - and literally build a little model of the thing out there. And so this is the model we're in as humans - how we perceive the world. The only way I can understand things like rooms, chairs and tables is I have a model for them. And HTMs build models like that. And so then you can say, after you've given it enough patterns over time, "How is this world structured?" What are the low-level causes and the high-level causes? Then, it can recognize new patterns and say, "This is normal or abnormal." It can recognize new patterns and say, "Hey, I know what this thing is, even though I've never seen it before" - because it turns out that almost everything is novel in the world, if you look at the patterns. So, even though you've never been in this room before, you know that this is a white board and these are chairs, even though you've never seen these patterns before. These exact patterns have never happened before. Yet, you know immediately what the stuff is. Austin: But all the primitives have happened before. Hawkins: Yes, you've seen chairs in the past, but my point is that the actual data - if you looked at the data entering your head - is novel. It seems so simple: "Oh, that's a chair." But it actually turns out that it's a completely novel pattern, so how do you know this new pattern coming in is a chair? Because obviously, it's a conceptual one, at the conscious level, but it turns out that it's a very, very difficult problem. It's like, how do you know that pattern of ones and zeros and bits is a chip? HTMs take these patterns in, form a model of the world and - if you design them correctly - build a hierarchy of causes. And the highest-level causes are the most interesting to us. This is what the most intelligence requires. And I use an example in my book - and in the white paper you read - about weather. I think weather's a good example of this. You can look at the causes of weather, and there are low-level causes. You can say, "Well, if it's cloudy, it's likely to rain. If it's cold, it's likely to snow." But then you start saying, "There's a thing called a weather front." And you might not have noticed that, if you're just living in the woods someplace. Then there's the idea of a storm, like a hurricane. And you can learn the higher-level cause. Once you know that higher-level cause, you can make better predictions. There's also something called El Ni?o, which we only discovered 20 years ago, which is even a higher-level cause. This is a cause that you, living in Nashua, New Hampshire, would never figure out on your own, or it would be very difficult to. But if there's a higher-level cause, and if you have the right data, and you present it enough over time, you might have figured it out - and someone did. Here's the goal. For example, you can say, "Why did the network get slow?" And you say, "Well, maybe it's because this router wasn't fast enough," or something like that. But the reality is, perhaps you didn't have the right network topology which is a higher-level cause you might not have discovered right away. Austin: These are insights at a higher level. Hawkins: Yes, and you can call it an insight. An insight is essentially discovering a higher-level pattern or a higher-level cause - one that persists over long periods of time and over large spatial areas. And that's what HTMs do. They build this hierarchy of insights, or hierarchy of knowledge, and if you train them properly, they discover higher- and higher-level causes. This is what humans are capable of - this is what makes a human expert. You have to be exposed to enough patterns, and you start building a better model of the world. Then you can go and help someone. A person who hasn't seen all those different patterns is not going to have that insight. An expert comes in and immediately says, "I understand what's going on here. You don't realize the bigger causes or the underlying causes of this, but let me tell you" - and then helps you out. It's just like there are soft surfaces, hard surfaces and so on. And then there are things that are combined of higher structures. Then you end up with tables and chairs. And then you end up with a concept of a room right? Or you see a bunch of tables and chairs, and then you say, "Oh, this might be a conference room." But I'm not sure if that helps people. I'm still learning how to educate people about this. It's a complex thing. Imagine trying to tell someone what a computer was 50 years ago. It's a pretty difficult task. Even if you know it, it's pretty hard to relate it to somebody. "Well, it's got these bits, and there are ones and zeros, and we have this thing called a program, and this thing called memory." It's like, "What are you talking about?" I'd like to go back to something I said earlier. Start by identifying a problem that humans find trivial but computers can't solve easily. That's a good way of starting to think about the kind of problems we can solve. Visual inference is one. You can train to be an expert businessperson. Given enough time and exposure, you'll learn how to do it. And so you get to start at that level. And looking at video - for security systems, for example - is not that hard to do, but it's impossible to write software to do it. And then you can dive in to saying, "Okay. I get a sense that if a human can do it - language, thinking, vision, analyzing problems, looking at anything - any kind of an expert - it's the kind of thing we can work on." Then you can say, "Well, how does it work?" And that's where it takes a bit of time. And the answer is that it's a memory system, and you expose it to these patterns, and it builds a model of the world. Austin: There were some brilliant insights, I think, that you picked up on. And the beauty of what you were pointing out there was the way the human central nervous system just adapts. Hawkins: I'll tell you what I think is beautiful about this. It's a really fundamental insight about psychology and mental states, and I can't claim to be the first one to have thought of this or anything else - but I have a very clear understanding right now, and most people don't. You have a perception of the world, as you look around and see things. It's very clear. But what you don't realize is that it's built up from these little bits. The actual data coming into your head isn't what you perceive. Your perception is really the model, your internal model. Some people are blind and some are deaf, but they both learn, pretty much, the exact same model of the world, right? They both learn language. Helen Keller was both blind and deaf - she had a pretty similar, almost identical model of language that you and I do. She could perceive words, just like you and I would, but she did it through touch, not through hearing or sight. You also can see and hear, but it's the same thing. You can build this model through many different types of senses; it doesn't have to be a specific set of senses. Blind people build the same world model through hearing and touch that sighted people who may be deaf do. They can't hear it, but they can see it. And you build the same model. And so, it's not irrelevant, but the sensory data is not the important part. It can be all kinds of weird stuff, but as long as it's sampling the world, you build this internal model, which is what the world really is. Vision isn't really about vision. It's about building a model of the world through a set of senses, and you can do it through different senses. You can take a blind person - specifically, it has been done with a blind person who had seen in the past and knows what vision is, but is completely blind - and put a little camera that basically puts pixel representations on the tongue. At first, it feels like a tingling on the tongue. But the patterns on the tongue correlate to the objects in the world, the same cortical algorithm, and this HTM algorithm essentially says, "Oh, it's just like seeing," and you start to see through your tongue. It's just incredible. I'd love to build some sensors like this. And it's not full vision, of course - it's a very primitive type of vision - but the point is that you start forming perceptions of things and objects in the world, just like you could see them, touch them or hear them. ![]() |
||||||||||||||||||||||||