Emmanuel Macron and his government are launching a big initiative around artificial intelligence today. They want to turn France into one of the leading countries when it comes to artificial intelligence.
“[Artificial intelligence] is a technological, economical, social and obviously ethical revolution,” Macron said in a speech. “This revolution won’t happen in 50 or 60 years, it’s happening right now. There are new opportunities and we can choose to follow some innovations or not.”
Some of the best mathematics and engineering schools are in France, and some of the best data scientists and AI researchers come from France. Many of them now work in California or London for Facebook, Deepmind, etc. And the French government wants to capitalize on that soft power to make an AI push.
And yet, how do you attract engineers and scientists? France’s answer is quite complicated because the government doesn’t want to inject a ton of public money and call it a day. It’s all about creating an AI ecosystem with strong pillars.
France’s AI strategy
First, many private companies have opened or plan to open AI research centers in France. Facebook and Google already work with hundreds of researchers in Paris. Today, Samsung, Fujitsu, DeepMind, IBM and Microsoft all announced plans to open offices in France to focus on AI research.
This represents tens of millions of dollars in investments and hundreds of employees. “Everybody is saying that Silicon Valley is overflowing right now,” a source close to the French President told me. That’s why big tech companies need to find talent outside of the U.S.
Great progress on artificial intelligence is happening in France. Samsung chooses France to locate its new research center on artificial intelligence, creating more than 100 jobs. Thank you Young Sohn for taking the leap. #ChooseFrance#AIforhumanitypic.twitter.com/Ud6jNZOw30
Foreign companies creating hundreds of jobs isn’t going to foster public research and European tech giants though — these companies are just tapping the smartest brains they can find. That’s why the French government wants to make it easier to work on fundamental research papers when you work for a private company.
The INRIA is going to create a national AI research program with four or five partners. The goal is quite simple — Macron said that there should be twice as many people studying and researching AI projects in France. Researchers will also be able to access and use some cloud computing capacities for their work.
It’s also going to get easier if you want to create a startup based on your research work or if you want to work for a private company during your PhD.
Second, France is going to set some new boundaries when it comes to data. French administrations are going to share new data sets so that anyone can build AI services using those data sets.
When it comes to health data, it looks like France wants to avoid another NHS/DeepMind scandal. While multiple French governments have worked on some kind of health data hub, Macron announced that this time it’s going to happen for real. The INDS is going to make sure that services and public institutions respect your privacy and grant authorizations on a case-by-case basis.
Third, when it comes to regulation, companies will be able to experiment in multiple industries. And it starts with autonomous vehicles. Companies will be able to test level 4 self-driving cars in 2019.
Overall, France is going to invest $1.85 billion (€1.5 billion) in AI projects, from public research to startup investments. Last year, Macron announced that there would be a new $11.2 billion (€10 billion) public fund managed by Bpifrance. Macron said today that AI startups should be the first priority of this new fund.
Making AI as neutral as possible
Arguably, the most interesting part of Macron’s speech was about the moral impact of artificial intelligence. As algorithms become more prominent, there’s a risk that they become black boxes that decide for you.
The French administration already has to share all its algorithms and data that they use following Axelle Lemaire’s law. But that’s still not the case for some touchy subjects. Macron said it’s still a work in progress.
Research projects or companies financed with public money will also have to share everything — this could impact public infrastructure companies for instance.
But it’s more complicated when you’re talking about private companies. Macron said GDPR was a step in the right direction. And now, he wants to go further.
He doesn’t have any practical suggestion for now, but he said that there should be an international certification authority. For instance, this authority could make sure that there’s no bias in training data sets. I don’t think Facebook or Google would enjoy this new regulation.
Finally, you introduce a bias if your staff is not diverse enough. That’s why schools and universities should make sure that they train a diverse group of people.
Not the first AI push
As Next INpact pointed out, there have been multiple reports on artificial intelligence over the past few years — FranceIA, the CNIL, the OPECST and the European Economic and Social Committee all wrote their own recommendations when it comes to AI policies.
Today, Fields medal winner and parliament member Cédric Villani shared a new report on artificial intelligence. It’s always an interesting read, and it was the inspiration for Macron’s speech today.
According to a source close to the French President, multiple ministers now have to focus on artificial intelligence for their own industries.
Today’s report feels more like a starting point than a conclusion. The French President thinks that AI is something important but not urgent. Now, it’s all about convincing the rest of the government to put aside all the urgent tasks for a minute and look at what’s important.
Let’s say we had a way to distribute beacons around our solar system (or beyond) that could survive for billions of years, recording what our civilization has achieved. What should they be like?
It’s easy to come up with what I consider to be sophomoric answers. But in reality I think this is a deep—and in some ways unsolvable—philosophical problem, that’s connected to fundamental issues about knowledge, communication and meaning.
Still, a friend of mine recently started a serious effort to build little quartz disks and have them hitch rides on spacecraft, to be deposited around the solar system. At first I argued that it was all a bit futile, but eventually I agreed to be an advisor to the project, and at least try to figure out what to do to the extent we can.
But, OK, so what’s the problem? Basically it’s about communicating meaning or knowledge outside of our current cultural and intellectual context. We just have to think about archaeology to know this is hard. What exactly was some arrangement of stones from a few thousand years ago for? Sometimes we can pretty much tell, because it’s close to something in our current culture. But a lot of the time it’s really hard to tell.
OK, but what are the potential use cases for our beacons? One might be to back up human knowledge so things could be restarted even if something goes awfully wrong with our current terrestrial civilization. And of course historically it was very fortunate that we had all those texts from antiquity when things in Europe restarted during the Renaissance. But part of what made this possible was that there had been a continuous tradition of languages like Latin and Greek—not to mention that it was humans that were both the creators and consumers of the material.
But what if the consumers of the beacons we plan to spread around the solar system are aliens, with no historical connection to us? Well, then it’s a much harder problem.
In the past, when people have thought about this, there’s been a tendency to say, “just show them math: it’s universal, and it’ll impress them!” But actually, I think neither claim about math is really true.
To understand this, we have to dive a little into some basic science that I happen to have spent many years working on. The reason people think math is a candidate for universal communication is that its constructs seem precise, and that at least here on Earth there’s only one (extant) version of it, so it seems definable without cultural references. But if one actually starts trying to work out how to communicate about current math without any assumptions (as, for example, I did as part of consulting on the Arrival movie), one quickly discovers that one really has to go “below math” to get to computational processes with simpler rules.
And (as seems to happen with great regularity, at least to me) one obvious place one lands is with cellular automata. It’s easy to show an elaborate pattern that’s created according to simple well-defined rules:
But here’s the problem: there are plenty of physical systems that basically operate according to rules like these, and produce similarly elaborate patterns. So if this is supposed to show the impressive achievement of our civilization, it fails.
OK, but surely there must be something we can show that makes it clear that we’ve got some special spark of intelligence. I certainly always assumed there was. But one of the things that’s come out of the basic science I’ve done was what I called the Principle of Computational Equivalence, that basically says that once one’s gotten beyond a very basic level, every system will show behavior that’s equivalent in the sophistication of the computation it exhibits.
So although we’re very proud of our brains, and our computers, and our mathematics, they’re ultimately not going to be able to produce anything that’s beyond what simple programs like cellular automata—or, for that matter, “naturally occurring” physical systems—can produce. So when we make an offhand comment like “the weather has a mind of its own,” it’s not so silly: the fluid dynamic processes that lead to the weather are computationally equivalent to the processes that, for example, go on in our brains.
It’s a natural human tendency at this point to protest that surely there must be something special about us, and everything we’ve achieved with our civilization. People may say, for example, that there’s no meaning and no purpose to what the weather does. Of course, we can certainly attribute such things to it (“it’s trying to equalize temperatures between here and there,” etc.), and without some larger cultural story there’s no meaningful way to say if they’re “really there” or not.
OK, so if showing a sophisticated computation isn’t going to communicate what’s special about us and our civilization, what is? The answer is in the end details. Sophisticated computation is ubiquitous in our universe. But what’s inevitably special about us is the details of our history and what we care about.
We’re learning the same thing as we watch the progress of artificial intelligence. Increasingly, we can automate the things we humans can do—even ones that involve reasoning, or judgement, or creativity. But what we (essentially by definition) can’t automate is defining what we want to do, and what our goals are. For these are intimately connected to the details of our biological existence, and the history of our civilization—which is exactly what’s special about us.
But, OK, how can we communicate these things? Well, it’s hard. Because—needless to say—they’re tied into aspects of us that are special, and that won’t necessarily be shared with whatever we’re trying to communicate with.
At the end of the day, though, we’ve got a project that’s going to launch beacons on spacecraft. So what’s the best thing to put on them? I’ve spent a significant part of my life building what’s now the Wolfram Language, whose core purpose is to provide a precise language for communicating knowledge that our civilization has accumulated in a way that both us humans, and computers, can understand. So perhaps this—and my experience with it—can help. But first, we should talk about history to get an idea of what has and hasn’t worked in the past.
Lessons from the Past
A few years ago I was visiting a museum and looking at little wooden models of life in ancient Egypt that had been buried with some king several millennia ago. “How sad,” I thought. “They imagined this would help them in the afterlife. But it didn’t work; instead it just ended up in a museum.” But then it struck me: “No, it did work! This is their ‘afterlife’!” They successfully transmitted some essence of their life to a world far beyond their own.
Of course, when we look at these models, it helps that a lot of what’s in them is familiar from modern times. Cows. A boat with oars. Scrolls. But some isn’t that familiar. What are those weird things at the ends of the boat, for example? What’s the purpose of those? What are they for? And here begins the challenge—of trying to understand without shared context.
I happened last summer to visit an archaeological site in Peru named Caral, that has all sorts of stone structures built more than 4000 years ago. It was pretty obvious what some of the structures were for. But others I couldn’t figure out. So I kept on asking our guide. And almost always the answer was the same: “it was for ceremonial purposes.”
Immediately I started thinking about modern structures. Yes, there are monuments and public artworks. But there are also skyscrapers, stadiums, cathedrals, canals, freeway interchanges and much more. And people have certain almost-ritual practices in interacting with these structures. But in the context of modern society, we would hardly call them “ceremonial”: we think of each type of structure as having a definite purpose which we can describe. But that description inevitably involves a considerable depth of cultural context.
When I was growing up in England, I went wandering around in woods near where I lived—and came across all sorts of pits and berms and other earthworks. I asked people what they were. Some said they were ancient fortifications; some said at least the pits were from bombs dropped in World War II. And who knows: maybe instead they were created by some process of erosion having nothing to do with people.
Almost exactly 50 years ago, as a young child vacationing in Sicily, I picked up this object on a beach:
Being very curious what it was, I took it to my local archaeology museum. “You’ve come to the wrong place, young man,” they said, “it’s obviously a natural object.” So off I went to a natural history museum, only to be greeted with “Sorry, it’s not for us; it’s an artifact.” And from then until now the mystery has remained (though with modern materials analysis techniques it could perhaps be resolved—and I obviously should do it!).
There are so many cases where it’s hard to tell if something is an artifact or not. Consider all the structures we’ve built on Earth. Back when I was writing A New Kind of Science, I asked some astronauts what the most obvious manmade structure they noticed from space was. It wasn’t anything like the Great Wall of China (which is actually hard to see). Instead, they said it was a line across the Great Salt Lake in Utah (actually a 30-mile-long railroad causeway built in 1959, with algae that happen to have varying colors on the two sides):
Then there was the 12-mile-diameter circle in New Zealand, the 30-mile one in Mauritania, and the 40-mile one in Quebec (with a certain Arrival heptapod calligraphy look):
Which were artifacts? This was before the web, so we had to contact people to find out. A New Zealand government researcher told us not to make the mistake of thinking their circle followed the shape of the cone volcano at its center. “The truth is, alas, much more prosaic,” he said: it’s the border of a national park, with trees cut outside only, i.e. an artifact. The other circles, however, had nothing to do with humans.
(It’s fun to look for evidence of humans visible from space. Like the grids of lights at night in Kansas, or lines of lights across Kazakhstan. And in recent years, there’s the 7-mile-long palm tree rendering in Dubai. And, on the flip side, people have also tried to look for what might be “archeological structures” in high-resolution satellite images of the moon.)
But, OK, let’s come back to the question of what things mean. In a cave painting from 7000 years ago, we can recognize shapes of animals, and hand stencils that we can see were made with hands. But what do the configurations of these things mean? Realistically at this point we have no serious idea.
Maybe it’s easier if we look at things that are more “mathematical”-like. In the 1990s I did a worldwide hunt for early examples of complex but structured patterns. I found all sorts of interesting things (such as mosaics supposedly made by Gilgamesh, from 3000 BC—and the earliest fractals, from 1210 AD). Most of the time I could tell what rules were used to make the patterns—though I could not tell what “meaning” the patterns were supposed to convey, or whether, instead, they were “merely ornamental.”
The last pattern above, though, had me very puzzled for a while. Is it a cellular automaton being constructed back in the 1300s? Or something from number theory? Well, no, in the end it turns out it’s a rendering of a list of 62 attributes of Allah from the Koran, in a special square form of Arabic calligraphy constructed like this:
About a decade ago, I learned about a pattern from 11,000 years ago, on a wall in Aleppo, Syria (one hopes it’s still intact there). What is this? Math? Music? Map? Decoration? Digitally encoded data? We pretty much have no idea.
I could go on giving examples. Lots of times people have said “if one sees such-and-such, then it must have been made for a purpose.” The philosopher Immanuel Kant offered the opinion that if one saw a regular hexagon drawn in the sand, one could only imagine a “rational cause” for it. I used to think of this whenever I saw hexagonal patterns formed in rocks. And a few years ago I heard about hexagons in sand, produced purely by the action of wind. But the biggest hexagon I know is the storm pattern around the north pole of Saturn—that presumably wasn’t in any usual sense “put there for a purpose:”
In 1899 Nikola Tesla picked up all sorts of elaborate and strange-sounding radio emissions, often a little reminiscent of Morse code. He knew they weren’t of human origin, so his immediate conclusion was that they must be radio messages from the inhabitants of Mars. Needless to say, they’re not. Instead, they’re just the result of physical processes in the Earth’s ionosphere and magnetosphere.
But here’s the ironic thing: they often sound bizarrely similar to whale songs! And, yes, whale songs have all sorts of elaborate rhyme-like and other features that remind us of languages. But we still don’t really know if they’re actually for “communication”, or just for “decoration” or “play.”
One might imagine that with modern machine learning and with enough data one should be able to train a translator for “talking to animals.” And no doubt that’d be easy enough for “are you happy?” or “are you hungry?”. But what about more sophisticated things? Say the kind of things we want to communicate to aliens?
I think it’d be very challenging. Because even if animals live in the same environment as us, it’s very unclear how they think about things. And it doesn’t help that even their experience of the world may be quite different—emphasizing for example smell rather than sight, and so on.
Animals can of course make “artifacts” too. Like this arrangement of sand produced over the course of a week or so by a little puffer fish:
But what is this? What does it mean? Should we think of this “piscifact” as some great achievement of puffer fish civilization, that should be celebrated throughout the solar system?
Surely not, one might say. Because even though it looks complex—and even “artistic” (a bit like bird songs have features of music)—we can imagine that one day we’d be able to decode the neural pathways in the brain of the puffer fish that lead it to make this. But so what? We’ll also one day be able to know the neural pathways in humans that lead them to build cathedrals—or try to plant beacons around the solar system.
Aliens and the Philosophy of Purpose
There’s a thought experiment I’ve long found useful. Imagine a very advanced civilization that’s able to move things like stars and planets around at will. What arrangement would they put them in?
Maybe they’d want to make a “beacon of purpose.” And maybe—like Kant—one could think that would be achievable by setting up some “recognizable” geometric pattern. Like how about an equilateral triangle? But no, that won’t do. Because for example the Trojan asteroids actually form an equilateral triangle with Jupiter and the Sun already, just as a result of physics.
And pretty soon one realizes that there’s actually nothing the aliens could do to “prove their purpose.” The configuration of stars in the sky may look kind of random to us (except, of course, that we still see constellations in it). But there’s nothing to say that looked at in the right way it doesn’t actually represent some grand purpose.
And here’s the confusing part: there’s a sense in which it does! Because, after all, just as a matter of physics, the configuration that occurs can be characterized as achieving the purpose of extremizing some quantity defined by the equations for matter and gravity and so on. Of course, one might say “that doesn’t count; it’s just physics.” But our whole universe (including ourselves) operates according to physics. And so now we’re back to discussing whether the extremization is “meaningful” or not.
We humans have definite ways to judge what’s meaningful or not to us. And what it comes down to is whether we can “tell a story” that explains, in culturally meaningful terms, why we’re doing something. Of course, the notion of purpose has evolved over the course of human history. Imagine trying to explain walking on a treadmill, or buying goods in a virtual world, or, for that matter, sending beacons out into the solar system—to the people thousands of years ago who created the structures from Peru that I showed above.
We’re not familiar (except in mythology) with telling “culturally meaningful stories” about the world of stars and planets. And in the past we might have imagined that somehow whatever stories we could tell would inevitably be far less rich than the ones we can tell about our civilization. But this is where basic science I’ve done comes in. The Principle of Computational Equivalence says that this isn’t true—and that in the end what goes on with stars and planets is just as rich as what goes on in our brains or our civilization.
In an effort to “show something interesting” to the universe, we might have thought that the best thing to do would be to present sophisticated abstract computational things. But that won’t be useful. Because those abstract computational things are ubiquitous throughout the universe.
And instead, the “most interesting” thing we have is actually the specific and arbitrary details of our particular history. Of course, one might imagine that there could be some sophisticated thing out there in the universe that could look at how our history starts, and immediately be able to deduce everything about how it will play out. But a consequence of the Principle of Computational Equivalence is what I call computational irreducibility, which implies that there can be no general shortcut to history; to find how it plays out, one effectively just has to live through it—which certainly helps one feel better about the meaningfulness of life.
The Role of Language
OK, so let’s say we want to explain our history. How can we do it? We can’t show every detail of everything that’s happened. Instead, we need to give a higher-level symbolic description, where we capture what’s important while idealizing everything else away. Of course, “what’s important” depends on who’s looking at it.
We might say “let’s show a picture.” But then we have to start talking about how to make the picture out of pixels at a certain resolution, how to represent colors, say with RGB—not to mention discussing how things might be imaged in 2D, compressed, etc. Across human history, we’ve had a decent record in having pictures remain at least somewhat comprehensible. But that’s probably in no small part because our biologically determined visual systems have stayed the same.
(It’s worth mentioning, though, that pictures can have features that are noticed only when they become “culturally absorbed.” For example, the nested patterns from the 1200s that I showed above were reproduced but ignored in art history books for hundreds of years—until fractals became “a thing” and people had a way to talk about them.)
When it comes to communicating knowledge on a large scale, the only scheme we know (and maybe the only one that’s possible) is to use language—in which essentially there’s a set of symbolic constructs that can be arranged in an almost infinite number of ways to communicate different meanings.
It was presumably the introduction of language that allowed our species to begin accumulating knowledge from one generation to the next, and eventually to develop civilization as we know it. So it makes sense that language should be at the center of how we might communicate the story of what we’ve achieved.
And indeed if we look at human history, the cultures we know the most about are precisely those with records in written language that we’ve been able to read. If the structures in Caral had inscriptions, then (assuming we could read them) we’d have a much better chance of knowing what the structures were for.
There’ve been languages like Latin, Greek, Hebrew, Sanskrit and Chinese that have been continuously used (or at least known) for thousands of years—and that we’re readily able to translate. But in cases like Egyptian hieroglyphs, Babylonian cuneiform, Linear B, or Mayan, the thread of usage was broken, and it took heroic efforts to decipher them (and often the luck of finding something like the Rosetta Stone). And in fact today there are still plenty of languages—like Linear A, Etruscan, Rongorongo, Zapotec and the Indus script—that have simply never been deciphered.
Then there are cases where it’s not even clear whether something represents a language. An example is the quipus of Peru—that presumably recorded “data” of some kind, but that might or might not have recorded something we’d usually call a language:
Math to the Rescue?
OK, but with all our abstract knowledge about mathematics, and computation, and so on, surely we can now invent a “universal language” that can be universally understood. Well, we can certainly create a formal system—like a cellular automaton—that just consistently operates according to its own formal rules. But does this communicate anything?
In its actual operation, the system just does what it does. But where there’s a choice is what the actual system is, what rules it uses, and what its initial conditions were. So if we were using cellular automata, we could for example decide that these particular ones are the ones we want to show:
What are we communicating here? Each rule has all sorts of detailed properties and behavior. But as a human you might say: “Aha, I see that all these rules double the length of their input; that’s the point.” But to be able to make that summary again requires a certain cultural context. Yes, with our human intellectual history, we have an easy way to talk about “doubling the length of their input.” But with a different intellectual history, that might not be a feature we have a way to talk about, just as human art historians for centuries didn’t have a way to talk about nested patterns.
Let’s say we choose to concentrate on traditional math. We have the same situation there. Maybe we could present theorems in some abstract system. But for each theorem it’s just “OK, fine, with those rules, that follows—much like with those shapes of molecules, this is a way they can arrange in a crystal.” And the only way one’s really “communicating something” is in the decision of which theorems to show, or which axiom systems to use. But again, to interpret those choices inevitably requires cultural context.
One place where the formal meets the actual is in the construction of theoretical models for things. We’ve got some actual physical process, and then we’ve got a formal, symbolic model for it—using mathematical equations, programs like cellular automata, or whatever. We might think that that connection would immediately define an interpretation for our formal system. But once again it does not, because our model is just a model, that captures some features of the system, and idealizes others away. And seeing how that works again requires cultural context.
There is one slight exception to this: what if there is a fundamental theory of all of physics, that can perhaps be stated as a simple program? That program is then not just an idealized model, but a full representation of physics. And the point is that that “ground truth” about our universe describes the physics that govern absolutely any entity that exists in our universe.
If there is indeed a simple model for the universe, it’s essentially inevitable that the things it directly describes are not ones familiar from our everyday sensory experience; for example they’re presumably “below” constructs like space and time as we know them. But still, we might imagine that we could show off our achievements by presenting a version of the ultimate theory for our universe (if we’d found it!). But even with this, there’s a problem. Because, well, it’s not difficult to show a correct model for the universe: you just have to look at the actual universe! So the main information in an abstract representation is in what the primitives of the abstract representation end up being (do you set up your universe in terms of networks, or algebraic structures, or what?).
Let’s back off from this level of philosophy for a moment. Let’s say we’re delivering a physical object—like a spacecraft, or a car—to our aliens. You might think the problem would be simpler. But the problem again is that it requires cultural context to decide what’s important, and what’s not. Is the placement of those rivets a message? Or an engineering optimization? Or an engineering tradition? Or just arbitrary?
Pretty much everything on, say, a spacecraft was presumably put there as part of building the spacecraft. Some was decided upon “on purpose” by its human designers. Some was probably a consequence of the physics of its manufacturing. But in the end the spacecraft just is what it is. You could imagine reconstructing the neural processes of its human designers, as you could imagine reconstructing the heat flows in the annealing of some part of it. But what is just the mechanism by which the spacecraft was built, and what is its “purpose”—or what is it trying to “communicate”?
The Molecular Version
It’s one thing to talk about sending messages based on the achievements of our civilization. But what about just sending our DNA? Yes, it doesn’t capture (at least in any direct way) all our intellectual achievements. But it does capture a couple of billion years of biological evolution, and represent a kind of memorial of the 1040 or so organisms that have ever lived on our planet.
Of course, we might again ask “what does it mean?”. And indeed one of the points of Darwinism is that the forms of organisms (and the DNA that defines them) arise purely as a consequence of the process of biological evolution, without any “intentional design”. Needless to say, when we actually start talking about biological organisms there’s a tremendous tendency to say things like “that mollusc has a pointy shell because it’s useful in wedging itself in rocks”—in other words, to attribute a purpose to what has arisen from evolution.
So what would we be communicating by sending DNA (or, for that matter, complete instances of organisms)? In a sense we’d be providing a frozen representation of history, though now biological history. There’s an issue of context again too. How does one interpret a disembodied piece of DNA? (Or, what environment is needed to get this spore to actually do something?)
Long ago it used to be said that if there were “organic molecules” out in space, it’d be a sign of life. But in fact plenty of even quite complex molecules have now been found, even in interstellar space. And while these molecules no doubt reflect all sorts of complex physical processes, nobody takes them as a sign of anything like life.
So what would happen if aliens found a DNA molecule? Is that elaborate sequence a “meaningful message,” or just something created through random processes? Yes, in the end the sequences that have survived in modern DNA reflect in some way what leads to successful organisms in our specific terrestrial environment, though—just as with technology and language—there is a certain feedback in the way that organisms create the environment for others.
But, so, what does a DNA sequence show? Well, like a library of human knowledge, it’s a representation of a lot of elaborate historical processes—and of a lot of irreducible computation. But the difference is that it doesn’t have any “spark of human intention” in it.
Needless to say, as we’ve been discussing, it’s hard to identify a signature for that. If we look at things we’ve created so far in our civilization, they’re typically recognizable by the presence of things like (what we at least currently consider) simple geometrical forms, such as lines and circles and so on. And in a sense it’s ironic that after all our development as a civilization, what we produce as artifacts look so much simpler than what nature routinely produces.
And we don’t have to look at biology, with all its effort of biological evolution. We can just as well think of physics, and things like the forms of snowflakes or splashes or turbulent fluids.
As I’ve argued at length, the real point is that out in the computational universe of possible programs, it’s actually easy to find examples where even simple underlying rules lead to highly complex behavior. And that’s what’s happening in nature. And the only reason we don’t see that usually in the things we construct is that we constrain ourselves to use engineering practices that avoid complexity, so that we can foresee their outcome. And the result of this is that we tend to always end up with things that are simple and familiar.
Now that we understand more about the computational universe, we can see, however, that it doesn’t always have to be this way. And in fact I have had great success just “mining the computational universe” for programs (and structures) that turn out to be useful, independent of whether one can “understand” how they operate. And something like the same thing happens when one trains a modern machine learning system. One ends up with a technological system that we can identify as achieving some overall purpose, but where the individual parts we can’t particularly recognize as doing meaningful things.
And indeed my expectation is that in the future, a smaller and smaller fraction of human-created technology will be “recognizable” and “understandable”. Optimized circuitry doesn’t have nice repetitive structure; nor do optimized algorithms. Needless to say, it’s sometimes hard to tell what’s going on. Is that pattern of holes on a speakerphone arranged to optimize some acoustic feature, or is it just “decorative”?
Yet again we’re thrust back into the same philosophical quandary: we can see the mechanism by which things operate, and we can come up with a story that describes why they might work that way. But there is no absolute way to decide whether that story is “correct”—except by referring back to the details of humans and human culture.
Talking about the World
Let’s go back to language. What really is a language? Structurally (at least in all the examples we know so far) it’s a collection of primitives (words, grammatical constructs, etc.) that can be assembled according to certain rules. And yes, we can look at a language formally at this level, just like we can look, say, at how to make tilings according to some set of rules. But what makes a language useful for communication is that its primitives somehow relate to the world—and that they’re tied into knowledge.
In a first approximation, the words or other primitives in a language end up being things that are useful in describing aspects of the world that we want to communicate. We have different words for “table” and “chair” because those are buckets of meaning that we find it useful to distinguish. Yes, we could start describing the details of how the legs of the table are arranged, but for many purposes it’s sufficient to just have that one word, or one symbolic primitive, “table”, that describes what we think of as a table.
Of course, for the word “table” to be useful for communication, the sender and recipient of the word have to have shared understanding of its meaning. As a practical matter, for natural languages, this is usually achieved in an essentially societal way—with people seeing other people describing things as “tables.”
How do we determine what words should exist? It’s a societally driven process, but at some level it’s about having ways to define concepts that are repeatedly useful to us. There’s a certain circularity to the whole thing. The concepts that are useful to us depend on the environment in which we live. If there weren’t any tables around (e.g. during the Stone Age), it wouldn’t be terribly useful to have the word “table.”
But then once we introduce a word for something (like “blog”), it starts to be easier for us to think about the thing—and then there tends to be more of it in the environment that we construct for ourselves, or choose to live in.
Imagine an intelligence that exists as a fluid (say the weather, for example). Or even imagine an aquatic organism, used to a fluid environment. Lots of the words we might take for granted about solid objects or locations won’t be terribly useful. And instead there might be words for aspects of fluid flow (say, lumps of vorticity that change in some particular way) that we’ve never identified as concepts that we need words for.
It might seem as if different entities that exist within our physical universe must necessarily have some commonality in the way they describe the world. But I don’t think this is the case—essentially as a consequence of the phenomenon of computational irreducibility.
The issue is that computational irreducibility implies that there are in effect an infinite number of irreducibly different environments that can be constructed on the basis of our physical universe—just like there are an infinite number of irreducibly different universal computers that can be built up using any given universal computer. In more practical terms, a way to say this is that different entities—or different intelligences—could operate using irreducibly different “technology stacks,” based on different elements of the physical world (e.g. atomic vs. electronic vs. fluidic vs. gravitational, etc.) and different chains of inventions. And the result would be that their way of describing the world would be irreducibly different.
Forming a Language
But OK, given a certain experience of the world, how can one figure out what words or concepts are useful in describing it? In human natural languages, this seems to be something that basically just evolves through a process roughly analogous to natural selection in the course of societal use of the language. And in designing the Wolfram Language as a computational communication language I’ve basically piggybacked on what has evolved in human natural language.
So how can we see the emergence of words and concepts in a context further away from human language? Well, in modern times, there’s an answer, which is basically to use our emerging example of alien intelligence: artificial intelligence.
Just take a neural network and start feeding it, say, images of lots of things in the world. (By picking the medium of 2D images, with a particular encoding of data, we’re essentially defining ourselves to be “experiencing the world” in a specific way.) Now see what kinds of distinctions the neural net makes in clustering or classifying these images.
In practice, different runs will give different answers. But any pattern of answers is in effect providing an example of the primitives for a language.
An easy place to see this is in training an image identification network. We started doing this several years ago with tens of millions of example images, in about 10,000 categories. And what’s notable is that if you look inside the network, what it’s effectively doing is to hone in on features of images that let it efficiently distinguish between different categories.
These features then in effect define the emergent symbolic language of the neural net. And, yes, this language is quite alien to us. It doesn’t directly reflect human language or human thinking. It’s in effect an alternate path for “understanding the world”, different from the one that humans and human language have taken.
Can we decipher the language? Doing so would allow us to “explain the story” of what the neural net is “thinking.” But it won’t typically be easy to do. Because the “concepts” that are being identified in the neural network typically won’t have easy translations to things we know about—and we’ll be stuck in effect doing something like natural science to try to identify phenomena from which we can build up a description of what’s going on.
OK, but in the problem of communicating with aliens, perhaps this suggests a way. Don’t try (and it’ll be hard) to specify a formal definition of “chair.” Just show lots of examples of chairs—and use this to define the symbolic “chair” construct. Needless to say, as soon as one’s showing pictures of chairs, not providing actual chairs, there are issues of how one’s describing or encoding things. And while this approach might work decently for common nouns, it’s more challenging for things like verbs, or more complex linguistic constructs.
But if we don’t want our spacecraft full of sample objects (a kind of ontological Noah’s Ark), maybe we could get away with just sending a device that looks at objects, and outputs what they’re called. After all, a human version of this is basically how people learn languages, either as children, or when they’re out doing linguistic fieldwork. And today we could certainly have a little computer with a very respectable, human-grade image identifier on it.
But here’s the problem. The aliens will start showing the computer all sorts of things that they’re familiar with. But there’s no guarantee whatsoever that they’ll be aligned with the things we (or the image identifier) has words for. One can already see the problem if one feeds an image identifier human abstract art; it’s likely to be even worse with the products of alien civilization:
What the Wolfram Language Does
So can the Wolfram Language help? My goal in building it has been to create a bridge between the things humans want to do, and the things computation abstractly makes possible. And if I were building the language not for humans but for aliens—or even dolphins—I’d expect it to be different.
In the end, it’s all about computation, and representing things computationally. But what one chooses to represent—and how one does it—depends on the whole context one’s dealing with. And in fact, even for us humans, this has steadily changed over time. Over the 30+ years I’ve been working on the Wolfram Language, for example, both technology and the world have measurably evolved—with the result that there are all sorts of new things that make sense to have in the language. (The advance of our whole cultural understanding of computation—with things like hyperlinks and functional programming now becoming commonplace—also changes the concepts that can be used in the language.)
Right now most people think of the Wolfram Language mainly as a way for humans to communicate with computers. But I’ve always seen it as a general computational communication language for humans and computers—that’s relevant among other things in giving us humans a way to think and communicate in computational terms. (And, yes, the kind of computational thinking this makes possible is going to be increasingly critical—even more so than mathematical thinking has been in the past.)
But the key point is that the Wolfram Language is capturing computation in human-compatible terms. And in fact we can view it as in effect giving a definition of which parts of the universe of possible computations we humans—at the current stage in the evolution of our civilization—actually care about.
Another way to put this is that we can think of the Wolfram Language as providing a compressed representation (or, in effect, a model) of the core content of our civilization. Some of that content is algorithmic and structural; some of it is data and knowledge about the details of our world and its history.
There’s more to do to make the Wolfram Language into a full symbolic discourse language that can express a full range of human intentions (for example what’s needed for encoding complete legal contracts, or ethical principles for AIs.) But with the Wolfram Language as it exists today, we’re already capturing a very broad swath of the concerns and achievements of our civilization.
But how would we feed it to aliens? At some level its gigabytes of code and terabytes of data just define rules—like the rules for a cellular automaton or any other computational system. But the point is that these rules are chosen to be ones that do computations that we humans care about.
It’s a bit like those Egyptian tomb models, which show things Egyptians cared about doing. If we give the aliens the Wolfram Language we’re essentially giving them a computational model of things we care about doing. Except, of course, that by providing a whole language—rather than just individual pictures or dioramas—we’re communicating in a vastly broader and deeper way.
The Reality of Time Capsules
What we’re trying to create in a sense amounts to a time capsule. So what can we learn from time capsules of the past? Sadly, the history is not too inspiring.
Particularly following the discovery of King Tutankhamun’s tomb in 1922, there was a burst of enthusiasm for time capsules that lasted a little over 50 years, and led to the creation—and typically burial—of perhaps 10,000 capsules. Realistically, though, the majority of these time capsules are even by now long forgotten—most often because the organizations that created them have changed or disappeared. (The Westinghouse Time Capsule for the 1939 World’s Fair was at one time a proud example; but last year the remains of Westinghouse filed for bankruptcy.)
My own email archive records a variety of requests in earlier years for materials for time capsules, and looking at it today I’m reminded that we seem to have created a time capsule for Mathematica’s 10th anniversary in 1998. But where is it now? I don’t know. And this is a typical problem. Because whereas an ongoing archive (or library, etc.) can keep organized track of things, time capsules tend to be singular, and have a habit of ending up sequestered away in places that quickly get obscured and forgotten. (The reverse can also happen: People think there’s a time capsule somewhere—like one supposedly left by John von Neumann to be opened 50 years after his death—but it turns out just to be a confusion.)
The one area where at least informal versions of time capsules seem to work out with some frequency is in building construction. In England, for example, when thatched roofs are redone after 50 years or so, it’s common for messages from the previous workers to be found. But a particularly old tradition—dating even back to the Babylonians—is to put things in the foundations, and particularly at the cornerstones, of buildings.
Often in Babylonian times, there would just be an inscription cursing whoever had demolished the building to the point of seeing its foundations. But later, there was for example a longstanding tradition among Freemason stonemasons to embed small boxes of memorabilia in public buildings they built.
More successful, however, than cleverly hidden time capsules have been stone inscriptions out in plain sight. And indeed much of our knowledge of ancient human history and culture comes from just such objects. Sometimes they are part of large surviving architectural structures. But one famous example (key to the deciphering of cuneiform) is simply carved into the side of a cliff in what’s now Iran:
Such inscriptions were common in the ancient world (as their tamer successors are common today). But somehow their irony was well captured by what is probably my single all-time favorite poem, Shelley’s “Ozymandias” (named after Ramses II of Egypt):
“I met a traveller from an antique land, Who said—Two vast and trunkless legs of stone Stand in the desert. … And on the pedestal, these words appear: ‘My name is Ozymandias, King of Kings; Look on my Works, ye Mighty, and despair!’ Nothing beside remains. Round the decay Of that colossal Wreck, boundless and bare The lone and level sands stretch far away.”
If there was a “Risks” section to a prospectus for the beacon project, this might be a good exhibit for it.
Of course, in addition to intentional “showoff” inscriptions, ancient civilizations left plenty of “documentary exhaust” that’s still around in one form or another today. A decade ago, for example, I bought off the web (and, yes, I’m pretty sure it’s genuine) a little cuneiform tablet from about 2100 BC:
It turns out to be a contract saying that a certain Mr. Lu-Nanna is receiving 1.5 gur (about 16 cubic feet) of barley in the month of Dumuzi (Tammuz/June-July), and that in return he should pay out certain goods in September-November.
Most surviving cuneiform tablets are about things like this. One in a thousand or so are about things like math and astronomy, though. And when we look at these tablets today, it’s certainly interesting to see how far the Babylonians had gotten in math and astronomy. But (with the possible exception of some astronomical parameters) after a while we don’t really learn anything more from such tablets.
And that’s a lesson for our efforts now. If we put math or science facts in our beacons, then, yes, it shows how far we’ve gotten (and of course to make the best impression we should try to illustrate the furthest reaches of, for example, today’s math, which will be quite hard to do). But it feels a bit like job applicants writing letters that start by explaining basic facts. Yes, we already know those; now tell us something about yourselves!
But what’s the best way to do that? In the past the channel with the highest bandwidth was the written word. In today’s world, maybe video—or AI simulation—goes further. But there’s more—and we’re starting to see this in modern archaeology. The fact is that pretty much any solid object carries microscopic traces of its history. Maybe it’s a few stray molecules—say from the DNA of something that got onto an eating utensil. Maybe it’s microscopic scratches or cracks in the material itself, indicating some pattern of wear.
Atomic force microscopy gives us the beginning of one way to systematically read such things out. But as molecular-scale computing comes online, such capabilities will grow rapidly. And this will give us access to a huge repository of “historical exhaust.”
Facebook announced this morning that it had completed its move to neural machine translation a complicated way of saying that Facebook is now using convolutional neural networks (CNNs) and recurrent neural networks (RNNs) to automatically translate content across Facebook.
Google, Microsoft and Facebook have been making the move to neural machine translation for some time now, rapidly leaving old-school phrase-based statistical machine translation behind. There are a lot of reasons why neural approaches show more promise than phrase-based approaches, but the bottom line is that they produce more accurate translations.
Traditional machine translation is a fairly explicit process. Relying on key phrases, phrase-based systems translate sentences then probabilistically determine a final translation. You can think of this in a similar light as using theRosetta Stone (identical phrases in multiple languages) to translate text.
In contrast, neural models deal in a higher level of abstraction. The interpretation of a sentence becomes part of a multi-dimensional vector representation, which really just means were trying to translate based on some semblance of context rather than phrases.
Facebook Status update translation
Its not a perfect process, and researchers are still tinkering with how to deal with long-term dependencies (i.e. retaining understanding and accuracy throughout a long text), but the approach is incredibly promising and has produced great results, thus far, for those implementing it.
Our problem is different than that of most of the standard places,mostly because of the type of language we see at Facebook, Necip Fazil Ayan, engineering manager in Facebooks language technologies group,explained to me in an interview. We see a lot ofinformal language and slang acronyms. The style of language is very different.
Googles brain team is open sourcing Tensor2Tensor, a new deep learning library designed to help researchers replicate results from recent papers in the field and push the boundaries of whats possible by trying new combinations of models, datasets and other parameters. The sheer number of variables in AI research combined with the fast pace of new developments makes it difficult for experiments run in two distinct settings to match. This is a pain for researchers and a drag on research progress.
The Tensor2Tensor library makes it easier to maintain best practices while conducting AI research. It comes equipped with key ingredients including hyperparameters, data-sets, model architectures and learning rate decay schemes.
The best part is that any of these components can be swapped in and out in a modular fashion without completely destroying everything. From a training perspective, this means that with Tensor2Tensor you can bring in new models and data sets at any time a much simpler process than would ordinarily be possible.
Google isnt alone in its pursuits to help make research more reproducible outside the lab. Facebook recently open sourced ParlAI, its tool to facilitate dialog research that comes prepackaged with commonly used datasets.
Here’s how science works: You have a question about some infinitesimal sliver of the universe. You form a hypothesis, test it, and eventually gather enough data to support or disprove what you thought was going on. That’s the fun part. The next bit is less glamorous: You write a manuscript, submit it to an academic journal, and endure the gauntlet of peer review, where a small group of anonymous experts in your field scrutinize the quality of your work.
Peer review has its flaws. Human beings (even scientists) are biased, lazy, and self-interested. Sometimes they suck at math (even scientists). So, perhaps inevitably, some people want to remove humans from the process—and replace them with artificial intelligence. Computers are, after all, unbiased, sedulous, and lack a sense of identity. They are also, by definition, good at math. And scientists aren’t just waiting around for some binary brain to manifest a set of protocols for identifying experimental excellence. Journal publishers are already building this stuff, piecemeal.
Recently, a competition called ScienceIE challenged teams to create programs that could extract the basic facts out of sentences in scientific papers, and compare those to the basic facts from sentences in other papers. “The broad goal of my project is to help scientists and practitioners gain more knowledge about a research area more quickly,” says Isabelle Augenstein, a post-doctoral AI researcher at University College of London, who devised the challenge.
That’s a tiny part of artificial intelligence’s biggest challenge: processing natural human language. Competitors designed programs to tackle three subtasks: reading each paper and identifying its key concepts, organizing key words by type, and identifying relationships between different key phrases. And it’s not just an academic exercise: Augenstein is on a two-year contract with Elsevier, one of the world’s largest publishers of scientific research, to develop computational tools for their massive library of manuscripts.
She has her work cut out for her. Elsevier publishes over 2,5001 different journals. Each has an editor, who has to find the right reviewer for each manuscript. (In 2015, 700,000 peer reviewers reviewed over 1.8 million manuscripts across Elsevier’s journals; 400,000 were eventually published.) “The number of humans capable of reviewing a proposal is generally limited to the specialists in that field,” says Mike Warren, AI veteran and CTO/co-founder of Descartes Labs, a digital mapping company that uses AI to parse satellite images. “So, you’ve got this small set of people with PhDs, and you keep dividing them into disciplines and sub-disciplines, and when you’re done there might only be 100 people on the planet qualified to review a certain manuscript.” Augenstein’s work is part of Elseviers work to automatically suggest the right reviewers for each manuscript.
Elsevier has developed a suite of automated tools, called Evise, to aid in peer review. The program checks for plagiarism (although that’s not really AI, just a search and match function), clears potential reviewers for things like conflicts of interest, and handles workflow between authors, editors, and reviewers. Several other major publishers have automated software to aid peer review—Springer-Nature, for instance, is currently trialing an independently-developed software package called StatReviewer that ensures that each submitted paper has complete and accurate statistical data.
But none seem as open about their capabilities or aspirations as Elsevier. “We are investigating more ambitious tasks,” says Augenstein. “Say you have a question about a paper: A machine learning model reads the paper and answers your question.”
Thank you very much, Dr. Roboto, but no thanks
Not everyone is charmed by the prospect of Dr. Roboto, PhD. Last month, Janne Hukkinen, professor of environmental policy at University of Helsinki, Finland, and editor of the Elsevier journal Ecological Economics wrote a cautionary op-ed for WIRED, premised on a future where AI peer review became fully autonomous:
I dont see why learning algorithms couldnt manage the entire review from submission to decision by drawing on publishers databases of reviewer profiles, analyzing past streams of comments by reviewers and editors, and recognizing the patterns of change in a manuscript from submission to final editorial decision. Whats more, disconnecting humans from peer review would ease the tension between the academics who want open access and the commercial publishers who are resisting it.
By Hukkinens logic, an AI that could do peer review could also write manuscripts. Eventually, people become a legacy system within the scientific method—redundant, inefficient, obsolete. His final argument: “New knowledge which humans no longer experience as something they themselves have produced would shake the foundations of human culture.”
COMPUTER: I can render the shimmering fantasy worlds of your dreams 60 times a second
But Hukkinens dark vision of machines capable of outthinking human scientists is, at the very least, decades away. “AI, despite its big successes in games like chess, Go, and poker, still cant understand most normal English sentences, let alone scientific text,” says Oren Etzioni, CEO of the Allen Institute for Artificial Intelligence. Consider this: The winning team from Augensteins ScienceIE competition scored 43 percent across the three subtasks.
And even non-computer brains have a hard time comprehending the passive-voiced mumbo jumbo common in scientific manuscripts; it is not uncommon for inscriptions within the literature to be structured such that the phenomenon being discussed is often described, after layers of prepositional preamble, and in vernacular that is vague, esoteric, and exorbitant, as being acted upon by causative factors. Linguists call anything written by humans, for humans, natural language. Computer scientists call natural language a hot mess.
“One large category of problems in natural language for AI is ambiguity,” says Ernest Davis, a computer scientist at NYU who studies common sense processing. Lets take a classic example of ambiguity, illustrated in this sentence by Stanford University emeritus computer scientist Terry Winograd:
The city councilmen refused the demonstrators a permit because they [feared/advocated] violence.
To you and me, the verbs give away who they refers to: the city council fears; the demonstrators advocate. But a computer brain would have a hell of a time figuring out which verb indicates which pronoun. And that type of ambiguity is just one thread in the tangled knot of natural language—from simple things like understanding homographs to unraveling the logic of narratives.
That’s not even touching on the specific issues in scientific papers, like connecting a written argument to some pattern in the data. This is even the case in pure mathematics papers. “Going from English to the formal logic of mathematics is not something we can automate,” says Davis. “And that would be one of the easiest things to work on because it’s highly restrictive and we understand the targets.” Disciplines that aren’t rooted in mathematics, like psychology, will be even more difficult. “In psychology papers, were nowhere near being able to check the reasonableness of arguments, says Davis. We dont know how to express the experiment in a way that a computer could use it.
And of course, a fully autonomous AI peer reviewer doesnt just have to outread humans, it has to outthink them. “When you think about AI problems, peer review is probably among the very hardest you can come up with, since the most important part of peer review is determining that research is novel, its something that has not been done before by someone else,” says Warren. A computer program might be able to survey the literature and figure out which questions remain, but would it be able to pick out research of Einsteinian proportions—some new theory that completely upends previous assumptions about how the world works?
Then again, what if everyone—AI advocates and critics alike—are looking at the problem backwards? “Maybe we just need to change the way we do scientific publishing,” says Tom Dietterich, AI researcher at Oregon State University. So, rather than writing our research as a story in English, we link our claims and evidence into a formalized structure, like a database, containing all the things that are known about a problem people are working on. Computerize the process of peer review, in other words, rather than its solution. But at that point its not computers youre reprogramming: Youre reprogramming human behavior.
1 UPDATE: 2/22/2017 — Originally, this said Elsevier published 7,500 journals. This was due to either a typo or just poorly-transcribed information. Either way, it’s fixed now.
At least, thats what Oscar Boykin says. He majored in physics at the Georgia Institute of Technology and in 2002 he finished a physics PhD at UCLA. But four years ago, physicists at the Large Hadron Collider in Switzerland discovered the Higgs boson, a subatomic particle first predicted in the 1960s. As Boykin points out, everyone expected it. The Higgs didn’t mess with the theoretical models of the universe. It didn’t change anything or give physcists anything new to strive for. “Physicists are excited when there’s something wrong with physics, and we’re in a situation now where there’s not a lot that’s wrong,” he says. “It’s a disheartening place for a physicist to be in.” Plus, the pay isn’t too good.
Boykin is no longer a physicist. He’s a Silicon Valley software engineer. And it’s a very good time to be one of those.
Boykin works at Stripe, a $9-billion startup that helps businesses accept payments online. He helps build and operate software systems that collect data from across the company’s services, and he works to predict the future of these services, including when, where, and how the fraudulent transactions will come. As a physicist, he’s ideally suited to the job, which requires both extreme math and abstract thought. And yet, unlike a physicist, he’s working in a field that now offers endless challenges and possibilities. Plus, the pay is great.
If physics and software engineering were subatomic particles, Silicon Valley has turned into the place where the fields collide. Boykin works with three other physicists at Stripe. In December, when General Electric acquired the machine learning startup Wise.io, CEO Jeff Immelt boasted that he had just grabbed a company packed with physicists, most notably UC Berkeley astrophysicist Joshua Bloom. The open source machine learning software H20, used by 70,000 data scientists across the globe, was built with help from Swiss physicist Arno Candel, who once worked at the SLAC National Accelerator Laboratory. Vijay Narayanan, Microsoft’s head of data science, is an astrophysicist, and several other physicists work under him.
Its not on purpose, exactly. “We didn’t go into the physics kindergarten and steal a basket of children,” says Stripe president and co-founder John Collison. “It just happened.” And it’s happening across Silicon Valley. Because structurally and technologically, the things that just about every internet company needs to do are more and more suited to the skill set of a physicist.
Of course, physicists have played a role in computer technology since its earliest days, just as they’ve played a role in so many other fields. John Mauchly, who helped design the ENIAC, one of the earliest computers, was a physicist. Dennis Ritchie, the father of the C programming language, was too.
But this is a particularly ripe moment for physicists in computer tech, thanks to the rise of machine learning, where machines learn tasks by analyzing vast amounts of data. This new wave of data science and AI is something that suits physicists right down to their socks.
Among other things, the industry has embraced neural networks, software that aims to mimic the structure of the human brain. But these neural networks are really just math on an enormous scale, mostly linear algebra and probability theory. Computer scientists aren’t necessarily trained in these areas, but physicists are. “The only thing that is really new to physicists is learning how to optimize these neural networks, training them, but that’s relatively straightforward,” Boykin says. “One technique is called Newton’s method. Newton the physicist, not some other Newton.”
Chris Bishop, who heads Microsoft’s Cambridge research lab, felt the same way thirty years ago, when deep neural networks first started to show promise in the academic world. That’s what led him from physics into machine learning. “There is something very natural about a physicist going into machine learning,” he says, “more natural than a computer scientist.”
The Challenge Space
Ten years ago, Boykin says, so many of his old physics pals were moving into the financial world. That same flavor of mathematics was also enormously useful on Wall Street as a way of predicting where the markets would go. One key method was The Black-Scholes Equation, a means of determining the value of a financial derivative. But Black-Scholes helped foment the great crash of 2008, and now, Boykin and others physicists say that far more of their colleagues are moving into data science and other kinds of computer tech.
Earlier this decade, physicists arrived at the top tech companies to help build so-called Big Data software, systems that juggle data across hundreds or even thousands of machines. At Twitter, Boykin helped build one called Summingbird, and three guys who met in the physics department at MIT built similar software at a startup called Cloudant. Physicists know how to handle data—at MIT, Cloudant’s founders handled massive datasets from the the Large Hadron Collider—and building these enormously complex systems requires its own breed of abstract thought. Then, once these systems were built, so many physicists have helped use the data they harnessed.
In the early days of Google, one of the key people building the massively distributed systems in the companys engine room was Yonatan Zunger, who has a PhD in string theory from Stanford. And when Kevin Scott joined the Google’s ads team, charged with grabbing data from across Google and using it to predict which ads were most likely to get the most clicks, he hired countless physicists. Unlike many computer scientists, they were suited to the very experimental nature of machine learning. “It was almost like lab science,” says Scott, now chief technology officer at LinkedIn.
Now that Big Data software is commonplace—Stripe uses an open source version of what Boykin helped build at Twitter—its helping machine learning models drive predictions inside so many other companies. That provides physicists with any even wider avenue into the Silicon Valley. At Stripe, Boykin’s team also includes Roban Kramer (physics PhD, Columbia), Christian Anderson (physics master’s, Harvard), and Kelley Rivoire (physics bachelor’s, MIT). They come because they’re suited to the work. And they come because of the money. As Boykin says: “The salaries in tech are arguably absurd.” But they also come because there are so many hard problems to solve.
Anderson left Harvard before getting his PhD because he came to view the field much as Boykin does—as an intellectual pursuit of diminishing returns. But that’s not the case on the internet. “Implicit in ‘the internet’ is the scope, the coverage of it,” Anderson says. “It makes opportunities much greater, but it also enriches the challenge space, the problem space. There is intellectual upside.”
Today, physicists are moving into Silicon Valley companies. But in the years come, a similar phenomenon will spread much further. Machine learning will change not only how the world analyzes data but how it builds software. Neural networks are already reinventing image recognition, speech recognition, machine translation, and the very nature of software interfaces. As Microsofts Chris Bishop says, software engineering is moving from handcrafted code based on logic to machine learning models based on probability and uncertainty. Companies like Google and Facebook are beginning to retrain their engineers in this new way of thinking. Eventually, the rest of the computing world will follow suit.
In other words, all the physicists pushing into the realm of the Silicon Valley engineer is a sign of a much bigger change to come. Soon, all the Silicon Valley engineers will push into the realm of the physicist.