[People] thought that the landing zone to retirement was well in place
How does capitalism change in the age of big data and the data superstars like Amazon? Viktor's concepts of data rich markets explains how.
Ben: Welcome to the Masters of Data Podcast, the podcast that brings the human to data. And I'm your host, Ben Newton. What better guest to have on the Masters of Data Podcast than the author of a book called Big Data and who just recently published a book called Reinventing Capitalism In The Age of Big Data. Viktor Mayer-Schoenberger is the Professor of Internet Governance and Regulation at Oxford. His research focuses on the role of information in the networked economy, earlier spent 10 years on the faculty of Harvard's Kennedy School of Government. We talked about his new book and the concept of data rich markets. I think you'll like it. So without any further ado, let's dig in.
Welcome everybody to the Masters of Data Podcast. And I am very excited about our guest today. He is an author. He is a professor. He is a speaker all around the world. Viktor Mayer-Schoenberger, welcome to the show.
Viktor: Thank you Ben. Thank you for having me.
Ben: We're especially excited because you're the Professor of Internet Governance and Regulation at the Oxford Internet Institute. And among other things you've written a couple books that are of very much interest to me and I think particularly to this audience. You wrote the book called Big Data a few years ago and just recently wrote your newer book about how big data is changing our economy and actually reinventing capitalism which I think as soon as I saw that I knew I wanted to talk to you so I'm really excited we were able to make this work.
Viktor: Sure, I'd be excited to engage in a conversation with you. These are very important topics and we must talk about that.
Ben: Yeah absolutely. And I think you've got some really big ideas and a book which I'm excited to talk through. But as we always do when we get started on the podcast I really like to understand why people are where they are, like what's your story. And in particular the research I was doing before we talked I think you have a really interesting background. So I'd love to hear how did you end up where you are now talking about big data and government at Oxford. What brought you to that point?
Viktor: Well the short answer is good fortune. But the longer answer is that I had two careers in the past. One was I had a software firm, sold it, got another software firm so I'm a bit of a serial entrepreneur on the software side. And then at the same time got involved in law and public policy, spent 10 years at the Faculty at Harvard doing public policy in the hi-tech sector. And since 2010 at the Faculty of Oxford University doing the same thing. And so I've always been really interested in data and information and how we use data and how it changes our society, changes the economy, changes the institutions that are so central to what we are as a society. And that's where the book Big Data came from and the more recent book about reinventing capitalism. So in a way in hindsight it looks like a straight path but there were a lot of dead ends and a lot of twists and turns and a lot of good fortune on the way.
Ben: That makes the more interesting story. When I was looking back at your background Viktor because when I originally saw the book title Reinventing Capitalism, Big Data I'm like OK this is a very ambitious book and I thought that was great. And then looking back at your background I mean you really do have a fascinating background to arrive at that because I mean even when you [inaudible 00:03:51] when you were younger participating in the International Physics Olympiad, young programmers contest, you got that background understanding the technology side plus law and economics. So that does seem like that puts you in a good place where you're combining a lot of these disciplines to come at this problem from a very different perspective which is great.
Viktor: Yeah I hope so. I guess I was a nerd before we were called nerds. I still remember the times I when I started at around the age of 14 or 15 to hook up to the Internet I did it with an acoustic coupler, an old fashioned plain old telephone where I put the receiver into the acoustic coupler and we had 300 bits per second of transmission speed. That was very different but fun.
Ben: No that's cool. I don't think I even got a chance to do that. I think my first one was an actual modem. That's pretty cool. So I mean a big part of what I think is so interesting about what you're talking about now is I hear people throw around the term big data a lot, you hear it get misused, I mean I even say there's been kind of overuse in a lot of ways over the last few years. But then what attracted me to a lot of stuff you've been doing particularly when you first wrote the book about big data and what you're talking about now is you seem to have taken a more comprehensive, more in-depth idea of what big data is. So I'd love for you to talk about what you actually mean by big data and actually how you define that when someone asks you.
Viktor: Sure. For a lot of people big data is maybe a tool to do what we were already doing but just faster or better or more efficiently. And to me big data is far more than that. Big data is an opportunity for us to get a new perspective on reality, a new look at the world. It's a new lens at reality that we didn't have before. And it's not the absolute number of data points that really matter but our ability to capture comprehensively a particular phenomenon in data that we're interested in so that we are not just looking at a small sample of data of a particular phenomenon but really something that is sort of approaching at least close to all of the data that can be captured about a specific phenomenon. And that enables us to see detail and to see connections with a small sample of data as in the small data world we just couldn't see whether it's our ability to identify illnesses such as skin cancer or whether it's our ability to better see what pedagogical methods work for children in education or whether it's how cars drive themselves. But it's really the kind of insight that we gain out of comprehensive amounts of data. And that can be millions of data points or billions of data points but it really needs to capture the essence of the phenomenon.
Ben: That makes a lot of sense. And one idea that was connected to that which definitely really captured me as I was looking at what you have been talking about and what you've been researching and writing in these books, you talked about the concept of a data rich market and I would admit that I hadn't really thought about big data that way and the concept of exactly how it's penetrating the markets. Maybe talk a little bit about that. When you say data rich markets what do you actually mean?
Viktor: Let me start out by saying there's a lot of debate right now about data being the new oil that is a new very powerful valuable resource. But I think that's way too short of a view because that suggests that our economy stays the same, the only thing that changes is that we are not trading in oil anymore, we are trading and transacting in data. And so that the entire economic institutions and processes they all stay the same, it's just a different resource that we are transacting namely data. And that I think is fundamentally far too short sighted. The recent book really argues that thanks to data the way we transact on the markets, the way markets work changes. And when we think about a market and how our market works it's really about helping people to coordinate with each other so that two people who have complementary preferences and needs can find each other, transact and exchange goods or services. But for that to work we need a marketplace where a lot of transaction partners can be found and we need some way where people can find each other. We need a matching process.
The market does that pretty well but for the market to work a lot of information needs to be available on the market about other people's offerings and preferences. And we need to then take all of that information, each one of us in the marketplace, and make a decision about what we buy or with whom we transact. And that was way too much for us humans to deal with in the past. So we kind of condensed it down. We used a shortcut and that shortcut, that condensation process, was that we would rather than communicating everything about our preferences, we just condense everything into price and exchange price information.
And there is even Nobel prizes being awarded for emphasizing the importance of price and that price is sort of the grease that makes the traditional market work. But price is just a crutch. It's not particularly good to condense everything into a single figure and then exchange a single figure because a lot of detail gets lost. And so traditional markets work but they're not extremely efficient. And that's what we see. We buy stuff because we think it's cheap but we don't really need it because we just over emphasize the importance of price compared to other qualities. We don't look at our preferences very well because it just overwhelms our cognitive capabilities. Comparing price is just so easy. We're socialized in doing that so we're focusing very much on price traditionally.
But we see now a new kind of market coming up and marketplaces where we have much more information rather than just price available. And that means we can find exactly what we're looking for whether it's booking platforms, from Kayak to bookingdot.com, whether it's car sharing platforms like blah blah car or whether it's classic e-commerce platforms like Amazon. These are all marketplaces where we thanks to data have a much better chance of finding exactly what we're looking for and that makes these kind of markets far better than traditional markets, priced based markets.
And so a lot of people talk about platform economy. I can't hear that word anymore because platform doesn't say much about what's going on, what's really going on is that these are marketplaces where we exchange so much rich data that we really find what we need. And that's why we like the market so much. There's a reason why people go to Amazon, it's not because we are forced to or because Jeff bribes us or because it's the cheapest place. It's because we easily find exactly what we we're looking for. And that's why data rich markets have such a leg up compared to traditional price based markets and that's why we see them come up whether it's Amazon or whether it's Uber or Airbnb or blah blah car or Kayak. Think also of Apple, Apple's App Store is nothing but a data rich market that makes recommendations about what kind of apps we might want to look at and that helps us find in this huge app universe some of the apps that we really like and cherish. That's the kind of marketplace that really helps humans find each other and coordinate and is so much better than what we had so far.
Ben: That does make a lot of sense and a couple of things stuck out to me on that because particularly when you talk about Amazon or Apple or any of these companies that are kind of at the forefront of this, if you look back big data a few years ago I mean there was a lot of high expectations and then it didn't always pan out and a lot of it didn't seem to pan out because these companies were collecting massive amounts of data but then they couldn't take advantage of them and there were a couple of things that stuck out to me reading how you were describing this. It seems a lot of what you were talking about in terms of these companies being able to generate these data rich markets it was about use a concept of a data ontology like a data structure class. I was thinking classification, tagging metadata, basically organization of the data. It seems like if I understood you correctly that a big part of being able to leverage this data is being able to classify it, organize it and present it in a way that makes these transactions work. Am I understanding that right?
Viktor: You're absolutely understanding it right. If I think about how my students go about booking a hotel room in a city they go on one of those platforms and then they select the kind of features that they want from a hotel like free Wi-Fi or like some bar scene close by, maybe some public transport close by and then they look at the reviews, they look at the area through Google Street View and only relatively late in the game they actually look at price. And so price is not just the defining category anymore, it's just one of many categories in a decision process. For all of that you need to have a lot of data and you need to be able to categorize it.
And when you look at the e-commerce landscape like 15 years ago or so there was eBay and there was Amazon and eBay was bigger than Amazon. And today of course Amazon is a 500 pound gorilla and eBay is much much smaller. Why is that? Well because on eBay you had lots and lots of goods but being able to find exactly what we were looking for was much harder on eBay than it was on Amazon. And it's this kind of easily finding through filters and so forth what you're looking for that really differentiates one marketplace from the other. And Amazon very very early on also built in recommendation features, a recommendation engine that supposedly accounts for a third of Amazon's revenues right now. And that is because it works. It just provides us with recommendations that are pretty good. Of course we laugh once in a while about a stupid recommendation that we received but more often than not we get good recommendations and we transact based on those.
Ben: And a question about that too when I was thinking about kind of the classification it seems probably what you're referring to is I've heard this term called the analytics economy as well and I mean I think it's coming from a different perspective but I think it's getting back at that core issue is that yeah you can have all this data but how are you actually going to leverage it. So I mean how do you see the kind of balance between ... well I guess you could just classify a bunch of stuff but if you don't actually have a way of actually processing that and ... I guess what I'm fundamentally getting at is like what's the role of some of the new advances in machine learning and A.I. and these algorithms in general compared to just the classification itself you think?
Viktor: We can now use artificial intelligence not just to help us classify data that's about a particular product into certain categories but we can actually use machine learning now to come up with better ontologies, that is better classification schemes. And there's a real gold rush in this particular field. There is startup companies that come up with new ways and of course every big player in the e-commerce field is gobbling up those startups because being able to come up with better ontologies, being able to come up with better classifications, faster and easier ways to classify new products, is paramount. Especially when you are a large marketplace with millions of products for sale and when your customers want more and more information.
Think about an e-commerce market for clothes, for apparel. You need pictures there. You need to be able to perhaps even take the picture and derive a 3-D model from it so that you can put them on a mannequin and see how it looks and sort of combine them. All of that requires a lot of data that's correctly classified. But that's just one part of the coin. The other is of course that you need artificial intelligence and machine learning to then out of all this data help in the matching process, help bring two transaction partners together. Tell me what kind of product I really should look at. And for that more data just trumps less data. If you have a lot of data about my preferences you are able to find what I'm looking for. It's like if you have a personal shopper and that personal shopper knows you very well, that personal shopper is going to be a real help to you compared with a sales person that doesn't know you at all and just may suggest the three standard products.
Ben: Yeah I think that's pretty compelling and it makes a lot of sense because I've even said this a couple of times myself. It seems to be that I mean that's where the investments are going for these companies right now is finding better ways to, I mean I've kind it personalization before and I think the way you describe it makes a lot of sense, it's basically how are you understanding the participants in this transaction to make it basically be a more advantageous exchange I guess would be the right way to say it. We're talking a lot about Amazon. We mentioned Google and you'd have ones like Facebook and things like that all in the same breath, Apple. I mean we're talking about a lot of very dominant companies, companies that have managed to I think take advantage of what you were just talking about, particularly Amazon and Google and Facebook, you call them superstars in your book and particularly in the events over the last year or two I think we've seen that that's not always a good thing. It seems like you come from that perspective as well, that it's not necessarily a good thing to have these companies that are so dominant. So talk a little bit about that. I mean how does that affect this data rich economy that you're talking about?
Viktor: Well so here's the interesting conundrum or the tension that you point at. On the one hand side these data rich markets are phenomenal. They're great. They're vastly better than the traditional markets in trying to find us what we are looking for. But a fundamental feature of the market is that there's decentral decision making, that every market participant, every consumer makes his or her own decision about what to purchase or not purchase. The problem is that these companies that now run data rich markets, think of Amazon, Amazon is not a traditional firm. It's not a traditional company. It's really a company that runs a global marketplace and a global online marketplace that's data rich. And so when you look at that data rich marketplace it provides consumers with good matches. But when you look more closely you see that every third dollar spent by a consumer on Amazon is spent based on a recommendation that Amazon makes. And so that means the decision making in the market isn't necessarily decentral anymore. It's Amazon's recommendation engine that tells a third of the people or that sort of influences a third of the transactions. And that means there is more and more a concentration of decision making in these marketplaces. And that's very worrying. Why?
Decentral structure of the market is really good because it makes the market robust. It makes it less vulnerable. If somebody in the marketplace makes a stupid decision, buys something that he shouldn't buy, he's hurting but the market isn't hurting. But suppose Amazon's recommendation engine has a deep flaw built into it. Nobody knows it. Not even Jeff Bezos knows about it but it's a deep flaw and it makes a mistake. A slight mistake with every recommendation. Then we all make the same mistake. And that brings not just harm to one or two individuals but to a third or half or more of all the shoppers on Amazon. That may bring down the entire marketplace. It's a single point of failure. And the problem is that if you have a marketplace where there is a single decision maker then you don't have a market anymore. You have a planned economy. And the danger of Amazon is that it looks like a marketplace. But it's not nothing more than approaching a planned economy. And that should worry us.
Ben: Yeah. And one thing you kind of touched upon there as well is it feels like over the last particularly couple of years because of some big stories about data privacy and things like that that there is more of an awareness now that these marketplaces, be it marketplaces for I don't know like social interactions like on Facebook or products with Amazon aren't always fair. So that it can be biased and that these kind of unconscious or even conscious biases but a lot of times unconscious biases work their way into the system and it seems like what you're getting as well I mean that seems like that's gonna be very likely to happen when you have these concentrated points of power within these markets because if Google is the one designing face detection algorithms that would be used to detect and do things on your behalf and some portion of the population is being detected as apes that's not a good thing. How do you think about how bias is creeping into this and how dangerous it is?
Viktor: Well we all know that because machine learning is data driven, is based on learning data, all of the biases and the learning data of course get reflected in the algorithm that comes out of that learning process. So in that sense machine learning is capturing all of the inherent biases of us humans but it could get worse and that is because of that single point of failure. Just think about Facebook and what we heard over the last two years. Facebook uses machine learning to tell us what kind of news feed articles we should read, what kind of posts we should read. And that's a machine learning process that's data driven. Now Russian trolls and companies like Cambridge Analytica and others have helped the hackers to essentially weaponize the Facebook structure, the Facebook platform, to provide us with information that shapes our thinking one way or the other. Essentially it helped Facebook become a biased information provider and that only works if you have a single point of failure. That only works if everybody is getting the news from the same source and if that is Facebook and if that is an algorithm that is fed on data and if I know what data I can feed it so that it creates the sort of bias that then leads to a biased outcome, a biased algorithm, then I can essentially weaponize Facebook to harm or bring down democracy.
Ben: Yeah and that's a ... I mean the way you described it is even more scary than I would have thought. Yeah I think that whole idea of how that ... because in some sense too people build algorithms. So even the algorithms that are being built you got bias in the data, you got bias in the algorithms and you got people actually trying to misuse these data exchanges for nefarious ends. I mean it's a pretty scary world in some sense. I mean it's both amazingly ... there's some amazing horizons that we can actually reach, amazing things we could do but it's also very scary some of the downside of it. So how do you think we as a society I guess in general actually get around this? I mean what's the right way to approach this problems so we can actually make the data rich markets work? Because I think as you say they're here to stay. So how do we make them work?
Viktor: Right. To me the option isn't to keep the data to myself, to not join any of those markets, because actually that means that I'm getting the short end of the stick. I don't get good matches. I don't get good product. I pay more for them and I don't get what I want. The future I think cannot be in being a digital recluse and harking back to times of ignorance but really to try and make sure that the data isn't as concentrated. If data is valuable then it should be spread more evenly. And that is why in the book we suggest one way out of that is to mandate that those large digital superstars share some of the data that they have of course anonymized with smaller startups so that for example we could go on Amazon and shop on Amazon but we wouldn't just have Amazon's recommendation engine available, there could be other recommendation engines from startups and from other sources and I could choose what I'd like. I could pay a small monthly fee and get the consumer report recommendation engine that would be less biased perhaps than the one from Amazon and therefore ensure that there isn't such a concentration of informational power that then influence purchasing and market transactions. And we can do that by spreading the data more evenly and making sure that there isn't that kind of a concentration of data among a few superstar firms.
Ben: Well if I'm understanding you right I mean probably ... am I right to think that there's somewhat of a parallel to I don't know the late 19th century, early 20th century, where you had kind of the Gilded Age and these large, the trust that came about and there was kind of this concentration of economic power and industrial power that allowed them to control prices and then we had to have a lot of this anti-trust regulation. So we're kind of in the same spot now you think in the Information Age?
Viktor: Yeah and there's so many examples. The antitrust legislation that came out in the early 20th century as you are referring to is a good example. Patent laws is as another good example. Patents give those that make great inventions a temporary exclusion rate for a temporary period of time. But in order to get that exclusion rate you need to be completely transparent about your invention. So once that time is lapsed everybody else can look at your drawings, can look at your description of the process and can then benefit from that process. That's the kind of spreading knowledge or spreading insight that we as a society have done all along. In fact even just about 10 years ago when Google bought a large travel company called ITA the Department of Justice said you can do that but if you do that you need to let others including competitors like Microsoft access the data that you just purchased because otherwise there is an unfair concentration of data that you have. So making data be accessible to more parties, spreading it more evenly in the economy isn't something new. It's there because it enables us and it has been there for 100s of years in the case of patents, over 100 years in the case of antitrust. It's there because it helps us keep the markets competitive and innovation going.
Ben: And particularly from where you're at with your title, where you're at at Oxford Internet Governance and Regulation. So you clearly are having conversations I would guess about and actually have an eye on how this is changing in the world right now and what governments are doing. Do you have a feeling that our kind of governments around the world are actually taking this seriously and moving forward to do this or do you think there's still a lack of recognition of the problem? What do you see?
Viktor: No I see a lot of governments taking note of this. I spoke with the European Commission, they're interested. I know that the Dutch government is interested. The German coalition government, one of the coalition parties, Social Democrats, have data sharing as their I guess a key element in their platform. In Britain this is an idea that has been put forward. The European Council has suggested that the digital superstars are forced to let other companies access some of the data and are pushing in that direction. So this is clearly something that's very heavily debated right now and I'm sure we'll see more action in that policy arena throughout the world in the coming years.
Ben: One last question for you particularly on the international stage is that I think there is a lot of talk sometimes that there's going to be maybe in the more negative light they're calling an arms race around artificial intelligence. But I mean there's an idea that there's a particularly in China versus what's going on in Europe and the United States around the new technologies and trying to race to that next set of technologies. But after talking to you and listening to you I wonder if it's really going to be a race about technology, not a race about data and what that would actually mean for this concept of data rich markets and for the future economy. What do you think? Am I right to think that that might be the case?
Viktor: Yes I think in part it is about technology, it's about the better tools and the better algorithms but perhaps even more importantly it is about access to data. And so I strongly think that the economy that enables startups and smaller and larger companies to have access to data so that they can then use the data in their machine learning attempts that will produce innovation, that will produce new insights and that will produce not just economic growth but hopefully also a digital dividend for society itself.
Take self-driving cars. The more data you have, the better the car can learn how to drive itself. So obviously we want as much data as we can in order to have really good self-driving cars. And therefore an entire economy like Europe and European car makers would actually benefit greatly from being able to pool the data that they have or to share the data that they have about their self-driving car attempts just to have more and more data available from which to learn and which to use to train their systems.
Ben: Yeah that does make sense. Last question for you, what's next? Where are you kind of thinking about and building your plans to do next?
Viktor: Well what's next is thinking about what's essential for us to preserve for humans. If we have such good data rich markets and digital assistants to help us in human decision making whether it's transaction decisions on the market or other decisions, what role is there for humans. And there is a lot of talk about the importance of creativity and soft skills and our ability to be emotional. And that's all fine. But I do think that there should be more than that. And to come up with some really good ways by which we humans can remain relevant. For example by remaining in the loop of decision making is really I think the next big debate that's coming our way.
Ben: I like that, that really does seem to be a theme I see across a lot of different areas. I'm excited to see what you do with that. Well Viktor thank you again for coming on the podcast. I think this was an awesome conversation and excited to continue to follow new books and new ideas that you put out there. Thanks for taking the time.
Viktor: Thank you Ben.
Speaker 3: Masters of Data is brought to you by Sumo Logic. Sumo Logic is a cloud native machine data analytics platform delivering real time continuous intelligence as a service to build, run and secure modern applications. Sumo Logic, empowers the people who power modern business. For more information go to SumoLogic.com. For more on Masters of Data go to mastersofdata.com and subscribe and spread the word by rating us on iTunes or your favorite podcast app.