Economics of Digitization: An Agenda
June 6 and 7, 2013
Ajay Agrawal and Nicola Lacetera, University of Toronto and NBER; John Horton, Desk Research; and Elizabeth Lyons, University of Toronto
Online contract labor globalizes traditionally local labor markets, with platforms that enable employers, most of whom are in high-income countries, to more easily outsource tasks to contractors, primarily located in low-income countries. Agrawal, Lacetera, Horton and Lyons provide descriptive statistics showing that this market is growing rapidly. They show data from one of the leading platforms where the number of hours worked increased 55% from 2011 to 2012, with the 2012 total wage bill just over $360 million. They outline three lines of inquiry in this market setting that are central to the broader digitization research agenda: 1) How will the digitization of this market influence the distribution of economic activity (geographic distribution of work, income distribution, distribution of work across firm boundaries)?;2) What is the magnitude and nature of information frictions in these digital market settings as reflected by user responses to market design features (allocation of visibility, investments in human capital acquisition, machine-aided recommendations)?; 3) How will the digitization of this market affect social welfare (increased efficiency in matching, production?)? Drawing upon economic theory as well as evidence from empirical research on online contract labor markets and other related settings, they motivate and contextualize this research agenda.
Michael Baye, Babur De Los Santos, and Matthijs Wildenbeest, Indiana University
Baye, De Los Santos, and Wildenbeest provide a data-driven overview of different platforms where consumers can search for books and booksellers, and show how the use of these platforms has shifted over time. They highlight a number of challenges and open agenda items related to observed data on consumer search, as well as prices of digital and physical books.
Catherine Mann, Brandeis University
As businesses and consumers search, communicate, and transact online, firms gather more and more personal and financial information. On the one hand, all this information can enhance market efficiency and consumer surplus, as firms tailor products to buyers. On the other hand, there is increased risk of information loss, either by accident or through theft. What issues should be on the digital agenda with regard to information loss, and what data are available to underpin both business response and any policy approach? Mann reviews the situation and points out where we need more thought and more data. She looks at: 1) Various frameworks for analysis, such as "How should we model the information marketplace, particularly with regard to the benefits and costs of information collection, retention, and aggregation?" 2) Quantification and data: What is the evidence on the prevalence and nature of information loss, what are the costs of information loss, and how valuable is this information in the marketplace? 3) Market and Policy Response: What do we know about the efficacy of market versus other approaches to disciplining market participants, either to avoid loss or remediate after information loss? Throughout, of particular interest is the international dimension of information loss. What issues arise when countries differ in their attitudes and policies toward information acquisition, aggregation, retention, and, importantly, in disclosure of information lost?
Randall Lewis and David Reiley, Google, Inc., and Justin Rao, Microsoft Research
Online advertising offers unprecedented opportunities for measurement. A host of new metrics, clicks being the leading example, have become widespread in advertising science. New data and experimentation platforms open the door for firms and researchers to measure true causal effects of advertising on a variety of consumer behaviors, such as purchases. Lewis, Rao, and Reiley dissect the new metrics and methods currently used by industry researchers, attacking the question, "How hard is it to reliably measure advertising effectiveness?" They outline the questions that they think can be answered by current data and methods, those that they believe will be in play within five years, and those that they believe could not be answered with arbitrarily large and detailed data. They pay close attention to the advances in computational advertising that are not only increasing the impact of advertising, but also usefully shifting the focus from "who to hit" to "what do I get."
Joshua Gans, University of Toronto and NBER, and Hanna Halaburda, Harvard University
With growing digitization, there has seen the emergence of pure digital currencies. These are currencies that exist virtually and have no physical component; even in paper. These currencies range from Facebook Credits to BitCoin. In this paper, Gans and Halaburda classify currencies, and digital currencies, as platforms for exchange. The platforms are distinguished as to whether you can exchange other currencies in or out of the platform. Authors demonstrate that to the extent that currencies are associated with other platforms (for instance, Facebook activity), the platform sponsor has a weak incentive to allow their currencies to be acquired by exchange or by activity but have no incentive to permit their currencies to be sold externally for other extra-platform currencies. Gans and Halaburda conclude by reviewing the regulatory issues posed by fungible currencies (such as BitCoin) and suggest that platform economics may provide a set of tools for future analysis of currencies in general.
Matthew Gentzkow and Jesse Shapiro, University of Chicago and NBER
In a 2011 paper, Gentzkow and Shapiro use individual and aggregate data to evaluate the extent of ideological segregation in the consumption of online news. Using standard metrics of segregation, they find that ideological segregation of online news consumption is low in absolute terms, higher than the segregation of most offline news consumption, and significantly lower than the segregation of face-to-face interactions with neighbors, co-workers, or family members. Here they consider the structure of supply and demand that might give rise to the observed patterns. They present preliminary evidence on the structure of consumer preferences and supplier incentives from some simple structural models of news demand.
Erik Brynjolfsson, MIT and NBER, and Lynn Wu, University of Pennsylvania
Most data sources used in economics, whether from the government or businesses, are typically available only after a substantial lag, at a high level of aggregation, and for variables that were specified and collected in advance. This hampers the effectiveness of real-time predictions. Wu and Brynjolfsson demonstrate how data from search engines like Google provide an accurate but simple way to predict future business activities. Applying their methodology to predict housing market trends, they find that a housing search index is strongly predictive of the future housing market sales and prices. The use of search data produces out-of-sample predictions with a smaller mean absolute error than the baseline model that uses conventional data but does not include any search data. The improvements in predictions using search terms is 7.1 percent better over the baseline for future home sales and 4.6 percent better for future housing prices. Furthermore, they find that their simple model of using search frequencies beats the predictions made by experts from the National Association of Realtors by 23.6 percent for future U.S. home sales. They also demonstrate how these data can be used in other markets, such as laptop sales. In the near future, this type of "nanoeconomic" data can transform prediction in numerous markets, and thus business and consumer decisionmaking.
Timothy Simcoe, Boston University and NBER
Simcoe presents an empirical case study of the Internet architecture from an economic viewpoint. Data collected from the two main Internet standard setting organizations (IETF and W3C) demonstrate the modularity of the Internet architecture and the specialized division of labor that produces it. Examining citations to Internet standards provides evidence on the diffusion and commercial applications of the protocols. Simcoe ties these observations together by arguing that modularity helps the Internet (and perhaps the digital technology more broadly) avoid long-run decreasing returns, by facilitating low-cost adaptation of a shared general-purpose technology to the demands of heterogeneous applications.
Hal Varian, University of California at Berkeley
Varian considers the problem of short-term time series forecasting (nowcasting) when there are more possible predictors than observations. His approach combines three Bayesian techniques: Kalman filtering, spike-and-slab regression, and model averaging. He illustrates this approach using search engine query data as predictors for consumer sentiment and gun sales.
Joel Waldfogel, University of Minnesota and NBER
Although revenue for recorded music has collapsed since the explosion of file sharing, results elsewhere suggest that the quality of new music has not suffered. One possible explanation is that digitization has allowed a wider range of firms to bring far more music to market using lower-cost methods of production, distribution, and promotion. Record labels have traditionally found it difficult to predict which albums will find commercial success, so many released albums fail while many nascent but unpromoted albums might have been successful. Forces raising the number of products released may allow consumers to discover more appealing choices if they can sift through the offerings. Digitization has promoted both Internet radio and a growing cadre of online music reviewers, providing alternatives to radio airplay as means for new product discovery. To explore this, Waldfogel assembles data on new works of recorded music released between 1980 and 2010, along with data on particular albums' sales, airplay on both traditional and Internet radio, and album reviews at Metacritic since 2000. First, he documents that despite a substantial drop in major-label album releases, the total quantity of new albums released annually has increased sharply since 2000, driven by independent labels and purely digital products. Second, increased product availability has been accompanied by a reduction in the concentration of sales in the top albums. Third, new information channels – Internet radio and online criticism – change the number and kinds of products about which consumers have information. Fourth, in the past dozen years, increasing numbers of albums find commercial success without substantial traditional airplay. Finally, albums from independent labels – which previously might not have made it to market – account for a growing share of commercially successful albums.
Megan MacGarvie, Boston University and NBER, and Petra Moser, Stanford University and NBER
Proponents of stronger copyright terms have argued that stronger copyright terms encourage creativity by increasing the profitability of authorship. Empirical evidence, however, is scarce, because data on the profitability of authorship is typically not available to the public. At current copyright lengths of 70 years plus the life of the author, further extension may also be unlikely to have significant effects. To investigate effects of copyright at lower pre-existing levels of protection, MacGarvie and Moser introduce a new data set of publishers’ payments to Romantic Period British authors between 1800 and 1830. These data indicate that payments to authors nearly doubled following an increase in the length of copyright in 1814. These findings suggest that - starting from low pre-existing levels of protection - policies that strengthen copyright terms may, in fact, increase the profitability of authorship.
Tatiana Komarova, London School of Economics; Denis Nekipelov, University of California at Berkeley; and Evgeny Yakovlev, New Economic School
The security of sensitive individual data is a subject of undisputable importance. One of the major threats to sensitive data arises when one can link sensitive information and publicly available data. Komarova, Nekipelov, and Yakovlev demonstrate that even if the sensitive data are never publicly released, the point estimates from the empirical model estimated from the combined public and sensitive data may lead to a disclosure of individual information. Their theory builds on the work in a 2012 paper in which they analyze the individual disclosure that arises from the releases of marginal empirical distributions of individual data. The disclosure threat in that case is posed by the possibility of a linkage between the released marginal distributions.Here, they analyze a different type of disclosure: they use the notion of the risk of statistical partial disclosure to measure the threat from the inference on sensitive individual attributes from the released empirical model that uses the data combined from the public and private sources. As their main example, they consider a treatment effect model in which the treatment status of an individual constitutes sensitive information.
Brett Danaher, Wellesley College, and Michael Smith and Rahul Telang, Carnegie Mellon University
Digitization raises a variety of important academic and managerial questions around firm strategies and public policies for the content industries, with many of these questions influenced by the erosion of copyright caused by Internet filesharing. At the same time, digitization has created many new opportunities to empirically analyze these questions by leveraging new data sources and abundant natural experiments in media markets. Danaher, Smith and Telang describe the open "big picture" questions in this field, and discuss methodological approaches that might be used to leverage the new data and natural experiments available in digital markets. They also offer a specific proof of concept research study that analyzes an important academic and managerial question -- the impact of legitimate streaming services on the demand for piracy. They use a specific natural experiment, namely, ABC's decision to add its content to Hulu.com. They find that adding content to Hulu resulted in an economically and statistically significant drop in piracy of that content.
Scott Wallsten, Technology Policy Institute
The Internet has radically transformed the way we live our lives. The net changes in consumer surplus and economic activity, however, are difficult to measure because some online activities, such as obtaining news, are new ways of doing old activities. In addition, new activities, like social media, have an opportunity cost in terms of the activities crowded out. This paper uses data from the American Time Use Survey from 2003 – 2011 to estimate the crowdout effects of leisure time spent online. Wallsten finds that, on the margin, each minute of online leisure time is correlated with 0.29 fewer minutes on all other types of leisure, with about half of that coming from time spent watching TV and video, 0.05 minutes from (offline) socializing, 0.04 minutes from relaxing and thinking, and the balance from time spent at parties, attending cultural events, and listening to the radio. Each minute of online leisure is also correlated with 0.27 fewer minutes working, 0.12 fewer minutes sleeping, 0.10 fewer minutes in travel time, 0.07 fewer minutes in household activities, and 0.06 fewer minutes in educational activities. This evidence suggests that the Internet’s impact on household activity has put it on a path to equal some of largest transformative changes in history, such as the introduction of broadcast radio and telecommunications.
Susan Athey, Stanford University and NBER, and Scott Stern, MIT and NBER
This paper evaluates the nature, relative incidence and drivers of software piracy. In contrast to prior studies, Athey and Stern base their study on direct observation of piracy, focusing on a specific product – Windows 7 – which was associated with a significant level of private sector investment. Using anonymized telemetry data, they are able to characterize the ways in which piracy occurs, the relative incidence of piracy across different economic and institutional environments, and the impact of enforcement efforts on choices to install pirated versus paid software. They find that: (a) the vast majority of “retail piracy” can be attributed to a small number of widely distributed "hacks" that are available through the Internet, (b) the incidence of piracy varies significantly with the microeconomic and institutional environment, and (c) software piracy primarily focuses on the most "advanced" version of Windows (Windows Ultimate). After controlling for a small number of measures of institutional quality and broadband infrastructure, one important candidate driver of piracy – GDP per capita – has no significant impact on the observed piracy rate, while strong intellectual property protection in a country significantly reduces piracy, suggesting that intellectual property policy is important even in poor countries. Finally, taking advantage of country-specific enforcement efforts against suppliers of pirated software such as the Pirate Bay, they are able to demonstrate the (heterogeneous) impact of enforcement efforts against software piracy on the choice by users between pirated versus paid software.