Showing Latest Publications

Vaccine Hesitancy and the Pandemic: Physician Survey Responses

ICLE Issue Brief Executive Summary Vaccines for the SARS-CoV-2 virus saved countless lives and are a modern science miracle. But they had risks, were not as effective as . . .

Executive Summary

Vaccines for the SARS-CoV-2 virus saved countless lives and are a modern science miracle. But they had risks, were not as effective as originally claimed, and were effectively forced onto many people in order for them to work, attend school, or travel. There is early evidence that these and other factors may currently be contributing to heightened vaccine hesitancy, with potentially serious consequences for public health.

To better understand the extent and causes of vaccine hesitancy, we surveyed 124 physicians in Montgomery County, Pennsylvania. They reported that most patients (nearly all adults and most children and infants) took the initial COVID-19 vaccines in the winter of 2021. Take-up among adults declined over the proceeding months and was much lower by the second half of 2023 (and almost non-existent among infants). This shift in vaccine uptake was especially prevalent in the more staunchly Republican-voting areas of Montgomery County. It is noteworthy, however, that at-risk populations continue to receive vaccine boosters.

More worryingly, physicians also report an increase in patient distrust for non-COVID vaccines and a more generalized increase in concern about and distrust of public-health advice among their patients. Even some physicians are concerned about governmental public-health advice.

Future research will need to establish whether vaccine uptake has changed and, if so, how it has changed, and what this may mean for public health. If these data are replicated in larger surveys and a trend becomes identifiable, the implications for public health could be serious, indeed.

I. Background

Vaccines are among the most powerful interventions in the field of public health, plausibly saving and improving more lives than anything other than good sanitation and diet. Historically, however, vaccines have typically taken years to develop. When effective vaccines were developed for COVID within a year of the pandemic’s emergence, many were surprised and awed.[1] Amazingly and, in retrospect, incredibly, the makers of the first mRNA vaccines to be granted emergency use authorization by the U.S. Food and Drug Administration (FDA) claimed more than 90% efficacy in trials. This is much higher than the efficacy of vaccines for many other diseases,[2] including influenza, although this level of efficacy was expected to wane over time.[3] For all of these reasons, COVID vaccines were greeted as truly remarkable public-health interventions.

Uptake of COVID vaccines was initially very high.[4] This was likely primarily because of the protection they afforded against a potentially deadly disease. But it was also partly because, for many, vaccination became a requirement for work, travel, and most other social interactions.

But concerns about the vaccines soon arose. While many of these purported risks were plainly false,[5] some—most notably, the risk of myocarditis and pericarditis among the young—were supported by data.[6] It also emerged that the vaccines were far less effective and shorter lived than originally touted. Moreover, they do not completely prevent disease transmission, although they probably reduce transmission by reducing viral loads.[7] As the virus has mutated over time, it has generally become more virulent, but less dangerous, which also likely has informed the calculus of those considering whether to take further vaccine boosters.

Despite these concerns, state and federal authorities have recommended and, in many cases, demanded vaccination.[8] These mandates were applied even to those who had just had COVID. This appeared illogical, given that the natural immunity provided by a disease is usually greater than the passive immunity from vaccination. This appears to be true of COVID, as well.[9]

U.S. vaccine policy is comprehensive and promotes vaccinations for all ages.[10] It also continues to promote COVID vaccines for everyone over six months old.[11] But skeptics have claimed all sorts of dangers from the vaccines, and most people report their experiences are that the vaccines really only prevented death for the old, obese, or those with other co-morbidities. This has resulted in reactive vaccine hesitancy, especially in more Republican-leaning areas. We hypothesize that this is partly because many elected Republicans vocally opposed vaccine mandates and some even voted to prohibit private requirements.[12] Such actions likely influenced opinion in these Republican-voting areas.

A. Aims

The aim of this research is to examine vaccine uptake and patient opinions about vaccines in Montgomery County, Pennsylvania. If, as expected, COVID-vaccine refusal and more general vaccine hesitancy has increased over the past few years, reasons for this will be discussed.

B. Methods

Primary physicians oversee many vaccinations and also address many questions from patients about vaccines, including about efficacy and safety. We undertook a survey of primary physicians in order to obtain information about changes in vaccine uptake, physicians’ interpretations of patients’ opinions about vaccines, and their own opinions about vaccines.

The survey was undertaken in Montgomery County, Pennsylvania, which ranges from the Northeast suburbs of Philadelphia into more rural areas. It has three members of the U.S. House of Representatives, including two Democrats (Reps. Madeleine Dean and Mary Scanlon) and one Republican (Rep. Brian Fitzpatrick).

Physicians were surveyed in the three constituencies. To assess uptake of, and opinions about, vaccines across the political divide, it made sense to find the few Republican-voting areas and compare them to the rest. The two most strongly Republican-voting of the county’s14 Pennsylvania House of Representatives districts are District 147 and District 131, both of which have 30% more registered Republican voters than Democrats. In many of the other state House districts, there are more than twice as many Democrats as Republicans registered to vote. From within the overall sample, physicians from these two Republican-leaning districts were compared to the other 12 state House districts. The tables in the appendix show all relevant political data.

C. Survey

The survey, titled “Vaccine Questionnaire” and republished in full below, was kept short to ensure full participation by physicians. It was undertaken for two weeks starting in mid-January 2024, with responses collected online, over the phone, or in-person.

D. Vaccine Questionnaire

The aim of this short survey is to find out about opinions and uptake of key vaccines in your practice. And to note whether there have been changes in either opinions or uptake of vaccines over the past few years.

  1. How long have you been at this practice (less than four years will not participate in final results)?
  2. How would you describe the initial uptake (2021) of the COVID vaccine for a) adults b ) children c) infants

Nearly everyone, most, few, almost none

  1. How would you describe the uptake of the most recent COVID booster for a) adults b ) children c) infants

Nearly everyone, most, few, almost none

  1. Over the same time period early 2021- late 2023 has uptake of other (non-COVID vaccines) changed in a) adults b ) children c) infants (increased, stayed the same, decreased)
  2. Have patient opinions changed over this time period?

COVID – positive, the same, negative about the vaccines.

Other vaccines – positive, the same, negative.

  1. Please provide any specific comments you recall made by patients.
  2. How have your opinions changed, if at all, about vaccines over the time period?

Thank you for your time.

E. Response Rate

In all, 124 physicians in Montgomery County who had been in their practice for more than four years replied in full to the survey. Rep. Dean’s constituency is the largest, and this was reflected by having the most physicians surveyed (48), compared with Fitzpatrick (37) and Scanlon (39). 19 physicians came from the two most Republican-leaning Pennsylvania House Districts.

F. Interpretation of Data

The data we obtained are imprecise, because they are primarily based on the recall (of up to four years) of busy physicians, each of whom deal with dozens of patients. Small differences over time or between districts may well be the result of poor recall or biases due to survey design. Nevertheless, significant differences are probably reliable and based on identifiable trends.

II. COVID Vaccine Uptake

As expected, most physicians reported very high initial uptake of the COVID vaccine among all groups, especially adults. Figure 1 represents the number of responses (Y axis) against time (2021 or 2023) and vaccine-recipient type (adult, child, infant). As the chart demonstrates, most physicians reported nearly every adult receiving a vaccine when first offered (a few may have had medical exemptions to vaccinations).

As noted above, when the mRNA COVID vaccines were first rolled out in late 2020 and early 2021, they were touted as more than 90% effective and that they would (or, at least, might) reduce transmission. Vaccines were also required for many jobs and for travel, etc. By late 2023, when the latest booster was made available, fewer adults were taking it, as well as far fewer children and almost no infants.

Perhaps this can be explained by a greater appreciation that the vaccines were less effective than originally touted, did not appreciably reduce transmission, were no longer required for jobs or travel, and that the side effects more widely explained. Additionally, the disease itself changed, becoming more contagious, but less deadly (though the long-term trajectory remains indeterminant).[13] By lowering viral loads, vaccines probably lowered transmission, but many vaccinated individuals still got the disease.[14] Many patients may also regard the side effects of vaccination—such as a sore arm and the possibility of feeling bad for a day or two—as not worthwhile, given that the disease itself appears little worse than a bad cold. Several physicians mentioned this as among the plausible reasons for declining uptake.

FIGURE 1: Montgomery County – COVID Vaccine Uptake

There was very little difference across the three U.S. House districts. While uptake was marginally lower in the Republican Rep. Fitzpatrick’s district, the difference was not statistically significant. In the two most Republican-leaning Pennsylvania House districts, however, there was a notable difference, as can be seen in Figure 2 and Figure 3. Initial uptake was not as great in these two districts, and it is almost non-existent for the most recent booster. This is noteworthy, given that Montgomery County follows Centers for Disease Control and Prevention (CDC) advice that everyone over six months old should receive the latest booster.[15] The vast majority of these districts’ patients are ignoring CDC advice.

FIGURE 2: Montgomery County (12D) Districts – COVID Vaccine Uptake

FIGURE 3: Montgomery County (2R) Districts – COVID Vaccine Uptake

A. Other Vaccines

COVID is one disease among many. Other diseases that require vaccinations obviously have not disappeared. Question 4 sought to gather information about the uptake of these other vaccines over the same period (early 2021 to late 2023). Here, the data are significant, as demonstrated in Figure 4. Not one physician reported an increase in vaccine uptake for these other diseases. While approximately a third reported no change, fully two-thirds (slightly more in the Republican areas) have seen a decrease in vaccination uptake.

This is a very broad measure and far more detailed surveys are required to understand exactly which vaccines are being missed—i.e., whether it is the annual (and not particularly effective) flu vaccines, or the far more important and less-frequent (often a one-off in childhood) vaccines for diseases such as measles, polio, or tuberculosis. The data below also do not show by how much vaccine rates are falling.

FIGURE 4: Non-COVID Vaccine Change in Uptake, 2021-2023

Nevertheless, that rates are falling is potentially worrying and deserving of attention.

III. Vaccine Opinions

The latter questions in the survey refer to opinions about vaccines and, where quantifiable, how they have changed, as well as specific comments made by patients (and the parents of patients) and physicians about vaccines. These responses do not rise much above anecdotes, but they may provide some insight into patient and physician concerns. These comments could also help to design more detailed surveys in the future.

  1. The vast majority of physicians reported a large decline in support for COVID vaccinations (as reflected in uptake) and a much smaller, but still important, decline in support for all vaccines.
  2. Most physicians report patient concerns about the safety of COVID (and, increasingly, other) vaccines. Patients are uncertain of, but worried about, social-media reports of vaccine harm. Given that social media was the only place that supported the notion that the SARS-CoV-2 virus originated in a lab—and permitted discussion of other theories and concerns, many of which turned out to be true—it is perhaps not surprising that many patients were inclined to worry about reports of vaccine harm that also appeared on social media. These patients were less likely to take the vaccine themselves, but more likely to take it than to let their children do so. This was especially true among the many patients who referred to the “lies” told by health authorities (Anthony Fauci was named repeatedly). A few patients appeared to be very angry about being mandated to take a potentially unsafe vaccine, even if they had recently had the disease.
  3. Patients offered more subtle comments—“nuanced” was the word mentioned by more than one physician—about the scientific illiteracy of health authorities who demanded COVID vaccines even for people who had recently had the disease. This led to a “total” distrust of vaccine policy among some patients, which physicians reported has definitely contributed to lowering flu-vaccine uptake, although one physician reported that “it’s too early to tell for other vaccines.” Some physicians agreed with their patients that the advice was unscientific.
  4. Some physicians also said that their trust in vaccination approval, efficacy, and health authorities’ advice had declined.

A. Discussion

A large NIH survey about vaccination opinions among 737 physicians was undertaken in May 2021, when COVID vaccines were taken in vast numbers.[16] The summary findings were that “10.1% of primary care physicians do not agree that, in general, vaccines are safe, 9.3% do not agree they are effective, and 8.3% do not agree they are important.”  Evidently, the vast majority of physicians accepted their safety, efficacy, and importance, but it is both interesting and relevant that a small minority did not.  One reason reported was that the pharmaceutical industry is not widely trusted and that some vaccinations, such as for flu, are often not that effective.

To ensure rapid distribution of COVID vaccines, pharmaceutical producers were given (temporary) immunity from liability related to vaccine-induced harm, which probably fueled some additional skepticism (a point physicians said a few patients made when refusing COVID vaccines).[17] This is obviously a tricky area, as the manufacturers might not have agreed to sell the vaccines in the United States without such protections.

By mid-2023, federal vaccine mandates as a requirement for federal jobs and international travel had been removed.[18] It is therefore not that surprising that the uptake of COVID vaccine boosters collapsed among infants and children, and fell markedly amongst adults. One comment made by a few physicians was that uptake was close to zero, as well, for adults under 40, while being nearly universal among adults over 70, or with co-morbidities. This likely demonstrates that those most at-risk were, indeed, reading the scientific situation correctly and taking the vaccine.

It is important not to overinterpret these results. The data could be the result of faulty recollections by busy physicians. Even if entirely accurate, they may reflect a temporary shift, rather than an actual trend in increased vaccine resistance. But these data are worrying if they are sustained and reflective more broadly than in one county in Pennsylvania.

IV. Conclusion

The development of COVID vaccines was a truly remarkable phenomenon. Within one year of the pandemic’s start, pharmaceutical companies had developed multiple vaccines, while the previous record for the fastest vaccine developed (for mumps) took four and half years.[19] These vaccines saved hundreds of thousands of lives, especially among the old and those with comorbidities who were most at risk from severe COVID.

But the vaccines were oversold, were not as effective as first touted, did not fully prevent transmission and, like most vaccines, posed some risks. By making them mandatory for jobs and travel, people who were disinclined to take them appear to have become more hostile to vaccines in general.  The exact reasons for the downturn in COVID vaccination are myriad. Some are due to vaccine failings, some to inappropriate political demands, but some are related to the disease changing to a more virulent but less harmful form, making vaccination less attractive.

Further research should establish whether the results in this survey are replicated over time and in larger groups. More importantly, there is a need to establish whether vaccine hesitancy applies across all vaccines or whether it is limited to COVID and seasonal vaccines with weak efficacy (such as influenza).

V. Appendix

Data from most recent U.S. Census and election for Montgomery County, Pennsylvania.[20]

Data for the two Republican-leaning Pennsylvania House districts.[21]

[1] COVID-19 Vaccines, U.S. Food & Drug Admin., (last visited Feb. 15, 2024).

[2] Kathy Katella, Comparing the COVID-19 Vaccines: How Are They Different?, Yale Medicine (Oct. 5, 2023),

[3] Huong Q. McLean, et al., Interim Estimates of 2022–23 Seasonal Influenza Vaccine Effectiveness — Wisconsin, October 2022–February 2023, Ctr. Disease Control & Prevention (Feb. 24, 2023),

[4] COVID-19 Vaccinations in the United States, Ctr. Disease Control & Prevention, (last visited Feb. 16, 2024).

[5] Debunking COVID-19 Myths, Mayo Clinic (Sep. 2, 2021), (last visited Feb. 15, 2024).

[6] Colleen Moriarty, The Link Between Myocarditis and COVID-19 mRNA Vaccines, Yale Medicine (Jun. 24, 2021),

[7] Anouk Oordt-Speets, et al., Effectiveness of COVID-19 Vaccination on Transmission: A Systematic Review, MDPI (2023),

[8] Kevin Liptak & Kaitlan Collins, Biden Announces New Vaccine Mandates That Could Cover 100 Million Americans, CNN (Sep. 9, 2021),

[9] Sara Diani, et al., SARS-CoV-2-The Role of Natural Immunity: A Narrative Review, Nat’l Ctr. Biotechnology Info. (Oct. 25, 2022),,SARS%2DCoV%2D2%20vaccination.

[10] Vaccines & Immunizations, U.S. Dept. Health & Human Serv., (last visited Feb. 15, 2024).

[11] COVID-19 Vaccine Effectiveness, Ctr. Disease Control & Prevention, (last visited Feb. 15, 2024).

[12] Jonathan Chait, How Vaccine Skeptics Took Over the Republican Party. A Case Study in the Party’s Dysfunction, Intelligencer (Oct. 21, 2022),; State Government Policies About Vaccine Requirements (Vaccine Passports), 2021-2022, Ballotpedia,,_2021-2022 (last visited Feb. 15, 2024).

[13] Ádám Kun, et al., Do Pathogens Always Evolve to Be Less Virulent? The Virulence–Transmission Trade-Off in Light of the COVID-19 Pandemic, 74 Biol. Futura 69–80 (2023),

[14] Anouk Oordt-Speets, et al., Effectiveness of COVID-19 Vaccination on Transmission: A Systematic Review, 3(10) COVID 1516-1527 (2023),

[15] CDC Recommends Updated COVID-19 Vaccine for Fall/Winter Virus Season, Ctr. Disease Control & Prevention (Sep. 12, 2023),

[16] Timothy Callaghan, et al., Imperfect Messengers? An Analysis of Vaccine Confidence Among Primary Care Physicians, 40(18) Vaccine 2588–2603 (Apr. 20, 2022),

[17] Shayna Greene, Fact Check: Are Pharmaceutical Companies Immune From COVID-19 Vaccine Lawsuits?, Newsweek (Jan. 19, 2021),

[18] The Biden-?Harris Administration Will End COVID-?19 Vaccination Requirements for Federal Employees, Contractors, International Travelers, Head Start Educators, and CMS-Certified Facilities, White House (May 1, 2023),

[19] Dave Roos, How a New Vaccine Was Developed in Record Time in the 1960s, (Oct. 4, 2023),

[20] ArcGIS,, (last visited Feb. 15, 2024).

[21] ArcGIS,, (last visited Feb. 15, 2024).

Continue reading
Innovation & the New Economy

Regulating Producers by Randomizing Consumers

Scholarship Abstract Consumer law aims to empower consumers with accurate information and to protect them from misinformation. For complex goods and services, however, mandated disclosure laws . . .


Consumer law aims to empower consumers with accurate information and to protect them from misinformation. For complex goods and services, however, mandated disclosure laws and other consumer law tools have had only limited success in helping consumers project their satisfaction and costs. Market responses, such as ratings, have limitations that have prevented them from compensating adequately for consumer law’s shortcomings. This Article describes how regulators could improve measurements of quality and cost by borrowing a tool from drug law: randomization. In designated markets, consumers who accept an incentive to volunteer would be randomized among two or more choices, and the government would collect short- and long-term information about each consumer’s experience. For example, in the health insurance market, by providing discounts to consumers who select two or more possible insurers, the government could accomplish the goal of comparing insurers’ health outcomes, free from the current confounding concern that different health insurers serve different pools of insureds. Randomization similarly could help overcome informational problems and abusive conduct in highly regulated markets for health providers, educational institutions, and lawyers, as well as for more ordinary goods and services, such as automobile repair. The information generated through randomization not only could be of direct use to consumers, but also could serve as an input into additional regulation, allowing the government to make more informed decisions about mandating product features where data establishes that producers exploit systematic errors by consumers.

Continue reading
Financial Regulation & Corporate Governance

Modeling Fee Shifting with Computational Game Theory

Scholarship Abstract While modern mathematical models of settlement bargaining in litigation generally seek to identify perfect Bayesian Nash equilibria, previous computational models have lacked game theoretic . . .


While modern mathematical models of settlement bargaining in litigation generally seek to identify perfect Bayesian Nash equilibria, previous computational models have lacked game theoretic foundations. This article illustrates how computational game theory can complement analytical models. It identifies equilibria by applying linear programming techniques to a discretized version of a cutting-edge model of settlement bargaining. This approach makes it straightforward to alter some assumptions in the model, including that the evidence about which the parties receive signals is irrelevant to the merits and that the party with a stronger case on the merits also has better information. The computational model can also toggle easily to explore cases involving liability rather than damages and can incorporate risk aversion. A drawback of the computational model is that bargaining games may have many equilibria, complicating assessments of whether changes in equilibria associated with parameter variations are causal.

Read at SSRN.

Continue reading
Innovation & the New Economy

FCC’s Digital-Discrimination Rules: An Open Invitation to Flood the Field with Schlock

TOTM Ahalf-dozen lawsuits have been filed to date challenging the digital-discrimination rules recently approved by the Federal Communications Commission (FCC). These cases were consolidated earlier this month and will now . . .

Ahalf-dozen lawsuits have been filed to date challenging the digital-discrimination rules recently approved by the Federal Communications Commission (FCC). These cases were consolidated earlier this month and will now be heard by the 8th U.S. Circuit Court of Appeals.

Read the full piece here.

Continue reading
Telecommunications & Regulated Utilities

From Data Myths to Data Reality: What Generative AI Can Tell Us About Competition Policy (and Vice Versa)

Scholarship I. Introduction It was once (and frequently) said that Google’s “data monopoly” was unassailable: “If ‘big data’ is the oil of the information economy, Google . . .

I. Introduction

It was once (and frequently) said that Google’s “data monopoly” was unassailable: “If ‘big data’ is the oil of the information economy, Google has Standard Oil-like monopoly dominance — and uses that control to maintain its dominant position.”[1] Similar epithets have been hurled at virtually all large online platforms, including Facebook (Meta), Amazon, and Uber.[2]

While some of these claims continue even today (for example, “big data” is a key component of the U.S. Justice Department’s (“DOJ”) Google Search and AdTech antitrust suits),[3] a shiny new data target has emerged in the form of generative artificial intelligence. The launch of ChatGPT in November 2022, as well as the advent of AI image-generation services like Midjourney and Dall-E, have dramatically expanded people’s conception of what is, and what might be, possible to achieve with generative AI technologies built on massive data sets.

While these services remain in the early stages of mainstream adoption and are in the throes of rapid, unpredictable technological evolution, they nevertheless already appear on the radar of competition policymakers around the world. Several antitrust enforcers appear to believe that, by acting now, they can avoid the “mistakes” that were purportedly made during the formative years of Web 2.0.[4] These mistakes, critics assert, include failing to appreciate the centrality of data in online markets, as well as letting mergers go unchecked and allowing early movers to entrench their market positions.[5] As Lina Khan, Chair of the FTC, put it: “we are still reeling from the concentration that resulted from Web 2.0, and we don’t want to repeat the mis-steps of the past with AI”.[6]

In that sense, the response from the competition-policy world is deeply troubling. Instead of engaging in critical self-assessment and adopting an appropriately restrained stance, the enforcement community appears to be chomping at the bit. Rather than assessing their prior assumptions based on the current technological moment, enforcers’ top priority appears to be figuring out how to deploy existing competition tools rapidly and almost reflexively to address the presumed competitive failures presented by generative AI.[7]

It is increasingly common for competition enforcers to argue that so-called “data network effects” serve not only to entrench incumbents in the markets where that data is collected, but also confer similar, self-reinforcing benefits in adjacent markets. Several enforcers have, for example, prevented large online platforms from acquiring smaller firms in adjacent markets, citing the risk that they could use their vast access to data to extend their dominance into these new markets.[8] They have also launched consultations to ascertain the role that data plays in AI competition. For instance, in an ongoing consultation, the European Commission asks: “What is the role of data and what are its relevant characteristics for the provision of generative AI systems and/or components, including AI models?”[9] Unsurprisingly, the U.S. Federal Trade Commission (“FTC”) has been bullish about the risks posed by incumbents’ access to data. In comments submitted to the U.S. Copyright Office, for example, the FTC argued that:

The rapid development and deployment of AI also poses potential risks to competition. The rising importance of AI to the economy may further lock in the market dominance of large incumbent technology firms. These powerful, vertically integrated incumbents control many of the inputs necessary for the effective development and deployment of AI tools, including cloud-based or local computing power and access to large stores of training data. These dominant technology companies may have the incentive to use their control over these inputs to unlawfully entrench their market positions in AI and related markets, including digital content markets.[10]

Against this backdrop, it stands to reason that the largest online platforms—including Alphabet, Meta, Apple, and Amazon — should have a meaningful advantage in the burgeoning markets for generative AI services. After all, it is widely recognized that data is an essential input for generative AI.[11] This competitive advantage should be all the more significant given that these firms have been at the forefront of AI technology for more than a decade. Over this period, Google’s DeepMind and AlphaGo and Meta’s have routinely made headlines.[12] Apple and Amazon also have vast experience with AI assistants, and all of these firms use AI technology throughout their platforms.[13]

Contrary to what one might expect, however, the tech giants have, to date, been unable to leverage their vast data troves to outcompete startups like OpenAI and Midjourney. At the time of writing, OpenAI’s ChatGPT appears to be, by far, the most successful chatbot[14], despite the fact that large tech platforms arguably have access to far more (and more up-to-date) data.

This article suggests there are important lessons to be learned from the current technological moment, if only enforcers would stop to reflect. The meteoric rise of consumer-facing AI services should offer competition enforcers and policymakers an opportunity for introspection. As we explain, the rapid emergence of generative AI technology may undercut many core assumptions of today’s competition-policy debates — the rueful after-effects of the purported failure of 20th-century antitrust to address the allegedly manifest harms of 21st-century technology. These include the notions that data advantages constitute barriers to entry and can be leveraged to project dominance into adjacent markets; that scale itself is a market failure to be addressed by enforcers; and that the use of consumer data is inherently harmful to those consumers.

II. Data Network Effects Theory and Enforcement

Proponents of tougher interventions by competition enforcers into digital markets often cite data network effects as a source of competitive advantage and barrier to entry (though terms like “economies of scale and scope” may offer more precision).[15] The crux of the argument is that “the collection and use of data creates a feedback loop of more data, which ultimately insulates incumbent platforms from entrants who, but for their data disadvantage, might offer a better product.”[16] This self-reinforcing cycle purportedly leads to market domination by a single firm. Thus, for Google, for example, it is argued that its “ever-expanding control of user personal data, and that data’s critical value to online advertisers, creates an insurmountable barrier to entry for new competition.”[17]

Right off the bat, it is important to note the conceptual problem of these claims. Because data is used to improve the quality of products and/or to subsidize their use, the idea of data as an entry barrier suggests that any product improvement or price reduction made by an incumbent could be a problematic entry barrier to any new entrant. This is tantamount to an argument that competition itself is a cognizable barrier to entry. Of course, it would be a curious approach to antitrust if this were treated as a problem, as it would imply that firms should under-compete — should forego consumer-welfare enhancements—in order to bring about a greater number of firms in a given market simply for its own sake.[18]

Meanwhile, actual economic studies of data network effects are few and far between, with scant empirical evidence to support the theory.[19] Andrei Hagiu and Julian Wright’s theoretical paper offers perhaps the most comprehensive treatment of the topic.[20] The authors ultimately conclude that data network effects can be of different magnitudes and have varying effects on firms’ incumbency advantage.[21] They cite Grammarly (an AI writing-assistance tool) as a potential example: “As users make corrections to the suggestions offered by Grammarly, its language experts and artificial intelligence can use this feedback to continue to improve its future recommendations for all users.”[22]

This is echoed by other economists who contend that “[t]he algorithmic analysis of user data and information might increase incumbency advantages, creating lock-in effects among users and making them more reluctant to join an entrant platform.”[23]

Crucially, some scholars take this logic a step further, arguing that platforms may use data from their “origin markets” in order to enter and dominate adjacent ones:

First, as we already mentioned, data collected in the origin market can be used, once the enveloper has entered the target market, to provide products more efficiently in the target market. Second, data collected in the origin market can be used to reduce the asymmetric information to which an entrant is typically subject when deciding to invest (for example, in R&D) to enter a new market. For instance, a search engine could be able to predict new trends from consumer searches and therefore face less uncertainty in product design.[24]

This possibility is also implicit in the paper by Hagiu and Wright.[25] Indeed, the authors’ theoretical model rests on an important distinction between within-user data advantages (that is, having access to more data about a given user) and across-user data advantages (information gleaned from having access to a wider user base). In both cases, there is an implicit assumption that platforms may use data from one service to gain an advantage in another market (because what matters is information about aggregate or individual user preferences, regardless of its origin).

Our review of the economic evidence suggests that several scholars have, with varying degrees of certainty, raised the possibility that incumbents may leverage data advantages to stifle competitors in their primary market or adjacent ones (be it via merger or organic growth). As we explain below, however, there is ultimately little evidence to support such claims.

Policymakers, however, have largely been receptive to these limited theoretical findings, basing multiple decisions on these theories, often with little consideration of the caveats that accompany them.[26] Indeed, it is remarkable that, in the Furman Report’s section on “[t]he data advantage for incumbents,” only two empirical economic studies are cited, and they offer directly contradictory conclusions with respect to the question of the strength of data advantages.[27] Nevertheless, the Furman Report concludes that data “may confer a form of unmatchable advantage on the incumbent business, making successful rivalry less likely,”[28] and adopts without reservation “convincing” evidence from non-economists with apparently no empirical basis.[29]

In the Google/Fitbit merger proceedings, the European Commission found that the combination of data from Google services with that of Fitbit devices would reduce competition in advertising markets:

Giving [sic] the large amount of data already used for advertising purposes that Google holds, the increase in Google’s data collection capabilities, which goes beyond the mere number of active users for which Fitbit has been collecting data so far, the Transaction is likely to have a negative impact on the development of an unfettered competition in the markets for online advertising.[30]

As a result, the Commission cleared the merger on the condition that Google refrain from using data from Fitbit devices for its advertising platform.[31] The Commission will likely focus on similar issues during its ongoing investigation into Microsoft’s investment into OpenAI.[32]

Along similar lines, the FTC’s complaint to enjoin Meta’s purchase of a virtual-reality (VR) fitness app called “Within” relied, among other things, on the fact that Meta could leverage its data about VR-user behavior to inform its decisions and potentially outcompete rival VR-fitness apps: “Meta’s control over the Quest platform also gives it unique access to VR user data, which it uses to inform strategic decisions.”[33]

The U.S. Department of Justice’s twin cases against Google also raise data leveraging and data barriers to entry. The agency’s AdTech complaint that “Google intentionally exploited its massive trove of user data to further entrench its monopoly across the digital advertising industry.”[34] Similarly, in its Search complaint, the agency argues that:

Google’s anticompetitive practices are especially pernicious because they deny rivals scale to compete effectively. General search services, search advertising, and general search text advertising require complex algorithms that are constantly learning which organic results and ads best respond to user queries; the volume, variety, and velocity of data accelerates the automated learning of search and search advertising algorithms.[35]

Finally, the merger guidelines published by several competition enforcers cite the acquisition of data as a potential source of competitive concerns. For instance, the FTC and DOJ’s newly published guidelines state that “acquiring data that helps facilitate matching, sorting, or prediction services may enable the platform to weaken rival platforms by denying them that data.”[36] Likewise, the UK Competition and Markets Authority (“CMA”) warns against incumbents acquiring firms in order to obtain their data and foreclose other rivals:

Incentive to foreclose rivals…

7.19(e) Particularly in complex and dynamic markets, firms may not focus on short term margins but may pursue other objectives to maximise their long-run profitability, which the CMA may consider. This may include… obtaining access to customer data….[37]

In short, competition authorities around the globe are taking an aggressive stance on data network effects. Among the ways this has manifested is in basing enforcement decisions on fears that data collected by one platform might confer a decisive competitive advantage in adjacent markets. Unfortunately, these concerns rest on little to no empirical evidence, either in the economic literature or the underlying case records.

III. Data Incumbency Advantages in Generative AI Markets

Given the assertions canvassed in the previous section, it seems reasonable to assume that firms such as Google, Meta, and Amazon would be in pole position to dominate the burgeoning market for generative AI. After all, these firms have not only been at the forefront of the field for the better part of a decade, but they also have access to vast troves of data, the likes of which their rivals could only dream when they launched their own services. Thus the authors of the Furman Report caution that “to the degree that the next technological revolution centres around artificial intelligence and machine learning, then the companies most able to take advantage of it may well be the existing large companies because of the importance of data for the successful use of these tools.[38]

At the time of writing, however, this is not how things have unfolded — although it bears noting these markets remain in flux and the competitive landscape is susceptible to change. The first significantly successful generative AI service was arguably not from either Meta—which had been working on chatbots for years and had access to, arguably, the world’s largest database of actual chats—or Google. Instead, the breakthrough came from a previously unknown firm called OpenAI.

OpenAI’s ChatGPT service currently holds an estimated 60% of the market (though reliable numbers are somewhat elusive).[39] It broke the record for the fastest online service to reach 100 million users (in only a couple of months), more than four times faster than the previous record holder, TikTok.[40] Based on Google Trends data, ChatGPT is nine times more popular than Google’s own Bard service worldwide, and 14 times more popular in the U.S.[41] In April 2023, ChatGPT reportedly registered 206.7 million unique visitors, compared to 19.5 million for Google’s Bard.[42] In short, at the time of writing, ChatGPT appears to be the most popular chatbot. And, so far, the entry of large players such as Google Bard or Meta AI appear to have had little effect on its market position.[43]

The picture is similar in the field of AI image generation. As of August 2023, Midjourney, Dall-E, and Stable Diffusion appear to be the three market leaders in terms of user visits.[44] This is despite competition from the likes of Google and Meta, who arguably have access to unparalleled image and video databases by virtue of their primary platform activities.[45]

This raises several crucial questions: how have these AI upstarts managed to be so successful, and is their success just a flash in the pan before Web 2.0 giants catch up and overthrow them? While we cannot answer either of these questions dispositively, some observations concerning the role and value of data in digital markets would appear to be relevant.

A first important observation is that empirical studies suggest data exhibits diminishing marginal returns. In other words, past a certain point, acquiring more data does not confer a meaningful edge to the acquiring firm. As Catherine Tucker puts it, following a review of the literature: “Empirically there is little evidence of economies of scale and scope in digital data in the instances where one would expect to find them.”[46]

Likewise, following a survey of the empirical literature on this topic, Geoffrey Manne & Dirk Auer conclude that:

Available evidence suggests that claims of “extreme” returns to scale in the tech sector are greatly overblown. Not only are the largest expenditures of digital platforms unlikely to become proportionally less important as output increases, but empirical research strongly suggests that even data does not give rise to increasing returns to scale, despite routinely being cited as the source of this effect.[47]

In other words, being the firm with the most data appears to be far less important than having enough data, and this lower bar may be accessible to far more firms than one might initially think possible.

And obtaining enough data could become even easier — that is, the volume of required data could become even smaller — with technological progress. For instance, synthetic data may provide an adequate substitute to real-world data[48] — or may even outperform real-world data.[49] As Thibault Schrepel and Alex Pentland point out, “advances in computer science and analytics are making the amount of data less relevant every day. In recent months, important technological advances have allowed companies with small data sets to compete with larger ones.”[50]

Indeed, past a certain threshold, acquiring more data might not meaningfully improve a service, where other improvements (such as better training methods or data curation) could have a large effect. In fact, there is some evidence that excessive data impedes a service’s ability to generate results appropriate for a given query: “[S]uperior model performance can often be achieved with smaller, high-quality datasets than massive, uncurated ones. Data curation ensures that training datasets are devoid of noise, irrelevant instances, and duplications, thus maximizing the efficiency of every training iteration.”[51]

Consider, for instance, a user who wants to generate an image of a basketball. Using a model trained on an indiscriminate range and number of public photos in which a basketball appears, but is surrounded by copious other image data, the user may end up with an inordinately noisy result. By contrast, a model trained with a better method on fewer, more-carefully selected images, could readily yield far superior results.[52] In one important example,

[t]he model’s performance is particularly remarkable, given its small size. “This is not a large language model trained on the whole Internet; this is a relatively small transformer trained for these tasks,” says Armando Solar-Lezama, a computer scientist at the Massachusetts Institute of Technology, who was not involved in the new study…. The finding implies that instead of just shoving ever more training data into machine-learning models, a complementary strategy might be to offer AI algorithms the equivalent of a focused linguistics or algebra class.[53]

Current efforts are thus focused on improving the mathematical and logical reasoning of large language models (“LLMs”), rather than maximizing training datasets.[54] Two points stand out. The first is that firms like OpenAI rely largely on publicly available datasets — such as GSM8K — to train their LLMs.[55] Second, the real challenge to create cutting-edge AI is not so much in collecting data, but rather in creating innovative AI training processes and architectures:

[B]uilding a truly general reasoning engine will require a more fundamental architectural innovation. What’s needed is a way for language models to learn new abstractions that go beyond their training data and have these evolving abstractions influence the model’s choices as it explores the space of possible solutions.

We know this is possible because the human brain does it. But it might be a while before OpenAI, DeepMind, or anyone else figures out how to do it in silicon.[56]

Furthermore, it is worth noting that the data most relevant to startups operating in a given market may not be those data held by large incumbent platforms in other markets, but rather data specific to the market in which the startup is active or, even better, to the given problem it is attempting to solve:

As Andres Lerner has argued, if you wanted to start a travel business, the data from Kayak or Priceline would be far more relevant. Or if you wanted to start a ride-sharing business, data from cab companies would be more useful than the broad, market-cross-cutting profiles Google and Facebook have. Consider companies like Uber, Lyft and Sidecar that had no customer data when they began to challenge established cab companies that did possess such data. If data were really so significant, they could never have competed successfully. But Uber, Lyft and Sidecar have been able to effectively compete because they built products that users wanted to use — they came up with an idea for a better mousetrap. The data they have accrued came after they innovated, entered the market and mounted their successful challenges — not before.[57]

The bottom line is that data is not the be-all and end-all that many in competition circles rather casually make it out to be.[58] While data may often confer marginal benefits, there is little sense these are ultimately decisive.[59] As a result, incumbent platforms’ access to vast numbers of users and data in their primary markets might only marginally affect their AI competitiveness.

A related observation is that firms’ capabilities and other features of their products arguably play a more important role than the data they own.[60] Examples of this abound in digital markets. Google overthrew Yahoo, despite initially having access to far fewer users and far less data; Google and Apple overcame Microsoft in the smartphone OS market despite having comparatively tiny ecosystems (at the time) to leverage; and TikTok rose to prominence despite intense competition from incumbents like Instagram, which had much larger user bases. In each of these cases, important product-design decisions (such as the PageRank algorithm, recognizing the specific needs of mobile users,[61] and TikTok’s clever algorithm) appear to have played a far greater role than initial user and data endowments (or lack thereof).

All of this suggests that the early success of OpenAI likely has more to do with its engineering decisions than the data it did (or did not) own. And going forward, OpenAI and its rivals’ ability to offer and monetize compelling stores offering custom versions of their generative AI technology will arguably play a much larger role than (and contribute to) their ownership of data.[62] In other words, the ultimate challenge is arguably to create a valuable platform, of which data ownership is a consequence, but not a cause.

It is also important to note that, in those instances where it is valuable, data does not just fall from the sky. Instead, it is through smart business and engineering decisions that firms can generate valuable information (which does not necessarily correlate with owing more data).

For instance, OpenAI’s success with ChatGPT is often attributed to its more efficient algorithms and training models, which arguably have enabled the service to improve more rapidly than its rivals.[63] Likewise, the ability of firms like Meta and Google to generate valuable data for advertising arguably depends more on design decisions that elicit the right data from users, rather than the raw number of users in their networks.

Put differently, setting up a business so as to generate the right information is more important than simply owning vast troves of data.[64] Even in those instances where high-quality data is an essential parameter of competition, it does not follow that having vaster databases or more users on a platform necessarily leads to better information for the platform.

Given what precedes, it seems clear that OpenAI and other generative AI startups’ early success, as well as their chances of prevailing in the future, hinge on a far broader range of factors than the mere ownership of data. Indeed, if data ownership consistently conferred a significant competitive advantage, these new firms would not be where they are today. This does not mean that data is worthless, of course. Rather, it means that competition authorities should not assume that merely possessing data is a dispositive competitive advantage, absent compelling empirical evidence to support such a finding. In this light, the current wave of decisions and competition-policy pronouncements that rely on data-related theories of harm are premature.

IV. Five Key Takeaways: Reconceptualizing the Role of Data in Generative AI Competition

As we explain above, data (network effects) are not the source of barriers to entry that they are sometimes made out to be; rather, the picture is far more nuanced. Indeed, as economist Andres Lerner demonstrated almost a decade ago (and the assessment is only truer today):

Although the collection of user data is generally valuable for online providers, the conclusion that such benefits of user data lead to significant returns to scale and to the entrenchment of dominant online platforms is based on unsupported assumptions. Although, in theory, control of an “essential” input can lead to the exclusion of rivals, a careful analysis of real-world evidence indicates that such concerns are unwarranted for many online businesses that have been the focus of the “big data” debate.[65]

While data can be an important part of the competitive landscape, incumbent data advantages are far less pronounced than today’s policymakers commonly assume. In that respect, five main lessons emerge:

  1. Data can be (very) valuable, but past a certain threshold, the benefits tend to diminish. In other words, having the most data is less important than having enough;
  2. The ability to generate valuable information does not depend on the number of users or the amount of data a platform has previously acquired;
  3. The most important datasets are not always proprietary;
  4. Technological advances and platforms’ engineering decisions affect their ability to generate valuable information, and this effect swamps the effect of the amount of data they own; and
  5. How platforms use data is arguably more important than what data or how much data they own.

These lessons have important ramifications for competition-policy debates over the competitive implications of data in technologically evolving areas.

First, it is not surprising that startups, rather than incumbents, have taken an early lead in generative AI (and in Web 2.0 before it). After all, if data-incumbency advantages are small or even nonexistent, then smaller and more nimble players may have an edge over established tech platforms. This is all the more likely given that, despite significant efforts, the biggest tech platforms were unable to offer compelling generative AI chatbots and image-generation services before the emergence of ChatGPT, Dall-E, Midjourney, etc. This failure suggests that, in a process akin to Christensen’s Innovator’s Dilemma,[66] something about their existing services and capabilities was holding them back in those markets. Of course, this does not necessarily mean that those same services/capabilities could not become an advantage when the generative AI market starts addressing issues of monetization and scale.[67] But it does mean that assumptions of a firm’s market power based on its possession of data are off the mark.

Another important implication is that, paradoxically, policymakers’ efforts to prevent Web 2.0 platforms from competing freely in generative AI markets may ultimately backfire and lead to less, not more, competition. Indeed, OpenAI is currently acquiring a sizeable lead in generative AI. While competition authorities might like to think that other startups will emerge and thrive in this space, it is important not to confuse desires with reality. For, while there is a vibrant AI-startup ecosystem, there is at least a case to be made that the most significant competition for today’s AI leaders will come from incumbent Web 2.0 platforms — although nothing is certain at this stage. Policymakers should beware not to stifle that competition on the misguided assumption that competitive pressure from large incumbents is somehow less valuable to consumers than that which originates from smaller firms.

Finally, even if there were a competition-related market failure to be addressed (which is anything but clear) in the field of generative AI, it is unclear that contemplated remedies would do more good than harm. Some of the solutions that have been put forward have highly ambiguous effects on consumer welfare. Scholars have shown that mandated data sharing — a solution championed by EU policymakers, among others — may sometimes dampen competition in generative AI markets.[68] This is also true of legislation like the GDPR that make it harder for firms to acquire more data about consumers — assuming such data is, indeed, useful to generative AI services.[69]

In sum, it is a flawed understanding of the economics and practical consequences of large agglomerations of data that lead competition authorities to believe that data-incumbency advantages are likely to harm competition in generative AI markets — or even in the data-intensive Web 2.0 markets that preceded them. Indeed, competition or regulatory intervention to “correct” data barriers and data network and scale effects is liable to do more harm than good.

[1] Nathan Newman, Taking on Google’s Monopoly Means Regulating Its Control of User Data, Huffington Post (Sep. 24, 2013),

[2] See e.g. Lina Khan & K. Sabeel Rahman, Restoring Competition in the U.S. Economy, in Untamed: How to Check Corporate, Financial, and Monopoly Power (Nell Abernathy, Mike Konczal, & Kathryn Milani, eds., 2016), at 23 (“From Amazon to Google to Uber, there is a new form of economic power on display, distinct from conventional monopolies and oligopolies…, leverag[ing] data, algorithms, and internet-based technologies… in ways that could operate invisibly and anticompetitively.”); Mark Weinstein, I Changed My Mind — Facebook Is a Monopoly, Wall St. J. (Oct. 1, 2021), (“[T]he glue that holds it all together is Facebook’s monopoly over data…. Facebook’s data troves give it unrivaled knowledge about people, governments — and its competitors.”).

[3] See generally Abigail Slater, Why “Big Data” Is a Big Deal, The Reg. Rev. (Nov. 6, 2023),; Amended Complaint at ¶36, United States v. Google, 1:20-cv-03010- (D.D.C. 2020); Complaint at ¶37, United States v. Google, 1:23-cv-00108 (E.D. Va. 2023), (“Google intentionally exploited its massive trove of user data to further entrench its monopoly across the digital advertising industry.”).

[4] See e.g. Press Release, European Commission, Commission Launches Calls for Contributions on Competition in Virtual Worlds and Generative AI (Jan. 9, 2024),; Krysten Crawford, FTC’s Lina Khan warns Big Tech over AI, SIEPR (Nov. 3, 2020), (“Federal Trade Commission Chair Lina Khan delivered a sharp warning to the technology industry in a speech at Stanford on Thursday: Antitrust enforcers are watching what you do in the race to profit from artificial intelligence.”) (emphasis added).

[5] See e.g. John M. Newman, Antitrust in Digital Markets, 72 Vand. L. Rev. 1497, 1501 (2019) (“[T]he status quo has frequently failed in this vital area, and it continues to do so with alarming regularity. The laissez-faire approach advocated for by scholars and adopted by courts and enforcers has allowed potentially massive harms to go unchecked.”);
Bertin Martins, Are New EU Data Market Regulations Coherent and Efficient?, Bruegel Working Paper 21/23 (2023), available at (“Technical restrictions on access to and re-use of data may result in failures in data markets and data-driven services markets.”); Valéria Faure-Muntian, Competitive Dysfunction: Why Competition Law Is Failing in a Digital World, The Forum Network (Feb. 24, 2021),

[6] Rana Foroohar, The Great US-Europe Antitrust Divide, FT (Feb. 5, 2024),

[7] See e.g. Press Release, European Commission, supra note 5.

[8] See infra, Section II. Commentators have also made similar claims. See, e.g., Ganesh Sitaram & Tejas N. Narechania, It’s Time for the Government to Regulate AI. Here’s How, Politico (Jan. 15, 2024) (“All that cloud computing power is used to train foundation models by having them “learn” from incomprehensibly huge quantities of data. Unsurprisingly, the entities that own these massive computing resources are also the companies that dominate model development. Google has Bard, Meta has LLaMa. Amazon recently invested $4 billion into one of OpenAI’s leading competitors, Anthropic. And Microsoft has a 49 percent ownership stake in OpenAI — giving it extraordinary influence, as the recent board struggles over Sam Altman’s role as CEO showed.”).

[9] Press Release, European Commission, supra note 5.

[10] Comment of U.S. Federal Trade Commission to the U.S. Copyright Office, Artificial Intelligence and Copyright, Docket No. 2023-6 (Oct. 30, 2023) at 4, available at (emphasis added).

[11] See, e.g. Joe Caserta, Holger Harreis, Kayvaun Rowshankish, Nikhil Srinidhi, and Asin Tavakoli, The data dividend: Fueling generative AI, McKinsey Digital (Sept. 15, 2023), (“Your data and its underlying foundations are the determining factors to what’s possible with generative AI.”).

[12] See e.g. Tim Keary, Google DeepMind’s Achievements and Breakthroughs in AI Research, Techopedia (Aug. 11, 2023),; See e.g. Will Douglas Heaven, Google DeepMind used a large language model to solve an unsolved math problem, MIT Technology Review (Dec. 14, 2023),; See also A Decade of Advancing the State-of-the-Art in AI Through Open Research, Meta (Nov. 30, 2023),; See also 200 languages within a single AI model: A breakthrough in high-quality machine translation, Meta, (last visited Jan. 18, 2023).

[13] See e.g. Jennifer Allen, 10 years of Siri: the history of Apple’s voice assistant, Tech Radar (Oct. 4, 2021),; see also Evan Selleck, How Apple is already using machine learning and AI in iOS, Apple Insider (Nov. 20, 2023),; see also Kathleen Walch, The Twenty Year History Of AI At Amazon, Forbes (July 19, 2019),

[14] See infra Section III.

[15] See e.g. Cédric Argenton & Jens Prüfer, Search Engine Competition with Network Externalities, 8 J. Comp. L. & Econ. 73, 74 (2012); Mark A. Lemley & Matthew Wansley, Coopting Disruption (February 1, 2024),

[16] John M. Yun, The Role of Big Data in Antitrust, in The Global Antitrust Institute Report on the Digital Economy (Joshua D. Wright & Douglas H. Ginsburg, eds., Nov. 11, 2020) at 233, available at See also e.g. Robert Wayne Gregory, Ola Henfridsson, Evgeny Kaganer, & Harris Kyriakou, The Role of Artificial Intelligence and Data Network Effects for Creating User Value, 46 Acad. of Mgmt. Rev. 534 (2020), final pre-print version at 4, available at (“A platform exhibits data network effects if, the more that the platform learns from the data it collects on users, the more valuable the platform becomes to each user.”). See also Karl Schmedders, José Parra-Moyano & Michael Wade, Why Data Aggregation Laws Could be the Answer to Big Tech Dominance, Silicon Republic (Feb. 6, 2024),

[17] Nathan Newman, Search, Antitrust, and the Economics of the Control of User Data, 31 Yale J. Reg. 401, 409 (2014) (emphasis added). See also id. at 420 & 423 (“While there are a number of network effects that come into play with Google, [“its intimate knowledge of its users contained in its vast databases of user personal data”] is likely the most important one in terms of entrenching the company’s monopoly in search advertising…. Google’s overwhelming control of user data… might make its dominance nearly unchallengeable.”).

[18] See also Yun, supra note 17 at 229 (“[I]nvestments in big data can create competitive distance between a firm and its rivals, including potential entrants, but this distance is the result of a competitive desire to improve one’s product.”).

[19] For a review of the literature on increasing returns to scale in data (this topic is broader than data network effects) see Geoffrey Manne & Dirk Auer, Antitrust Dystopia and Antitrust Nostalgia: Alarmist Theories of Harm in Digital Markets and Their Origins, 28 Geo Mason L. Rev. 1281, 1344 (2021).

[20] Andrei Hagiu & Julian Wright, Data-Enabled Learning, Network Effects, and Competitive Advantage, 54 RAND J. Econ. 638 (2023) (final preprint available at

[21] Id. at 2. The authors conclude that “Data-enabled learning would seem to give incumbent firms a competitive advantage. But how strong is this advantage and how does it differ from that obtained from more traditional mechanisms….”

[22] Id.

[23] Bruno Jullien & Wilfried Sand-Zantman, The Economics of Platforms: A Theory Guide for Competition Policy, 54 Info. Econ. & Pol’y 10080, 101031 (2021).

[24] Daniele Condorelli & Jorge Padilla, Harnessing Platform Envelopment in the Digital World, 16 J. Comp. L. & Pol’y 143, 167 (2020).

[25] See Hagiu & Wright, supra note 21.

[26] For a summary of these limitations, see generally Catherine Tucker, Network Effects and Market Power: What Have We Learned in the Last Decade?, Antitrust (Spring 2018) at 72, available at See also Manne & Auer, supra note 20, at 1330.

[27] See Jason Furman, Diane Coyle, Amelia Fletcher, Derek McAuley & Philip Marsden (Dig. Competition Expert Panel), Unlocking Digital Competition (2019) at 32-35 (“Furman Report”), available at

[28] Id. at 34.

[29] Id. at 35. To its credit, it should be noted, the Furman Report does counsel caution before mandating access to data as a remedy to promote competition. See id. at 75. That said, the Furman Report does maintain that such a remedy should certainly be on the table because “the evidence suggests that large data holdings are at the heart of the potential for some platform markets to be dominated by single players and for that dominance to be entrenched in a way that lessens the potential for competition for the market.” Id. In fact, the evidence does not show this.

[30] Case COMP/M.9660 — Google/Fitbit, Commission Decision (Dec. 17, 2020) (Summary at O.J. (C 194) 7), available at at 455.

[31] Id. at 896.

[32] See Natasha Lomas, EU Checking if Microsoft’s OpenAI Investment Falls Under Merger Rules, TechCrunch (Jan. 9, 2024),

[33] Amended Complaint at 11, Meta/Zuckerberg/Within, Fed. Trade Comm’n. (2022) (No. 605837), available at

[34] Amended Complaint (D.D.C), supra note 4, at ¶37.

[35] Amended Complaint (E.D. Va), supra note 4, at ¶8.

[36] US Dep’t of Justice & Fed. Trade Comm’n, Merger Guidelines (2023) at 25,

[37] Competition and Mkts. Auth., Merger Assessment Guidelines (2021) at  ¶7.19(e),–_.pdf.

[38] Furman Report, supra note 28, at ¶4.

[39] See e.g. Chris Westfall, New Research Shows ChatGPT Reigns Supreme in AI Tool Sector, Forbes (Nov. 16, 2023),

[40] See Krystal Hu, ChatGPT Sets Record for Fastest-Growing User Base, Reuters (Feb. 2, 2023),; Google: The AI Race Is On, App Economy Insights (Feb. 7, 2023),

[41] See Google Trends,,%2Fg%2F11ts49p01g&hl=en (last visited, Jan. 12, 2024) and,%2Fg%2F11ts49p01g&hl=en (last visited Jan. 12, 2024).

[42] See David F. Carr, As ChatGPT Growth Flattened in May, Google Bard Rose 187%, Similarweb Blog (June 5, 2023),

[43] See Press Release, Meta, Introducing New AI Experiences Across Our Family of Apps and Devices (Sept. 27, 2023),; Sundar Pichai, An Important Next Step on Our AI Journey, Google Keyword Blog (Feb. 6, 2023),

[44] See Ion Prodan, 14 Million Users: Midjourney’s Statistical Success, Yon (Aug. 19, 2023), See also Andrew Wilson, Midjourney Statistics: Users, Polls, & Growth [Oct 2023], ApproachableAI (Oct. 13, 2023),

[45] See Hema Budaraju, New Ways to Get Inspired with Generative AI in Search, Google Keyword Blog (Oct. 12, 2023),; Imagine with Meta AI, Meta (last visited Jan. 12, 2024),

[46] Catherine Tucker, Digital Data, Platforms and the Usual [Antitrust] Suspects: Network Effects, Switching Costs, Essential Facility, 54 Rev. Indus. Org. 683, 686 (2019).

[47] Manne & Auer, supra note 20, at 1345.

[48] See e.g. Stefanie Koperniak, Artificial Data Give the Same Results as Real Data—Without Compromising Privacy, MIT News (Mar. 3, 2017), (“[Authors] describe a machine learning system that automatically creates synthetic data—with the goal of enabling data science efforts that, due to a lack of access to real data, may have otherwise not left the ground. While the use of authentic data can cause significant privacy concerns, this synthetic data is completely different from that produced by real users—but can still be used to develop and test data science algorithms and models.”).

[49] See e.g. Rachel Gordon, Synthetic Imagery Sets New Bar in AI Training Efficiency, MIT News (Nov. 20, 2023), (“By using synthetic images to train machine learning models, a team of scientists recently surpassed results obtained from traditional ‘real-image’ training methods.).

[50] Thibault Schrepel & Alex ‘Sandy’ Pentland, Competition Between AI Foundation Models: Dynamics and Policy Recommendations, MIT Connection Science Working Paper (Jun. 2023), at 8.

[51] Igor Susmelj, Optimizing Generative AI: The Role of Data Curation, Lightly (last visited Jan 15, 2024),

[52] See e.g. Xiaoliang Dai, et al., Emu: Enhancing Image Generation Models Using Photogenic Needles in a Haystack , ArXiv (Sep. 27, 2023) at 1, (“[S]upervised fine-tuning with a set of surprisingly small but extremely visually appealing images can significantly improve the generation quality.”). See also Hu Xu, et al., Demystifying CLIP Data, ArXiv (Sep. 28, 2023),

[53] Lauren Leffer, New Training Method Helps AI Generalize like People Do, Sci. Am. (Oct. 26, 2023), (discussing Brendan M. Lake & Marco Baroni, Human-Like Systematic Generalization Through a Meta-Learning Neural Network, 623 Nature 115 (2023)).

[54] Timothy B. Lee, The Real Research Behind the Wild Rumors about OpenAI’s Q* Project, Ars Technica (Dec. 8, 2023),

[55] Id. See also GSM8K, Papers with Code (last visited Jan. 18, 2023), available at; MATH Dataset, GitHub (last visited Jan. 18, 2024), available at

[56] Lee, supra note 55.

[57] Geoffrey Manne & Ben Sperry, Debunking the Myth of a Data Barrier to Entry for Online Services, Truth on the Market (Mar. 26, 2015), (citing Andres V. Lerner, The Role of ‘Big Data’ in Online Platform Competition (Aug. 26, 2014), available at

[58] See e.g., Lemley & Wansley, supra note 18, at 22 (“Incumbents have all that information. It would be difficult for a new entrant to acquire similar datasets independently….”).

[59] See Catherine Tucker, Digital Data as an Essential Facility: Control, CPI Antitrust Chron. (Feb. 2020) at 11 (“[U]ltimately the value of data is not the raw manifestation of the data itself, but the ability of a firm to use this data as an input to insight.”).

[60] Or, as John Yun puts it, data is only a small component of digital firms’ production function. See Yun, supra note 17, at 235 (“Second, while no one would seriously dispute that having more data is better than having less, the idea of a data-driven network effect is focused too narrowly on a single factor improving quality. As mentioned in supra Section I.A, there are a variety of factors that enter a firm’s production function to improve quality.”).

[61] Luxia Le, The Real Reason Windows Phone Failed Spectacularly, History–Computer (Aug. 8, 2023),

[62] Introducing the GPT Store, Open AI (Jan. 10, 2024),

[63] See Michael Schade, How ChatGPT and Our Language Models are Developed, OpenAI,; Sreejani Bhattacharyya, Interesting innovations from OpenAI in 2021, AIM (Jan. 1, 2022),; Danny Hernadez & Tom B. Brown, Measuring the Algorithmic Efficiency of Neural Networks, ArXiv (May 8, 2020), available at

[64] See Yun, supra note 17 at 235 (“Even if data is primarily responsible for a platform’s quality improvements, these improvements do not simply materialize with the presence of more data—which differentiates the idea of data-driven network effects from direct network effects. A firm needs to intentionally transform raw, collected data into something that provides analytical insights. This transformation involves costs including those associated with data storage, organization, and analytics, which moves the idea of collecting more data away from a strict network effect to more of a ‘data opportunity.’”).

[65] Lerner, supra note 58, at 4-5 (emphasis added).

[66] See Clayton M. Christensen, The Innovator’s Dilemma: When New Technologies Cause Great Firms to Fail (2013).

[67] See David J. Teece, Dynamic Capabilities and Strategic Management: Organizing for Innovation and Growth (2009).

[68] See Hagiu and Wright, supra note 21, at 4 (“We use our dynamic framework to explore how data sharing works: we find that it in-creases consumer surplus when one firm is sufficiently far ahead of the other by making the laggard more competitive, but it decreases consumer surplus when the firms are sufficiently evenly matched by making firms compete less aggressively, which in our model means subsidizing consumers less.”). See also Lerner, supra note 58.

[69] See e.g. Hagiu & Wright, id. (“We also use our model to highlight an unintended consequence of privacy policies. If such policies reduce the rate at which firms can extract useful data from consumers, they will tend to increase the incumbent’s competitive advantage, reflecting that the entrant has more scope for new learning and so is affected more by such a policy.”); Jian Jia, Ginger Zhe Jin & Liad Wagman, The Short-Run Effects of the General Data Protection Regulation on Technology Venture Investment, 40 Marketing Sci. 593 (2021) (finding GDPR reduced investment in new and emerging technology firms, particularly in data-related ventures); James Campbell, Avi Goldfarb, & Catherine Tucker, Privacy Regulation and Market Structure, 24 J. Econ. & Mgmt. Strat. 47 (2015) (“Consequently, rather than increasing competition, the nature of transaction costs implied by privacy regulation suggests that privacy regulation may be anti-competitive.”).

Continue reading
Antitrust & Consumer Protection

Network Reliability as a Differentiated Product

Popular Media In March 2023 I wrote a post reiterating my argument that grid reliability is not a public good, at least not according to the technical . . .

In March 2023 I wrote a post reiterating my argument that grid reliability is not a public good, at least not according to the technical definition of a public good as nonexcludable and nonrival. Instead I argued that grid reliability is a common-pool resource, potentially somewhat excludable or at least differentiable, and if not nonrival at least congestible. Given recent discussions in electricity policy about reliability in general and winter reliability in particular, I think it’s worth reconsidering now.

Read the full piece here.

Continue reading
Innovation & the New Economy

The FTC Should Not Enact a Deceptive or Unfair Marketing Earnings-Claims Rule

TOTM Back in February 2022, the Federal Trade Commission (FTC) announced an advance notice of proposed rulemaking (ANPRM) on “deceptive or unfair earnings claims.” According to the FTC… Read . . .

Back in February 2022, the Federal Trade Commission (FTC) announced an advance notice of proposed rulemaking (ANPRM) on “deceptive or unfair earnings claims.” According to the FTC…

Read the full piece here.

Continue reading
Antitrust & Consumer Protection

From Europe, with Love: Lessons in Regulatory Humility Following the DMA Implementation

TOTM The European Union’s implementation of the Digital Markets Act (DMA), whose stated goal is to bring more “fairness” and “contestability” to digital markets, could offer some important . . .

The European Union’s implementation of the Digital Markets Act (DMA), whose stated goal is to bring more “fairness” and “contestability” to digital markets, could offer some important regulatory lessons to those countries around the world that have been rushing to emulate the Old Continent.

Read the full piece here.


Continue reading
Antitrust & Consumer Protection

Tom Hazlett on the Benefits of Mergers

Presentations & Interviews ICLE Academic Affiliate Thomas Hazlett was a guest on CNBC’s Squawk Box to discuss his recent Wall Street Journal op-ed arguing that mergers often have . . .

ICLE Academic Affiliate Thomas Hazlett was a guest on CNBC’s Squawk Box to discuss his recent Wall Street Journal op-ed arguing that mergers often have consumer benefits. Video of the clip is embedded below.

Continue reading
Antitrust & Consumer Protection