Scholarship (ICLE)

Findings of Fact May Be Stubborn Things on Appeal, Even if They Are Not Facts

The U.S. government’s case in the Google Search matter raised complex questions of law and fact.[1] The former are more likely issues for appeal, due in no small part to the trial court’s traditional function as finder of fact. The findings are interesting all the same difficult, too, casual reactions to the decision notwithstanding.

The court’s treatment of the facts in evidence may have implications beyond the instant case. In that regard, two issues seem especially salient. First is the question of default bias, which might be termed a question of the “stickiness” of default settings for general search engines (GSEs) or of switching costs. That is, to what extent does a system’s choice of a default GSE diminish or impede the ability of other GSEs to compete?

Second, what sort of answer to the first question is required to establish liability for monopolization under Section 2 of the Sherman Act? For example, if the degree of stickiness to establish monopolization is (a) one that prevents actual or would-be rivals from reaching minimum efficient scale or, stronger, (b) one that prevents actual or would be rivals from reaching minimum efficient scale and would not make economic sense but for that exclusionary effect, then we can ask about the minimum efficient scale required to compete.

I. Behavioral Economics and the Google Search Case

Part of what is interesting about the first question is that Judge Mehta’s decision on liability seemed to credit expert testimony from behavioral economics.[2] Behavioral economics however distinct from the behavioral sciences generally—continues to develop as a field of research. But whatever its scientific interest and broader policy implications,[3] behavioral economics has made few inroads in antitrust jurisprudence, and perhaps no systematic ones. Further, it has been argued that the apparent dearth of applications in antitrust law (or competition policy) should persist.[4] While I admit to some sympathy for skepticism about behavioral antitrust generally, its demise is hardly settled,[5] and this essay does not mean to resolve the general question whether behavioral research could find more purchase than it has in competition policy.

Rather, I consider the apparent role of the behavioral evidence presented in the Google Search case as a model inroad for the field. Specifically, Judge Mehta’s opinion repeatedly cites testimony from Professor Antonio Rangel on “choice friction” and the “stickiness” of default settings.[6] For example, the opinion states, as a finding of fact, that “as Dr. Rangel convincingly explained, the combination of user habit, Google’s brand, and choice friction creates a powerful default effect that drives most consumers to use the default search access points occupied by Google.[7] Reviewing Rangel’s slides for the case, the high-level message seems clear: “Search engine defaults generate a sizable and robust bias towards the default” and “[s]earch engine default effects have stronger effects on mobile devices than on personal computers.[8]

Some see the application of behavioral evidence in the case to be salutary.[9] I do not. At one level, Professor Rangel’s general observation about defaults seems unsurprising. Changing anything—lunch reservations or seats around a table—implies transition costs (or switching costs), however high or low. Any sort of default might generate a bias toward the default position. That might be a robust observation across domains, whether it has anything to do with mobile phones, browsers, and GSEs or not, and it might have been a general observation before the advent of behavioral economics. The findings of fact do not ascribe any particular weights to the contributions of, say, choice frictions and “Google’s brand,” while other findings of fact suggest that Google earned its reputational assets, at least to the extent that Google innovated in developing what was, and remains, according to other findings of fact in the decision, a superior GSE.

What, then, was the contribution of behavioral economics? “Sizable” is vague, to be sure, but it is not without connotation: “sizable” implies large or very large effects, if not any precise numerical quantity. Additional slides seem to amplify the connotation, even if there are no obvious citations to studies of the search defaults at issue. For an example that might seem familiar to readers of Thaler and Sunstein’s Nudge, Rangel’s slides state: “In Austria, where citizens were registered as organ donors by default, 99% were registered donors. In neighboring Germany, where citizens had to affirmatively register, only 12% were registered donors.[10] The delta is indeed striking, especially if we assume that switching costs are low and that baseline beliefs and preferences are similar across the two nations. It is merely a correlation, but it is striking, nonetheless.

Commentary on the case has identified that specific correlation, among others, as illustrative. Crémer et al. state that “[t]here is extensive behavioral economics literature documenting substantial default effects in both laboratory and field experiments. For instance, defaulting people into organ donation schemes leads to 99% participation, compared to just 12% when people need to register.[11]

There is indeed extensive evidence of default effects in different choice settings. Still, one might wonder whether the single correlation presented in the expert testimony may be unrepresentative. As it happens, there is a body of research on the impact of defaults on rates of organ donation—no surprise there, given the medical import of organ donation.[12] That research does not, however, suggest that the story is nearly as simple or striking as the German/Austria example might imply. Overall, it suggests that many factors, besides the default, might produce different effects in the real world and that, in fact, diverse factors have been observed to bear on donation rates.[13] Or, as a 2019 review of experimental, cross-sectional, and longitudinal evidence observed, opt-out (presumed consent) defaults are associated with higher rates of donation, but the observed magnitudes vary considerably, and the default is “just one factor among many.[14]

One study using a time series of European data from 1991 to 2001 controlled for variables documented to affect donation rates, including nations’ transplant infrastructure, educational level, and religion. In nations where the default is donation (opt-out nations), the authors observed “a significant [P<0.02] increase in donation, increasing from 14.1 to 16.4, a 16.3% increase.[15] The delta is substantial, but it is a far cry from that which might be implied by Rangel’s testimony.

A more recent study, employing panel data from the EU-27 countries plus Croatia, found that “presumed consent countries have 28% to 32% higher cadaveric donation and 27% to 31% higher kidney transplant rates in comparison to informed consent countries, after accounting for potential confounding factors.[16] Again, we have a substantial effect, but a variable one; and even leaving aside the question of causality, we see an observed effect that is considerably smaller than that implied by Rangel’s Austria/Germany example.

Similar research, and experimental research, suggest that any number of vectors or inputs might make a difference, including (i) the type of decision or subject matter of the default; (ii) the framing (the way the default and the alternatives are presented); (iii) the current default itself; (iv) users’ beliefs about the riskiness of the current default (or of switching); (v) the source of the default; and (vi) the familiarity of the user with the type of choice at issue.[17] Much of this is a way of saying that context matters, which also includes the notion that “the value of an option depends not only on the option in question, but also on the other options in the choice set.[18] And, indeed, various characterizations of context dependence of choice are endemic in the larger literature.[19]

Rangel’s own body of work, while highly varied, can be situated within this tradition. He has, for example, studied mechanisms of attention as they might be applied to varying grocery shelf placement, if not in actual grocery-store settings. And Google’s default deals have been likened to slotting fees paid for desirable grocery-shelf placement.[20]

For example, he and co-authors have studied visual attention and choice, with experiments displaying certain food choices to investigate, e.g., the influence of visual salience on the amount of attention devoted to a display. Experiments on visual saliency and consumer choice[21] suggest that the visual properties of, e.g., packaging can influence visual attention, which can bias choice in experiments where items were displayed in marked screen “shelves” (not actual grocery shelves), and eye tracking was measured. Rangel and colleagues observed that eliminating peripheral display of alternatives approximately doubled the attention biases—the amount of attention devoted to a displayed food item. The authors “suggest that individuals might be influenceable by settings in which only one item is shown at a time, such as e-commerce.[22] And they might be. These are legitimate areas of inquiry. But how, if at all, is one supposed to generalize the experimental findings such that they can be applied to the efficacy of GSE default settings? What is the applicable model?

To some extent, the misfit between the empirical evidence and its application in the case may be due to misuse of the available science, and not the experimental research itself.[23] We need not assail the research methods or findings to observe that the experiments do not quite measure or model the extent to which, say, varied cereal displays in grocery stores (for which manufacturers might pay slotting fees) modulate the choices made by either marginal consumers, those with established preferences, or on average. For example, manipulating the relative amount of attention that research subjects pay to projections of appetitive (desirable) and aversive food items, it was observed that appetitive items were 6–11% more likely to be chosen with long fixation (exposure)—with a similar but somewhat different bias for aversive items.[24] The effect of visual salience is most pronounced when consumers are (experimentally) forced to make very fast choices, across a range of image exposures from 70 to 500 milliseconds. That is, the very longest exposure was half a second.[25]

These are, of course, very different magnitudes than the ones we see in the evidentiary slides. They are much smaller magnitudes, and the largest of them are observed to be associated with very rapid forced choices. That is not to say that the results from specific choice experiments are more representative of the degree to which a given default setting for a browser or search engine might or might not impede switching—something the experiments were not designed to test. Rather, given the apparent complexity of the affective, perceptual, and cognitive mechanisms at play, we simply do not know how to generalize from the experimental findings to the market dynamics at issue in the Google Search case. We do not know, and we ought not to pretend we do. Again, observed effects seem to vary greatly across the nature, framing, and context of the choice at issue, and that variation may be one of the most robust observations we can make about the choice literature.[26]

Perhaps more than that, the specific examples of default effects identified by the government’s expert witness and certain commenters seem not so much arbitrary as cherry-picked—that is, highly skewed results suggesting that some default effects in some choice domains may be very large, while ignoring conspicuous evidence that observed effects vary considerably and, apparently, due to numerous factors. On balance, it seems more misleading than informative—less charitably, quoting Bentham, we might say that appeals to scientific authority in the case were “nonsense on stilts.[27]

To be sure, the court’s finding on the stickiness of default status does not rest solely on the evidence from behavioral economics. But the juggling of real-world correlations (e.g., Google’s share of search on Android devices in Europe, where there is no default GSE, and of desktop search on Windows machines, which come preloaded with Microsoft’s Edge browser, on which Bing is the default search engine) seems worse than its handling of the behavioral evidence—indeed, it seems fundamentally confusing.[28]

II. The Limits of Scale

There is another technical issue at the heart of Google’s scale advantage. That is not simply whether Google enjoys some scale advantage in data or “compute,” but whether (a) such advantages are the result of the default agreements, (b) are likely to be durable, (c) prevent rivals, such as Microsoft’s Bing, from reaching minimum efficient scale, and perhaps (or perhaps not) (d) whether the purchase of those scale advantages, via those agreements, would make no economic sense but for the (relative) harm they do to Bing.

There is ample evidence that large language models (LLMs) require significant computational resources, including both the various infrastructure resources that are often bundled under the rubric “compute” and access to large datasets for training data.[29] These resources are fundamental prerequisites of machine learning generally, which includes, but is hardly limited to, LLMs. Entry is far from costless. Scaling training data and model parameters have paid dividends in fact,[30] and it may be understandable that, building on observations about scaling in LLMs,[31] some have suggested scaling “laws” (not classical, general results) that may imply ever-increasing returns to scale.[32] There is, as Narayanan and Kapoor describe it, a “seeming predictability of scaling,” one which rests on “a misunderstanding of what research has shown.[33]

Scaling is not free, however, and there is no reason to suppose its capacity to generate functionally superior models is boundless.[34] Scale advantages may be context- and application-specific, and may evidence decreasing (and eventually negative) returns to scale. Hence, for example, “downward pressure on model size” has been observed across the industry,[35] and there has been increasing interest in, among other things, the quality of training data and diverse efforts at developing small(er) language models.[36] Interest in “pre-training” data, data filtering, fine-tuning, and “curation” has seen promising results in the development of, e.g., Meta’s Llama 3 (an open source LLM),[37] Berkeley’s Koala,[38] Stanford’s Alpaca,[39] and others. The question about entry barriers may be sharpened further by the recent launch of OpenAI’s ChatGPT Search.[40]

The pace of development has been considerable, if not staggering, challenging (if not confounding) both theoretical frameworks[41] and the predictions of market developments, even in the relatively near term. And whatever the limits or constraints on scale advantages in theory, there remains the question of what we know about the data requirements minimally efficient scale and the extent to which the defaults at issue in the Google case did or did not impede rivals’ ability to achieve it.

For example, it has been reported that over 1.3 billion unique visitors used Microsoft’s Bing GSE in December 2023.[42] Moreover, as recognized by the district court, Bing is set as the default search engine in Microsoft’s Edge browser, and Edge, in turn, is the default browser in Windows desktop machines. Windows is the leading desktop operating system, with a reported market share of approximately 72% worldwide. Microsoft’s market cap is one of the world’s largest—larger, even, than that of Google’s parent, Alphabet, Inc.

What is the scale of data required to reach minimally efficient scale, such that Microsoft’s access to data falls short, the extremely large installed base of Microsoft consumers across the firm’s product range and, specifically, of Bing users, notwithstanding? One cannot find an answer to that question in the court’s decision in the Google Search case.

If there were, we could ask, further, how much of the shortage is due to the default agreements at issue in the case, given that, as found by the district court, Google first achieved a scale advantage by dint of innovation and its provision of a uniquely excellent GSE. Gregory Werden—former chief counsel for antitrust at the Department of Justice (DOJ)—puts the problem as follows: “Plaintiffs argued, ‘By controlling search defaults, Google ensures that it has—and will always have—more scale than any other general search engine. This scale gives Google an impenetrable advantage . . . .’ But if ‘more scale’ generates an ‘impenetrable advantage,’ Google’s monopoly was assured without resort to the alleged exclusionary conduct.[43]

The assumption is that Google’s quantitative advantage in search yields an unassailable, indeed self-perpetuating, advantage in the scale of training data and, hence, the quality of Google’s GSE. Perhaps, but the relationship between searches and the scale of the training data used in developing relevant foundation models was not in evidence; neither was any account of Microsoft’s presumed lack of—and inability to acquire—sufficient training data to compete.

That seems an issue whether or not the court should have adopted the “no economic sense” test derived from Aspen Skiing,[44] which the U.S. Supreme Court has said “is at or near the outer boundary of §2 liability.[45] That is, we wonder what degree of “foreclosure” may be attributed to the agreements at issue in the case, setting aside that Google’s default agreements may have made good economic sense, for Google, because (i) they helped Google improve its product (and not merely impair the products of rivals) and (ii) given Google’s earned scale advantage, primary exposure to marginal GSE consumers was valuable both for competing as a GSE and for Google’s consumers in ad markets (including but not limited to the “general text advertising” market found by the court).

III. Conclusion

U.S. antitrust law—like competition policy more widely—has never been a purely analytic matter: settled law does not yield algorithms into which enforcers and courts can input observations, turn the crank, and find decisions on liability (much less remedies) as definite outputs. Cases—like the Google Search case—may present difficult questions of fact and law. And, of course, there is the judicial analog of semantics: of the fit between observed facts and established law. The foregoing is not meant to impugn the role of judgment as a judicial function (whatever its black-box characteristics as a cognitive function).

It is meant to raise questions about the government’s case and the court’s opinion in the Google Search case. There are ex post questions, of course: what facts were presented to, or found by, the court, and were they adequate to sustain the government’s burden of proof? In that regard, I am somewhat more sympathetic to the impressionistic (and vague) account of scale, to the extent that, as I observed above, it is built on roughly established—if not actually law-like—observations of scale advantages in training LLMs and other machine learning systems. That is, I am somewhat more sympathetic about the purported scale advantage, if not to the legal conclusions that were supposed to follow from it.

I am considerably less sympathetic to the evidence from behavioral economics, which seems, as I noted, “nonsense on stilts.” It proved nothing, on any reasonable construction of proof.

Ex ante, the technical issues presented by the case seem to present challenges going forward. And the specific challenges presented by the Google Search case seem in some ways the tip of the iceberg, given the dynamism observed in AI and AI-fueled tech. With what confidence can enforcers project market developments even two years hence? How can they test counterfactuals? We need not insist on anything remotely approaching precision or certainty to worry about the unintended consequences of hand-waving.

[1] The government’s complaint is at https://www.justice.gov/opa/press-release/file/1328941/dl. The U.S. district court’s August 2024 decision on liability is United States v. Google, LLC, case No. 20-cv-3010 (APM) (D.D.C. Aug. 5, 2024).

[2] J. Crémer, A. Fletcher, P. Heidhues, G. Kimmelman, G. Monti, R. Podszun, M. Schnitzer and F. Scott Morton, What We Learn About the Behavioral Economics of Defaults from the Google Search Monopolization Case, ProMarket, Feb. 27, 2024, https://www.promarket.org/2024/02/27/what-we-learn-about-the-behavioral-economics-ofdefaults-from-the-google-search-monopolization-case; O. Vásquez Duque, Antitrust Regulation of Big Tech Needs a Better Understanding of Behavioral Economics, ProMarket, Dec. 19, 2023, https://www.promarket.org/2023/12/19/antitrustregulation-of-big-tech-needs-a-better-understanding-of-behavioral-economics.

[3] For a widely discussed approach, see R. H. Thaler and C. R. Sunstein, Nudge: Improving Decisions About Health, Wealth, and Happiness, Yale University Press, New Haven, 2008. Summarizing diverse views of potential consumer protection interventions based on behavioral economics, see, e.g., J. P. Mulholland, Fed. Trade Comm’n Bureau of Economics, Summary Report on the FTC Behavioral Economics Conference (2007), https://www.ftc.gov/sites/default/files/documents/reports/summary-report-ftcbehavioral-economics-conference/070914mulhollandrpt.pdf.

[4] For fundamentally skeptical views of the role of behavioral economics in competition policy, see, e.g., A. Devlin and M. Jacobs, The Empty Promise of Behavioral Antitrust, Harv. J. L. & Pub. Pol’y, Vol. 37, No. 3, 2015, pp. 1009–1063 (“The recurring question for policymakers is whether behavioral antitrust can advance the state of the art in a meaningful way. The answer, at present, is ‘no.’” Ibid. at 1063); see also, G. A. Manne and T. J. Zywicki, Uncertainty, Evolution, and Behavioral Economic Theory, J. L. Econ. & Pol’y, Vol. 10, No. 3, 2014, pp. 555–580; J. D. Wright and D. H. Ginsburg, Behavioral Law and Economics: Its Origins, Fatal Flaws, and Implications for Liberty, Nw. U. L. Rev., Vol. 106, No. 3, 2012, pp. 1033–1090; K. Zeiler, Cautions on the Use of Economics Experiments in Law, J. Inst. & Theoretical Econ (JITE)/Zeitschrift für die gesamte Staatswissenschaft, Vol. 166, 2010, pp. 178–193.

[5] For more ambitious views about applications, see, e.g., M. E. Stucke, Reconsidering Antitrust’s Goals, B.C. L. Rev., Vol. 53, Issue 2, 2012, pp. 551–630; A. P. Reeves and M. E. Stucke, Behavioral Antitrust, Ind. L.J., Vol. 86, Issue 4, 2011, pp. 1527–1586.

[6] There appear to be 21 citations to Professor Rangel’s testimony in the published decision.

[7] Ibid. at 229 (citing findings of fact ¶¶ 65–74).

[8] The U.S. Department of Justice has posted a redacted version of Professor Rangel’s slides. Prof. Antonio Rangel, Behavioral Economics Expert, Ex. No. UPXD101, 1:20-cv-03010-APM, https://www.justice.gov/d9/2023-09/416682.pdf.

[9] Supra note 2.

[10] Rangel slides, supra note 8 (Defaults Strongly Influence Choice).

[11] Crémer et al., supra note 2 (internal citation omitted).

[12] See, e.g., Organ Donation Statistics, Health Resources & Servs. Admin., U.S. Dep’t Health & Human Servs. Admin. (2024), https://www.organdonor.gov/learn/organdonation-statistics (reporting, e.g., the number of people on the national transplant waiting list and the number who die each day waiting for a transplant).

[13] E. J. Johnson and D. G. Goldstein, Defaults and Donation Decisions, Transplantation, Vol. 78, No. 12, 2004, pp. 1713–1716.

[14] M. Steffel, E. F. Williams and D. Tannenbaum, Does Changing Defaults Save Lives? Effects of Presumed Consent Organ Donation Policies, Behav. Sci. & Pol’y, Vol. 5, Issue 1, 2019, pp. 69–88, at 78.

[15] E. J. Johnson, M. Steffel and D. G. Goldstein, Making Better Decisions: From Measuring to Constructing Preferences, Health Psychol., Vol. 24, No. 4 (Suppl.), 2005, pp. S17–S22, at S19.

[16] Z. B. Ugur, Does Presumed Consent Save Lives? Evidence from Europe, Health Econ., Vol. 24, No. 12, 2015, pp. 1560–1572, DOI: 10.1002/hec.3111.

[17] Ibid.

[18] A. R. Otto, S. Devine, E. Schulz, A. M Bornstein and K. Louie Context-Dependent Choice and Evaluation in Real-World Consumer Behavior, Sci. Rep., Vol. 12, No. 1, 2022, art. 17744, https://doi.org/10.1038/s41598-022-22416-5.

[19] A. Tversky and I. Simonson, Context-Dependent Preferences, Mgmt. Sci., Vol. 39, Issue 10, 1993, pp. 1179–1297, https://doi.org/10.1287/mnsc.39.10.1179; J. S. Trueblood, S. D. Brown, A. Heathcote and J. R. Busemeyer, Not Just for Consumers: Context Effects Are Fundamental to Decision Making, Psych. Sci., Vol. 24, No. 6, 2013, pp. 901–908, https://doi.org/10.1177/0956797612464241.

[20] B. Albrecht, Trade Promotions in High Tech, Truth on the Market, Dec. 21, 2020, https://truthonthemarket.com/2020/12/21/trade-promotions-in-high-tech.

[21] See, e.g., B. Eum, S. Dolbier and A. Rangel, Peripheral Visual Information Halves Attentional Bias Choices, Psych. Sci., Vol. 34, Issue 9, 2023, pp. 984–998, https://doi.org/10.1177/09567976231184878; K. C. Armel, A. Beaumel and A. Rangel, Biasing Simple Choices by Manipulating Relative Visual Attention, Judgement & Decision Making, Vol. 3, Issue 5, 2008, pp. 396–403, doi:10.1017/S1930297500000413; I. Krajbich, C. Armel and A. Rangel, Visual Fixations and the Computation and Comparison of Value in Simple Choice, Nature Neurosci., Vol. 13, 2010, pp. 1292–1298; M. Milosavljevic, v. Navalpakkam, C. Koch and A. Rangel, Relative Visual Saliency Differences Induce Sizable Bias in Consumer Choice, J. Consumer Psych., Vol. 22, Issue 1, 2012, pp. 67–74.

[22] Eum et al., supra note 21, at 984.

[23] See Zeiler, supra note 4 (discussing “common misuses of quantitative results from economics experiments”).

[24] Milosavljevic et al., supra note 21.

[25] Ibid.

[26] Notes 14–16 and accompanying text, supra.

[27] J. Bentham, Anarchical Fallacies (1796) (published in, e.g., The works of Jeremy Bentham: Published under the superintendence of his executor, John Bowring, Vol. 2, William Tait, Edinburgh, 1843).

[28] G. A. Manne, A Critical Analysis of the Google Search Antitrust Decision, Int. Ctr. For Law & Econ., ICLE White Paper 2024-08-14, Aug. 14, 2024, https://laweconcenter.org/wp-content/uploads/2024/08/Manne-Google-Search-Decision-Analysis-2024-08-14.pdf; G. J. Werden, Harm to the Competitive Process in the Google Case, MercatusWorking Paper, July 9, 2024, https://www.mercatus.org/research/working-papers/harm-competitive-process-googlecase; D. J. Gilman, Some Thoughts on the Google Decision, for Those Who Haven’t “Binged” It Yet, Truth on the Market, Aug. 20, 2024, https://truthonthemarket.com/2024/08/20/some-thoughts-on-the-google-decision-for-those-who-havent-binged-it-yet.

[29] For a review, see M. A. K. Raiaan, Md S. H. Mukta, K. Fatema, N. M. Fahad, S. Sakib, M. M. J. Mim, J. Ahmad, M. E. Ali, S. Azam, A Review on Large Language Models: Architectures, Applications, Taxonomies, Open Issues and Challenges, IEEE Access, Vol. 12, 2024, pp. 26839–26874, https://doi.org/10.1109/ACCESS.2024.3365742.

[30] See, e.g., T. Schrepel and A. Pentland, Competition Between AI Foundation Models: Dynamics and Policy Recommendations, MIT Connection Science Working Paper, July 2024 (describing the increasingly large scale of training datasets and number of parameters across successive versions of OpenAI’s ChatGPT).

[31] See, e.g., J. Kaplan, S. McCandlish, T. Henighan, T. B. Brown, B. Chess, R. Child, S. Gray, A. Radford, J. Wu and D. Amodei, Scaling Laws for Neural Language Models, Jan. 23, 2020, https://arxiv.org/abs/2001.08361.

[32] That popular (over)generalization may exceed Kaplan et al., ibid., and, for example, Rich Sutton’s well-known statement of pessimism about AI methods that do not continue to scale arbitrarily with increased computation. R. Sutton, The Bitter Lesson, Mar. 13, 2019, https://www.cs.utexas.edu/~eunsol/courses/data/bitter_lesson.pdf.

[33] A. Narayanan and S. Kapoor, AI Scaling Myths, AI Snake Oil, June 27, 2024, https://www.aisnakeoil.com/p/ai-scaling-myths (to be sure, Narayanan and Kapoor recognized that past increases in model size, dataset scale, and training compute have paid tremendous dividends in LLM performance).

[34] Ibid. See also, e.g., S. Hooker, On the Limitations of Compute Thresholds as a Governance Strategy, July 30, 2024, https://doi.org/10.48550/arXiv.2407.05694 (“Evidence to-date suggests we are not good at predicting what abilities emerge at different scales.”); R. Schaeffer, H. Schoelkopf, B. Miranda, G. Mukobi, v. Madan, A. Ibrahim, H. Bradley, S. Biderman and S. Koyejo, Why Has Predicting Downstream Capabilities of Frontier AI Models with Scale Remained Elusive?, June 6, 2024, https://arxiv.org/abs/2406.04391.

[35] Narayanan and Kapoor, supra note 33.

[36] See, e.g., Y. Li, S. Bubeck, R. Eldan, A. Del Giorno, S. Gunasekar, and Y. T. Lee, Textbooks Are All You Need II: phi-1.5 technical report, Microsoft Research, Sept. 11, 2023, https://arxiv.org/pdf/2309.05463 (developing natural language processing model, phi-1.5, with apparent performance on natural language tasks comparable to models 5x larger); X. Geng, A. Gudibande, H. Liu, E. Wallace, P. Abbeel, S. Levine and D. Song, Koala: A Dialogue Model for Academic Research, Berkeley A. I. Rsch., Apr. 3, 2023, https://bair.berkeley.edu/blog/2023/04/03/koala (discussing the Koala chatbot and suggesting “that models that are small enough to be run locally can capture much of the performance of their larger cousins if trained on carefully sourced data”).

[37] For an overview, see Meta, Introducing Meta Llama 3: The most capable openly available LLM to date, Apr. 18, 2024, https://ai.meta.com/blog/meta-llama-3.

[38] Geng et al., supra note 36.

[39] R. Taori, I. Gulrajani, T. Zhang, Y. Dubois, X. Li, C. Guestrin, P. Liang and T. B. Hashimoto, Alpaca: A Strong, Replicable Instruction-Following Model, Stanford Univ. Ctr. for Rsch. on Found. Models (2024), https://crfm.stanford.edu/2023/03/13/alpaca.html.

[40] Introducing ChatGPT Search, OpenAI, Oct. 31, 2024, https://openai.com/index/introducing-chatgpt-search.

[41] Raiaan et al., supra note 29.

[42] Worldwide visits to Bing.com from July to December 2023, Statista, 2024, https://www.statista.com/statistics/752270/range-of-bingcom-based-on-unique-visitors.

[43] G. J. Werden, Harm to the Competitive Process in the Google Case, Mercatus Ctr. Antitrust & Comp. Working Papers, July 9, 2024, https://www.mercatus.org/research/working-papers/harm-competitive-process-google-case.

[44] Aspen Skiing Co. v. Aspen Highlands Skiing Corp., 472 U.S. 585 (1985).

[45] Verizon Communications, Inc. v. Law Offices of Curtis v. Trinko, LLP, 540 U.S. 398, 409 (2004).