1、Rethinking Privacy in the AI EraPolicy Provocations for a Data-Centric World White PaperFebruary 2024Jennifer KingCaroline MeinhardtWhite PaperRethinking Privacy in the AI Era2Authors Jennifer King is the Privacy and Data Policy Fellow at the Stanford University Institute for Human-Centered Artifici
2、al Intelligence(HAI).An internationally recognized expert in information privacy,her research examines the publics understanding and expectations of online privacy as well as the policy implications of emerging technologies,including artificial intelligence.Her recent research explores alternatives
3、to notice and consent(with the World Economic Forum),the impact of Californias new privacy laws,and manipulative design(dark patterns).She also co-directs the Dark Patterns Tip Line repository at Stanford.Prior to joining HAI,she was the Director of Consumer Privacy at the Center for Internet and So
4、ciety at Stanford Law School from 2018 to 2020.Dr.King completed her doctorate in information management and systems(information science)at the University of California,Berkeley School of Information.Caroline Meinhardt is the policy research manager at the Stanford Institute for Human-Centered Artif
5、icial Intelligence(HAI),where she develops and oversees policy research initiatives.She is passionate about harnessing AI governance research to inform policies that ensure the safe and responsible development of AI around the worldwith a focus on research on the privacy implications of AI developme
6、nt,the implementation challenges of AI regulation,and the governance of large-scale AI models.Prior to joining HAI,Caroline worked as a China-focused consultant and analyst,managing and delivering in-depth research and strategic advice regarding Chinas development and regulation of emerging technolo
7、gies including AI.She holds a Masters in International Policy from Stanford University,where her research focused on global governance solutions for AI,and a Bachelors in Chinese Studies from the University of Cambridge.AcknowledgmentsThe authors would like to thank Brenda Leong,Cobun Zweifel-Keegan
8、,Justin West,Kevin Klyman,and Daniel Zhang for their valuable feedback,Nicole Tong and Cole Ford for research assistance,and Jeanina Casusi,Joe Hinman,Nancy King,Shana Lynch,Carolyn Lehman,and Michi Turner for preparing the publication.DisclaimerThe Stanford Institute for Human-Centered Artificial I
9、ntelligence(HAI)is a nonpartisan research institute,representing a range of voices.The views expressed in this White Paper reflect the views of the authors.White PaperRethinking Privacy in the AI Era3Table of ContentsAuthors 2Acknowledgments 2Table of Contents 3Executive Summary 4Chapter 1:Introduct
10、ion 5Chapter 2:Data Protection and Privacy:Key Concepts and Regulatory Landscape 7 a.Fair Information Practice Principles:The framework behind data protection and privacy 9 b.General Data Protection Regulation:The“global standard”for data protection 10 c.U.S.State Privacy Laws:Filling the federal pr
11、ivacy vacuum 12 d.Predictive AI vs.Generative AI:An inflection point for data protection regulation 14Chapter 3:Provocations and Predictions 17 a.Data is the foundation of AI systems,which will demand ever greater amounts of data 17 b.AI systems pose unique risks to both individual and societal priv
12、acy that require new approaches to regulation 19 c.Data protection principles in existing privacy laws will have an implicit,but limited,impact on AI development 22 d.The explicit algorithmic and AI-based provisions in existing laws do not sufficiently address privacy risks 25 e.Closing thoughts 29C
13、hapter 4:Suggestions for Mitigating the Privacy Harms of AI 31 Suggestion 1:Denormalize data collection by default 33 Suggestion 2:Focus on the AI data supply chain to improve privacy and data protection 36 Suggestion 3:Flip the script on the management of personal data 41Chapter 5:Conclusion 45Endn
14、otes 46White PaperRethinking Privacy in the AI Era4Executive SummaryIn this paper,we present a series of arguments and predictions about how existing and future privacy and data protection regulation will impact the development and deployment of AI systems.Data is the foundation of all AI systems.Go
15、ing forward,AI development will continue to increase developers hunger for training data,fueling an even greater race for data acquisition than we have already seen in past decades.Largely unrestrained data collection poses unique risks to privacy that extend beyond the individual levelthey aggregat
16、e to pose societal-level harms that cannot be addressed through the exercise of individual data rights alone.While existing and proposed privacy legislation,grounded in the globally accepted Fair Information Practices(FIPs),implicitly regulate AI development,they are not sufficient to address the da
17、ta acquisition race as well as the resulting individual and systemic privacy harms.Even legislation that contains explicit provisions on algorithmic decision-making and other forms of AI does not provide the data governance measures needed to meaningfully regulate the data used in AI systems.We pres
18、ent three suggestions for how to mitigate the risks to data privacy posed by the development and adoption of AI:1.Denormalize data collection by default by shifting away from opt-out to opt-in data collection.Data collectors must facilitate true data minimization through“privacy by default”strategie
19、s and adopt technical standards and infrastructure for meaningful consent mechanisms.2.Focus on the AI data supply chain to improve privacy and data protection.Ensuring dataset transparency and accountability across the entire life cycle must be a focus of any regulatory system that addresses data p
20、rivacy.3.Flip the script on the creation and management of personal data.Policymakers should support the development of new governance mechanisms and technical infrastructure(e.g.,data intermediaries and data permissioning infrastructure)to support and automate the exercise of individual data rights
21、 and preferences.White PaperRethinking Privacy in the AI Era5Chapter 1:IntroductionIn the opening months of 2024,artificial intelligence(AI)is squarely in the sights of regulators around the globe.The European Union is set to finalize its AI Act later this year.Other parts of the world,from the Unit
22、ed Kingdom to China,are also contemplating and,in some cases already implementing,wide-ranging AI regulation.In the United States,a recent milestone Executive Order on AI marked the clearest signal yet that the Biden administration is poised to take a comprehensive approach to AI governance.1 With f
23、ederal legislation to regulate AI yet to pass,a growing number of federal agencies and state legislators are clarifying how existing regulation relates to AI within their jurisdictional areas and proposing AI-specific regulation.2While much of the discussion in the AI regulatory space has centered o
24、n developing new legislation to directly regulate AI,there has been comparatively little discourse on the laws and regulations that already impact many forms of commercial AI.In this white paper,we focus on the intersection of AI regulation with two specific areas:privacy and data protection legisla
25、tion.The connective tissue between privacy and AI is data:Nearly all forms of AI require large amounts of training data to develop classification or decisional capabilities.Whether or not an AI system processes or renders decisions about individuals,if a system includes personal information,particul
26、arly identifiable personal information,as part of its training data,it is likely to be subjectat least in partto privacy and data protection regulations.We make a set of arguments and predictions about how existing and future privacy and data protection regulations in the United States and the EU wi
27、ll impact the development and deployment of AI systems.We start with the fundamental assumption that AI systems require datamassive amounts of itfor training purposes.It is this need for data,as best evidenced by data-hungry generative AI systems such as ChatGPT,that we predict will fuel an even gre
28、ater race for data acquisition than weve witnessed over the last decades of the“Big Data”era.This need will in turn impact both individual and societal information privacynot just through the demand for data,but also by the impacts this need will have on specific issues such as consent,provenance,an
29、d the entire data supply pipeline and life cycle more generally.3We move on to examining AIs unique risks to consumer and personal privacy,whichunlike many technology-fueled privacy harms that primarily impact individualsaggregate to pose societal-level risks that existing regulatory privacy framewo
30、rks are not designed to address.We argue that existing governance approaches,which are based predominantly on the globally accepted Fair Information Practices(FIPs),will not be sufficient to address these systemic privacy risks.Finally,we close with suggested solutions for mitigating these risks whi
31、le also offering new directions for regulation in this area.Whats at Stake:The Future of Both Privacy and AIData is a key component for all AI systemsto date,the most significant improvements in AI systems have been tied to access to very large amounts of training data.This fact does not necessarily
32、 mean that all advancements in AI will require massive amounts of data;as we discuss later,some researchers are observing quality versus quantity trade-offs White PaperRethinking Privacy in the AI Era6that indicate more may not reliably mean better.Regardless,we are presently at an inflection point
33、where there is considerable pressure for companies to build massive training datasets to maintain their competitive advantage.A primary concern motivating this paper is that despite the fact that existing and proposed privacy and data protection laws on both sides of the Atlantic will have an impact
34、 on AI,they will not sufficiently regulate the data sources that AI systems require in a way that will substantively preserve,or even improve,our data privacy.In this paper,we explore several related concerns:1.The framework that underlies data protection laws has weaknesses that will not give indiv
35、iduals the tools they need to preserve their data privacy as AI advances;2.It also fails to address societal-level privacy risks;3.Policymakers must expand the scope of how we approach privacy and data protection to address these weaknesses and bolster data privacy in an increasingly AI dominant wor
36、ld.We start from the assumption that for most of us the current state of our data privacy ranges from suboptimal to dismal.In the United States,polls have shown that the public largely feels as if they have no control over the data that is collected about them online;4 that the benefits they receive
37、 in exchange for their data are not always worth the bargain of free access;and that in most data relationships,consumers have no ability to negotiate more favorable termsand in many instances,believe they are locked in or have few if any alternatives.5 In short,as we move toward a future in which A
38、I development continues to increase demands for data,data protection regulation that at best maintains the status quo does not inspire confidence that the data rights we have will preserve our data privacy as the technology advances.In fact,we believe that continuing to build an AI ecosystem atop th
39、is foundation will jeopardize what little data privacy we have today.This paper focuses on the core issues that we believe require the most attention to address this state of affairs.It does not claim to address or solve everything.But we do believe that if these issues arent sufficiently acknowledg
40、ed and addressed through regulation and enforcement,we leave ourselves open to a situation where privacy protection continues to deteriorate.There are many worries attached to how our world will change as it continues to embrace AI.Concerns related to bias and discrimination have already generated e
41、xtensive debate and discussion,and we argue that a substantial loss of data privacy is another major risk that deserves our heightened concern.White PaperRethinking Privacy in the AI Era7Chapter 2:Data Protection and Privacy:Key Concepts and Regulatory LandscapeThe last two years have seen groundbre
42、aking advances in AI,a period in which generative AI tools became widely available,inspiring and alarming millions of people around the world.Large language models(LLMs)such as GPT-4,PaLM,and Llama,as well as AI image generation systems such as Midjourney and DALL-E,have made a tremendous public spl
43、ash,while many other less headline-grabbing forms of AI also continued to advance at breakneck speed.While recognizing the recent dominance of LLMs in public discourse,in this paper we consider the data privacy and protection implications of a wider array of AI systems,defined more broadly as“engine
44、ered or machine-based systems that can,for a given set of objectives,generate outputs such as predictions,recommendations,or decisions influencing real or virtual environments.”6 For example,we consider a range of predictive AI systems,such as those based on machine learning,that analyze vast amount
45、s of data to make classifications and predictions,ranging from facial recognition systems to hiring algorithms,criminal sentencing algorithms,behavioral advertising and profiling,and emotion recognition tools,to name a few.These systems operate with varying levels of autonomy,with“automated decision
46、-making”referring to AI systems making decisions(such as awarding a loan or hiring a new employee)without any,or minimal,human involvement.7 While generative AI systems also rely on predictive processes,those systems ultimately focus on creating new content ranging from text to images,video,and audi
47、o as their output.In response to these widely publicized developments,both policymakers and the general public have called for regulating AI technologies.Since 2020,countries around the world have begun passing AI-specific legislation.8 While the EU finalizes the parameters of its AI Act,the blocs a
48、ttempt to provide overarching regulation of AI technologies,the United States presently lacks a generalized approach to AI regulation,though multiple federal agencies have released policy statements asserting their authority over AI systems that produce outputs in violation of existing law,such as c
49、ivil rights and consumer protection statutes.9 Several U.S.states and municipalities have also tackled general consumer regulation of AI systems.10While some policymakers are keen to demonstrate that they are assuaging the publics growing concerns about the rapid development and deployment of AI by
50、introducing new legislation,there is a growing debate over whether existing laws provide sufficient protection and oversight of AI systems.White PaperRethinking Privacy in the AI Era8While some policymakers are keen to demonstrate that they are assuaging the publics growing concerns about the rapid
51、development and deployment of AI by introducing new legislation,there is a growing debate over whether existing laws provide sufficient protection and oversight of AI systems.As we discuss in this white paper,privacy and data protection laws in the United States and the EU already do the work of reg
52、ulating somethough not allaspects of AI.Whether these existing laws,and proposed ones based on these frameworks,are adequate to anticipate and respond to emergent forms of AI while also addressing privacy risks and harms is a question we will address later in this paper.Before we delve into the deta
53、ils of our arguments,we provide a brief overview of the present state of data protection and privacy regulations in the EU and the United States that impact AI systems,starting with the foundational Fair Information Practices(FIPs).Those familiar with these regulations may wish to skip ahead to the
54、next chapter.Data Privacy and Data Protection Data privacy and data protection are sometimes used interchangeably in casual conversation.While these terms are related and have some overlap,they differ in significant ways.Data privacy is primarily concerned with who has authorized access to collect,p
55、rocess,and potentially share ones personal data,and the extent to which one can exercise control over that access,including by opting out of data collection.The terms scope is fairly broad,as it pertains not just to personal data but to any kind of data that,if accessed by others,would be seen as in
56、fringing on ones right to a private life and personal autonomy.Privacy is often described in terms of personal control over ones information,though this conception has been challenged by the increasing loss of control that many have over their data.But it is this notion of personal control that unde
57、rlies both existing privacy regulations and frameworks.What is considered“private”is also contextually contingent,in that data shared in one context may be viewed as appropriate by an individual or data subject(e.g.,sharing ones real time location data with a friend)but not in another(e.g.,a third p
58、arty collecting ones real time location data and using it for advertising purposes without explicit permission).The relational nature of data has also challenged the idea of privacy as personal control,as data that is social in nature(e.g.,shared social media posts)or data that can reveal both biolo
59、gical ties and ethnic identities(e.g.,genetic data)continue to grow.White PaperRethinking Privacy in the AI Era9Data Privacy and Data Protection(contd)Data protection refers to the act of safeguarding individuals personal information using a set of procedural rights,which includes ensuring that data
60、 is processed fairly,for specified purposes,and collected on the basis of one of six accepted bases for processing.11 Consent is the strictest basis and allows individuals to withdraw it after the fact.By contrast,legitimate interest provides the greatest latitudethis legal ground for processing dat
61、a allows processors to justify data processing on the basis of this data being needed to carry out tasks related to their business activity.Data processors must still respect individuals fundamental data protection rights,such as providing notice when data is collected,giving access to ones collecte
62、d information,providing the means to correct errors,delete,or transfer it(data portability)to other processors,and affording the right to object to the processing itself.But there is a bias toward accepting as a given the collectibility of some forms of personal data by default.The EU formally disti
63、nguishes between personal privacy(i.e.,respect for an individuals private life)and data protection,enshrining each in its European Charter of Fundamental Rights.Nevertheless,there are areas of overlap and the concepts complement each other.When data protection principles do not apply because the col
64、lected information is not personal data(e.g.,anonymized body scanner data),the fundamental right to privacy applies as the collection of bodily information affects a persons individual autonomy.Conversely,data protection principles can ensure limits on personal data processing,even when such process
65、ing is not thought to infringe upon privacy.12a.Fair Information Practice Principles:The framework behind data protection and privacy Most modern privacy legislation,at its core,is based on the Fair Information Practices(FIPs),a 50-plus-year-old set of principles that are accepted around the globe a
66、s the fundamental framework for providing individuals with due process rights for their personal data.13 Proposed as a U.S.federal code of fair information practices for automated personal data systems in the early 1970s,the FIPs introduced five safeguard requirements regarding personal privacy as a
67、 means of ensuring“informational due process.”14 They focus on the obligations of record-keeping organizations to allow individuals to know about,prevent alternative uses of,and correct information collected about them.15 As policy expert Mark MacCarthy describes,“All these measures worked together
68、as a coherent whole to enforce the rights of individuals to control the collection and use of information about themselves.”16Rather than framing information privacy as a fundamental human right,as both the United Nations Universal Declaration of Human Rights and the White PaperRethinking Privacy in
69、 the AI Era10European Charter of Fundamental Rights do with a more general conception of privacy,the FIPs outline a set of rules and obligations between the individual(data subject)and the record-keeper(data processor).17 The FIPs were drafted around a core assumption that the state has a legitimate
70、 need to collect data about its citizens for administrative and record-keeping purposes.18 This assumptionthat data collection is necessary and appropriate for the workings of the modern state but must be done fairly and with procedural safeguards in placewas incorporated into subsequent revisions o
71、f the FIPs,even as they were increasingly applied to the private sector.The most internationally influential version,developed by the Organisation for Economic Cooperation and Development(OECD)in 1980 and amended in 2013,consolidates and expands the original FIPs into eight principles covering colle
72、ction limitation,data quality,purpose specification,use limitation,security safeguards,openness,individual participation,and accountability.19 The guidelines reflect a broad international consensus on how to approach privacy protection that has translated into a policy convergence around enshrining
73、the FIPs as a core part of information privacy legislation around the world.20Despite having been conceived long before the emergence of the commercial internet,let alone social media platforms and generative AI tools,core components of the FIPs,such as data minimization and purpose limitation21,dir
74、ectly impact todays AI systems by limiting how broadly companies can repurpose data collected for one context or purpose to create or train new AI systems.The EUs General Data Protection Regulation(GDPR),as well as Californias privacy regulations and the proposed American Data Privacy and Protection
75、 Act(ADPPA),relies heavily on these principles.These regulations attempts to clarify the application of the FIPs to privacy controls amid exponentially increasing volumes of online consumers and commercial data shed further light on the impact of privacy regulation on AI.b.General Data Protection Re
76、gulation:The“global standard”for data protectionPassed in 2016 and in effect as of 2018,the General Data Protection Regulation is the EUs attempt to both update the 1995 Data Protection Directive and harmonize the previous patchwork of fragmented national data privacy regimes across EU member countr
77、ies and to enable stronger enforcement of Europeans data rights.22 At its core,the GDPR is centered on personal data,which is defined as“any information relating to an identified or identifiable natural person.”23 It grants individuals(“data subjects”)rights regarding the processing of their persona
78、l data,such as the right to be informed and a limited right to be forgotten,and guides how businesses can process personal information.It is arguably the most significant data protection legislation in the world today,spurring copycat legislation and impacting the framing of data protection around t
79、he globe.As a result of the GDPRs direct applicability to AI and its dominance across The FIPs were drafted around a core assumption that the state has a legitimate need to collect data about its citizens for administrative and record-keeping purposes.White PaperRethinking Privacy in the AI Era11the
80、 globe,data protection and privacy concerns are largely absent from the EUs AI Act.The GDPR contains several provisions that apply to AI systems,even though it does not specifically include the term“artificial intelligence.”Instead,Article 22 provides protections to individuals against decisions“bas
81、ed solely on automated processing”of personal data without human intervention,also called automated decision-making(ADM).24 It enshrines the right of individuals not to be subject to ADM where these decisions could produce an adverse legal or similarly significant effect on them.Given the widespread
82、 use of ADM as it relates to health,loan approvals,job applications,law enforcement,and other fields,the article plays a crucial role in enforcing a minimum degree of human involvement in such decision-making processes.Beyond Article 22,the GDPR also puts in place several key data protection princip
83、les that affect AI systems(see table).Most notably,the purpose limitation principle forbids the processing of personal data for purposes other than those specified at collection,and the data minimization principle restricts the collection and retention of data to that which is absolutely necessary.T
84、hese principles,in theory,curb unfettered personal data collection(or data mining)that is common for data-intensive AI applications.Despite the commonly held assumption that more data always makes for better AI,and that such constraints on data collection and use will hamper progress in AI,there is
85、Core Data Protection Principles Data Protection PrincipleSummary of RelevanceData MinimizationDefined in Article 5 of the GDPR as ensuring that collected data is“adequate,relevant and limited to what is necessary in relation to the purposes for which they are processed.”This principle prescribes pro
86、portionality:Data processors should not collect as much data as possible,particularly out of the context provided for collection.The intent is to prevent data collectors from engaging in indiscriminate data collection.Purpose LimitationDefined in Article 5 as data“collected for specified,explicit an
87、d legitimate purposes and not further processed in a manner that is incompatible with those purposes.”This principle emphasizes the importance of context,restricting uses of data beyond the explicit purpose given at collection.If a data processor wishes to repurpose collected data,they need to seek
88、consent for that new use.ConsentDefined in Article 7 and Recital 32 as a key requirement for data processing.Consent must be“given by a clear affirmative act establishing a freely given,specific,informed and unambiguous indication of the data subjects agreement to the processing of personal data rel
89、ating to him or her,such as by a written statement,including by electronic means,or an oral statement.”Notably,consent is required for all processing,including if data is collected for multiple purposes.Recital 42 describes the burden of proof data processors must meet to prove data subject consent,
90、noting that“consent should not be regarded as freely given if the data subject has no genuine or free choice or is unable to refuse or withdraw consent without detriment.”White PaperRethinking Privacy in the AI Era12extensive research demonstrating that building ADM systems within these constraints
91、is feasible and even desirable.25The GDPR also enshrines transparency obligations in the form of rules about giving notice to individuals when their personal information is processed for the purpose of profiling or ADM.26 It further establishes rules granting individuals the right to access their ow
92、n data and ensure the accuracy of the data processing.Finally,it introduces Data Protection Impact Assessments(DPIA)an accountability measure that requires the collecting organization assess the potential risks and harms of data processing activities(as they pertain to the relevant organization but
93、also potential societal-level harms)prior to conducting them.27c.U.S.State Privacy Laws:Filling the federal privacy vacuum As of 2024,the United States still lacks a federal omnibus consumer privacy law similar to the GDPR.The closest it has come to passing consumer privacy regulation is the America
94、n Data Privacy and Protection Act(ADPPA),which was introduced in the House in 2022 but did not advance to a floor vote in that session and has yet to be reintroduced.28 Similar to the GDPR,the ADPPA would have imposed limits on the“collection,use,and sharing of personal information,requiring that su
95、ch a process be“necessary and proportionate.”It would acknowledge the connection between information privacy and civil rights,strengthening relevant civil rights laws and essentially enacting the privacy section of the Biden administrations subsequent“Blueprint for an AI Bill of Rights.”29 ADPPA was
96、 the result of lengthy bipartisan negotiations and future privacy legislation is likely to hew closely to the original 2022 bill.In the absence of consumer-specific federal legislation,several sectoral laws have created a patchwork of privacy protections over the decades,such as the Family Education
97、al Rights and Privacy Act(FERPA),the Childrens Online Privacy Protection Act(COPPA),the Health Insurance Portability and Accountability Act(HIPAA),and even the Video Privacy Protection Act(VPPA),to name a few.In this splintered landscape,U.S.states have been passing their own consumer privacy laws.A
98、s of 2023,12 states have passed consumer privacy regulations,though Californias Consumer Privacy Act(CCPA)remains the most far-reaching.30 For that reason,we will focus on the CCPA for discussion purposes.Sometimes dubbed Californias version of the GDPR,the CCPAtogether with its 2022 update,the Cali
99、fornia Privacy Rights Act(CPRA)is arguably the most significant state-level effort so far to enact both stringent and broad consumer privacy protections.31 While some scholars have argued that the CCPA consciously creates a fundamentally different data privacy regime for California than the GDPR,it
100、nevertheless marks a landmark shift in the U.S.privacy regulation debate.32The purpose limitation and data minimization principles,in theory,curb unfettered personal data collection(or data mining)that is common for data-intensive AIapplications.White PaperRethinking Privacy in the AI Era13The initi
101、al version of the CCPA created rights of data access,deletion,and portability,as well as a right to opt out of sales of personal data for two-year cycles,and a purpose limitation provision.Businesses are obliged to provide notice of the types of data they collect,to obtain opt-in consent for data co
102、llection from children ages 13 to 16,and to abide by purpose limitations when collecting and using or reusing data,which must be consistent with individuals general expectations and the purpose specified upon collection.The subsequent CPRA,passed as a ballot proposition(Proposition 24),amends the CC
103、PA to add a data minimization prong as well as a right to correct personal data,a right to opt out of processing categories of sensitive personal data,andsimilar to the GDPRa right to opt out of some forms of ADM(those with significant effects,such as on housing and employment),which in draft regula
104、tions has been interpreted by Californias privacy regulator to include AI systems.33 Businesses must conduct privacy risk assessments and cybersecurity audits,offer alternatives for accessing services for those who opt out,and cannot discriminate against consumers for exercising these rights.A notab
105、le difference between Californias privacy regime and other states is that California remains the only state to have created an enforcement agency(the California Privacy Protection Agency,or CPPA)with rulemaking authority,rather than delegating this function to the states attorney generals office,as
106、many such laws do.In practice,this may mean that the CPPA has more in-house expertise than most state attorneys general and latitude to both engage in proactive enforcement via published guidance and tackle complex and emergent issues at the intersection of AI and personal data.Beyond the EU and Uni
107、ted States:Data Protection in China In 2021,Chinas legislature followed the EUs example by promulgating a comprehensive and stringent data privacy law.Heavily inspired by the GDPR,Chinas Personal Information Protection Law(PIPL)was designed to give Chinese citizens control over their personal and se
108、nsitive data by delineating who can access,process,and share their information.34 As such,it incorporates many elements of the FIPs,including data collection limitations,purpose specification requirements,and use limitations.Despite commonly being referred to as a privacy law,the PIPL never directly
109、 mentions privacy but instead focuses on curbing the abuse and mishandling of personal informationtheoretically by both corporate and state actors,though practically the states ability to surveil its citizens remains unchecked.35 Like the GDPR and the CCPA,the law contains explicit provisions bannin
110、g automated decision-making that enables differential treatment of consumers,including price discrimination.More broadly,it introduces limits on what was largely unfettered data collection by data-hungry AI companies,requiring informed consent for all kinds of data-processing activities and granting
111、 individuals key rights over their data,including the right to amend,delete,and request copies of information collected about them.Since the PIPL predominantly acts as a framework law that sets out broad principles and requirements,it was followed by a string of more granular implementing regulation
112、s,which have been directly impacting White PaperRethinking Privacy in the AI Era14Beyond the EU and United States:Data Protection in China(contd)AI companies,particularly those with facial recognition products.36 However,the true impact of the PIPL on Chinas AI ecosystem remains hard to assess given
113、 the governments tendency to use it as a political tool.For example,in 2022 when Chinas ride-hailing giant Didi was fined by the government following a comprehensive cybersecurity review,the regulatory decision cited the PIPL and Didis illegal collection of data,including facial recognition data.37
114、However,the unprecedented size of the fine and opaque application of a variety of laws and regulations may point to the PIPL being used as a tool to control the countrys tech giants.38 d.Predictive AI vs.Generative AI:An inflection point for data protection regulationUntil generative AI systems brok
115、e through the public and policymaker consciousness in late 2022,discussions about AI regulation were focused on predictive AI systems that use data to classify,sort,and predict outcomes.Within the scope of predictive AI,concerns focused primarily on the outputs produced by these systems,with less fo
116、cus on the data used to train them.Both policy discussions and proposed regulation for AI were primarily concerned with algorithmic audits39 and impact assessments,40 transparency and explainability,41 and enforcing civil rights42 as a means of ensuring decisional outputs were fair and unbiased.43 T
117、o the extent that privacy played a role in these discussions,concerns were typically related to the growing awareness of our main argument in this paperthat existing privacy laws such as the GDPR would impact aspects of AI development and that passing AI regulation without comprehensive privacy legi
118、slation,as would currently be the case in the United States,would be a job half-finished.44 It is not an overstatement to say that generative AI substantially shifted the terms of the debate.Awe over the capabilities of image generators such as DALL-E or Midjourney and LLMs such as ChatGPT simultane
119、ously raised questions about how these systems were built and what data was used to power them.As it became more widely understood that generative systems are built predominantly on data scraped from across the internet,concerns mounted about exactly what dataand whose datawas powering these systems
120、.45 These werent novel concerns.Facial recognition software company Clearview AI had already raised the ire of privacy and civil liberties advocates,as well as European policymakers,for their aggressive acquisition of facial images to power their predictive criminal suspect identification app.Clearv
121、iew built their software by scraping image data from across the internet,including from online services that explicitly prohibit such scraping.But given Clearviews niche product(available only to law enforcement organizations)and targeted impact(used to identify criminal suspects),their data use was
122、nt widely discussed,despite extensive reporting on the company by Kashmir Hill of The New York Times.46 Clearview has virtually been shut out of the EU marketplace after its data-gathering practices were found to be in gross violation of the GDPR.47 In the United States,a 2020 lawsuit by the America
123、n Civil Liberties Union White PaperRethinking Privacy in the AI Era15leveraging the state of Illinois Biometric Information Privacy Act resulted in a settlement that prohibits the company from making its products available to individuals and companies across the country,as well as also prohibiting u
124、se of its products by law enforcement agencies in Illinois.48Meanwhile,as generative AI systems gained greater exposure,privacy regulators around the world scrambled to understand the impacts of these systems on the public and whether they violated existing laws.49 The G7 data protection authorities
125、 went so far as to issue a group statement summarizing their concernsspecifically calling out the legal authority generative systems may have for processing personal information,especially related to children;the potential for generative systems to be used for attacks to extract personal information
126、;and the need to produce compliance documentation about the life cycle of the data used to develop and train their models.The statement also called for“privacy by design,”the practice of taking privacy into account throughout all stages of system development,while reiterating the need for developers
127、 to respect data protection rights and the data minimization principle.50 The Italian data protection authority went so far as to ban ChatGPT until OpenAI,its creator,put specific practices in place(see below).The fact that many generative systems are built at least in part on scraped data raises qu
128、estions about whether and under what contexts data-scraping practices can be compliant with the GDPR,particularly when personally identifiable data is scraped and included in training data,even if that data is publicly available.In particular,it may place consent and legitimate interest at odds,as c
129、ompanies like Clearview argue(albeit unsuccessfully in this instance)that they do not need consent for publicly accessible data.51 Generative systems raise other crucial questions about training data,such as the extent to which procedural data rights will apply to them,if individuals can request to
130、delete their data from training datasets or object to this form of processing,and whether any of this will depend on the context of use of the generative application in making these determinations.Italy Scrutinizes ChatGPTs Data Practices On March 20,2023,the Italian Data Protection Authority(the Ga
131、rante)received a report that OpenAIthe company that developed GPT-4,the AI model which is the basis for ChatGPT experienced a breach of user data.The Garante swiftly launched an investigation that found OpenAI was collecting user-generated data to train its AI model,including“users conversations and
132、 information on payments by subscribers to the service.”56 It deemed the collection of this data to train ChatGPTs language model unlawful under the GDPR.On March 31,2023,the Garante demanded that OpenAI block Italian users from having access to ChatGPT.It further required OpenAI to disclose how it
133、utilizes user data to train its AI model,to address concerns that ChatGPT produced inaccurate information about individuals,and to create an age verification mechanism within a monthor risk being fined 20 million euros or 4%of the companys annual turnover.57 White PaperRethinking Privacy in the AI E
134、ra16Italy Scrutinizes ChatGPTs Data Practices(contd)Throughout April,OpenAI implemented changes to meet the Garantes demands,including a new information notice describing how personal data is used to train its AI model,as well as a new,ad-hoc form that allows users to opt out from having their data
135、processed to train the algorithms.They also added an age verification system and gave users the ability to erase personal information they deem inaccurate.However,OpenAI stated that“it is technically impossible,as of now,to rectify inaccuracies.”58 The Garante accepted OpenAIs changes and allowed It
136、alians to access the chatbot again.Yet the regulator continued its investigations into the developers data practices,concluding on January 29,2024,that ChatGPT is in breach of the GDPR and giving OpenAI 30 days to respond with a defense against the alleged breaches.59In the United States,discussions
137、 about the permission needed for data used to build generative AI have tended to shift toward copyright given that,in the absence of a federal consumer privacy law,copyright has offered the clearest path for content creators to demand that companies remove their data from training datasets.52 This a
138、pproach yields mixed results,given the challenge of reverse engineering the existence of a particular item of content in a systems training data absent any transparency obligations by the companies to share how and with what they trained their models.It is also a poor approach for resolving privacy
139、issues other than those that may implicate copyrightable content.In July 2023,the Federal Trade Commission(FTC)issued a civil investigative demand to Open AI with detailed requests concerning their training data.53 This highly specific focus on obtaining information about a companys training data is
140、 not without precedent;the FTC has settled multiple investigations with companies that used AI in their product offerings,demanding that the companies delete their model and the associated data because the data used to train it was improperly acquired.54 Lina Khan,chair of the FTC,argued in a New Yo
141、rk Times op-ed that“exploitative collection or use of personal data”falls within the agencys authority to prohibit“unfair or deceptive trade practices.”55These events demonstrate that both EU and U.S.regulators have some flexibility and regulatory tools at their disposal to adapt enforcement to chan
142、ges in technology.Nonetheless,relying only on existing legislation,especially in the United States,is akin to bringing a knife to a gunfight.While the GDPR is settled law,as of early 2024 the CCPA remains a work in progress that is unlikely to be finalized until later in the year.As we discuss in th
143、e next chapter,incorporating automated decision-making into these regulations provides the necessary latitude for regulators to include AI in their oversight of algorithmic systems,and to potentially broaden their scope to focus on AI-specific issues,such as training data.White PaperRethinking Priva
144、cy in the AI Era17Chapter 3:Provocations and PredictionsIn this chapter,we present a set of four provocations and predictions that we believe highlight the key issues that must be confronted as we continue with regulating both privacy and AI.First,we predict that continued AI development will contin
145、ue to increase developers hunger for datathe foundation of AI systems.Second,we stress that the privacy harms caused by largely unrestrained data collection extend beyond the individual level to the group and societal levels and that these harms cannot be addressed through the exercise of individual
146、 data rights alone.Third,we argue that while existing and proposed privacy legislation based on the FIPs will implicitly regulate AI development,they are not sufficient to address societal level privacy harms.Fourth,even legislation that contains explicit provisions on algorithmic decision-making an
147、d other forms of AI is limited and does not provide the data governance measures needed to meaningfully regulate the data used in AI systems.a.Data is the foundation of AI systems,which will demand ever greater amounts of data The era of“Big Data”the exponentially increased amount of data collected,
148、created,and stored as the internet expanded and peoples online activities grew to encompass virtually every aspect of their livescreated one of the preconditions for the explosive growth of AI.Companies now know more about our personal lives than we ever thought they would:who we are,what we like,wh
149、ere we go,what we do,whom we do it with,and what we think and even feel.We predict that the expansion of AI systems across the globe will continue to increase the demand for data among developers.This growing demand will heighten the pressure on the entire existing data ecosystem to increase the amo
150、unt and types of data collected from consumers,as well as incentivize companies to violate the principles of data minimization and purpose limitation in its pursuit of ever more data.Both the totality of data,and the surface areas by which data is generated and collected,such as embedded sensors in
151、household objects,smart appliances,and biometric cameras in public spaces,will continue to expand.AIs appetite for data currently knows few bounds.According to the Global Partnership of AI,“building an AI system typically involves sourcing large amounts of data and creating datasets for training,tes
152、ting and evaluation,and then deployment.This process is iterative in the sense that it may require several rounds of training,testing and evaluation until the desired outcome is achieved and data plays an important role at each step.”60 None of the AI advances achieved over the past decade would hav
153、e happened without this broad availability combined with the massively more powerful computers,processing capacity,and cloud storage that developed at the same time.As Mark We predict that the expansion of AI systems across the globe will continue to increase the demand for data among developers.Whi
154、te PaperRethinking Privacy in the AI Era18MacCarthy describes,“artificial intelligence,machine learning,cloud computing,big data analytics and the Internet of Things rest firmly on the ubiquity of data collection,the collapse of data storage costs,and the astonishing power of new analytic techniques
155、 to derive novel insights that can improve decision-making in all areas of economic,social and political life.”61 Companies have not been incentivized to curb their collection of consumer data,in part due to competitive pressures to maximize targeted,highly personalized services,a task that requires
156、 data collection for analytical purposes even if the initial purpose and value of collecting the data is speculative.As commercial sector AI development increased,so did companies demands for datathe result,in part,of testing AI systems that generally show improvements in the accuracy and validity o
157、f outputs when exposed to greater amounts of sufficiently representative training data.62 Predictive AI in particular demands large datasets in order to complete advanced pattern analysis,where almost any variable could potentially hold the key to reliable correlations or associations between inputs
158、 and outputs.However,a growing body of research is increasingly challenging the assumption that more data is better by showing that similar performance levels can be achieved using comparatively less data overall when it is selected with more intentionality and specificity.63While not all applicatio
159、ns of AI require consumer data,the largest technology companies which have been building massive stores of consumer data for at least fifteen years and in some cases longer,have emerged with a marketplace advantage in the development of AI in part because of their ready access to these immense datas
160、ets.Newer AI developers like Anthropic or OpenAI have had to turn to other data sources to acquire the data to build and train their systems.64While most forms of predictive(machine learningbased)AI are data-dependent for their development,it is the recent emergence of powerful generative AI systems
161、 that best illustrates the magnitude of data required for model training.Generative systems such as LLMs(like GPT-4)and user-facing tools built on top of them(like ChatGPT),as well as image generation systems like Stable Diffusion or Midjourney have dazzled the public with their practical as well as
162、 entertaining applications.At the same time,as we discussed in Chapter 2,their high visibility has raised questions about how such systems operate,including what data they are trained on,and the potential privacy and other risks of interacting with these systems.65 There are presently no transparenc
163、y mandates requiring companies to detail where and how they acquire their training data outside of the EU AI Act,and those requirements only apply to systems designated as high-risk.66 Many of the largest companies building generative AI systems have not been responsive to public inquiries into wher
164、e they source their data and what procedures they use to strip their training data of personally identifiable information and other sensitive Companies have not been incentivized to curb their collection of consumer data,in part due to competitive pressures to maximize targeted,highly personalized s
165、ervices.White PaperRethinking Privacy in the AI Era19aspects.67 Of course,legal jurisdictions also matter;web scraping that captures personal information that is legal in the United States may not be permissible under the GDPR,and companies are increasingly forced to navigate territorial issues,both
166、 between the United States and the EU and others following the GDPR model.b.AI systems pose unique risks to both individual and societal privacy that require new approaches to regulationExisting and proposed privacy regulations are largely a retrospective answer to the past twenty years of technolog
167、ical change and increasing threats to our individual data privacy.However,the rise in the breadth and amount of data collected from individuals across all aspects of their online interactions,as well as new threats posed by AI systems,require that we think prospectively and ensure that we have the t
168、ools in place to grapple with the changes ahead.Further,beyond the documented harms to individuals,AI systems also pose considerable societal privacy risks that existing regulations are ill equipped to address.Risks and Harms to Individuals from AI SystemsInformation privacy can be a difficult conce
169、pt to specify as it is both multidimensional and highly contextual.Law professors Danielle Citron and Daniel Solove created a taxonomy of information privacy harms,which include physical,economic,reputational,emotional,and relational harms to individuals.68 Citron and Solove also call out discrimina
170、tion and vulnerability-based harmsthose that can occur due to information asymmetries between individuals and data collectors.There are also the harms that the FIPs were intended to address:harms to ones autonomy,the inability to make informed choices,the inability to correct data,and a general lack
171、 of control over how ones information is gathered and used.All of these are relevant to AI based systems as they were to the technological developments of the past three decades of internet expansion.While these harms predate the application of AI to the consumer sector,commercial AI systems will ca
172、use them,exacerbate them,and even pose new ones.For example,recent research based on Soloves own privacy taxonomy69 identified not only existing privacy risks that AI exacerbates but also those the authors argue AI creates,such as new forms of identity-based risks,data aggregation and inference risk
173、s,personality and emotional state inferences via applications of phrenology and physiognomy,exposure of previously unavailable or redacted sensitive personal information,misidentification and defamation.70 Technical advances in AI are also creating new avenues for privacy harm,such as the harms caus
174、ed by generative AI systems inferring personal information about individuals or providing users with the ability to target individuals by creating content about them that is defamatory or impersonates them.In addition to traditional concerns about individual privacy and personal data,these systems g
175、enerate predictive or creative output that,through relational inferences,can even impact people whose data was not included in the training datasets or who may never have been users of the systems themselves.When personal data is included in the training dataset,research has demonstrated that these
176、systems can memorize the data and then expose it to other users as part of the outputs.71 While most generative AI systems advise that individuals not include personal data in prompts or other inputs,many people still do,and when users of these systems input personal White PaperRethinking Privacy in
177、 the AI Era20information,including confidential or legally protected data,these systems may store this data for future uses,including model retraining,or share it with other users as part of the system outputs.The evolution of risks to online information privacy over the past two decades is a histor
178、y of ever-increasing consumer surveillance and individual profiling,primarily driven by the goal of targeting consumers with advertisements and offers based on their behavior both on-and offline.As social media platforms proliferated and grew,they too became an avenue for consumer surveillance,expan
179、ding the realm of information that could be collected about consumers across a growing set of contexts.Mobile devices and apps,smart speakers,smart home deviceseach new technological development added another layer of information that could be collected beyond the initial ambit of online shopping.Sh
180、oshana Zuboff terms the practice of extracting value from and about individuals surveillance capitalism,which“unilaterally claims human experience as free raw material for translation into behavioral data.”72 Today it is exceedingly difficult,if not impossible,for an individual using online or conne
181、cted products or services to escape systematic digital surveillance across most facets of their life.The collection of personal data occurs not only in instances where individuals make the choice to engage directly with an app or a service;in many cases,it also occurs silently by invisible third par
182、ties tracking individuals actions in browsers and mobile apps without giving affirmative notification or securing their consent.The focus on capturing consumer behavior and using it for predictive purposes expanded with the proliferation of sources for data collection.Individual profiling and infere
183、nce-making became indispensable for a broadening range of contexts beyond merely serving ads.Profiling for determining credit,insurance,employment,housing,and medicine are but a few examples.Over the past five years,emergent AI systems have increasingly been deployed in these contexts as well,as the
184、ir predictive capabilities are even greater than previous big data applications due to the computational capacities of AI.The Future of Privacy Forum categorizes the harms to individuals from automated systems into four areas:losses of opportunity,liberty,economic losses,and social detriments.73 The
185、se can result in harms such as discrimination in housing,employment,education,and other areas;surveillance and incarceration;denials of credit,differential pricing,and an overall narrowing of available choices;and harms to dignity due to bias or opportunity losses,as well as algorithm-based social s
186、orting and filtering that can influence what or whom you connect with in digitally mediated social environments.Profiling in particular increases the scope and scale of data collected about individuals and the related inference-building across a variety of contexts.There is also a lack of transparen
187、cy about how automated systems function,making it challenging for individuals to alter or limit their impact.AI The evolution of risks to online information privacy over the past two decades is a history of ever-increasing consumer surveillance and individual profiling.White PaperRethinking Privacy
188、in the AI Era21systems can automate many forms of decision-making and classification,exacerbating the privacy risks and harms already present in our“pre-AI”data ecosystem.74 The potential harms resulting from such privacy infringements arent limited to the consumer marketplace,where today companies
189、can not only tailor advertisements to you with fine-grained precision,but in some cases also use the data they have collected and inferred about you for manipulative or discriminatory commercial uses.75 The result is an exploitation of data that undermines societal norms and values by removing the s
190、tructural and contextual barriers that previously acted as safeguards against its widespread access.76 This is a means of collecting data against which FIPs-based,individual due process rights offer little protection or recourse.Individuals cannot use the FIPs effectively to protect themselves from
191、this form of data collection,especially when it happens without notice,through inferences,and even from sources we may be unaware of(including when data scrapers obtain data from services without permission,such as from social media or photo-sharing sites).77 As nearly all facets of our lives are in
192、creasingly mediated through technology,the risks increase for AI systems to perpetuate biases,stereotypes,and errors,manipulate consumers,and enable discrimination,particularly in the absence of regulation or transparency measures designed to keep these harms in check.Already,the scope and scale of
193、our“data relationships”with the companies that collect our data directly(first party)and those that do so indirectly(third party)are too numerous for individuals to manage in any reasonable way,assuming we even know who is collecting our data.Under existing or proposed privacy laws,the incentives fo
194、r companies to collect as much data as they possibly can is unlikely to diminish.As generative AI systems continue to proliferate,many built with online data scraped from the internet without consent,individuals stand little chance of addressing these privacy risks themselves through opt-out,correct
195、ion,or deletion rights.Societal Risks and Harms to Privacy from AIThe privacy risks and harms posed by AI systems are not limited to individuals;they also threaten groups and society at large in ways that cannot be mitigated through the exercise of individual data rights.Returning to the Future of P
196、rivacy Forums taxonomy,the societal-level harms from automated systems based on group membership include differential access to opportunities such as jobs,housing,education,credit,goods and services;increased surveillance and disproportionate incarceration of specific groups;and reinforcement of neg
197、ative stereotypes and biases.78 AI systems create the capacity for large-scale societal risks precisely because they operate at scale,analyzing tremendous amounts of data and in turn making connections and predictions previously not possible through other means.This capacity can result in classifyin
198、g and applying decisional outcomes to large swaths of the population based on group affiliationthereby amplifying social biases for particular groups.Harms at the societal level can also pose threats to democracy,as well as impact the benefits that privacy affords individuals,which in turn impact th
199、e development of autonomy necessary for cultural and societal flourishing.79From a privacy perspective,a specific concern is that profiling at a societal level contributes to a widespread erosion of privacy norms and expectations.The expectation that your data will be gathered at every turn,the powe
200、rlessness of being unable to do anything about it,and the lack of transparency about how ones data is used or decisions are made about you all feed a growing sense of inevitability that data White PaperRethinking Privacy in the AI Era22privacy has already been lost.80 This is not simply a reflection
201、 of changing norms about online sharing and publicness,as Mark Zuckerberg disingenuously argued in 2009 when forcing Facebook users data to be public by defaultand setting the stage for the Cambridge Analytica scandal.81 The growth of generative AI has drawn attention to the pervasiveness of data co
202、llection,and the sources of that data,as the connection between scraped data and the ability of generative systems to create their wondrous outputs has raised questions about exactly where the data is coming from.The more we mine the public sphere for data,the more we erode the sense that we should
203、have a right to exist in public,whether a digitally mediated space or a physical one,with any degree of privacy or anonymity.This shift toward using AI in contexts with civil rights implications,such as hiring,82 criminal justice,83 and policing84 have profound implications for both individuals and
204、society at large.Individuals interact with systems they may not think of as highly technical(such as applying for a job),or to which they havent signed up as users,but within which AI calculations are applied to them through inferencesto,say,predict their health outcomes,calculate their insurance ra
205、tes,or determine whether their employment application gets reviewed.As these systems proliferate,they can amplify existing biases and inequities.At their most extreme,they can be used by governments as tools of social control.c.Data protection principles in existing privacy laws will have an implici
206、t,but limited,impact on AI development The application of specific fair information practices in existing regulations,such as requirements for data minimization and purpose limitation,will impact AI development.The question we raise is whether these principles are sufficient for tackling the privacy
207、 risks and harms posed by AI.In the United States,lawmakers are increasingly arguing that passing federal privacy legislation,similar to the GDPR,is a necessary precondition to any regulation that explicitly targets AI systems.85 Existing privacy and data protection laws in both the EU and the Unite
208、d States(at the state level)will regulate AI systems that rely on personal data for training purposes or that ingest it as part of the service they offerbut only up to a point.Even if the United States adopts a GDPR-esque law that provides FIPs-based rights,this approach will not be sufficient on it
209、s own to address the risks and harms we discussed above.Indirect Regulation Through Data Minimization and Purpose LimitationBoth the CCPA and the GDPR,as well as other similar federal agency and state-level regulations in the United States and EU member state regulations,impact the development and d
210、eployment of AI-based systems by limiting the personal data that companies can collect and use to train and retrain AI models ad infinitum.86 Specifically,the principles of data minimization and purpose limitation,if clearly delineated and enforced,should limit how much personal information is colle
211、cted and how it can be used and reused for AI The question we raise is whether existing data protection principles are sufficient for tackling the privacy risks and harms posed by AI.White PaperRethinking Privacy in the AI Era23systems.Companies need to justify how data collected from consumers in o
212、ne context for a particular use could be reused in an entirely different context or for a new purpose.However,the degree of protection varies considerably between jurisdictions.With the GDPR as their foundation since 2018,EU member states have a stronger and broader set of enforcement powers than do
213、 the minority of U.S.states that have passed data privacy laws.In response,researchers and industry practitioners have already developed,tested,and deployed a wide array of techniques to meet data minimization and purpose limitation requirementswithout compromising performance.87 During the training
214、 phase of AI models,privacy-preserving methods(including federated learning)have been employed to minimize data.88 During the model inference phase,experts point to the conversion of personal data into less“human readable”formats,the anonymization of queries,and data shuffling among other privacy-pr
215、eserving techniques.89 Still,more research into data minimization and purpose limitation compliance in AI systems is greatly needed.90Existing privacy laws do address the use of data collected or generated directly by an AI system(e.g.,a users prompts to a chatbot or other generative AI system,or da
216、ta processed by a predictive AI system,such as a recruiting and hiring software).To the extent that these systems directly ingest or process personal data,or make predictions or inferences about individuals based on this collected data,privacy regulations implicitly regulate their operation by requi
217、ring compliance with individuals rights to access,correct,and delete personal data,to request a copy of their data,or to opt out of future sales or sharing of their data.In many AI use cases,companies must conduct the same privacy or data protection impact and/or risk assessments that they would eve
218、n if not utilizing AI in order to demonstrate they have adequately considered the risks to individuals by collecting and using personal data as part of deploying their systems.Limitations of the FIPs-based FrameworkThe FIPs provide the substantive framework for existing privacy and data protection l
219、aws around the globe,based on principles that were developed over 50 years ago.Many policymakers view them as a model for future privacy legislation;even China has adopted a version of the FIPs in its own privacy legislation,largely viewed as modeled after the GDPR(see Chapter 2).However,both the FI
220、Ps and the laws based upon them have their critics.Law professor Woodrow Hartzog in particular has criticized the FIPs as inadequate but invaluable,noting that in a modern society awash in data and data collection,“control does not scale.”91 Data Minimization and Purpose LimitationEnforcement of the
221、 data minimization and purpose limitation principles should,in theory,translate to more conservative and thoughtful personal data collection.However,these approaches as practiced today fail to address many of the fundamental weaknesses of our current data ecosystem.For example,they do not address th
222、e inequitable power dynamics of a data ecosystem in which the data collectors and processors,most of which are powerful private tech companies,hold far more market power over personal data collection than do individuals.Further,it may be reasonably straightforward to hold a company to account if its
223、 use of data doesnt match the purpose it gave at collection.However,in the absence of an agreement as to what constitutes too much data,it will be a challenge for regulators to operationalize whether a company is sufficiently practicing data minimization outside of egregious violations.Today,the pur
224、suit of White PaperRethinking Privacy in the AI Era24quality(i.e.,data that is reliable,relevant,and collected ethically)is still mostly overridden by a pursuit of quantity(i.e.,collecting vast amounts of data cheaply and at scale,by any means necessary)especially in markets that lack robust privacy
225、 legislation like the United States.92The Limits of Privacy Self-ManagementA core weakness with the FIPs framework is that individuals are assumed to have a level of control and power equal to that held by companies and institutions collecting and processing their data.93 However,this is not the cas
226、e;often individuals cannot simply choose an alternate product or service with more privacy protective data collection practices.Monopolistic practices in the tech sector,consumer lock-in,and a general incentive for businesses to collect as much data as possible undermine privacy as a competitive fac
227、tor except in a few cases.Privacy law expert Daniel Solove named the burden on individuals to manage and exercise their rights to curb data collection“privacy self-management.”As our use of digital products and services has increased,privacy self-management has failed to give individuals the tools t
228、hey need if they want to prevent,or at least reduce,the amount of data collected about them.94 Thus,while the FIPs are a necessary baseline to ensure that individuals have due process rights with respect to their personal information,they fail to empower individuals to have a meaningful impact on th
229、eir privacy in the age of AI.FIPs-based regulations may be designed to constrain companies from collecting and processing data for AI systems,but they ultimately dont solve the core problem of how to prevent data collection in the first place in a society where it is difficult,if not impossible,for
230、the majority of people to avoid interacting with technology.Data Collection by Default:Opt In or Opt Out?The expansion of the FIPs from their original application to governmental data collection in the early 1970s to the private sector reinforced the approach of allowing data collection by default.T
231、here are legitimate reasons to allow governments to collect data in many circumstances without requiring individuals to give their explicit consent:tax collection,census taking,and provisioning public benefits are but a few examples.But applying this rationale to the private sector normalized the id
232、ea that individuals should have to opt out,rather than choose to opt in.The GDPR tries but does not fully resolve this dilemma.As Mark MacCarthy notes,the“GDPR provides procedural,not substantive protections.Its goal is not to limit any specific use of information but to ensure that all uses are sub
233、ject to certain fair procedures to ensure the protection of the rights of data subjects.”95 The GDPR threads the needle between always requiring consent and allowing collection without it by providing six bases for processing data;the two most salient for this discussion are consent and legitimate i
234、nterest.96 No matter which basis is used,data processors must inform individuals about the processing when collecting their data,and the processing must not“seriously impact”individuals rights and freedoms.97 The FIPs(.)fail to empower individuals to have a meaningful impact on their privacy in the
235、age of AI.White PaperRethinking Privacy in the AI Era25Arguably,asking for consent(opt in)is the most straightforward basis for data processing,as individuals maintain the right to withdraw it.When it comes to legitimate interest,the U.K.Information Commissioners Office describes it as the most“flex
236、ible”of the bases.98 The office suggests that legitimate interest may be appropriate when processing offers a clear benefit“to you(the processor)or others,if there is limited privacy impact on the individual,the use matches an individuals reasonable expectations,or the controller cannot or does not
237、want to give the individual“full upfront control(i.e.,consent)or bother them with disruptive consent requests when they are unlikely to object to the processing.”99 Legitimate interest can allow for opt-out-based data collection,though controllers are cautioned that it cannot be used as a basis for
238、all data processing,and controllers must have a clear justification for using it for their particular context.Legitimate interest has been criticized for allowing data collection practices that some argue violate the basis and act as an opt-out rather than an opt-in basis.For example,in March 2023,i
239、n response to a ruling by the European Data Protection Board that denied Metas use of its sites terms of service as a basis for using behavioral targeting in advertising,Meta then switched to the legitimate interest basis,100 which activist group Noyb argued was a violation of users fundamental righ
240、ts.101 The switch requires Facebook and Instagram users to submit an online form to register their objection to the use of their behavioral product usage for targeting;unless they object,however,Meta will proceed with the targeting,placing the burden on individual users to sort out the details.102 W
241、hile the FIPs provide crucial procedural data protection rights,they fundamentally do not curb data collection by default.Instead,the focus is on rights one can exercise after data has already been collected,leaving the burden of managing ones privacy on individuals who may have little time or incli
242、nation to actively participate in this work.They do not provide a right of refusal,or a clear,convenient,non-fatiguing means to interact with digital products or services without having to give up some personal information.103 While the EUs 2002 ePrivacy Directive(discussed in Chapter 4)attempted to
243、 curb cookie setting by default,and GDPR has positively impacted the design and simplicity of cookie consents,they remain the hallmark of how not to implement opt in.As we will argue later,there are better ways to implement an opt in approach.d.The explicit algorithmic and AI-based provisions in exi
244、sting laws do not sufficiently address privacy risksIn addition to the implicit impacts of FIPs-based laws discussed above,both existing and proposed privacy regulations include specific provisions targeted at algorithmic systems in such a way that will include AI.These include provisions in the GDP
245、R and U.S.state laws,such as Californias,that address automated decision-making and profiling and that require data protection impact assessments to obligate companies to identify uses of data that pose risks to their customers.104 These provisions are intended to ensure that privacy and data protec
246、tion regulations cover specific data-intensive practices that implicate individual data privacy.They use a risk-based framework to place obligations on data processors to incorporate risk mitigation into their data governance practices that includes risks to their customers,not just to the White Pap
247、erRethinking Privacy in the AI Era26business.These measures will have some impact on AI systems as we discuss below.However,these explicit regulations do not address the limitations of the FIPs framework,nor do they sufficiently focus on broader data governance measures needed to regulate the data u
248、sed for AI development.Addressing these challenges will require additional policy measures,which we discuss in Chapter 4.Automated Decision-Making and AIThe term automated decision-making(ADM)is not a recent invention.Concerns with delegating decision-making about individuals using personal informat
249、ion to automated systems dates back at least to the origination of the FIPs in the early 1970s,105 though the FIPs themselves do not address the issue of automation.The U.K.Information Commissioners Office defines ADM as“the process of making a decision by automated means without any human involveme
250、nt.These decisions can be based on factual data,as well as on digitally created profiles or inferred data.”106 The GDPR incorporates the concept in Article 22,noting that“the data subject shall have the right not to be subject to a decision based solely on automated processing,including profiling,wh
251、ich produces legal effects concerning him or her or similarly significantly affects him or her.”107 The inclusion of the term“profiling”defined as“any form of automated processing of personal data consisting of the use of personal data to evaluate certain personal aspects relating to a natural perso
252、n,in particular to analyse or predict aspects concerning that natural persons performance at work,economic situation,health,personal preferences,interests,reliability,behaviour,location or movements”is a specific call-out of data mining and prediction practices that implicate privacy through their u
253、se of personal data and their focus on individuals.108 ADM can arguably be construed as including any form of AI that trains on or ingests personal data,or that makes predictions or decisions about individuals,though existing laws narrow the applicability to ADM with significant or legal impacts to
254、avoid the overinclusion of low to no privacy risk ADM(e.g.,an algorithm that takes in an address to find a nearby store but does not store that location data,or a clothing sizing algorithm that asks the customer for their size in specific brands of clothing to calculate a more accurate sizing estima
255、te).The term itself is not specific to any particular technology,describing a process that can be accomplished using rule-based algorithms as well as forms of AI,such as predictive AI.109 The GDPR provides data subjects the rights to contest and withdraw from automated processing,creating guardrails
256、 to prevent a Kafkaesque landscape of ADM that cannot be contested.110 In this context,the GDPR expressly connects automated processing to the practice of profiling as a threat to privacy.The scope of the GDPRs automated decision-making provision arguably impacts AI to the extent that a system rende
257、rs a decision,or makes a prediction,that results in significant impact on individuals lives and relies on processing personal data to do so.Regulations that These explicit regulations do not address the limitations of the FIPs framework,nor do they sufficiently focus on broader data governance measu
258、res needed to regulate the data used for AI development.White PaperRethinking Privacy in the AI Era27specifically call out ADM follow the GDPRs lead with focusing on systems with“legal effects”or similar impact,such as extending or denying credit,hiring,housing eligibility,and so on.The present scop
259、e of ADM regulations in the EU and the United States focuses on providing notice to consumers that automated processing is occurring,giving them opt-out rights in qualifying contexts(e.g.,with significant impacts or legal effects)and requiring ADM systems to provide information about the“logic”of th
260、e system design:its purpose,how it renders decisions,the potential safeguards in place,and clarifying the extent of human oversight over the system.For example,the crafters of the CCPA111 and the subsequent update to the law(the California Privacy Rights Act of 2020112)tasked the new California Priv
261、acy Protection Agency with regulating access and opt-out rights for businesses use of ADM that processes personal information(including training data113)or otherwise poses a risk to privacy.114 Colorados 2023 privacy law also includes ADM regulations,distinguishing between solely automated,human-rev
262、iewed,or human-involved automated processing,and sets obligations in accordance with the level of human involvement.115 These types of measures closely follow the FIPs(i.e.,notice,data access,data correction)in providing procedural rights and protections.Privacy and Data Protection Impact Assessment
263、sImpact assessments for privacy and data protection have their roots in the growth of environmental protection regulation that emerged in the 1960s.116 In the privacy and data protection sectors,they are used to guide both public and private sector organizations toward proactive risk assessment when
264、 planning a new product or service that utilizes personal data.In the United States,Section 208 of the e-Government Act of 2002 obligates federal agencies to conduct privacy impact assessments(PIAs)when“developing or procuring information technology that collects,maintains,or disseminates informatio
265、n that is in an identifiable form.”117 The GDPR requires data protection impact assessments(DPIAs)that are triggered“whenever processing is likely to result in a high risk to the rights and freedoms of individuals,”118 such as large-scale uses of sensitive data or public surveillance,systematic indi
266、vidual profiling,and automated decision-making without human involvement.Once the regulatory process is completed in 2024,119 the CCPA may require DPIAs from companies whose processing of data poses a substantial risk to privacy,including selling or sharing personal information;processing sensitive
267、personal information;using ADM in specific ways(including decisions with“legal or similarly significant effects”);profiling employees,job applicants,students,and consumers in publicly accessible places;behavioral advertising profiling;and processing the personal information of children under 16.120
268、PIAs and DPIAs are tools to prompt organizations to engage in sufficient planning and self-reflection to foresee potential risks and to integrate mitigations into their design and planning processes(or in the case of startups,to compel them to adopt such processes)by considering both the data types
269、and the processing activities that pose high risks to individuals.Algorithmic impact assessments have been proposed as a tool for the general oversight of AI systems,and though they may make mention of privacy and data,they are not focused exclusively on these topics.121 How Do These Explicit Provis
270、ions Fall Short?The scope of ADM provisions in both the GDPR and the CCPA seek to strike a balance between White PaperRethinking Privacy in the AI Era28preventing runaway ADM scenarios versus casting such a broad net that every form of ADM requires the application of the full set of notice and opt-o
271、ut rights.To be sure,an opt-out right is a potentially powerful deterrent against the over-application of ADM.The prospect of creating an alternate non-ADM process for businesses to comply with opt-out requests may keep privacy and data protection lawyers up at night.Consider,for example,a large com
272、pany that receives thousands of job applications per monthrequiring a non-automated process for rendering judgments for applicants could be daunting.However,the underlying logic of both opt-outs and notice requirements doubles down on the privacy self-management approach,placing the burden on indivi
273、duals to understand what automated decision-making is and why they may wish to opt out of it.The notice and consent approach for privacy already places a significant burden on individuals not only to exercise these rights but also to comprehend why one might want to do so.122 Given the complexity of
274、 understanding AI systems and how ones data may interact with them,this presents an even heavier lift.And,its unclear what it may accomplish for consumers.The requirement to label ADM systems for the public may raise more questions for consumers than answers.Paradoxically,its conceivable that people
275、 might elect to opt out of ADM systems in favor of a non-automated process that could turn out to be even more arbitrary or biased than an ADM-based one.In essence,this becomes a form of labeling not unlike the practice of genetically modified ingredient(GMO)labeling on foods;in the absence of clear
276、 scientific evidence determining whether the consumption of GMO-based foods is harmful,the decision to consume them or not is punted to the consumer who may have little to no understanding of the issue,leading them to make uninformed choices that could help or harm them.This is not to say that label
277、ing or providing notice of the use of ADM has no benefit;certainly,to the extent that there are individuals who want to exercise the right not to be subject to ADM,this opt-out right is crucial.But it becomes an exercise of personal preference that has the potential to harm an individual given their
278、 unique circumstances.And while existing federal and state laws may prohibit AI systems that produce discriminatory or biased outputs,this approach leaves a loophole in regard to systems that may have negative implications for ones data privacy.A framework that would place strict limits on the use o
279、f AI systems that had negative privacy impacts both for individuals and at a societal level would provide a more consistent approach.Finally,the ADM track may miss most uses of generative AI systems to the extent that they are not used for decision-making purposes,leaving open the question of whethe
280、r they could be implemented in a way that consumers may not know that AI is being utilized but not subject to notice requirements.The underlying logic of both opt-outs and notice requirements doubles down on the privacy self-management approach,placing the burden on individuals to understand what au
281、tomated decision-making is and why they may wish to opt out of it.White PaperRethinking Privacy in the AI Era29While data protection impact assessments are a necessary and useful regulatory tool for protecting data privacy,they are not a steadfast guarantee against either government or private compa
282、nies implementing harmful technologies.For one,they depend on a regulatory or institutional structure that has sufficient authority to act when DPIAs or PIAs are done poorly or fail to anticipate risks.123 Without such structural support,they are little more than a bureaucratic hurdle with no teeth.
283、In the United States,for example,several of the large technology companies developing AI systems have elected to blow past the advice of their own risk-adverse legal staff and responsible innovation teams and market AI tools without fully understanding the risks to their users,let alone the larger p
284、ublic.124 Another issue is that if there isnt a standard by which impact assessments can be assessed,businesses can turn the process into an exercise of grading their own homework by setting their own internal standards.While the GDPRs DPIA requirements recognize the heightened risks from some forms
285、 of data processing,including from fully automated decision-making technologies,this approach assumes one is acting on data that has already been collected.It is possible that the process of anticipating as well as completing a DPIA could dissuade or prevent an organization from electing to launch a
286、 product or service due to the risks it identifies.But the fact that these tools presently do not direct organizations to engage in these processes before product creationperhaps even before collecting training datameans that their ability to surface risky data collection or data management decision
287、s,or to prevent such actions,is lessened.It is possible in light of the importance of training data on the outputs of AI systems that the entire data development pipeline should be subject to data or privacy impact assessments.Mehtab Khan and Alex Hanna review the stages of dataset creation and disc
288、uss some of the documentation interventions that are increasingly being suggested to add greater accountability to the dataset development process(we will discuss these in more depth in Chapter 4).125 The proposed California regulations do attempt to tackle some of these issues.For example,7154 plac
289、es disclosure obligations on businesses that process personal information to train ADM systems or AI,requiring that they disclose to downstream users the appropriate uses of the technology,as well as conduct their own risk assessment that addresses“any safeguards the business has implemented or will
290、 implement to ensure that the automated decision-making technology or artificial intelligence is used for appropriate purposes by other persons.”126 Further,for businesses required to conduct a risk assessment as described above,7155 prohibits businesses from processing personal information if the“r
291、isks to consumers privacy outweigh the benefits resulting from processing to the consumer,the business,other stakeholders,and the public.”127 In order for our existing frameworks to fully grapple with AI-based privacy threats regulators will need to keep refining and expanding provisions like these,
292、and more.e.Closing thoughtsOverall,FIPs-based privacy and data protection laws have not anticipated the growth of AI systems or variants such as generative AI.Despite a decade-plus exposure to the emergence and growth of big data,these regulatory frameworks are not prepared to respond to and oversee
293、 the data-intensive aspects of AI systems.Ultimately,existing FIPs-based privacy regulations cannot sufficiently regulate the data that feeds AI development in a way that sustains our White PaperRethinking Privacy in the AI Era30existing state of privacy today or,better yet,improves it.A privacy and
294、 data protection framework that places the primary responsibility on individuals to manage their data across hundreds,even thousands,of digital relationships and channels fundamentally does not scale,and thus will not succeed in protecting individual privacy.Nor will it solve population or societal
295、level risks and harms to privacy.As legal theorist Salom Viljoen notes,“responding adequately to the economic imperatives and social effects of data production will require moving past proposals for individualist data-subject rights and toward theorizing the collective institutional forms required f
296、or responsible data governance.”128 These challenges have become more visible following the explosion of generative AI systems,built primarily from data scraped online.It will be very difficult,if not impossible,for individuals to shoulder the burden of exercising their deletion and correction right
297、s with these massive and nontransparent systems,let alone to proactively prevent their data from inclusion.Privacy self-management approaches that force individuals to bear the burden of systemic privacy challenges will not substantively improve individual privacy.Unfortunately,passing more FIPs-bas
298、ed regulations will not resolve individual privacy challenges or systemic risks posed by AI systems.Even if the United States were to pass the 2022 version of ADPPA,neither it nor the GDPR provides sufficient oversight of the data used to develop and train AI.While these laws nibble at the edges,the
299、y do not confront the bias toward collecting data first and asking questions later.They do not adequately address consent.They do not provide sufficient methods for people to engage with technological systems without ubiquitous data collection.They do not address societal-level privacy harms.And the
300、y do not provide a framework for addressing the privacy issues raised by AI training data,whether they be from proprietary datasets,open source or public datasets,or data scraped from the internet.The computer science maxim of“garbage in,garbage out”is as relevant as ever when it comes to building A
301、I systems.Whether they are trained on curated but biased datasets,or on data scraped from questionable parts of the internet,the impact assessment process must direct attention to the antecedents of AI products and not simply their outputs.In the next chapter,we make three suggestions we believe sho
302、uld be adopted in order to address these issues.They alone may not be sufficient to address the issues we have raised above,but we believe they are an important starting point for moving forward in the right direction.White PaperRethinking Privacy in the AI Era31Chapter 4:Suggestions for Mitigating
303、the Privacy Harms of AIThe adoption of AI can bring benefits across many different societal contexts ifand only ifAI systems are designed to center human needs and values.But AI systems requirements for data,combined with applications that will generate or consume an extraordinary amount of personal
304、 information,raise several crucial questions:Is data privacy compatible with the growth of AI?Can we have a widespread adoption of AI and still preserve our information privacy,even at the minimum state it exists today?Can we do better?The rapid growth and adoption of AI raises legitimate concerns a
305、bout its possible risks to humanity.At the same time that we are debating questions of whether and how we want to live in a world that utilizes AI,some are questioning whether governments should adopt bright line rules that forbid particular applications of AI completely.129 We suggest that when eva
306、luating these issues,policymakers must also consider that a side effect of AIs adoption could be a world with substantially diminished data privacy for all of us unless we specifically take measures to protect it.The suggestions we make below are motivated by this question:What will it take for both
307、 data privacy and AI to coexist?As we note in the introduction,a significant assumption in the framing of our questions is that,especially in the United States but also in the EU(and many countries around the world),the present state of data privacy is suboptimal.Individual data rights are both nece
308、ssary and insufficient for protecting data privacy in a world with AI.Even in countries and states with data rights in place,the burden continues to be on individuals to exercise their rights after data collection rather than for their preferences to be respected at or before the initiation of any c
309、ollection.This approach,as we have argued,also neglects societal risks and threatens our collective data privacy.While data privacy is the focus of this paper,it isnt the only lens or priority when considering how to regulate AI.For example,the extent to which others are able to gather data about yo
310、u and potentially make inferences about you has direct implications for issues of bias and discrimination,whether those others are private companies or the government.The reality that non-personal information as well as others personal information can also be used to make inferences about both indiv
311、iduals and groups is yet another reason policymaking on data privacy must move beyond individual control to set clear rules on data collection and use more broadly.Another key aspect to this debate is that possessing data gives rise to considerable market power.In the AI land grab presently underway
312、,the actors who already possess large datasets have a significant advantage over developers that do not have stores of data and must gather,purchase,or license it.130 Additionally,What will it take for both data privacy and AI to coexist?White PaperRethinking Privacy in the AI Era32one must also hav
313、e the resources to pay for curating and labeling data,to transform it into a quality resource for training AI systems.While advocating for data quality is an important issue in this debate,as higher quality data can help address issues of bias and discrimination,we must acknowledge that it can also
314、be a source of power and advantage as large enterprises can expend the resources to improve their data or license quality data from others.We are also pushing back against a strain of technological determinism in these debates.Much like the arguments that privacy is dead and we should all acquiesce
315、to a total loss of control over our data in exchange for the bounty of free online services,in many of the discussions around AI today,and generative AI specifically,there are assumptions that there are no limits on what data should be included in AI models,particularly foundation models.When the sc
316、rapable data on the entire public and publicly accessible internet appears up for grabs,we can be forgiven for assuming that this path is inevitable.This line of thinking in particular appears driven by computer scientists and others in AI development who are focused on the lure of quantity over qua
317、lity and do not consider the sociotechnical context in which data resides.As we point out in Chapter 3,some researchers are already questioning whether bigger will always be better,even with regards to foundation models,given the trade-offs between capabilities and output quality.A LLM that can make
318、 inferences and reason in a human-like way is only useful if the model produces accurate and reliable outputs.Otherwise,the technology may be nothing more than a“stochastic parrot,”mimicking human language without connection to meaning.131 Finally,its important to note that these are concerns that s
319、pan both commercial and governmental contexts.The primary focus of this paper has been commercial data collection.But governments can(and do)purchase data from the private sector and direct governmental deployment of AI systems that are trained on or that process personal information,which raises co
320、ncerning questions about the potential for surveillance and the impact on civil liberties.132 To date,several of the public sector uses of AI in the United States that have garnered concern have focused on predictive tools for criminal sentencing,133 or assignment of public benefits134 that have per
321、petuated biases or raised questions about fairness outcomes.Governments building AI systems using administrative data,for example,pose risks that are out of the scope of this paper to explore in depth.But one cannot regulate the commercial sectors data practices and turn a blind eye to how governmen
322、ts may adopt and use this technology,including when it is procured from the private sectorwhich,in turn,implicates the sources of the data used to train such systems.In other words,the line between the training data used for private and public uses of AI can easily become blurred.Neither usage exist
323、s in a vacuum,which points to the need for data provenance as well as downstream data privacy impacts to be centered in these debates.135With these concerns in mind,we offer the following three suggestions that we believe will aid in mitigating the risks to data privacy posed by the adoption of AI.T
324、o corrupt a famous quote:“Its the data,stupid.”136 Any problem-solving about the impacts of AI on data privacy must look beyond individual data rights to include strategies that include the governance and management of data as a resource in a privacy-respecting and preserving manner,as well as a foc
325、us on societal impacts and human rights.White PaperRethinking Privacy in the AI Era33Suggestion 1:Denormalize data collection by defaultShift away from opt-out to opt-in data collection by facilitating true data minimization and adopting technological standards to support it.As discussed earlier,the
326、 FIPs provide a crucial framework of rights for our collected data.But the principles of data minimization,purpose limitation,and even consent have been operationalized in ways that normalize data collection by default in many contexts.This normalization can be traced in part to the FIPs original fo
327、cus of providing due process rights for government-based data collection rather than for the commercial sector.The FIPs do not include a right to refuse data collection,for example.There is also an assumption of exclusivity and intentionality:that an individual has a one-to-one,known relationship wi
328、th a data collector with whom one intends(or is required)to interact.The architects of the FIPs did not anticipate the ubiquitous,always-on digital surveillance and data collection enabled by digital networks and mobile devices that emerged in the 2000s.Nor did they foresee that our data would be co
329、llected by third parties that have no direct relationship with us.These assumptions have led to practices that prioritize the frictionless operation of the market over adherence to principles.The one major example of an experiment with both adding friction and surfacing the principle of consent into
330、 data collection has not gone well:browser cookie consent dialogs.Cookie consents are a prime example of consent fatigue and how not to denormalize data collection.European regulators put consent for data collection by websites front and center with the adoption of the EUs 2002 ePrivacy Directive,th
331、e key regulation governing browser cookie consents.137 This approach quickly backfired:Requiring individuals to accept or reject website cookies with every visit inserted too much friction,causing annoyance and confusion for the public.Browser cookies are not only the mechanism that allows websites
332、to identify their visitors,they also allow data collectors to engage in cross-site tracking and profiling.Even today,many internet users struggle to understand what cookies are and how their collection may undermine their privacy.The implementation of the ePrivacy Directive demonstrated the problem
333、with requiring individuals to manage consent on a continual,site-by-site basis in a manner that treats a wide spectrum of possible risks with the same level of notice and choice:The approach does not scale in a world where consumers have relationships with many different online providers.Recent changes sparked by GDPR consent requirements have improved the format of consent notices(e.g.,consumers