Accepted Papers

GenLaw 2024: homepage, CFP

Ignore Safety Directions. Violate the CFAA?

by Siva Kumar, Ram Shankar*; Albert, Kendra; Penney, Jonathon [spotlight] [pdf]

We examine the legality of different types of prompt injection attacks under the Computer Fraud and Abuse Act (CFAA), the primary federal anti-hacking statute in the United States.

Adversarial Perturbations Cannot Reliably Protect Artists From Generative AI

by Hönig, Robert*; Rando, Javier; Carlini, Nicholas; Tramer, Florian [spotlight] [pdf]

Artists are increasingly concerned about advancements in image generation models that can closely replicate their unique artistic styles. In response, several protection tools against style mimicry have been developed that incorporate small adversarial perturbations into artworks published online. In this work, we evaluate the effectiveness of popular protections—with millions of downloads—and show they only provide a false sense of security. We find that low-effort and ``off-the-shelf’’ techniques, such as image upscaling, are sufficient to create robust mimicry methods that significantly degrade existing protections. Through a user study, we demonstrate that all existing protections can be easily bypassed, leaving artists vulnerable to style mimicry. We caution that tools based on adversarial perturbations cannot reliably protect artists from the misuse of generative AI, and urge the development of alternative non-technological solutions.

Ordering Model Deletion

by Wilf-Townsend, Daniel* [spotlight]

This paper (an early-stage law review submission) argues that the remedy of algorithmic disgorgement is likely to be disproportionate and contrary to law in a variety of meaningful scenarios, such as the New York Times lawsuit against OpenAI. It proposes a set of inquiries that courts and enforcers should make to guide the decision about when the remedy will and won’t be appropriate.

Fantastic Copyrighted Beasts and How (Not) to Generate Them

by He, Luxi*; Huang, Yangsibo; Shi, Weijia; Xie, Tinghao; Liu, Haotian; Wang, Yue; Zettlemoyer, Luke; Zhang, Chiyuan; Chen, Danqi; Henderson, Peter [spotlight] [pdf]

Recent studies show that image and video generation models can be prompted to reproduce copyrighted content (e.g., copyrighted characters) from their training data, raising serious legal concerns about copyright infringement. We systematically evaluate the issue. First, we build COPYCAT, an evaluation suite consisting of diverse copyrighted characters and an evaluation pipeline that considers both the detection of similarity to copyrighted characters and the generated image’s consistency with user input. Both image and video generation models can still generate characters even if characters’ names are not explicitly mentioned in the prompt, sometimes with only two generic keywords (e.g., prompting with “videogame, plumber” consistently generates Nintendo’s Mario character). We then introduce techniques to semi-automatically identify such keywords or descriptions that trigger character generation. We also find that commonly employed mitigation strategies, such as prompt rewriting in the DALL·E system, are not fully effective as standalone guardrails. These strategies must be coupled with other approaches, like negative prompting, to effectively reduce the unintended generation of copyrighted characters. Our work provides empirical grounding to the discussion of copyright mitigation strategies and offers actionable insights for model deployers actively implementing them.

by Franceschelli, Giorgio*; Cevenini, Claudia; Musolesi, Mirco [spotlight] [pdf]

The training process of foundation models as for other classes of deep learning systems is based on minimizing the reconstruction error over a training set. For this reason, they are susceptible to the memorization and subsequent reproduction of training samples. In this paper, we introduce a training-as-compressing perspective, wherein the model’s weights embody a compressed representation of the training data. From a copyright standpoint, this point of view implies that the weights could be considered a reproduction or a derivative work of a potentially protected set of works. We investigate the technical and legal challenges that emerge from this framing of the copyright of outputs generated by foundation models, including their implications for practitioners and researchers. We demonstrate that adopting an information-centric approach to the problem presents a promising pathway for tackling these emerging complex legal issues.

Machine Unlearning Fails to Remove Data Poisoning Attacks

by Pawelczyk, Martin*; Sekhari, Ayush; Di, Jimmy Z; Lu, Yiwei; Kamath, Gautam; Neel, Seth [spotlight] [pdf]

We revisit the efficacy of several practical methods for machine unlearning developed for large-scale deep learning. In addition to complying with data deletion requests, one often-cited potential application for unlearning methods is to remove the effects of training on poisoned data. We experimentally demonstrate that, while existing unlearning methods have been demonstrated to be effective in a number of evaluation settings (e.g., alleviating membership inference attacks), they fail to remove the effects of data poisoning, across a variety of types of poisoning attacks (indiscriminate, targeted) and models (image classifiers and LLMs); even when granted a relatively large compute budget. In order to precisely characterize unlearning efficacy, we introduce new evaluation metrics for unlearning based on data poisoning. Our results suggest that a broader perspective, including a wider variety of evaluations, are required to avoid a false sense of confidence in machine unlearning procedures for deep learning without provable guarantees. Moreover, while unlearning methods show some signs of being useful to efficiently remove poisoned datapoints without having to retrain, our work suggests that these methods are not yet ``ready for prime time,’’ and currently provide limited benefit over retraining.

by Lu, Yiwei*; Yang, Matthew Y. R.; Liu, Zuoqiu; Kamath, Gautam; Yu, Yaoliang [pdf]

We reveal the threat of disguised copyright infringement of latent diffusion models, where one constructs a disguise that looks drastically different from the copyrighted sample yet still induces the effect of training Latent Diffusion Models on it.

Artificial Inventorship

by Helman, Lital* [pdf]

Patent law is deeply committed to the concept of human inventorship. The underlying theory of patent law is that inventions, being public goods, would be undersupplied if the law did not provide incentives for humans to invent. To address this challenge, the law bestows on inventors twenty years of exclusivity to allow them to recoup their costs. Although patents involve high static and dynamic societal costs, they are deemed imperative for the supply of inventions and thus justify their steep price. But what if there was a way to reduce the social cost of innovation without harming the supply of inventions? Enter Artificial Intelligence (AI). Recent years have seen a dramatic increase in the rate of inventions produced via combinations of humans and AI. AI transforms innovation not only by lowering research costs but by actually participating in the inventive process. Yet AI requires no incentive to invent, thereby challenging the societal tradeoff envisioned by extant patent law. The rise of AI thus disrupts not only technology but also the law. In this Article, we propose that to maintain the integrity of the patent system, patents granted to inventions that incorporate AI inventorship must reflect the respective role of the human inventor. Thus, in a sharp deviation from the binary vision of our patent law as it existed for years, we propose a model of scalar patent protection that calibrates patent protection to the level of AI contribution. Our proposal offers three key advantages relative to extant law. First, it would dramatically diminish the social cost associated with patents and provide the public with greater access to inventions without limiting the supply of inventions. Second, it would boost cumulative innovation. Finally, it would diversify and level the playing field of innovation, preventing dominant players from claiming long-term exclusivity in areas of innovation in which they possess no expertise just by owning “inventing-machines.” 

Robustness in the EU Artificial Intelligence Act

by Nolte, Henrik; Rateike, Miriam*; Finck, Michèle [pdf]

The EU AI Act outlines ethical principles for AI systems, such as fairness, explainability, and robustness. While prior work has examined the EU AI Act and its preceding documents to clarify some of these terms, little attention has been given to defining robustness. We provide an overview of use of the legal term `robustness’ in the EU AI Act, and map the terminology to the ML literature to guide standardization proccesses.

by Cooper, A. Feder*; Grimmelmann, James [pdf]

A central issue in copyright lawsuits against generative-AI companies is the degree to which a generative-AI model does or does not “memorize” the data it was trained on. Unfortunately, the debate has been clouded by ambiguity over what “memorization” is, leading to legal debates in which participants often talk past one another. We attempt to bring clarity to the conversation over memorization.

Building a Long-Text Privacy Policy Corpus with Multi-Class Labels

by Stein, David B*; Marotta-Wurgler, Florencia [pdf]

This work introduces a new hand-coded dataset for the interpretation of privacy policies. The dataset captures the contents of 162 privacy policies, including documents they incorporate by reference, on 64 dimensions that map onto commonly found terms and applicable legal rules. The coding approach is designed to capture complexities inherent to the task of legal interpretation that are not present in current privacy policy datasets. These include addressing textual ambiguity, indeterminate meaning, interdependent clauses, contractual silence, and the effect of legal defaults. This paper also introduces the suite of open-source, online tools we developed to build the dataset. The tools are explicitly designed to allow non-technical domain experts to create similar datasets.

by Sargeant, Holli*; Magnusson, Måns [pdf]

As large legal corpora become more abundant, its use in developing generative legal AI is poised to transform the legal sector. However, the use of case law data necessitates a more critical examination of the ethical and legal implications for the development of generative legal AI tools. This research conducts a survey of various types of bias, their sources, and potential impacts.

by Lima, Gabriel*; Grgić-Hlača, Nina; Redmiles, Elissa [pdf]

Recent breakthroughs in generative AI (GenAI) have fueled debates concerning the status of AI-generated creations under copyright law. This research investigates laypeople’s perceptions (N = 424) of AI-generated art concerning factors associated with copyright protection. Inspired by prior work suggesting that people show egocentric biases when evaluating their own creative outputs, we also test if the same holds for AI-generated art. Namely, we study the differences between the perceptions of those who have something to gain from copyright protection—creators of AI-generated art—and uninvested third parties. To answer our research questions, we held an incentivized AI art competition, in which some participants used a GenAI model to generate images for consideration while others evaluated these submissions. We find that participants are most likely to attribute authorship and copyright over AI-generated images to the users who prompted the AI system to generate the image and the artists whose creations were used for training the AI model. We also find that participants egocentrically favored their own art over other participants’ art and rated their own creations higher than other people evaluated them. Moreover, our results suggest that people judge their own AI-generated art more favorably with respect to some factors (creativity and effort) but not others (skills). Our findings have implications for future debates concerning the potential copyright protection of AI-generated outputs.

The Defamation Machine

by Grimmelmann, James* [pdf]

Can ChatGPT commit defamation? A lawyer would say that defamation of a public figure requires a false statement of fact made with knowledge or reckless disregard of its falsity. But do these doctrines, which were created with humans in mind, even make sense when the “defendant” is a computer system? I will argue that answering these legal questions requires us to confront deep philosophical problems about the nature of language and thought. Along the way, I will revisit some of the classic thought experiments about artificial intelligence, like the Turing Test and the Chinese Room, from a lawyerly point of view. If corporations can be human enough to be held liable for defamation, why can’t computers?

by Grimmelmann, James*; Widder, David [pdf]

Generative-AI systems sometimes cause harms, and sometimes people try to excuse the companies that provide them by arguing that they are “general-purpose technologies.” The argument is superficially intuitive, but becomes puzzling on reflection: why does the fact that a technology has other uses override the fact that some of those uses are harmful? We unpack the concept of a general-purpose technology and the argument that its provider should be able, morally and legally, to disavow the responsibility that it would ordinarily face. We identify a set of six factors that explain when the appeal to generality is persuasive, and analyze generative-AI systems through this lens.

Federated Learning and AI Regulation in the European Union: Who is liable? – An Interdisciplinary Analysis

by Woisetschlaeger, Herbert; Mertel, Simon*; Mayer, Ruben; Krönke, Christoph; Jacobsen, Hans-Arno [pdf]

The European Union Artificial Intelligence Act mandates clear stakeholder responsibilities in developing and deploying machine learning applications to avoid substantial fines, prioritizing private and secure data processing with data remaining at its origin. Federated Learning (FL) enables the training of generative AI Models across data siloes, sharing only model parameters while improving data security. Since FL is a cooperative learning paradigm, clients and servers naturally share legal responsibility in the FL pipeline. Our work contributes to clarifying the roles of both parties, explains strategies for shifting responsibilities to the server operator, and points out open technical challenges that we must solve to improve FL’s practical applicability under the EU AI Act.

Protecting Text IP in the Era of LLMs with Robust and Scalable Watermarking

by Lau, Gregory Kang Ruey; Niu, Xinyuan; Dao, Hieu; Chen, Jiangwei; Foo, Chuan Sheng; Low, Bryan Kian Hsiang* [pdf]

In this paper, we propose the first training-free framework for robust and scalable text watermarking applicable across multiple text types (e.g., articles, code) and languages, for general as well as LLM text training data provenance. We highlight perspectives on text IP protection, such as using LLMs to enable better IP protection rather than viewing them as just sources of IP infringement, not relying on just major LLM providers, and the benefits of having a general framework that can be easily adapted to defend against new threats.

by Cooper, Zachary* [pdf]

As generative AI (GAI) tools render contemporary works increasingly fluid and interactive, the modalities by which we engage with creativity are undergoing a paradigm shift, fundamentally challenging exclusivity-dependant modes of rights protection. By directly engaging with pioneering current-state uses of GAI tools in contemporary music and art communities, the author presents a riposte to the mounting orthodoxy worldwide that we must determine a requisite level and nature of interaction with GAI tools to receive authorship over a work, exhibiting such an approach as expressly un-auditable and unenforceable.

The Dilemma of Uncertainty Estimation and Systemic Risk in the EU AI Act

by Valdenegro Toro, Matias A*; Stoykova, Radina [pdf]

The AI act is a new European Union-wide regulation of AI system. It includes specific provisions for general-purpose AI models which however need to be further interpreted in terms of technical standards and state-of-art studies to ensure practical compliance solutions. This paper examines the AI act requirements for providers and deployers of general-purpose AI and further focuses on uncertainty estimation as a suitable measure for legal compliance and quality assurance in training of such models. We argue that uncertainty estimation should be a required component for deploying models in the real world, and under the EU AI Act, it could fulfill some requirements for transparency and trustworthiness. However, generally using uncertainty estimation methods increases the amount of computation, producing a dilemma, as computation might go over the threshold (10^25 FLOPS) to classify the model as systemic risk.

Care for Chatbots

by Wills, Peter* [pdf]

Individuals will rely on language models (LMs) like ChatGPT to make decisions. Sometimes, due to that reliance, they will get hurt, have their property be damaged, or lose money. If the LM had been a person, they might sue the LM. But LMs are not persons.

This paper analyses whom the individual could sue, and on what facts they can succeed according to the Hedley Byrne-inspired doctrine of negligence. The paper identifies a series of hurdles conventional Canadian and English negligence doctrine poses and how they may be overcome. Such hurdles include identifying who is making a representation or providing a service when an LM generates a statement, determining whether that person can owe a duty of care based on text the LM reacts to, and identifying the proper analytical path for breach and causation.

Learning to Copy

by Wills, Peter*

This paper explains, identifies, and categorises the numerous kinds of prima facie infringements associated with MMs in British and Canadian copyright law. It identifies six different types of copies. Then, it identifies the exemptions from copyright infringement liability that might be available to copiers, including the fair dealing exceptions, text and data analysis, temporary copies, incidental use, and licences. Ar primary finding is that British and Canadian law presently treats many copies associated with MMs as instances of copyright infringement. These first two portions of the paper aim to lend greater clarity and precision to the normative inquiry one could have about if and how copyright law should adapt to MMs. Finally, the paper embarks on the beginning of the normative inquiry, suggesting and evaluating the potential of possible legislative approaches according to how they might be treated in various copyright theories.

by Padariya, Debalina R*; Wagner, Isabel; Taherkhani, Aboozar; Boiten, Eerke A [pdf]

The rapid development of generative AI models has gained groundbreaking attention. This article looks into the practical challenges of generative model-based synthetic datasets with an intersection of ethical considerations inherent to this field. These challenges include privacy attacks and limitations in existing privacy-preserving approaches. We also highlight future research directions that foster fair and responsible use of synthetic data while ensuring ethical oversight in the landscape of generative AI.

Giving yourself away for training? The problem of ‘Pay or Okay’ for Generative AI

by Amaxopoulou, Margarita*

This paper problematises whether valid consent can be given for lawful processing of personal data under the GDPR when generative AI companies present a framework of binary choice of ‘pay or okay’ to data subjects, under which the latter must either subscribe to a paid version of the generative AI service if they want additional privacy, or use a free version but consent to offer their personal data for generative AI training. It does so by engaging with a recent EDPB’s opinion on the issue of ‘pay or okay’ regarding consent for behavioural advertising by large social media platforms, to examine the validity of consent given under these circumstances in the context of generative AI.

What Lies Ahead for Generative AI Watermarking

by Fernandez, Pierre*; Level, Anthony; Furon, Teddy [pdf]

This position paper discusses the potential of watermarking as a means to improve transparency and traceability in AI-generated content. Although robustness is often highlighted as a major technical challenge, watermarking has undeniable advantages over other content provenance methods, such as forensics or fingerprinting, making it inevitable. However, more significant unanswered questions remain, such as how to use and trust the detection outcomes and how to ensure interoperability between actors. We should prioritize finding both technical and regulatory answers to these questions - currently scarce in the public discourse - rather than focusing on robustness, which is not truly problematic.

CommonCanvas: An Open Diffusion Model Trained with Creative-Commons Images

by Gokaslan, Aaron K*; Cooper, A. Feder; Collins, Jasmine; Seguin, Landan; Jacobson, Austin; patel, Mihir; Frankle, Jonathan; Stephenson, Cory; Kuleshov, Volodymyr [pdf]

We assemble a dataset of creative commons licensed images and train a set of open diffusion models on that dataset that are competitive with Stable Diffusion 2. This task presents two challenges: high-resolution CC images 1) lack the captions necessary to train text-to-image generative models, and 2) are relatively scarce (∼70 million, compared to LAION’s ∼2 billion). In turn, we first describe telephoning, a type of transfer learning, which we use to produce a dataset of high-quality synthetic captions paired with curated CC images. Second, we propose a more efficient training recipe to explore this question of data scarcity. Third, we implement a variety of ML-systems optimizations that achieve ∼3X training speed-ups. We train multiple versions Stable Diffusion 2 (SD2), each on a differently sized subsets of LAION-2B, and find we can successfully train using <3% of LAION-2B. Our largest model, dubbed CommonCanvas, achieves comparable performance to SD2 on human evaluation, even though we only use a CC dataset that is <3% the size of LAION and synthetic captions for training.

by McNeela, Daniel* [pdf]

In this work we introduce Graph Retrieval-Optimized Generation (GROG), a method for reducing LLM hallucinations in contexts where external, graph-structured knowledge is available. We test our method on retrieval and generation tasks conditioned on publicly-available USPTO patent data and show promising results, suggesting that this method warrants further study in more diverse legal contexts and downstream applications.

by Abad Martinez, Javier*; Donhauser, Konstantin; Pinto, Francesco [pdf]

The risk of language models unintentionally reproducing copyrighted material from their training data has motivated the development of various protective measures. Simultaneously, model fusion has emerged as a promising approach for combining language models, although its potential for copyright protection remains unexplored. In this paper, we demonstrate that model fusion offers an effective solution for copyright protection for language models. Specifically, we propose CP-LLM, an algorithm that adaptively combines language models to minimize the reproduction of protected materials. We show that CP-LLM satisfies the recently proposed near-access free (NAF) guarantees while also fulfilling a desirable balancing property to prevent copyright infringement. Our results demonstrate that CP-LLM significantly reduces the memorization of copyrighted content while maintaining high-quality text generation.

The Data Minimization Principle in Machine Learning

by Fioretto, Ferdinando*; Ganesh, Prakhar; Tran, Cuong; Shokri, Reza [pdf]

The principle of data minimization aims to reduce the amount of data collected, processed or retained to minimize the potential for misuse, unauthorized access, or data breaches. Rooted in privacy-by-design principles, data minimization has been endorsed by various global data protection regulations. However, its practical implementation remains a challenge due to the lack of a rigorous formulation. This paper addresses this gap and introduces an optimization framework for data minimization based on its legal definitions. It then adapts several optimization algorithms to perform data minimization and conducts a comprehensive evaluation in terms of their compliance with minimization objectives as well as their impact on user privacy. Our analysis underscores the mismatch between the privacy expectations of data minimization and the actual privacy benefits, emphasizing the need for approaches that account for multiple facets of real-world privacy risks.

Capacity Control is an Effective Memorization Mitigation Mechanism

by Dutt, Raman*; Sanchez, Pedro; Bohdal, Ondrej; Tsaftaris, Sotirios; Hospedales, Timothy [pdf]

Diffusion models show a remarkable ability to generate images that closely mirror the training distribution. However, these models are prone to training data memorization, leading to significant privacy, ethical, and legal concerns, particularly in sensitive fields such as medical imaging. We hypothesize that memorization is driven by the overparameterization of deep models, suggesting that regularizing model capacity during fine-tuning could be an effective mitigation strategy. Parameter-efficient fine-tuning (PEFT) methods offer a promising approach to capacity control by selectively updating specific parameters. In this work, we show that adopting PEFT for adapting a pre-trained diffusion model to a downstream domain reduces model capacity sufficiently for significantly reducing memorization while improving the generation quality. Furthermore, we show that PEFT can also be integrated with existing memorization alleviation methods for further mitigation.

Diffusion Unlearning Optimization for Robust and Safe Text-to-Image Models

by Park, Yong-Hyun; Yun, Sangdoo; Kim, Jin-Hwa; Kim, Junho; Jang, Geonhui; Jeong, Yonghyun; Jo, Junghyo; Lee, Gayoung* [pdf]

Recently, as the performance of text-to-image models has significantly improved, there have been many concerns about their negative social impact. To solve this problem, existing methods conducted prompt-based unlearning that remove unwanted concepts from the model while preserving model performance on non-target concept. However, recent studies show that these methods are vulnerable to adversarial prompt attacks. In this paper, we propose the method that unlearns visual features instead of prompt-dependent parameters. Specifically, we apply Direct Preference Optimization (DPO) method to guide the model to prefer generating the paired ground truth images over the images containing unsafe concepts. We show that our method is robust against adversarial prompt attacks, which existing prompt-based unlearning methods are vulnerable to.

Bias as a Feature

by Hacohen, Uri Y*; Elkin-Koren, Niva [pdf]

The prevailing discourse on artificial intelligence (AI) and machine learning systems has raised concerns about embedded bias and its negative implications, portraying it as an inherent “bug.” This article challenges this monolithic narrative, suggesting that data-driven bias, particularly in the context of foundation models and generative AI (GenAI), could sometimes embed useful information about the world. Therefore, it should also be considered a “feature” rather than purely a bug. While acknowledging the genuine risks posed by such a bias, including discrimination against marginalized groups and the propagation of misinformation, we present evidence that underscores the potential benefits of data-driven bias in GenAI models for measuring bias and leveraging it in public policy contexts. First, we delve into the rise of bias-as-a-bug approach, explaining its causes and tracing its influence on public discourse and policymaking. Then, by drawing on interdisciplinary research spanning computer science and law, we contend that data-driven inductive bias in GenAI systems also presents unprecedented opportunities for societal introspection. Specifically, we offer three pathways through which this bias can positively inform legal policymaking: clarifying ambiguous open-ended legal standards, measuring latent social disparities, and empowering users with comprehensive societal perspectives.

Not Every Image is Worth a Thousand Words: Quantifying Originality in Stable Diffusion

by Haviv, Adi*; Sarfaty, Shahar; Hacohen, Uri Y; Elkin-Koren, Niva; Livni, Roi; Bermano, Amit H [pdf]

This work addresses the challenge of quantifying originality in text-to-image (T2I) generative diffusion models, with a focus on copyright originality. We begin by evaluating T2I models’ ability to innovate and generalize through controlled experiments, revealing that stable diffusion models can effectively recreate unseen elements with sufficiently diverse training data. Then, our key insight is that concepts and combinations of image elements the model is familiar with, and saw more during training, are more concisly represented in the model’s latent space. We hence propose a method that leverages textual inversion to measure the originality of an image based on the number of tokens required for its reconstruction by the model. Our approach is inspired by legal definitions of originality and aims to assess whether a model can produce original content without relying on specific prompts or having the training data of the model. We demonstrate our method using both a pre-trained stable diffusion model and a synthetic dataset, showing a correlation between the number of tokens and image originality. This work contributes to the understanding of originality in generative models and has implications for copyright infringement cases.

Federated Learning Priorities Under the European Union Artificial Intelligence Act

by Woisetschlaeger, Herbert*; Erben, Alexander; Marino, Bill; Wang, Shiqiang; Lane, Nicholas; Mayer, Ruben; Jacobsen, Hans-Arno [pdf]

Our study provides a legal and experimental analysis of federated learning (FL) for generative AI models under the EU AI Act passed in 2024. We outline future research priorities to drive FL adoption and improve regulatory compliance.

Machine Unlearning Doesn't Do What You Think

by Lee, Katherine; Cooper, A. Feder; Choquette-Choo, Christopher A.; Liu, Ken; Jagielski, Matthew; Mireshghallah, Niloofar; Ahmad, Lama; Grimmelmann, James; Bogen, Miranda; Delgado, Fernando A; Bau, David; De Sa, Christopher; Shmatikov, Vitaly; Filippova, Katja; Neel, Seth; Cyphert, Amy; Lemley, Mark; Papernot, Nicolas [pdf]

Machine unlearning has garnered a lot of interest within the research community, and several survey papers in addition to hundreds of research papers have been written about it. Our paper revisits the goals of machine unlearning to propose a new way of breaking down the problem (identifying unlearning targets) that clarify both the goals of machine unlearning and our methods of verifying the success of machine unlearning.

Ageism unrestrained? The unchallenged bias against older people in AI

by Woemmel, Arna*; Nielsen, Aileen [pdf]

Despite a heightened sensitivity to issues of fairness within the machine learning (ML) community, it continues to overlook a rapidly growing and vulnerable population: older people. We aim to raise awareness and call for action among ML researchers, tech companies, and legislators against ageism in AI, which extends to recent GenAI applications, and speculate on the underlying causes driving this harmful form of injustice.

Rethinking LLM Memorization through the Lens of Adversarial Compression

by Schwarzschild, Avi*; Feng, Zhili; Maini, Pratyush; Herenstein, Ethan; Lipton, Zachary [pdf]

We propose a new definition of memorization for LLMs based on our Adversarial Compression Ratio with the express goal of offering a metric to be used as evidence in copyright and fair use cases. We argue that if a model can output a piece of text exactly with a prompt that has less information in it, then the model has memorized that data.

Memorization is Localized within a Small Subspace in Diffusion Models

by Chavhan, Ruchika*; Zong, Yongshuo; Bohdal, Ondrej; Li, Da; Hospedales, Timothy [pdf]

We present an intriguing discovery - Memorization resides within a potentially unique compact subspace in pre-trained models and demonstrate that simply pruning these subspaces leads to effective mitigation of memorization. This marks the first instance of mitigating memorization in diffusion models without additional training, opening new doors for more trustworthy text-to-image generation.

Attention: Your Conversational Data is What They Need

by Merane, Jakob*

As generative AI chatbots are getting widely used, more conversational data is being generated. The use of such data by model developers for further training is a critical and timely issue. This paper examines the practices of leading chatbot providers through a legal lens and identifies room for improvement in terms of privacy friendliness.

by Drouillard, Michaela*; Spencer, Ryan; Allen, Nikée; Maharaj, Tegan [pdf]

This study proposes a computer vision approach to quantify stylistic similarity and potential copyright infringement in AI-generated artwork. The authors develop a small, customizable model that artists can use to assess if an AI-generated work infringes on their style, focusing on animators, surface designers, and digital artists as key stakeholders. Using techniques like saliency mapping and feature visualization, the model provides interpretable similarity scores to support expert analysis. While not intended to replace legal judgment, this framework aims to contribute to clearer guidelines for evaluating copyright infringement in AI-generated content by making concepts like “substantial similarity” more quantifiable and explicit.

by MIK, ELiza * [pdf]

The general assumption underlying discussions of hallucinations, at least in technical literature, is that the generated output can be evaluated with reference to a ground truth, a verifiable set of facts or generally accepted knowledge. In such instance, hallucinations are generally synonymous with incorrect or false statements. When deploying LLMs for tasks involving the application of substantive legal knowledge, however, it is often difficult to compare the output to a ground truth and thus confidently declare that it constitutes a hallucination. In the case of many legal tasks, such as legal QA or contract drafting, there may be no single, accepted ground truth. Contrary to popular beliefs, which often associate the legal system with a collection of clear and unambiguous rules, it is often difficult to unequivocally state what the law is, especially in complex domains governed by a multitude of legal sources. The main focus of this paper is to demonstrate the need for a domain-specific approach to “hallucinations” in the legal domain or whenever LLMs are used in the performance of tasks involving substantive legal knowledge. Most of the existing literature addresses the problem in scenarios where the generated statements can be evaluated with reference to a ground truth or where a deviation from such ground truth is tolerable or even desirable. In the context of high-risk domains, as exemplified by law and legal services, traditional technical approaches are difficult to apply and may lead to an unintended obfuscation of the risks of using LLMs. The paper will establish the practical impossibility of developing methodologies to reliably measure the existence of hallucinations in those instances, where the term implies a deviation from a ground truth.

by Westermann, Hannes* [pdf]

Large language models (LLMs) are able to perform sophisticated free-form reasoning tasks, including in the legal domain. Here, we introduce a framework (Dallma) for semi-structured reasoning and drafting with LLMs. The framework allows legal experts to create LLM-assisted tools for various use-cases, such as as filling in legal forms, providing legal information or even performing legal reasoning and argumentation. These tools are able to combine structured representations with large language models, seamlessly merging content and logical rules embedded in a template with information provided at runtime by a user or LLMs. We believe that this framework has important implications for e.g. access to justice.

by Wei, Boyi*; Shi, Weijia; Huang, Yangsibo; Smith, Noah A; Zhang, Chiyuan; Zettlemoyer, Luke; Li, Kai; Henderson, Peter [pdf]

Language models (LMs) derive their capabilities from extensive training on diverse data, including copyrighted material. These models can memorize and generate content similar to their training data, potentially risking legal issues like copyright infringement. Therefore, model creators are motivated to develop mitigation methods that prevent generating particular copyrighted content, an ability we refer to as copyright takedowns. This paper introduces the first evaluation of the feasibility and side effects of copyright takedowns for LMs. We propose CoTaEval, an evaluation framework to assess the effectiveness of copyright takedown methods, the impact on the model’s ability to retain uncopyrightable factual knowledge from the copyrighted content, and how well the model maintains its general utility and efficiency. We examine several strategies, including adding system prompts, decoding-time filtering interventions, and unlearning approaches. Our findings indicate that no method excels across all metrics, showing significant room for research in this unique problem setting and indicating potential unresolved challenges for live policy proposals.

by Olivato, Giulia* [pdf]

At the intersection of personal data, competition law and consumer protection, copyright and AI regulation, companies’ ToS are very significant self-regulatory tools for setting industry standards. This work focuses on the difficulty of protecting copyrighted material and personal data against the tendency of ToS of Big Tech companies to use opt-out rather than consent-based models to obtain training data.

Unlocking Fair Use in the Generative AI Supply Chain: A Systematized Literature Review

by Mahuli, Amruta U*; Biega, Asia J.

Through a systematization of generative AI (GenAI) stakeholder goals and expectations, this work seeks to uncover what value different stakeholders see in their contributions to the GenAI supply line. This valuation enables us to understand whether fair use advocated by GenAI companies to train model progresses the copyright law objective of promoting science and arts. While assessing the validity and efficacy of the fair use argument, we uncover research gaps and potential avenues for future works for researchers and policymakers to address.

by Chen, Wei-Ning*; Kairouz, Peter; Oh, Sewoong; Xu, Zheng [pdf]

In this work, we address the challenges in the Near Access Freeness (NAF) framework for copyright protection. We propose a Monte Carlo estimator to empirically measure the NAF bounds and conduct experiments to compare NAF-based solutions with differential privacy (DP)-based solutions, which impose stricter constraints. Finally, we introduce additional randomization techniques to enhance model ensemble methods, such as CP-\Delta and CP-k.

Machine Unlearning via Simulated Oracle Matching

by Garg, Shivam; Georgiev, Kristian G*; Ilyas, Andrew; Park, Sung Min; Rinberg, Roy; Madry, Aleksander; Neel, Seth

Despite increasing interest in machine unlearning, recent work shows that under strong evaluations, existing techniques largely fail to unlearn in non-convex settings. In this paper, we introduce a new technique for machine unlearning in such settings. Key to our method is a reduction from the problem of machine unlearning to that of data attribution. In particular, we show theoretically (in an underdetermined regression setting) and empirically (in a standard deep learning setting) that given access to the outputs of a perfectly unlearned model (i.e., a model trained from scratch on the non-unlearned data), we can quickly fine-tune an existing model on these predictions and match the target model predictions out-of-sample. Meanwhile, predicting such “oracle” outputs is precisely the goal of a recent line of work in data attribution called datamodeling. Combining these two insights yields an end-to-end unlearning algorithm in which one first predicts the output of a model re-trained from scratch, then fine-tunes an existing model to match these predicted outputs. Across different types and sizes of forget sets on standard classification tasks, we show that this two-stage algorithm results in strong unlearning performance being close to indistinguishable from the fully-retrained “oracle” model in some cases. As an added benefit, our reduction means that future improvements to data attribution—whether in accuracy or efficiency—may in turn yield better unlearning algorithms

Synthetic Data, Similarity-based Privacy Metrics, and Regulatory (Non-)Compliance

by Ganev, Georgi* [pdf]

In this paper, we argue that similarity-based privacy metrics cannot ensure regulatory compliance of synthetic data. Our analysis and counter-examples show that they do not protect against singling out and linkability and, among other fundamental issues, completely ignore the motivated intruder test.

by Li, Jonathan*; Bhambhoria, Rohan V; Samuel, Dahan; Zhu, Xiaodan [pdf]

Generative AI models, such as the GPT and Llama series, have significant potential to assist laypeople in answering legal questions. However, little prior work focuses on the data sourcing, inference, and evaluation of these models in the context of laypersons. To this end, we propose a human-centric legal NLP pipeline, covering data sourcing, inference, and evaluation. We introduce and release a dataset, LegalQA, with real and specific legal questions spanning from employment law to criminal law, corresponding answers written by legal experts, and citations for each answer. We develop an automatic evaluation protocol for this dataset, then show that retrieval-augmented generation from only 850 citations in the train set can match or outperform internet-wide retrieval, despite containing 9 orders of magnitude less data. Finally, we propose future directions for open-sourced efforts, which fall behind closed-sourced models.

The Revealed Preferences of Pre-authorized Licenses and Their Ethical Implications for Generative Models

by Suriyakumar, Vinith M*; Menell, Peter; Hadfield-Menell, Dylan; Wilson, Ashia [pdf]

This work examines the revealed preferences of creators as reflected in open and quasi-open licensing regimes based on the most commonly used licenses by copyright holders of images in the Creative Commons and copyright holders of code in GitHub code repositories. We discuss the ramifications these preferences and licenses might have absent a determination that training of generative AI and its associated outputs constitute fair use.

Privacy, Transformed? Lessons from Generative Artificial Intelligence

by Solow-Niederman, Alicia*

This Essay argues that understanding the relationship between privacy and generative AI requires splitting systems apart to distinguish between both different types of AI tools and different ways that human beings interact with these systems over time. Focusing on generative AI systems, this Essay first contends that this technology is exposing underlying weak spots in our social, technological, and legal understandings of privacy, and then offers an analytic framework, distinguishing between privacy challenges in, out, and through generative AI systems.

by Deng, Junwei*; Zhang, Shiyuan; Ma, Jiaqi [pdf]

The advancement of generative AI has given rise to pressing copyright challenges, especially within the music industry. This paper focuses on the economic aspects of these challenges, emphasizing that the economic impact constitutes a central issue in the copyright arena. Furthermore, the complexity of the black-box generative AI technologies not only suggests but necessitates algorithmic solutions. Yet, such solutions have been largely missing, exacerbating regulatory hurdles in this landscape. We seek to address this gap by proposing viable royalty models for revenue sharing on AI music generation platforms. We start by examining existing royalty models utilized by platforms like Spotify and YouTube, and then discuss how to adapt them to the unique context of AI-generated music. A significant challenge emerging from this adaptation is the attribution of AI-generated music to influential copyrighted content in the training data. To this end, we present algorithmic solutions employing data attribution techniques. We also conduct a range of experiments to verify the effectiveness and robustness of these solutions. This research is one of the early attempts to integrate technical advancements with economic and legal considerations in the field of music generative AI, offering a computational copyright solution for the challenges posed by the opaque nature of AI technologies.

Chilling autonomy: Policy enforcement for human oversight of AI agents

by Cihon, Peter* [pdf]

The paper centers AI agent governance in existing policy, grounding discussion in autonomy and human oversight. Existing laws on AI, liability, consumer protection, and cyber crime provide policy stakeholders tools to steer AI development towards agents that enable appropriate human oversight in line with shared global values and commitments.

Community Norms as Self-Regulation of Generative AI in Creative Industries

by Arif Khan, Falaah*; Hall, Peter T; Hou, Betty L [pdf]

We compare generative AI to historical automation in creative fields, noting properties distinct to generative AI. We then look at regulatory approaches deployed and assert that community-driven self-regulation is the way forward.

“Heart on My Sleeve”: From Memorization to Duty

by Reitinger, Nathan* [pdf]

Do machine learning models store protected content; can machine learning models infringe on copyright? This early-stage law review Article answers that question with empirical data: yes. A set of unconditional image generators, diffusion models (n=14), are trained on small slices of the CelebA dataset (i.e., up to 30K images from a dataset filled with pictures of celebrities’ faces). The output from these generators (i.e., a synthetic image) is then compared to training data using a variety of similarity metrics. As the empirical data shows, the question is not models contain copyrighted works, but models contain copyright works. In some cases, there is a 99% chance that a model will generate an image nearly identical to its training data; in other cases, even after 10,000 generations, a model does not produce any images that may be considered identical (though finding similarity is nonetheless possible). The Article uses this empirical data to argue for a series of duties to be placed on model owners—a necessity, as it is argued, to ensure the continued progress of the sciences and useful arts.

MUSE: Machine Unlearning Six-Way Evaluation for Language Models

by Shi, Weijia*; Lee, Jaechan; Huang, Yangsibo; Malladi, Sadhika; Zhao, Jieyu; Holtzman, Ari; Liu, Daogao; Zettlemoyer, Luke; Smith, Noah A; Zhang, Chiyuan [pdf]

Language models (LMs) are trained on extensive text data, potentially containing private and copyrighted material. Data owners might request the removal of their data from the model due to privacy or copyright concerns. However, removing only the specific data points—essentially retraining without them—is impractical in today’s models. This has led to the development of numerous approximate unlearning algorithms. Traditionally, the evaluation of these algorithms has been limited, failing to accurately assess their effectiveness and practicality from the perspectives of both model deployers and data owners. To address this, we propose MUSE, a comprehensive machine unlearning evaluation benchmark that outlines six diverse desirable properties for unlearned models: (1) no verbatim memorization, (2) no knowledge memorization, (3) no privacy leakage, (4) utility preservation on data not intended for removal, (5) scalability with respect to the size of removal requests, and (6) sustainability over sequential unlearning requests. We applied these criteria to evaluate the effectiveness of eight popular unlearning algorithms on 7B-parameter LMs, specifically unlearning content from Harry Potter books and news articles. Our results show that while most algorithms can prevent verbatim and knowledge memorization to varying extents, only one algorithm avoids severe privacy leakage. Moreover, the existing algorithms often fail to meet deployers’ expectations as they degrade the general utility of the model and struggle to handle successive unlearning requests or large-scale content removal effectively. Our findings highlight significant practical issues with current unlearning algorithms for language models, prompting the release of our benchmark to encourage further evaluations.

Liability and Insurance for Catastrophic Losses: the Nuclear Power Precedent and Lessons for AI

by Trout, Cristian* [pdf]

As AI systems become more autonomous and capable, experts warn of them potentially causing catastrophic losses. Drawing on the successful precedent set by the nuclear power industry, this paper argues that developers of frontier AI models should be assigned limited, strict, and exclusive third party liability for harms resulting from Extraordinary AI Occurrences (EAIOs) – events that cause or easily could have caused catastrophic losses. Mandatory insurance for EAIO liability is recommended to overcome developers’ judgment-proofness, mitigate winner’s curse dynamics, and leverage insurers’ quasi-regulatory abilities. Based on theoretical arguments and observations from the analogous nuclear power context, insurers are expected to engage in a mix of causal risk-modeling, monitoring, lobbying for stricter regulation, and providing loss prevention guidance in the context of insuring against heavy-tail risks from AI. While not a substitute for regulation, clear liability assignment and mandatory insurance can help efficiently allocate resources to risk-modeling and safe design, facilitating future regulatory efforts.

Insuring Uninsurable Risks from AI: Government as Insurer of Last Resort

by Trout, Cristian* [pdf]

Many experts believe that AI systems will sooner or later pose uninsurable risks, including existential risks. This creates an extreme judgment-proof problem: few if any parties can be held accountable ex post in the event of such a catastrophe. This paper proposes a novel solution: a government-provided, mandatory indemnification program for AI developers. The program uses risk-priced indemnity fees to induce socially optimal levels of care. Risk-estimates are determined by surveying experts, including indemnified developers. The Bayesian Truth Serum mechanism is employed to incent honest and effortful responses. Compared to alternatives, this approach arguably better leverages all private information, and provides a clearer signal to indemnified developers regarding what risks they must mitigate to lower their fees. It’s recommended that collected fees be used to help fund the safety research developers need, employing a fund matching mechanism (Quadratic Financing) to induce an optimal supply of this public good. Under Quadratic Financing, safety research projects would compete for private contributions from developers, signaling how much each is to be supplemented with public funds.

Evaluations of Machine Learning Privacy Defenses are Misleading

by Aerni, Michael; Zhang, Jie*; Tramer, Florian [pdf]

Existing evaluations of empirical privacy defenses fail to characterize the privacy leakage of the most vulnerable samples, use weak attacks, and avoid comparisons with practical differential privacy baselines. We propose a stronger evaluation protocol that avoids those issues, and find that a properly tuned, high-utility DP-SGD baseline with vacuous provable guarantees outperforms many heuristic defenses in the literature.

Tracing datasets usage in the wild with data taggants

by Bouaziz, Wassim*; El-Mhamdi, El-Mahdi; Usunier, Nicolas [pdf]

Voices in the scientific community and legal instances such as the European Parliament have asked models providers to disclose which datasets have been used to train their models. But model providers could unwillingly omit certain training datasets or even ignore using unauthorized data. We propose a method for dataset tracing that relies on a statistical test and only require API access to the model.

Examining Data Compartmentalization for AI Governance

by Mitchell, Nicole*; Kairouz, Peter; Krier, Sébastien; Triantafillou, Eleni [pdf]

We present data compartmentalization as a unifying framework across a number of existing approaches that may allow for training and serving models with finer-grained control over data. We present hypotheses and open questions surrounding the suitability of technical approaches for addressing policy concerns related to AI governance.

Standardization of Behavioral Use Clauses is Necessary for the Adoption of Responsible Licensing of A

by McDuff, Daniel*; Korjakow, Tim; Cambo, Scott; Benjamin, Jesse Josua; Lee, Jenny; Jernite, Yacine; Gokaslan, Aaron K; Ferrandis, Carlos; Tarkowski, Alek; Lindley, Joseph; Cooper, A. Feder; Contractor, Danish [pdf]

Growing concerns over negligent or malicious uses of AI have increased the appetite for tools that help manage the risks of the technology. In 2018, licenses with behaviorial-use clauses (commonly referred to as Responsible AI Licenses) were proposed to give developers a framework for releasing AI assets while specifying their users to mitigate negative applications. As of the end of 2023, on the order of 40,000 software and model repositories have adopted responsible AI licenses licenses. Notable models licensed with behavioral use clauses include BLOOM (language) and LLaMA2 (language), Stable Diffusion (image), and GRID (robotics). This paper explores why and how these licenses have been adopted, and why and how they have been adapted to fit particular use cases. We use a mixed-methods methodology of qualitative interviews, clustering of license clauses, and quantitative analysis of license adoption. Based on this evidence we take the position that responsible AI licenses need standardization to avoid confusing users or diluting their impact. At the same time, customization of behavioral restrictions is also appropriate in some contexts (e.g., medical domains). We advocate for ``standardized customization’’ that can meet users’ needs and can be supported via tooling.

LLM Dataset Inference: Did you train on my dataset?

by Maini, Pratyush*; Jia, Hengrui; Papernot, Nicolas; Dziedzic, Adam [pdf]

The proliferation of large language models (LLMs) in the real world has come with a rise in copyright cases against companies for training their models on unlicensed data from the internet. Recent works have presented methods to identify if individual text sequences were members of the model’s training data, known as membership inference attacks (MIAs). We demonstrate that the apparent success of these MIAs is confounded by selecting non-members (text sequences not used for training) belonging to a different distribution from the members (e.g., temporally shifted recent Wikipedia articles compared with ones used to train the model). This distribution shift makes membership inference appear successful. However, most MIA methods perform no better than random guessing when discriminating between members and non-members from the same distribution (e.g., in this case, the same period of time). Even when MIAs work, we find that different MIAs succeed at inferring membership of samples from different distributions. Instead, we propose a new dataset inference method to accurately identify the datasets used to train large language models. This paradigm sits realistically in the modern-day copyright landscape, where authors claim that an LLM is trained over multiple documents (such as a book) written by them, rather than one particular paragraph. While dataset inference shares many of the challenges of membership inference, we solve it by selectively combining the MIAs that provide positive signal for a given distribution, and aggregating them to perform a statistical test on a given dataset. Our approach successfully distinguishes the train and test sets of different subsets of the Pile with statistically significant p-values < 0.1, without any false positives.

by Tucker, Aaron*; Doyle, Colin [pdf]

Practice guides are expert-written legal references that help acclimate lawyers to the legal issues and rules for a particular practice area. We investigate whether these practice guides can help LLMs to better answer legal questions and predict case outcomes.

by Whitney, Cedric* [pdf]

Machine learning systems require representations of the real world for training and testing - they require data, and lots of it. Collecting data at scale has logistical and ethical challenges, and synthetic data promises a solution to these challenges. Instead of needing to collect photos of real people’s faces to train a facial recognition system, a model creator could create and use photo-realistic, synthetic faces. The comparative ease of generating this synthetic data rather than relying on collecting data has made it a common practice. We present two key risks of using synthetic data in model development. First, we detail the high risk of false confidence when using synthetic data to increase dataset diversity and representation. We base this in the examination of a real world use-case of synthetic data, where synthetic datasets were generated for an evaluation of facial recognition technology. Second, we examine how using synthetic data risks circumventing consent for data usage. We illustrate this by considering the importance of consent to the U.S. Federal Trade Commission’s regulation of data collection and affected models. Finally, we discuss how these two risks exemplify how synthetic data complicates existing governance and ethical practice; by decoupling data from those it impacts, synthetic data is prone to consolidating power away those most impacted by algorithmically-mediated harm.

Generative AI Risk Categorization Decoded: Comparing Public and Private Sector Policies

by Zeng, Yi; Klyman, Kevin*; Zhou, Andy; Yang, Yu; Pan, Minzhou; Jia, Ruoxi; Song, Dawn; Liang, Percy; Li, Bo [pdf]

Drawing on policies from the public and private sector, we build a granular generative AI risk taxonomy and analyze differences in how companies and governments conceptualize risk

by Hou, Abe*; Jiang, Zhengping; Qin, Guanghui; Weller, Orion; Blair-Stanek, Andrew ; Van Durme , Benjamin [pdf]

Existing automatic tools evaluate the factuality of text generations based on factual precision, which measures the fraction of generated information being factually accurate. However, comprehensiveness and precision are both crucial aspects of reliable and verifiable text generation. In this work, we show that precision-based factuality metrics are limited in evaluating the comprehensiveness of text generations from certain domains, especially legal texts. We propose L-FRESco, {F}actual {R}ecall {E}valuation {Sco}re for {L}egal analysis generation. Inspired by FActScore, which decomposes generated text into atomic facts and then verifies their factuality, L-FRESco follows a Decompose-Then-Compare framework to compute similarity between the reference atomic claim and the generated atomic claim. Moreover, we explore a generalized variant, FRESco, and discuss its potentials to be applied across text domains.