How do the White House commitments on AI stack up?

The White House announced this week that it has secured “voluntary commitments” from seven leading AI companies to manage the risks posed by AI.

Getting companies—Amazon, Anthropic, Google, Inflection, Meta, Microsoft, and OpenAI—to agree to anything is a step forward. They include bitter rivals with subtle but important differences in the ways they approach AI research and development.

Meta, for example, is so eager to get its AI models into the hands of developers that it has opened several of them up, putting their code out in the open for anyone to use. Other labs, such as Anthropic, have taken a more cautious approach, releasing their technology in more limited ways.

But what do these obligations actually mean? And are they likely to change much about how AI companies operate, given that they are not backed by the force of law?

Given the potential risks of regulating AI, the details matter. So let’s take a closer look at what’s been agreed upon here and quantify the potential impact.

Commitment 1: Companies are committed to internal and external security testing of their AI systems prior to release.

Each of these AI companies already runs security testing – what’s often called a “red team” – of their models before they’re released. On one level, this isn’t really a new commitment. It is a vague promise. It does not come with many details about the type of test required, or who will perform the test.

in Accompanying statement of commitmentsThe White House said only that testing of AI models “will be conducted in part by independent experts” and focus on AI risks “such as biosecurity and cybersecurity, as well as its broader societal implications.”

It’s a good idea to have AI companies publicly commit to continuing to run this type of testing, and to encourage more transparency in the testing process. And there are some kinds of AI risks — such as the risk that AI models will be used to develop biological weapons — that government and military officials may be more appropriate than companies to assess.

I’d like to see the AI ​​industry agree to a standard set of safety tests, such as the “standalone copy” tests Alignment Research Center Runs on models previously released by OpenAI and Anthropic. I’d also like to see the federal government fund these types of tests, which can be expensive and require engineers with significant technical expertise. Currently, many safety tests are funded and supervised by companies, which raises obvious questions about conflicts of interest.

Commitment 2: Companies commit to sharing information across the industry and with governments, civil society, and academia about AI risk management.

This commitment is also a bit vague. Many of these companies already publish information about their AI models—usually in academic papers or company blog posts. A few of them, including OpenAI and Anthropic, also publish documents called “System Cards,” which outline the steps they’ve taken to make these models more secure.

But they also withheld information at times, citing safety concerns. When OpenAI released its latest AI model, GPT-4, this year I cut industry habits It chose not to disclose how much data it was trained on, or the size of the model (a metric known as “parameters”). It said it declined to release this information due to competition and safety concerns. It also happens to be the kind of data tech companies want to keep out of competition.

Under these new commitments, will AI companies be forced to release this kind of information? What if doing so threatens to accelerate the AI ​​arms race?

I suspect the White House’s goal is less to force companies to disclose the number of their standards than to encourage them to share information with each other about the risks that their models pose (or do not represent).

But even this kind of information sharing can be fraught with danger. If Google’s AI team prevents a new model from engineering a deadly biological weapon from being used during beta testing, should it share that information outside of Google? Would that risk giving bad parties ideas about how they could get a less guarded model to perform the same task?

Commitment 3: Companies commit to investing in cybersecurity and insider threat safeguards to protect the weights of proprietary and unpublished models.

This one is straightforward and uncontroversial among the AI ​​insiders I spoke with. “Model weights” is a technical term for the mathematical instructions that give AI models the ability to run. Weights are what you want to steal if you’re an agent of a foreign government (or a competing company) who wants to build your own version of ChatGPT or another AI product. And it’s something AI companies have a vested interest in keeping under tight scrutiny.

There have already been well publicized issues with model weights leaking. The weights of the LLaMA language archetype were from Meta, for example Leaked on 4chan and other websites a few days after the form was released to the public. Given the risks of further leaks — and the interest other countries might have in stealing this technology from US companies — asking AI companies to invest more in their own security seems like a no-brainer.

Commitment 4: Companies are committed to making it easier for third parties to discover and report vulnerabilities in their AI systems.

I’m not really sure what this means. Every AI company has discovered vulnerabilities in their models after they launch, usually because users try to do bad things with the models or circumvent their own firewalls (a practice known as “jailbreaking”) in ways the companies didn’t expect.

The White House commitment calls for companies to create a “robust reporting mechanism” for these vulnerabilities, but it’s not clear what that might mean. In-app comments button, similar to the one that allows Facebook and Twitter users to report posts that break the rules? A bug reward program, such as the program OpenAI started this year To reward users who find flaws in its systems? another thing? We’ll have to wait for more details.

Commitment 5: Companies are committed to developing robust technical mechanisms to ensure that users know when content is being generated by AI, such as a system of watermarks.

This is an interesting idea but leaves a lot of room for interpretation. Until now, AI companies have struggled to create tools that let people know whether or not they’re looking at AI-generated content. There are good technical reasons for this, but it is a real problem when people can consider work generated by AI as their own. (Ask any high school teacher.) And many of the tools currently being touted as being able to detect AI output cannot do so with any degree of accuracy.

I’m not optimistic that this problem is completely fixable. But I’m glad companies are pledging to work on it.

Commitment 6: Companies are obligated to publicly communicate the capabilities and limitations of their AI systems and areas of appropriate and inappropriate use.

Another undertaking that makes sense with plenty of wiggle room. How often will companies be required to report the capabilities and limitations of their systems? How detailed will this information be? And given that many companies that build AI systems have been surprised by the capabilities of their own systems after the fact, how well can they be described upfront?

Commitment 7: Companies commit to prioritizing research on the societal risks that AI systems can pose, including avoiding harmful bias and discrimination and protecting privacy.

The commitment to “prioritize research” is about as nebulous as the commitment becomes. Still, I’m sure this commitment will be well received by many in the AI ​​ethics crowd, who want AI companies to make preventing near-term harms like bias and discrimination a priority over worrying about doomsday scenarios, as AI safety experts do.

If you’re confused by the difference between “AI ethics” and “AI security,” just know that there are two warring factions within the AI ​​research community, each believing the other is focused on preventing the wrong kinds of harm.

Commitment 8: Companies are committed to developing and deploying advanced AI systems to help solve society’s biggest challenges.

I don’t think many people would argue that advanced AI should do that no Used to help tackle society’s biggest challenges. The White House lists “cancer prevention” and “climate change mitigation” as two areas where it would like AI companies to focus their efforts, and you won’t get any disagreement from me there.

What makes this goal somewhat complex, though, is that in AI research, what starts out as trivial often has more serious implications. Some of the technology that went into DeepMind’s AlphaGo program — an artificial intelligence system trained to play the board game Go — has been shown to be useful in predicting the 3D structures of proteins, a key finding that has bolstered basic scientific research.

Overall, the White House’s deal with AI companies seems more symbolic than substantial. There is no enforcement mechanism to make sure companies adhere to these obligations, and many of them reflect precautions AI companies are already taking.

However, it is a reasonable first step. Agreeing to follow these rules shows that AI companies have learned from the failures of past technology companies, which waited to engage with government until they got into trouble. In Washington, at least in terms of technology regulation, it pays to show up early.