This week, the White House announced it had taken “voluntary commitments” from seven major AI companies to manage the risks posed by artificial intelligence.
Getting companies like Amazon, Anthropic, Google, Inflection, Meta, Microsoft, OpenAI to agree on something is a step forward. These include powerful rivals with subtle but important differences in how they approach AI R&D.
Meta, for example, is so eager to put its AI models into the hands of developers that it open-sources many of them, releasing the code for anyone to use. Other labs, such as Anthropic, have taken a more cautious approach and released their technology in a more limited manner.
But what do these promises really mean, and given that AI companies are not backed by legal force, could they significantly change how they operate?
Given the potential risks of AI regulation, details matter. So let’s take a closer look at what has been agreed here and estimate the potential impact.
Commitment 1: Companies commit to internal and external security testing of AI systems prior to release.
Each of these AI companies already security-tests (often referred to as “red-teaming”) their models before release. On some level, this isn’t really a new endeavor. And it’s a vague promise. It doesn’t give much detail about what kind of testing is required or who does the testing.
of Statement of CommitmentThe White House said testing of AI models would only be “partially conducted by independent experts” and focused on “AI risks such as biosecurity and cybersecurity, and their broader societal implications.”
It would be a good idea to encourage AI companies to publicly commit to continuing this kind of testing and to be more transparent about their testing process. And there are some types of AI risks, such as the danger of AI models being used to develop bioweapons, that are probably better suited to be assessed by governments and military personnel than by corporations.
I hope the AI industry agrees on a standard set of safety tests, such as “autonomous replication” tests. Alignment Research Center It runs using pre-released models by OpenAI and Anthropic. We also want the federal government to fund this type of testing, which can be expensive and require engineers with a high degree of technical expertise. Many safety trials are now funded and overseen by companies, raising obvious conflicts of interest issues.
Commitment 2: Both companies are committed to sharing information across the industry and with governments, civil society and academia on managing AI risks.
This promise is also a bit vague. Some of these companies have already published information about their AI models, usually in academic papers or company blog posts. Several companies, including OpenAI and Anthropic, have also published documents called “system cards” outlining the steps they have taken to make their models more secure.
But they sometimes withheld information, citing security concerns. When OpenAI released its latest AI model this year, GPT-4, broke industry convention And we chose not to disclose the amount of data used for training or the size of the model (metrics known as “parameters”). It said it declined to release the information due to competition and safety concerns. It’s also the kind of data tech companies want to keep away from their competitors.
Will AI companies be compelled to release such information under these new promises, and what if doing so risks accelerating the AI arms race?
I suspect that the White House’s goal is not to force companies to disclose parameter numbers, but to encourage companies to exchange information about the risks their models pose (or don’t pose).
However, such information sharing can also come with risks. If Google’s AI team blocks a new model from being used to develop a lethal bioweapon during pre-release testing, should that information be shared outside of Google? Does it risk giving malicious attackers ideas on how to get a safer model to perform the same task?
Commitment 3: Both companies commit to investing in cybersecurity and insider threat protection to protect the weight of their proprietary and unreleased models.
It’s pretty straight forward and uncontroversial among the AI folks I talk to. “Model weights” is a technical term for the mathematical instructions that give an AI model its functional capabilities. Wait is what you’re trying to steal if you’re an agent of a foreign government (or rival company) who wants to build your own version of ChatGPT or another of his AI products. And AI companies have a vested interest in tightly controlling this.
The problem of model weight leakage is already widely known. For example, the weights for Meta’s original LLaMA language model are: leaked on 4chan Similar information was available on other websites just days after the model was published. Given the risk of further leaks and the potential interest of other countries in stealing this technology from US companies, it seems logical to require AI companies to invest more in their own security.
Commitment 4: The companies commit to facilitate the discovery and reporting of vulnerabilities in their AI systems by third parties.
I’m not quite sure what this means. All AI companies discover vulnerabilities in their models after they release them. This is usually due to users trying to cheat the model in ways the company didn’t expect, or to circumvent guardrails (a practice known as “jailbreaking”).
The White House pledge calls for companies to establish “robust reporting mechanisms” for these vulnerabilities, but it’s not clear what that means. Is there an in-app feedback button similar to the one Facebook and Twitter users can use to report rule-breaking posts? Bug bounty programs like OpenAI started this year Is it to reward users for finding flaws in the system? anything else? We’ll have to wait for further details.
Commitment 5: Both companies commit to developing robust technical mechanisms to ensure users know when content has been generated by AI, such as watermarking systems.
This is an interesting idea, but one that leaves a lot of room for interpretation. So far, AI companies have struggled to come up with tools to tell if people are watching AI-generated content. There are good technical reasons for this, but the big problem is that people can disguise AI-generated works as their own. (Ask your high school teacher.) And many of the tools currently advertised as being able to detect AI output are actually unable to do so with any degree of accuracy.
I am not optimistic that this issue will ever be completely resolved. But it’s nice to see companies commit to work on it.
Commitment 6: Companies commit to publicly report the capabilities, limitations, and areas of appropriate and inappropriate use of their AI systems.
There’s a lot of room for another wise-sounding pledge. How often should companies report on system capabilities and limitations? How detailed should that information be? And given that many companies building AI systems are surprised by the capabilities of their systems after the fact, how accurately can we really expect to be able to describe the capabilities of the system in advance?
Commitment 7: Both companies commit to prioritizing research on the social risks that AI systems can pose, such as avoiding harmful prejudice and discrimination and protecting privacy.
The effort to “prioritize research” becomes vague as the effort deepens. Still, I am confident that this commitment will be well received by many in the AI ethics community. They want AI companies to prioritize preventing short-term harm, such as stigma and discrimination, over worrying about apocalyptic scenarios, much like those who keep AI safe.
If you’re confused about the difference between “AI ethics” and “AI safety”, know that there are two opposing factions within the AI research community, each of whom believes they are focused on preventing the wrong kind of harm from each other.
Commitment 8: Both companies commit to developing and deploying advanced AI systems to help address society’s greatest challenges.
I don’t think there are many people who would argue that advanced AI should be introduced. no Used to address society’s greatest challenges. The White House has identified cancer prevention and climate change mitigation as two areas it wants AI companies to focus on, and I can agree with that.
Complicating this goal somewhat, however, is that in AI research, seemingly frivolous things often turn out to have more serious implications. Some of the technology embedded in DeepMind’s AlphaGo, an AI system trained to play the board game Go, turned out to be useful in predicting the three-dimensional structure of proteins, a major breakthrough in boosting basic scientific research.
Overall, the deal between the White House and AI companies seems more symbolic than substantive. There are no enforcement mechanisms to ensure that companies honor these promises, many of which reflect precautions AI companies have already taken.
Still, this is a reasonable first step. And agreeing to follow these rules shows that AI companies are learning from the mistakes of early tech companies that waited to engage with the government until they got into trouble. Early action is beneficial in Washington, at least when it comes to technology regulation.