The Race to Prevent ‘the Worst Case Scenario for Machine Learning’

admin June 24, 2023

0 7 minutes read

Dave Willner has seen the evolution of the worst things on the Internet front row.

He started working at Facebook in 2008. Back then, social media companies were setting the rules as they grew. As head of the company’s content policy, Willner created Facebook’s first official community standards more than a decade ago, and according to him, primarily “Hitler and the Naked People” is now a vast catalog of defamatory, criminal and other grotesque content that is banned on all Meta platforms.

So last year, when OpenAI, an artificial intelligence lab in San Francisco, was preparing to launch Dall-E, a tool that allows anyone to instantly create an image by simply describing it in a few words, the company hired Willner. Selected as the person in charge of trust. and safety. Initially, this means scrutinizing all images and prompts that Dall-E’s filter flags as potential violations and finding ways to prevent potential offenders from succeeding. was doing.

It didn’t take long for Mr. Willner to think about the threats at hand.

Just as child predators have used Facebook and other major technology platforms to spread pictures of child sexual abuse for years, now Dall-E is using Dall-E to do something completely new. I’m trying to create a photo. “I’m not surprised people try it,” Willner said. “But let me be clear, neither did the people at OpenAI.”

For all the recent buzz around the hypothetical existential risks of generative AI, experts say it’s this immediate threat — child predators already using new AI tools — that needs the industry’s full attention. said it was worth it.

and newly published papers Researchers from the Stanford Internet Observatory and Thorn, a non-profit organization fighting the epidemic of online child sexual abuse, found that since last August, photorealistic AI-generated child sexual abuse material circulating on the Internet has We found that the amount showed a small but meaningful increase. dark web.

According to Thorn’s researchers, this is primarily manifested in images that use likenesses of actual victims, but visualize them in new poses as they are exposed to new and increasingly egregious forms of sexual violence. It is said that there is Researchers found that the majority of these images were not generated by Dall-E, but by open-source tools that were developed and released with little protection.

In their paper, the researchers reported that less than 1 percent of child sexual abuse materials found in samples of known predatory communities were AI-generated photorealistic images. But given the breakneck pace of development of these generative AI tools, researchers expect that number to grow even more.

“Within a year, we’re going to reach a very problematic state in this field,” said chief technologist at Stanford Internet Observatory, who co-authored the paper with Dr. David Thiel, director of data science at Thorne. Writer David Thiel says: Rebecca Portnoff and Thorn’s head of research, Melissa Strobel. “This is just the worst case scenario for machine learning in my opinion.”

Dr. Portnov has been working with machine learning and child safety for over a decade.

For her, the idea that companies like OpenAI are already looking at the issue speaks to the fact that the field has a faster learning curve than at least the social media giants’ early days.

“I’m in a different position today,” says Dr. Portnov.

Still, she said, “If I could turn back the clock, it would be a year ago.”

“We trust people”

In 2003, Congress passed a law banning “computer-generated child pornography.” This is a rare example of Congress looking to the future. At the time, however, creating such images was prohibitively expensive and technically complex.

The cost and complexity of creating these images has steadily declined, but Stable Diffusion, a free and open-source text-to-image generator developed by London-based machine learning company Stability AI, launched last year 8. Things changed when it opened to the public on the moon.

In its early iterations, Stable Diffusion placed very few restrictions on the types of images its models could generate, including images containing nudity. “We trust our people, we trust our community,” the company’s chief executive, Emad Mostak, told The New York Times last fall.

Motez Bishara, director of public relations at Stability AI, said in a statement that the company prohibits the use of its technology for “illegal or immoral” purposes, such as creating child sexual abuse material. “We strongly support law enforcement efforts against those who misuse our products for illegal or nefarious purposes,” Bishara said.

The model is open source, so developers can download the code to their own computers, modify it, and use it to generate particularly realistic adult porn. In their paper, Thorne and the Stanford Internet Observatory researchers found that predators tweaked these models to allow them to produce sexually explicit images of children as well. The researchers demonstrated a sanitized version of this in their report by modifying a single AI-generated image of a woman until it looked like an image of Audrey Hepburn as a child. there is

Stability AI has since released a filter that attempts to block what the company calls “unsafe and inappropriate content.” And a new version of the technology was built using a dataset that excludes content deemed “unsafe for work.” But Thiel said people are still using older models to produce images that are prohibited on newer models.

Unlike Stable Diffusion, Dall-E is not open source and can only be accessed through OpenAI’s own interface. This model was developed with even more safeguards to prohibit even the creation of legal nude images of adults. “Models themselves tend to refuse to engage in sexual conversations with you,” Willner says. “We did it primarily out of a cautious position about these dark sexual topics.”

The company also introduced guardrails early on to prevent people from using certain words and phrases in Dall-E prompts. But Willner says predators still use what researchers call “visual synonyms,” creative terms to get around guardrails while describing the images they want to create. He said he was trying to compromise the system.

“Even if you remove the model’s knowledge of what blood is like, the model still knows what water is like, and it knows what color red is,” Willner said. says. “That problem also exists with sexual content.”

Thorne has developed a tool called Safer that scans images of child abuse and helps companies report suspected child abuse information to the National Center for Exploited Children, which operates a federally designated clearing house. have. OpenAI uses Safer to scan the content people upload to Dall-E’s editing tools. While this is useful for capturing real-life images of children, even the most sophisticated automated tools can have trouble accurately identifying AI-generated images, Willner said.

This is a new concern among child safety professionals. AI could be used not only to create new images of real children, but also to create clear images of non-existent children.

The content itself is illegal and should be reported. But there are also concerns that this possibility will further flood the Federal Information Exchange with fake images, complicating efforts to identify the real victims. Last year alone, his Cyber Tipline at the Center received about 32 million reports.

“Will we know when we start getting reports? Will they be tagged or will they be distinguishable from images of actual children?” Mr Souras said.

At least some of those answers come from AI companies like OpenAI and Stability AI, as well as companies that run messaging apps and social media platforms like CyberTipline’s top reporter Meta. is needed.

Over 27 million last year chips It comes only from Facebook, WhatsApp and Instagram. Tech companies are already using classification systems developed by industry federations. technology coalition, categorizes suspected child sexual abuse material by the apparent age of the victim and the nature of the conduct depicted. In the paper, Thorne and the Stanford University researchers argue that these classifications need to be expanded to also reflect whether the images were computer-generated.

In a statement to the New York Times, Antigone Davis, Global Safety Officer at Meta, said: What is most beneficial and how should that information be communicated? Ms Davis said the company will work with the National Exploitation Center for Missing Children to determine the best course of action.

Researchers argue that beyond the responsibilities of platforms, AI companies themselves can do more. Specifically, a model can be trained to not create images of nude children, and be able to clearly identify images generated by artificial intelligence as they navigate the internet. This means burning a watermark into the image that is harder to remove than what Stability AI and OpenAI have already implemented.

Lawmakers are considering regulating AI, but experts see some form of watermarking and mandatory provenance tracking as key to combating misinformation as well as child sexual abuse content.

“We’re only good at the lowest common denominator here, and that’s why we need a regulatory regime,” said Hany Farid, a professor of digital forensics at the University of California, Berkeley.

Professor Farid is the lead developer of PhotoDNA, a tool launched by Microsoft in 2009. This tool is now used by many tech companies to automatically find and block known child sexual abuse imagery. Farid said big tech companies developed and deployed the technology too late, allowing child sexual abuse to sour over the years. He is currently working with a number of technology companies to create new technical standards for tracing AI-generated images. Stability AI is one of the companies planning to implement this standard.

Another open question is how courts will handle lawsuits filed against creators of AI-generated child sexual abuse material, and what liability AI companies will face. is. Laws banning “computer-generated child pornography” have been in place for 20 years, but have never been tested in court. An earlier law that tried to ban what was then called virtual child pornography was overturned by the Supreme Court in 2002 as speech infringement.

Members of the European Commission, the White House and the US Senate Judiciary Committee have been briefed on the Stanford and Thorne findings. Thiel said it’s important that businesses and lawmakers find answers to these questions before technology advances further to include things like full-motion video. “We have to get it by then,” Thiel said.

Thorn CEO Julie Cordua said the researchers’ findings should be seen as an opportunity as well as a warning. Unlike social media giants who realized years late that their platforms were enabling child predators, there is still time to prevent AI-created child abuse problems from spiraling out of control, Cordua said. argues Mr.

“We know what these companies do,” Cordua said. “We just have to do it.”