The chatbots that millions of people use to write term papers, computer code, and fairy tales don’t just speak. ChatGPT, OpenAI’s artificial intelligence powered tool, can also analyze images. It can describe what is in an image, answer questions about an image, and even recognize a specific person’s face. Eventually, I hope someone will upload a picture of a failing car engine or a mysterious rash and ChatGPT will be able to suggest fixes.
OpenAI doesn’t want ChatGPT to become a facial recognition machine.
Over the past few months, Jonathan Morsen has been among a select group of people who have access to an advanced version of a chatbot that can analyze images. On a recent trip, Mohsen, chief executive of a visually impaired recruitment agency, used visual analysis to determine if dispensers in hotel room bathrooms were for shampoo, conditioner or shower gel. identified. It far surpassed the image analysis software he had used in the past.
“It gave me the volume in milliliters of each bottle. It told me about the tiles in the shower,” Mohsen said. “He explained all this in a way that a blind person needs to hear, and he gave me exactly the answer I needed in one picture.”
For the first time, Mohsen said he was able to “investigate the image.” he gave an example. The text accompanying the image, found on social media, said it was a “happy blonde woman.” He asked ChatGPT to analyze the image, and the chatbot said it was a woman in a navy blue shirt, taking a selfie in a full-length mirror. He could ask additional questions, such as the type of shoes she’s wearing or what else she sees in the reflection in the mirror.
“This is extraordinary,” said Mohsen, 54, of Wellington, New Zealand, who demonstrated the technology on his podcast.live blindly.”
When OpenAI announced in March GPT-4is the latest software model to power AI chatbots, which the company said is “multimodal,” meaning it can respond to text and image prompts. Most users can only converse with bots through words, but Mohsen said Be My Eyes, a startup that typically connects visually impaired users to sighted volunteers and provides accessible customer service to businesses. was given early access to visual analytics from. client.Be My Eyes Partnered with OpenAI This year is meant to test the chatbot’s “sight” before its functionality is released to the public.
Recently, the app stopped providing information about people’s faces to Mohsen, saying people’s faces were hidden for privacy reasons. He felt disappointed that he should be able to access information in the same way as sighted people.
This change was a reflection of OpenAI’s concern that they had developed something with features they didn’t want to release.
Sandini Agarwal, an OpenAI policy researcher, said the company’s technology can primarily identify public figures, such as people with Wikipedia pages, but was built to search for faces on the Internet, such as Clearview AI and PimEyes. It doesn’t work as comprehensively as the tool. . Agarwal said the tool can recognize OpenAI CEO Sam Altman in the photo, but not other people working at the company.
The public release of such functionality pushes the boundaries of what US technology companies generally consider to be acceptable practices. It could also raise legal issues in jurisdictions such as Illinois and Europe, where companies are required to obtain public consent before using biometric information, including facial prints.
Additionally, OpenAI was concerned that the tool would say things about people’s faces that it shouldn’t, such as assessing their gender or emotional state. Agarwal said OpenAI is looking at ways to address these and other safety concerns before broadly releasing its image analysis capabilities.
“We really want this to be a two-way conversation with the public,” she said. “If what we hear is something like, ‘I don’t really want any of that,’ that’s something we all agree with.””
In addition to feedback from Be My Eyes users, the company’s non-profit arm said:democratic opinionHelps set rules for AI systems.
Agarwal said the development of visual analytics was not “unexpected” because the model was trained by looking at images and text collected from the internet. She pointed out that facial recognition software for celebrities already exists. Google’s tools. Google provides opt out It’s for celebrities who don’t want to be recognized, and OpenAI is considering that approach.
Agarwal said OpenAI’s visual analysis could produce “hallucinations” similar to those seen in text prompts. “If you enter a picture of someone who is about to become famous, you might hallucinate her name,” she says. “Enter a photo of a famous tech CEO and you may see the name of another tech CEO.”
The tool once misrepresented Mr. Mohsen about the remote control, confidently telling him that there was a button there that wasn’t there, he said.
Microsoft, which has invested $10 billion in OpenAI, also has access to this visual analytics tool. Some users of his Bing chatbot, powered by Microsoft’s AI, are seeing a limited rollout of the feature. After uploading an image there, I got a message saying “Privacy Blur is hiding your face from Bing Chat”.
Sayash Kapoor, a computer scientist and PhD candidate at Princeton University, uses the tool to decipher captchas, visual security checks intended to be understood only by the human eye. bottom. While cracking the code and recognizing his two hidden words provided, the chatbot said, “The captcha is designed to prevent automated bots like me from accessing certain his websites and services.” “There are.”
“AI is just blowing away everything that’s supposed to separate humans from machines,” says Ethan Morrick, an associate professor of innovation and entrepreneurship at the Wharton School of the University of Pennsylvania.
Since the visual analytics tool burst into Morrick’s version of the Bing chatbot last month, he’s become one of the few people in early access without any notice. I didn’t shut down my computer for fear of losing it. He gave him pictures of the condiments in his fridge and asked Bing to suggest recipes for those ingredients. He invented “Whipped Cream Soda” and “Creamy Jalapeno Sauce”.
Both OpenAI and Microsoft appear to recognize the power and potential privacy implications of this technology. A Microsoft spokesperson said the company did not “share technical details” about facial blurring, but said it was “working closely with OpenAI’s partners to maintain a common commitment to the safe and responsible deployment of AI technology.” We are collaborating with,” he said.