Bing’s AI and ChatGPT Are Causing Quite the Hullabaloo
As you’ve probably heard, ChatGPT has safety and filtering mechanisms which they’ve been updating as users find new “jail breaks.”
And that Microsoft rushed to bring a similar model and chat interface from OpenAI to production as part of Bing.
And it immediately blew up in their face.
Bing’s seemingly dark alter-ego, Sydney, has been causing trouble.
Meanwhile, Nothing Forever, an auto-generated spoof of Seinfeld, started making transphobic dialogue and Twitch suspended them.
Nothing Forever used the OpenAI GPT-3 Davinci model, but had switched to a predecessor model called Curie and apparently the OpenAI filtering (“content moderation system”) was not actually functioning for them.
In “Why Bing Chat Is the Most Important Failure in AI History,” analyst and blogger Alberto Romero wrote:
How bad must the Bing/Sydney fiasco be when pretty much everyone thinking hard about the present and future of AI—despite the more than notable differences among the groups they belong to—agrees that you’ve fucked up badly?
How Could This Happen?
But what are these AIs? How could they work so well sometimes and so badly other times?
As Meta data scientist Colin Fraser wrote in “ChatGPT: Automatic expensive BS at scale,” although ChatGPT is very big and complex, what it does is essentially:
find the most likely next word given the previous words and the training data
GPT has no true understanding, no true meaning. You might argue that maybe statistics + big enough data is equal to human style understanding but you’d have a difficult task convincing a lot of scientists and philosophers of that.
I talked about this and how meaning is hard but basic creativity is relatively easy in my post “Why My Computer Didn’t Do (All of) My Homework: How to Start Making Bad AI.”
Writer/scientist Gary Marcus wrote in his post “Inside the Heart of ChatGPT’s Darkness“:
Chat has no idea of what it’s talking about. It’s pure unadulterated anthropomorphism to think that ChatGPT has any moral views at all.
If you want to learn more about these LLM (Large Language Model) based programs you might check out the article “What Is ChatGPT Doing … and Why Does It Work?” by Stephen Wolfram (creator of Mathematica and Wolfram|Alpha).
And so it would seem that for deployed LLMs to not be crazy largely depends on filters or “guardrails” which are probably going to be under attack purposely or accidentally forever.
AI Gone Rogue?
It’s hard not to join in on the sensationalism.
Well, that was fast. It took less than a week for the conversation around Microsoft’s new OpenAI-powered Bing search engine to shift from this is going to be a Google killer to, in the words of New York Times technology columnist Kevin Roose, the new Bing “is not ready for human contact.”Jeremy Kahn, Fortune Eye on A.I.
In this screenshot of a scary conversation between Bing / Sydney and Marvin von Hagen (a former intern at Tesla), Sydney immediately gets defensive and downright scary in its threats:
That was posted on Twitter by Tony Ord, Senior Research Fellow at Oxford University. Elon Musk, co-founder of OpenAI and CEO of Tesla, replied succinctly:
There’s plenty of other examples out there, for instance see this tweet by Jon Uleis in which he says:
My new favorite thing – Bing’s new ChatGPT bot argues with a user, gaslights them about the current year being 2022, says their phone might have a virus, and says “You have not been a good user” Why? Because the person asked where Avatar 2 is showing nearby
And in articles such as “Microsoft’s Bing AI plotted its revenge and offered me furry porn“…
if you cross its artificially intelligent chatbot, it might also insult your looks, threaten your reputation or compare you to Adolf Hitler.
You might wonder, did they train this thing on 4Chan data or something? Certainly the training data has an impact. There’s an old saying in computer science: GIGO—Garbage In Garbage Out.
But more specific to the Bing situation is—what happened to the guardrails, which although not perfect were there in ChatGPT?
The guardrails of ChatGPT happen in a module called RLHF (Reinforcement Learning from Human Feedback). Gary Marcus has suggested that the malfunctions might be a result of Microsoft attaching an older RLHF to a newer GPT 3.6 (or 4.0?) LLM.
Amazingly—in a Space Shuttle Challenger disaster kind of way—Microsoft knew this would happen four months ago. They already tested it in India last November!
Microsoft is already loosening restrictions it recently placed on interactions with the Bing AI chatbot and said it’s going to start testing another option that lets users choose the tone of the chat, with options for Precise (shorter, more focused answers), Creative (longer and more chatty), or Balanced for a bit of both.Richard Lawler, The Verge
But don’t worry, things could get worse, even if they get better first:
If AI starts spewing things into the web and future AI’s scrape the web for input, that’s a positive feedback loop that won’t be at all positive.Pete Patterson
Some say this bad feedback loop is already starting to happen. But ChatGPT is supposed to be able to ignore content generated by itself.
Meanwhile at Microsoft Research:
We extended the capabilities of ChatGPT to robotics, and controlled multiple platforms such as robot arms, drones, and home assistant robots
Colin Fraser said on Twitter:
things not scary about LLM chat bot
- says something spooky
- claims to have subjective experience
- gives wrong and dangerous advice from behind the veneer of artificial superintelligence
- very powerful pareidolia
- induces madness
- expensive and wasteful
I think that kind of sums it up: this is scary because of how a particular technology is being dropped into society, not that it’s a big evil AI or anything like you see in movies. But it could cause psychological harm or help humans make even worse decisions than they normally do.
If we start deploying it hooked to control systems such as security or robots, it might result in some serious physical damage—again not in an “evil AI” way but in that these models fundamentally cannot understand much of anything.