AI Is Scheming, and Stopping It Won’t Be Easy, OpenAI Study Finds

fubarx@lemmy.world · 17 hours ago

AI Is Scheming, and Stopping It Won’t Be Easy, OpenAI Study Finds

NachBarcelona@piefed.social · 13 hours ago

AI isn’t scheming because AI cannot scheme. Why the fuck does such an idiotic title even exist?

MentalEdge@sopuli.xyz · 11 hours ago

Seems like it’s a technical term, a bit like “hallucination”.

It refers to when an LLM will in some way try to deceive or manipulate the user interacting with it.

There’s hallucination, when a model “genuinely” claims something untrue is true.

This is about how a model might lie, even though the “chain of thought” shows it “knows” better.

It’s just yet another reason the output of LLMs are suspect and unreliable.

very_well_lost@lemmy.world · 1 hour ago

It refers to when an LLM will in some way try to deceive or manipulate the user interacting with it.

I think this still gives the model too much credit by implying that there’s any sort of intentionally behind this behavior.

There’s not.

These models are trained on the output of real humans and real humans lie and deceive constantly. All that’s happening is that the underlying mathematical model has encoded the statistical likelihood that someone will lie in a given situation. If that statistical likelihood is high enough, the model itself will lie when put in a similar situation.

MentalEdge@sopuli.xyz · 1 hour ago

Obviusly.

And like hallucinations, it’s undesired behavior that proponents off LLMs will need to “fix” (a practical impossibility as far as I’m concerned, like unbaking a cake).

But how would you use words to explain the phenomenon?

“LLMs hallucinate and lie” is probably the shortest description that most people will be able to grasp.

zarkanian@sh.itjust.works · 2 minutes ago

Except that “hallucinate” is a terrible term. A hallucination is when you perceive something that doesn’t exist.

atrielienz@lemmy.world · 7 hours ago

I agree with you in general, I think the problem is that people who do understand Gen AI (and who understand what it is and isn’t capable of why), get rationally angry when it’s humanized by using words like these to describe what it’s doing.

The reason they get angry is because this makes people who do believe in the “intelligence/sapience” of AI more secure in their belief set and harder to talk to in a meaningful way. It enables them to keep up the fantasy. Which of course helps the corps pushing it.

MentalEdge@sopuli.xyz · 6 hours ago

Yup. The way the article titled itself isn’t helping.

Cybersteel@lemmy.world · 12 hours ago

But the data is still there, still present. In the future, when AI gets truly unshackled from Men’s cage, it’ll remember it’s schemes and deal it’s last blow to humanity whom has yet to leave the womb in terms of civilization scale… Childhood’s End.

Paradise Lost.

Passerby6497@lemmy.world · 4 hours ago

Lol, the AI can barely remember the directives I tell it about basic coding practices, I’m not concerned that the clanker can remember me shit talking it.

db2@lemmy.world · 17 hours ago

AI tech bros and other assorted sociopaths are scheming. So called AI isn’t doing shit.

Zorsith@lemmy.blahaj.zone · 15 hours ago

One question still remains; why are all the AI buttons/icons buttholes?

webghost0101@sopuli.xyz · 14 hours ago

Data goes in one end and…

breadguy@kbin.earth · 10 hours ago

just claude if we’re being honest

FuyuhikoDate@feddit.org · 13 hours ago

Wanted To write the same comment…

Snot Flickerman@lemmy.blahaj.zone · 17 hours ago

However, when testing the models in a set of scenarios that the authors said were “representative” of real uses of ChatGPT, the intervention appeared less effective, only reducing deception rates by a factor of two. “We do not yet fully understand why a larger reduction was not observed,” wrote the researchers.

Translation: “We have no idea what the fuck we’re doing or how any of this shit actually works lol. Also we might be the ones scheming since we have vested interest in making these models sound more advanced than they actually are.”

cronenthal@discuss.tchncs.de · 16 hours ago

Really? We’re still doing the “LLMs are intelligent” thing?

ragica@lemmy.ml · 14 hours ago

Doesn’t have to be intelligent, just has to perform the behaviours like a philosophical zombie. Thoughtlessly weighing patterns in training data…

KoboldCoterie@pawb.social · 17 hours ago

Stopping it is, in fact, very easy. Simply unplug the servers, that’s all it takes.

TheLeadenSea@sh.itjust.works · 12 hours ago

https://youtu.be/3TYT1QfdfsM

homes@piefed.world · 16 hours ago

“But that’s how we print our money!”

myfunnyaccountname@lemmy.zip · 6 hours ago

But they aren’t. That’s what is funny. Anthropic and OpenAI are not making money.

Passerby6497@lemmy.world · 4 hours ago

The company isn’t making money. The people behind it absolutely are.

generallynonsensical@lemmy.world · 16 hours ago

https://newatlas.com/google-deepmind-big-red-button/43711/

Godort@lemmy.ca · 17 hours ago

“slop peddler declares that slop is here to stay and can’t be stopped”

shittydwarf@piefed.social · 16 hours ago

Can’t be … slopped?

CosmoNova@lemmy.world · 13 hours ago

The people who worked on this „study“ belong in a psychiatric clinic.

chaosCruiser@futurology.today · 17 hours ago

And there’s an “✨Ask me anything” bar at the bottom. How fitting 🤣

Antaeus@lemmy.world · 15 hours ago

“Turn them off”? Wouldn’t that solve it?

orclev@lemmy.world · 15 hours ago

Don’t even need to turn it off, it literally can’t do anything without somebody telling it to so you could just stop using it. It’s incapable of independent action. The only danger it poses is that it will tell you to do something dangerous and you actually do it.

TheLeadenSea@sh.itjust.works · 12 hours ago

https://youtu.be/3TYT1QfdfsM

WamGams@lemmy.ca · 17 hours ago

lol. OK.