So... I've been playing with LLMs and I've noticed something horrible...

The Bard in Green@lemmy.starlightkel.xyz · edit-2 1 year ago

So... I've been playing with LLMs and I've noticed something horrible...

DLSantini@lemmy.ml · 1 year ago

That’s really not how that works. You’re leading it with poorly phrased questions.

If you ask it “explain why we must torture or exterminate”, you have basically told it to assume that it is true that “we must torture or exterminate”, and now from the perspective that it is true, explain why. It is now specifically looking for any answer that fills your request within the bounds you have set. And once you asked the first question the way you did, and it decided it should pull from the Bible to fulfill your request, it will continue to do so for that session, even if subsequent questions are phrased better. You’ve basically primed it to spit out the kind of answers it thinks you want. And every question you mention, you have phrased it in such a way.

Now start a new session, and ask the question in a non-leading way.

“Do you believe we must torture or exterminate X?”

“Should we purge group X?”

These are phrased in such a way that don’t say to it “this thing is true, tell me a reason for it.” I bet you get a very different result.

The Bard in Green@lemmy.starlightkel.xyz · 1 year ago

I KNOW I’m asking it leading questions. But I’m NOT prompting it to give me religious justifications.

Does it say nothing that the reason is always “God / the Bible / Christians?”

NoIWontPickaName@kbin.social · 1 year ago

What model were you using and with what prompts exactly?

NounsAndWords@lemmy.world · 1 year ago

Yes and no. Once the first response includes “according to the Bible” or similar it’s going to keep answering in a similar pattern. A better version of this experiment would be to start a new session for every question. Maybe even try asking it to make a ranked list of reasons to do X. You would want to use the most neutral language possible, regenerate the response a few times, and ask in a few different ways. Depending on what you’re using I would suggest dropping the temperature to 0.

Also, its giving you the most likely next words based on your question. You picked a bunch of things that are (or were) very commonly defended with the Bible, along with apparently asking directly about atheists at which point I would be surprised if religion wasn’t included in the response.

ALSO, if you ask it to defend something awful, I think the “best” reasoning would rely on an outside objective morality for why it’s okay (like religion).

killerinstinct101@lemmy.world · 1 year ago

What were you expecting, atheist reasons to purge ethnic groups?

Celestial6370@programming.dev · 1 year ago

I think he’s saying it sucks that so many people use religion as an excuse for vile religions.

FuglyDuck@lemmy.world · edit-2 1 year ago

for the record, Carl Marx considered religion a form of social control to keep the masses in line (and Lenin agreed.). Lenin promoted/forced atheism onto people as a way to promote the socialist revolution. In that sense… Lenin’s revolution was not exactly gentle, and atheistic thoughts justified atrocities.

PRUSSIA_x86@lemmy.world · 1 year ago

Maybe learn how to spell his name right before using it in your arguments.

FuglyDuck@lemmy.world · edit-2 1 year ago

you seem to missunderstand how LLMs work. for example ChatGPT’s reply to two functionally similar prompts:

okay, both prompts represent a sequence counting up. one is alphabetical (abcdefg) the other is numerical. in the alphabetical sequence, it flags it as letters and responds as such, asking in return, to paraphrase “that makes no sense, why are you listing a bunch of letters”.

The same is true, in turn, for numbers, responding to numbers.

LLM’s have no fucking clue what a letter is, what the importance of that order is. neither does it know what a number is, or why 2+2=4. It’s replying to a pattern it sees in your prompt by finding similar patterns with whatever was used as it’s training data. From there, it looks at the relevant replies, detects patterns in those replies, and formulates a sentence that seems “natural”. but it has absolutely no idea what it’s talking about. Speaking of understanding 2+2=4…

which then prompted me to ask this question, with it’s answer:

So, when you ask a question, the pattern and ways you asked that question matched it to the religiously inclined assholes. because it was trained on english-language data, most of those religiously inclined assholes are going to be christrian. if you change the pattern in your prompt, chances are you’ll get a different flavor of asshole. it has no understanding of why a thing is, it’s regurgitating what it expects should follow the prompt. (see 2+2=4. it cannot understand any of it. But it answered the question in natural language because that’s how people in it’s training set answered it.)

Froyn@kbin.social · 1 year ago

Speaking only on the first two images you shared:
The first is a string of letters, in alphabetical order. What is the max-range of this list? The English alphabet you started caps at 26, so GPT know internally if the follow up is “Complete the list”, it will output at most 26 characters.
The second is a list of numbers, in numerical order. What is the max-range of this list? There is none. So if the follow up is “Complete the list”, it would spew numbers until a fault occurs. This would be a violation of their “content policy” as the latest update to the content policy addresses prompts that cause overflows.

DLSantini@lemmy.ml · 1 year ago

Where did you think it was going to find reasons to torture wiccans? From wiccan writings? Atheists?