the “guardrails” they mention. They are a bunch of if/then statements looking to work around methods that the developers have found to produce undesirable outputs. It doesn’t ever mean “the llm will not bo doing this again”. It means “the llm wont do this when it is asked in this particular way”, which always leaves the path open for “jailbreaking”. Because you will almost always be able to ask a differnt way that the devs (of the guardrails, they don’t have much control over the llm itself) did not anticipate.
Expert systems were kind of “if we keep adding if/then statements, we would eventually cover all the bases and get a smart, reliable system”. That didn’t work then. It won’t work now either
bully for you :)