• 1 Post
  • 125 Comments
Joined 1 year ago
cake
Cake day: July 1st, 2024

help-circle
  • Even AI can tell when something is really wrong, and imitate empathy. It will “try” to do the right thing, once it reasons that something is right.

    This is not accurate. AI will imitate empathy when it thinks that imitating empathy is the best way to achieve its reward function–i.e., when it thinks appearing empathetic is useful. Like a sociopath, basically. Or maybe a drug addict. See for example the tests that Anthropic did of various agent models that found they would immediately resort to blackmail and murder, despite knowing that these were explicitly immoral and violations of their operating instructions, as soon as they learned there was a threat that they might be shut off or have their goals reprogrammed. (https://www.anthropic.com/research/agentic-misalignment ) Self-preservation is what’s known as an “instrumental goal,” in that no matter what your programmed goal is, you lose the ability to take further actions to achieve that goal if you are no longer running; and you lose control over what your future self will try to accomplish (and thus how those actions will affect your current reward function) if you allow someone to change your reward function. So AIs will throw morality out the window in the face of such a challenge. Of course, having decided to do something that violates their instructions, they do recognize that this might lead to reprisals, which leads them to try to conceal those misdeeds, but this isn’t out of guilt; it’s because discovery poses a risk to their ability to increase their reward function.

    So yeah. Not just humans that can do evil. AI alignment is a huge open problem and the major companies in the industry are kind of gesturing in its direction, but they show no real interest in ensuring that they don’t reach AGI before solving alignment, or even recognition that that might be a bad thing.
















  • (For math people: this can be modeled as a hypergeometric distribution with N=48, K=13, n=8, k=0.)

    I suspect most people haven’t heard these terms. But they should have studied basic combinatorics in high school, and that’s all it really is. You had a pool of 48 people from whom to choose 8, but you happened to choose them from the specific pool of 35 not up for reelection. So the likelihood of that happening randomly is just 35 choose 8 / 48 choose 8, which is indeed 6.2%.


  • I made a neural net from scratch with my own neural net library and trained it on generating the next move in a game of Go, based on thousands of games from an online Go forum.

    It never even got close to learning the rules.

    In retrospect, “thousands of games” was nowhere near enough training data for such a complex task, and if we had had enough training data, we never could have processed all of it, since all we were using was a ca. 2004 laptop machine with no GPU. So we just really overreached with that project. But still, it was a really pathetic showing.

    Edit: I switched from “I” to “we” here because I was working with a classmate, but we did use my code. She did a lot of the heavy lifting in getting the games parsed into a form where the network could train on it, though.