Description
Author's note: this is somewhat more rushed than ideal, but I think getting this out sooner is pretty important. Ideally, it would be a bit less snarky. Anthropic[1] recently published a new piece of research: The Hot Mess of AI: How Does Misalignment Scale with Model Intelligence and Task Complexity? (arXiv, Twitter thread). I have some complaints about both the paper and the accompanying blog post. tl;dr The paper's abstract says that "in several settings, larger, more capable mod...