Hacker Newsnew | past | comments | ask | show | jobs | submit | Topfi's commentslogin

I can, just not in the US [0]. I always presumed this is linked to the health care being provided by employers rather than having a more robust safety net that allows for civil disobedience without having to fear existential risks. However, I also can’t forget that the French have their safety net not as a God given right, but because they fought for it via (often not just civil) protest. Reference also the statements MLK JR made concerning the willingness of white moderates to engage in actually effective disobedience, even when their financial situation allows for such.

[0] https://thenonviolenceproject.wisc.edu/2023/06/02/recent-pro...


There are more people not on employer-provided health insurance in the US than exist in France. How does your presumption work given that fact?

I just checked and it's 54% in the US [0] vs 0% in France (cause basic public healthcare isn't tied to employment). So, unless you are referring to absolute numbers (not very helpful when comparing countries of different sizes), I'm not sure what you are referring to.

I will admit that my purely personal thesis on this front goes a bit beyond healthcare. I feel that a robust safety net, iron clad right to protest and a large, at least reasonably financially stable (meaning no existential financial fears for at least the majority of citizenry, i.e. above roughly 60% middle class for a given economy) are needed to allow for protests in such a manner that the citizenry are both capable, willing and informed sufficiently to protect their own interests and democracy as a whole. Having the right and ability to protests is needed, just as much as being comfortable enough to have the time to actually stay politically engaged (consistent financial strain being a reasonable cause for why one doesn't stay informed in my book). France or my home country of Austria (imperfect countries like any other, I will (un)happily admit) on that front are in the 65%-75% range, whereas the US appears to barely get above 50% purely by income along with higher health care costs in general and employer linked plans for as stated above the majority, so these are somewhat interlinked in my view.

Same reason, albeit less extreme, why in war-torn countries, long standing brutal dictatorships and the like, the citizenry rarely is able to create any proper action agains their oppressors, not because they are accepting of the status quo, weak, or anything of the sort, but because when one is starving and trying to help their family unit survive, even beyond the risk that action can pose, their often isn't any time to actually consider it. "A republic, if you can keep it", in my opinion is a high demand from the public. They need to have the tools, rights and resources to actively defend it. Not saying France is perfect here, but I will say that it is easy to just raise our finger at the US populous without considering the whole picture.

[0] https://www.reuters.com/world/us/portion-insured-americans-w...


SSRIs are a first line treatment across many EU countries too, yet we somehow manage.

When I grew up in Germany, I had some pretty bad phases during my teens. I wonder if I had had easy access to guns together with lots of information and videos about shooters on the internet, maybe I would have thought about that too. I didn't have any of those so I sometimes thought about suicide but never about shooting others.

The US has a combination of SSRIs (maybe that's a factor, we don't know for sure), easy access to guns, gun culture, glorification of violence and vigilantism and over the last decades a lot of school shooters to imitate. Basically a ton of risk factors combined.


I still am struggling to understand why they informed the government about something that is known to be an issue in every LLM. There is no LLM that cannot be jailbroken, so unless this means that we have reached the absolute maximum publicly accessible US made LLMs are allowed to operate at with GPT 5.5, this is not grounded in any sane regulation attempt.

Does anyone know what limits Fable 5 has overstepped in the eyes of the government? Parameter count? Certain benchmark results? Training computer?

Cause if it’s just the ability to assist with cyberattacks and being jailbreakable, there is no model previously released that isn’t equally guilty.

Remember that for GPT 5.5 and 5.4, OpenAI also restricted the cybersecurity focused use under designated models, otherwise rerouting to 5.3-codex like Fable did with Opus 4.8. And both OpenAI models can also be jailbroken all the same.

Basically, what was the reason to tell the government now and not with Opus 4.5 or GPT 5.4? sama has been doing the rounds with apocalyptic predictions…


I submitted separately, but this Axios report has some details that call a lot of the speculation in this thread into question, i.e. that this wasn't much of a "jailbreak" at all and that it's not Anthropic-specific - the White House intends to generally regulate Mythos-class models (whatever exactly that means):

Between the lines: The government's response "seems way out of line with what's actually in the research report," Luta Security CEO Katie Moussouris, who Anthropic shared the Amazon report with, told Axios.

Moussouris said the researchers were able to find security vulnerabilities by asking questions normal defenders would ask AI, which is exactly what the model was intended to do.

An administration official told Axios they do not view other models as national security threats because they do not surpass the bar that Mythos set.

Anything at Mythos level or above would need to go through the administration to ensure the government's national security apparatus is hardened enough, the official added.

https://www.axios.com/2026/06/13/anthropic-amazon-white-hous...


That’s a terrible way to create AI regulations

If they actually cared about this issue we’d have predictable laws and regulatory bodies that let companies actually plan

There’s a reason royal fiat doesn’t lead to healthy economies. It’s just confusing and chaotic. It’s not clear why anyone would invest in a new model now.

Then the next administration comes in and instantly, by fiat, they decide to lift the ban. The market just gets jerked around with no ability to plan long term investments.


It’s a great way to regulate if you’re corrupt. When the rules are opaque and arbitrary, there’s a lot more room for corruption.

And most of the opaque rules predate this administration. We need to be chopping off poorly defined laws that grant the government power rather than trying to figure out the next regulation to add.

Not that I'm ever one to support anything this regime does but I'm kind of okay with them pumping the brakes on this until we really get a handle on what the

The USG has limited capabilities on technologies from GPS chips to thermal imaging with "national security" implications for a while and now they're doing it but it seems people don't like how ill defined "Mythos-class" means. Would it be better if it was some %X on some benchmark that the frontier model peddlers could just limbo under to make it "acceptable" for release? Do we just accept that jailbreaking will never be prevented?

The part of all this I do have a problem with is the national state cybersecurity cat-and-mouse this kicks off. Will the US tech landscape have enough time to safely get a "Mythos-class" model to harden itself before China releases or leverages a "Mythos-class" cyber munition?


> The USG has limited capabilities on technologies from GPS chips

Are you referring to Selective Availability? That ended decades ago.


> That’s a terrible way to create AI regulations

This administration doesn't do regulations, its extortion. Same as the tariffs. Just grease someone's palm and then the vague restriction is lifted.


In a parallel universe where we have Biden (or Democratic Party) administration, how different do you think the regulations / approach would be for this fast moving and unpredictable technology?

It’s hard not to see this ban as being motivated by retribution for refusing to use the models for spying and autonomous warfare.

They at least wouldn't depend on how extensively you publicly glaze the President.

Probably using the rule of law in some way? Talking about it in public? Legislating? You know... government type stuff?

They probably would have been in line with Executive Order 14110, the Biden administration's detailed description of a principled approach to regulation of the AI industry. It would have been aligned with the Trump administration's stated goals as well, but a coalition of rich VCs successfully bribed him to rescind it as one of his first acts in office, because the primary principle of Trumpist government is that people who pay Donald Trump a lot of money get what they want.

Interesting. Hope there is any clarification on what "Mythos level" is and why 5.5-cyber doesn't arise to it. Any metric I could come up with (parameters, pre-train compute, benchmark scores, etc.) seems somewhere between imperfect and utterly nonsensical. Pure speculation, but GPT-5 series models including the new 5.5 pre-train appear far closer to Sonnet than Opus or Fable in pure parameter count, so maybe that's it, but the "they do not surpass the bar that Mythos set" line sounds more like there is a believe that Mythos/Fable are more capable in cybersecurity tasks, whereas the data [0] doesn't seem to bare this out. I did not do any cybersecurity assessment of Fable 5 myself, partly due to personal reasons that make that something I'm abstaining from, but my coding evals showed that while task adherence and assessment wise it was neck and neck with 5.5, the task inference was a major jump again (something prior Anthropic models tended to already do incredibly well on) and while that makes it a far better model to work with for UX experiments, I don't see how that translates to cybersecurity, along with the aforementioned publicly available evals by AISI.

Seeing as neither Mythos nor GPT-5.5 had been pre-trained with a particular focus on cybersecurity, this would have to mean any model that benchmarks better than GPT-5.4 or Opus 4.6 on these tasks cannot be used by None-US-Citizens. If such guidance isn't enforced for all US labs, I think that's irrefutable evidence that this isn't about cybersecurity or "the bar that Mythos set"...

[0] https://xcancel.com/AISecurityInst/status/205458976317312633...


They literally asked for it. Two days ago Amodei wrote an essay urging the government to regulate them. He explicitly cited Mythos, as proof that frontier AI has acquired autonomous hacking capabilities that threaten critical infrastructure and national security.

  "Mythos Preview scrambled the global cybersecurity landscape. But its broader significance is that it proves beyond doubt that AI models are now tools of global and national strategic consequence." 


  "The government should have the power to block or deter deployment of the model if it is determined, in light of third-party assessment, to present unacceptable risks. This power must be scoped to the above four specific risks and there must be protective measures against political favoritism or arbitrary decisions" 
https://darioamodei.com/post/policy-on-the-ai-exponential

A third-party demonstrated that it was possible to jailbreak the safety measures of Fable to access the raw Mythos abilities. Abilities which Anthropic say are too dangerous for the public.

Edit. From David Sacks:

  — A highly credible trusted partner of both Anthropic and the USG who was testing Fable came forward with a jailbreak of those guardrails. The Admin asked Dario to fix the jailbreak or de-deploy the model. Dario refused.

   — In their blog post, Anthropic defended its decision by saying the jailbreak isn’t serious. That is not what the trusted partner and the USG believe; nor is that kind of minimizing language consistent with Anthropic’s brand as the AI safety company. It’s difficult to fathom how they could claim a jailbreak allowing operability of a cyber weapon could be defined as not “serious".

Cynically: this is an attempt to quash open source or discount model competition through regulatory capture.

I'm sure it's also a step towards requiring id and limiting access for us plebians to real power and keeping it for maintaining or growing power of those in charge. It's all an excuse to give us a Westworld season 3. Probably a better example out there..

> A third-party demonstrated that it was possible to jailbreak the safety measures of Fable to access the raw Mythos abilities. Abilities which Anthropic say are too dangerous for the public.

Pressure test this assumption before getting behind this position.


I will certainly revisit it as more information comes out, but is it your contention that Anthropic solved jailbreaking with Mythos?

What you claim contradicts Anthropic’s statements. I assume that is the contention.

That is a strawman. My contention is what you just implicitly acknowledged - there is not information put out yet to validate the quoted claim. There are claims to the contrary, as well, from Anthropic themselves.

In the absence of information, maybe it’s better to ask which claim is more extraordinary.

That,

A. Anthropic solved the llm jailbreak problem with mythos (despite no claim to have done so on their part)

B. That a full jailbreak of mythos is possible.


That’s not what the claim is though.

Anthropic’s claims are as follows if you read their post:

* this is not a universal jailbreak method

* the jailbreak affords you the same capabilities you get already with other models, not Mythos.

In this situation it’s which party do you trust more and history would suggest this administration is very playful with the truth, especially when it comes to economically damaging the company that’s become their political enemy


There is not an absence of information.

There is information, from Anthropic, concerning the jailbreaks that motivated this action, that directly contradicts the statement.

There is just an absence of information backing the statement I responded to.

I find it so odd this is apparently so contentious a take.


What assumption?

The one I quoted, which contradicts Anthropic’s post and has no supporting evidence publicly available. That a jailbreak was found that accesses the model’s _raw_ capabilities. Something Anthropic has explained was not the case.

It is pretty clear, no? Anthropic claims that the jailbreaks they were made aware of did not access the model’s raw capability, explained that there are protections to mitigate the impact of successful jailbreaks, etc. Coming here and stating something to the contrary with zero explanation or actual evidence is the assumption.

>I still am struggling to understand why they informed the government about something that is known to be an issue in every LLM. There is no LLM that cannot be jailbroken, so unless this means that we have reached the absolute maximum publicly accessible US made LLMs are allowed to operate at with GPT 5.5, this is not grounded in any sane regulation attempt.

I wondering where you are getting the idea that there is an sane regulation right now?


The only reason I can see is because Amazon wanted something like this to happen. But I'm not sure what Amazon would gain from that, since they don't have their own competing frontier models.

Of course, Amazon wanted this to happen.

They own 20% of Anthropic.

Anthropic bleeds cash. They have to raise capital.

There are only 2 ways: an IPO or follow-ons from existing investors.

If the IPO gets delayed because of these restrictions, Anthropic will be forced to raise more capital from existing investors.

And existing investors (Amazon) will end up owning more of Anthropic at a cheaper valuation.


There's a much simpler explanation: Amazon's business is selling cloud services. Amazon is constantly under threat of attack and anything that disturbs the balance between attackers and defenders is bad for Amazon. Amazon also needs to keep their AWS customers safe.

This is Amazon prioritizing their 100% stake in AWS over their 20% stake in Anthropic. It's also possible that Amazon knows things that are not public.

The fact that Amazon is willing to report this despite owning shares in Anthropic and being close to a liquidation event points to whatever they found being actually serious.


Why would they have launched Fable on Bedrock if they knew they were going to be shutting it down a day later?

I'm just stating facts:

- Amazon's CEO knew what he was doing and the possible consequences

- Anthropic must raise cash, and there are only 2 ways: an IPO or follow-ons

- If the IPO is blocked, existing investors will be able to increase their stake on Anthropic at a very attractive lower valuation

- Amazon has 20% of Anthropic: so, they benefit from it


My guess is that they liked the status quo with Project Glasswing and didn't want Fable to be public, especially if anyone is jailbreaking it into Mythos and using it for cyber

But then it backfired spectacularly and now it seems they can't use Mythos currently


This is either a complete own goal by Amazon… a play to consolidate compute/model access.

Will Chinese models be allowed on the market… at all? Will startups be banned from training models of equivalent capacity?


At this point would I be outsourcing my knowledge work or would I be entering self-exile?

Did it cross your mind that Amazon cares about the security of the United States and reported the jailbreak to protect it?

...Not to mention that they're investors in Anthropic.

Claims of retribution aside, one steelman is that Mythos is likely the most capable model that's usable by folks like the NSA [1], and decision-makers across the USG and industry partners have seen a stream of reports of Mythos successfully finding serious vulnerabilities over the past couple months due to Glasswing.

So even if GPT 5.5 is just as capable in these scenarios (which, imo, it largely is), it is not known by the government apparatus as having the same capabilities.

Personally, I think we crossed the threshold of capabilities with Opus 4.6 [2], which translated to an even more capable open-weight GLM 5.1 (which it is rumored to have distilled Opus 4.6) [3][4]. But the USG and its partners aren't fully rational actors with perfect data, so it's possible they're only viscerally aware of these capabilities in the context of Mythos.

[1]: https://www.reuters.com/business/us-security-agency-is-using...

[2]: Opus 4.6 was used for https://www.noahlebovic.com/testing-an-autonomous-hacker/

[3]: See GLM 5.1 scoring in https://www.cybergym.io/cybergym/

[4]: https://dualuse.dev/posts/chinese-models-are-sometimes-bette...


I doubt that the capabilities of GPT-5.5-cyber aren’t known by the US government considering OpenAI is their primary LLM partner after Anthropic had concerns about using models for autonomous weaponry and mass surveillance of US citizens. If anything, they should have more experience in GPT-5.5s full feature set due to longer access and may even already have GPT-5.6 access.

Hanlon's razor. Are the people with the right access talking to the right people? Wouldn't be the first time for miscommunication in the executive branch.

Fair point, not unlikely, though my personal assumption is that, like with Nvidia export controls, there will be a sudden reversal with no tangible, actual, technically based reason the second a certain person has their ring kissed...

Why not both? The current executive has missed the mark on appointments pretty badly a number of times due to the prizing of loyalty over competency.

They made a deal for access, but I'm unsure if it's usable, scaled, and has vulnerabilities attributed to it at this point. But I have no inside information here, so I could be wrong.

If it had vulnerabilities the marketing copy would already be written and published.

Probably a con job. The AI companies don't think they will be able to significantly improve their models in the next year or so, so they are stalling with government regulations whilst taking in investor money.

Anthropic themselves have played up the dangers of Mythos, limited its release, etc. So if it can be jail broken then it specifically deserves controls, per Dario’s own manifestos. David Sacks - the “AI Czar” - also said the government asked Anthropic to patch the issue but they refused, which is bizarre. And that led to the export ban.

The simple answer is that Trump has a stick up his ass against Anthropic and is also fond of stock market manipulation. No need to get too deep when it comes to dealing with that orange shmuck.

This is just another shakedown like with Tylenol etc, knock the product, lower the stock price and have a competitor hostile takeover, or get kickbacks

This is a hypothesis, and a viable one.

But I caution you against drawing conclusions from your hypothesis and calling it a day, instead of taking in the available data and using it to broaden your understanding of what's actually happening.

This could be many things: a shakedown, Trump's pettiness, marketing kayfabe, an actual government reaction to a very weaponizable technology, and so on.

But if you call it "just another shakedown" and go about your day, then you're doing yourself a disservice, because the story is still unfolding and we don't have all the facts.

You don't actually have the full story, so don't delude yourself into think you do.


Its been 10 years of historical abuse. You're a battered spouse in a bad relationship with the most audacious narcissist that has ever lived.

I'm not American, and I definitely don't support Trump.

Care to spin the outrage wheel again and lob another unfounded insult at me?

At any rate, feel free to indulge in (plausible) conspiracy theories until further details of the story have emerged.


Its not Fable 5 that overstepped in the eyes of the US government.

It's Anthropic.

This is transparent revenge for them daring to try and push back a little on enabling war crimes.


Anthropic is perfectly fine with the US government using Claude to commit war crimes. The US military has done hundreds of extra-judicial killings in the waters around South America over the last year and Anthropic hasn't had anything to say about that.

Use nuance and judgement, friend. Anthropic notably pushed back on completely autonomous no-human-in-the-loop drone killings and mass surveillance of the US population, where others like OpenAI scrambled to agree. Anthropic isn't perfect but that doesn't make them equally bad.

Trust no one, friend. Believe what you want to believe.

>This is transparent revenge for them daring to try and push back a little on enabling war crimes.

Anthropic wasn't pushing back on enabling war crimes. They said they didn't want the models to work with autonomous weapons because the the models weren't good enough.


Arguably it’s a worse (or different) war crime to knowingly target people incompetently and thus kill more innocent civilians. In this respect, they showed themselves against one war crime. Not “war crimes” in general but a specific misuse of ai in war.

That's pushing back. The regime doesn't care if the models are good enough, they want the optics of killing lots of people using cutting edge tech, they don't really care if it's the right people.

Whether you or me or Anthropic think it was pushing back or not is besides the point.

I can agree on revenge, but it's important to not paint it as a good vs evil when it isn't.

It's the AWS CEO being a little snitch to gain favor from the Government. That is what this is about.

Why not both?

Antropic models are the ones that designated that school as valid target

What is the basis for that claim? There’s been lots of wild conjecture, but as The Guardian reported, “Almost none of this had any relationship to reality” and “LLMs-gone-rogue dominated coverage, but had nothing to do with the targeting.” https://www.theguardian.com/news/2026/mar/26/ai-got-the-blam...

People designated that school as a valid target - using fancy calculators does not remove that the pass/fail rests with people. AI models have no agency. Even if they are given autonomy - it is given.

>This is transparent revenge for them daring to try and push back a little on enabling war crimes.

Don't be so pessimistic, maybe they're just trying to give their buddy Musk and XAi a chance to catch up.


Anthropic is one of the two consistent revenue sources for XAI via their colossus deal. I have been critical of this man longer than most, but I don’t see him hurting his own bottom line.

It could be the Trump admin incompetently attempting to help Trump’s primary benefactor? (As I haven’t yet seen anyone say that the current actions are a competent approach to AI regulation.)

The reason is pretty obvious. Anthropic tried to play hardball with the government and now they are under their thumb for scrutiny of any and every little thing they do.

That's what this admin is known for. If you do even what a normal person would think is sane but they don't like it, well now they need to make you bow down and break you so you "learn your lesson".

It doesn't help that they themselves marketed this model as being especially dangerous in the publics hands. If this was just another model drop and none of the fear mongering I don't doubt this probably wouldn't have had any issues.


It is important to note this formula doesn't require understanding any subject.

People keep seeking logic where there is non. We have an internet full of theories assuming there is more to it.


I mean the logic is simple but people don't want to admit it, you must pay the vig if you want in on the action. Before this type of naked corruption would take the form of boardroom seats/book deals/speaking gigs after you leave office but now it's more open so others will take note.

It also helps if you bust a few kneecaps in the process to show what happens if you go astray.


>The reason is pretty obvious. Anthropic tried to play hardball with the government

that is one.

Another is who is going into the first IPO. Troubles for Anthropic IPO would channel all those money into OpenAI's one. Check financial interests of this admin. Hint - they aren't with Anthropic.

Third - most of the export and access controlled tech of the past wasn't productivity multiplier, nor human replacement. AI is a different case - the more capable AI the more its general economic benefit. Export and access control of AI allows you to more and more control the whole domestic and large part of global economy, not just military capabilities like in the past.

Political - coming into elections with "this evil new tech was coming after your jobs, yet we reigned it in and protected your jobs". After all such approach has been for decades working great when it comes to coalminers.

Note that specific bug-finding capabilities of a specific model is a red herring here, and other leading models are almost there, and definitely will be there in a month.

It is all about revenge, money and power.


Alternatively, this is the best advertising for which Anthropic could hope: "Our product, and nobody else's, is so good that the government declared us a threat to national security." If they bring it back for US-nationals only, maybe demanding ID for users, people will think it's the bees knees: "so dangerous that non-Americans can't have access" probably sounds like a ringing endorsement to some C-level decision makers.

Crowdstrike took down airports in July 2024, and its stock was back up by October; it's double the price now. Everyone saw how systemically important it was and how it took down entire industries, and they asked why they weren't using it themselves if it's so important. See also the 2025 cloud outages.


What good is advertising if they can't actually sell the product?

Customers (especially large ones) don't so much buy individual specific products, they buy into a company and its prospects. Customers don't want to chop and change. They want to lock in with the leader.

This whole thing shrieks out that Anthropic is at the head of the pack, with the most capable models.

It hardly matters in the customer's mind that today they can't buy this specific model.


The same customers that are barred by law from using antrhopic on any government contracts. If they get past that, they are then cant have any foreign workers use state of the art anthropic models. SOTA anthropic models also can work with working in any secure government clouds or with sensitive customer data due to retention policies.

It is hard to see being a new benefit for anthropic.


> Crowdstrike took down airports in July 2024, and its stock was back up by October; it's double the price now. Everyone saw how systemically important it was and how it took down entire industries, and they asked why they weren't using it themselves if it's so important. See also the 2025 cloud outages.

Truly, too big to fail. Capitalism is broken when companies aren't punished but rewarded for screwing up. What point do stock markets serve when bad behavior has no incentives at all to be prevented?!


Not even limitoto companies, if you prevent a problem you get fired because your work isn't visible, if you create a problem and then fix it you're a hero

>Troubles for Anthropic IPO would channel all those money into OpenAI's one.

Troubles for Anthropic would almost certainly affect OpenAI, significantly. Yesterday just proved that the government sees it within their remit to shut down AI models. All current and future AI investment now has to contend with this risk. You should even see the effect of this decision on SPCX on market open despite X.ai being whatever tiny fraction that it is.


>> Another is who is going into the first IPO. Troubles for Anthropic IPO would channel all those money into OpenAI's one. Check financial interests of this admin. Hint - they aren't with Anthropic.

Yep. Kushner owns private shares of OpenAI.


> The reason is pretty obvious

I would argue the simple reason is that Amazon wanted to fsck Anthropic to set them back, despite whatever partnership they may claim. The competition at that level is intense and these guys do not play by the same rules that regular people do. They can't flat out murder each other (yet) so they find other ways to do it.


Why? Amazon makes tons of money serving Anthropic models through Bedrock and they seem to have basically given up on their own frontier models.

Previous administration was same way… intentionally not including Tesla in an EV summit

This is lacking any nuance. The CEO not being invited to a meaningless ceremony vs being designated a supply chain risk by the DoD and being forced to shut down your product. Use judgment.

It's astonishing how that summit sparkles the Tesla sowflakes. We gave them tens of billions of dollars in subsidies and a 100% tariff on the Chinese competition! Huge, substantive policy assistance! But Biden wanted to pal around with some union supporters and that's supposed to be some horrible slight? Please.

Elon didn't drop millions on the Trump campaign and throw a double Sieg Heil at the 2025 US presidential inauguration because Biden refused a photo-op. He did those things because he believes in them, because he believes the things he says on twitter. The EV summit thing is the least believable "you made me do it" excuse I've ever seen.


You'll notice the tariffs were helping legacy auto more than Tesla

> intentionally not including Tesla in an EV summit

this comparison is orders of magnitude different


Wasn't that a UAW summit about EVs? Tesla does not work with UAW, so they wouldn't appear at a UAW event.

Give me a break with this. You are not so thick as to think the two things are remotely comparable.

Because based upon on what Anthropic has told the “AI people” and military, it is dangerous if an adversary gets its hands in the cyber capabilities. Knowing that if they ignored it and something did happen, heads will roll. Blame Anthropic for that, or wait if they are all for safety, they shouldnt complain.

> I still am struggling to understand

And? Does it matter?


Reminds me of people freaking out about the Grok Bikini thing, but GPT and Googles image model they all do the same behavior. Clearly biased against Elon Musk despite it being a problem for every single image model out there.

Great article, the worst offender is compact tab mode in the current Safari. The animations they implemented make that unusable, sometimes it’ll move tabs away from where the tab was when clicking, the animation always look clunky and the entire experience feels utterly untested. Doesn’t just look poor, but violates quite a lot of HIG rules Apple recommends for third party devs. Maybe something to focus on in a part two of this article.

My personal issue in comparing LLM progress and risk as labs publicly predict it with nuclear power in the middle of the 20th century is that the processes by which it works where fairly quickly well understood and the risk could thus be realistically assessed. Some powerplant operators did not adhere with best practices, but building a relatively safe nuclear power plant was not impossible given appropriate effort and spending. Heck, according to some, we could have even gone far more fail-safe approaches (molten salt) if military interest haden’t been at play.

With what is predicted by frontier labs for LLMs, all of this is not the case. We are far further from any understanding of how these models work internally than in the early days of fission and, if this was actually creating a truly intelligent, autonomous entity, alignment seems unsolvable as well, at least the way it is proposed.

It’s why I have from the get go been critical of this doomsday framing and tended to always dislike it. This is basically the outcome that was inevitable given the framing and it was bought to prevent far less stringent, but more actionable possible regulation that labs very much wanted to avoid.


  > We are far further from any understanding of how these models work internally than in the early days of fission
OMG. I'm like really dont want to be offensive or something, but everyone always knew "HOW" these models work exactly. Its easy enough principle to explain to 10 years old if you take something like Karpathy article on MicroGPT:

https://karpathy.github.io/2026/02/12/microgpt/

None of SOTA LLMs are any different - they just much much larger and have a lot of optimizations.

Fact that LLM companies trying to sell it as some kind of magic is just proof how much lies is here.

All it does is just predict next "word" at any given time.

  > and, if this was actually creating a truly intelligent, autonomous entity, alignment seems unsolvable as well, at least the way it is proposed.
This is obviously true. It's very hard to predict whatever you gonna decompress from a lossely "compressed" dataset using floating point math.

This is why you cant solve it all with pre-training or censorship on top, but instead you need a good sandboxes and harnesses.


By how, I meant specifically the internal activations, which no person in the field claims to have a comprehensive understanding of, not next token prediction as the underlying technology. The whole interpretability of it all is the crux I was referring to, though I will give that you are right, that’s not really the how it works and I worded it sloppily.

Anthropic are putting more effort than most into this and I find their work fascinating in that area, though like with OpenAI, I will maintain that if they truly believed this problem must be solved to stave off major catastrophe, they’d solely focus on interpretability of other labs models, not work on and market their own.


All humans do is predict the next action at any given time. You roll your eyes, it's a tired argument, but still. You have memories, a personality, thoughts ranging from the long running to the mere reflexive, you have a rich conscious experience, and all of this in service of generating the next thing that you do at any given time. If you actually knew how LLMs worked, you could rewrite them as code, refactor it, disable jailbreaks, and put out a superior product. Your description only covers what an LLM does, not how. Part of the how is that it necessarily predicts multiple words ahead. It wouldn't be possible to write couplets otherwise, and they could do that in the GPT-3 era.

> Should you be able to use a Samsung SoC in an Apple phone?

How did we go in less than two comments from providing access to APIs that are already present, implemented and actively used by Apple (who in their holy wisdom deem us mortals not worthy to access these the way we choose) to a completely different hypothetical of requiring actively building support for another companies hardware?

Such slippery slopes really aren't helpful, nor in any way comparable to what the DMA actually intends or states.


Great effort and a bit closer to my private evals than DeepSWE. I greatly appreciate the focus on false negative and positives, along with simply being far more focused on actual, mergeable quality output over plain passing. Could see a lot of others adopt your list of metrics as a basis, they are very well defined and solid coverage of everything one should want out of code provided, not just focused on one or two narrow targets. Will incorporate a lot of these ideas in my own tests and polish some other parts where I somewhat unintentionally already went into a roughly similar direction.

> […] fed large chunks of an article multiple times to an LLM […]

So they had to prompt? An LLM? I got this argument before and still don’t get what it’s trying to say. These models do not output anything unless prompted, that’s not any kind of gotcha.

On the code outputting front there is a lot of relevant evidence beyond the NYC lawsuit [0].

If I slightly modify GPL code, that doesn’t give me the right to relicense.

[0] https://arxiv.org/html/2601.02671?amp=&amp= and https://arxiv.org/abs/2506.12286 and https://ai.stanford.edu/blog/verbatim-memorization/


50+TFlops is nothing, I got that in my MacBook, but besides that, when, a few years/decades from now, whatever arbitrary compute limit we think prevents Armageddon comes down to enthusiast and consumer level, what then? This isn’t Uranium, compute is not a physical resource.

This is the “SGI” regulation issue I never read a reasonable answer to, if one believes this is possible and should be prevented then either that means they want to restrict every computing system sold from here on out to some arbitrary metric (and somehow prevent users from just creating clusters to get around such a compute restriction) or what?

If compute alone directly leads to “SGI” or whatever, then we might as well put paper bags on our heads and lie down in some English pub.

Not to mention, if one really wanted to cause harm, training a current day LLM and using it for Stuxnet-esque attacks is reasonably possible long before any arbitrary compute limit we might introduce now, no machine God needed to cause major harm.

That’s why I prefer advocacy for LLM regs that focus on current day impact. Mental health concerns, training data licensing questions and the like. There I can formulated reasonable regulation that can hold. For “SGI”, I do not know anyone who actually has done that and I have looked hard. That’s why I consider these things more distraction from actually necessary and possible regulation that just draws attention via a flashy doomsday scenario.

Occasionally, I will click on one of the AI Doomsday Youtube videos recommended to me. And far more often then not, these will posit that "SGI" requires only compute and will inevitably cause devastation. Fair enough, I still think we should put a bit more focus on e.g. LLM induced psychosis, the labs rarely compensating those whose training data they used, etc. but if it is their opinion that "SGI" is possible, I can get why they'd ignore such concerns. But at the end, they never state how to regulate or prevent this, they more often then not have a call to action ("If you want to prevent this...") linking to a website where we can actually read about how they think we should deal with this. Inevitably, I click on said site, finding it to for one be an Effective Altruism aligned project and B always just contain some blabla about "aligning AI training with human values", which is absolutely meaningless nonsense, not least after having watched a video in which someone spends 15 minutes espousing that "we could never fully control "SGI"".

Makes all these feel more like industry efforts to stave of necessary regulation and not actually serious, but if one can formulate how to regulate “SGI” that isn't laughable, nonsense or both, I am not opposed, I just don’t think that person exists…


Good. I firmly am of the opinion that if any ADAS sold to require limited intervention (Supervised FSD, Mobileye, Nvidias offerings, etc.) actually was reliable and as safe or safer as a human (even if only the former, there is no fatigue, intoxication, etc. that impact performance), insurances would financially incentivize their use hard.

The fact that they aren't doing that in countries were such models are available means they made the calculus proving that these solutions aren't yet reliable. Same for the companies offering the tech, who can't have that much confidence in claims about superior safety if the responsibility (both legal and safety wise) still lies solely with the human driver at all times.

If BYD truly pays out any cases that come from proper use, that has suddenly changed and they would have to be far ahead of the competition. That or they have a major scandal about not paying out, in either case, a good indicator for the rest of the ADAS industry.

Supervised FSD (oxymoron of the century right there) is not available where I live so I have no hands on experience, but I occasionally read experiences on social media. It's fascinating how often I read something akin to:

> Version X+2 is so great, it drives me around 24/7 without interventions, nothing like the dangerous things I had with Version X+1"

Then I click on their account and see comments from a few weeks back that essentially go:

> "Version X+1 is so great, it drives me around 24/7 without interventions, nothing like the dangerous things I had with Version X"

The newest version of Supervised FSD always seems to be above reproach and the prior one always is retroactively called dangerous, often by the same commenter. Obviously, that isn't really possible and makes me doubt any experiences I read, along with the traffic law violations and unsafe driving I see in videos.

Whether Tesla, Mercedes, Lucid, Kia, etc., if the insurance isn't cheaper or on the manufacturer during usage, there is no reason to believe their claims on superior safety. They have done the maths and it says we aren't there yet.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: