The big AI labs are also accumulating huge datasets of expert work in a wide range of fields, which is very expensive to re-create. It seems pretty plausible that this this gives them a big advantage that is compounded by their larger training runs and larger models.
The Team plan is ~125 USD / month / user. Big enterprises like Uber are paying upwards of $1500 USD / month / user. Anthropic can raise their revenue a lot more by selling to big enterprises than they can by selling more team plan seats.
I gave GPT-4 some source code and my existing tests, and asked it to write a new test, and it did it! It didn’t even run straight away, I had to fix it, but it still blew my mind.
Later, I wrote a ~5k line proxy for work in C, and gave the whole thing to ChatGPT o1 and asked it to review it. It found several real memory bugs, and now that service has been running since with no problems.
Just this week, I was trying to write a greedy solver to pick the best subset of block sizes to keep from a larger sweep for shorter testing. Opus 4.8 suggested that this could actually be solved as a MILP problem, and found the perfect solution in 5 mins. I’d never even heard of MILP before.
I find this version unlikely, since companies very rarely genuinely believe what they preach in PR campaigns. It's always some sales and marketting dudes and gals trying to polish up something as something more than it is. Which is very annoying. We can now choose between Anthropic being the one exception to this, while having huuuuge incentive to hype up their product, or we just write it off as more marketting fluff.
I would be very surprised if this is an actual thought-out PR strategy. I am far more inclined to believe that their employees are just bought-in to the future where AI is genuinely transformative.
Whether they are right of wrong is another matter, but their claims also don’t seem too far out of the realm of possibility to me.
Coding agents have fundamentally changed my day-to-day job. In the last year, my work has shifted from me writing all of my code, to me writing very little code and spending most of my time on understanding problems better and setting direction, and reviewing, verifying, and polishing the output of coding agents. It has been quite a drastic change.
It is not that outlandish to suggest that coding agents could continue to improve at such a drastic rate over the next year. And the implications of that could be quite large! Even just the implications of more white-collar workers adopting tools like Cowork seems potentially very large, with tools that already exist today. It seems sensible to at least consider this as a possibility.
Dario is no John hammond though. That'd be altman. He actually has the discipline and background as an ai scientist to tell what the potential failure modes are. You're right, he might still be just hyping things up, but generally i'd give more benefit of doubts to anthropic. Precisely because Dario was a scientist and I'd stand by it. People who get their phd in science already self-select, or proven at least to be made of different stuff.
Likewise, people don't as easily blame ilya for 'hyping things up' when he said these things.
Also talk about incentives, there are also incentives to lower their valuation. If you wanna be vigilant against social engineering i'd be wary of that too.
These are moot anyway though cause the article isnt even making any super strong claim. If you read it it's no big deal
This is obviously the case to me, but I think HN is very anti-AI.
I genuinely don't believe that they sat down in a board room and said "yeah lets specifically release this now before an IPO so we can juice it!" They haven't even announced an IPO date. So is every blog on capabilities before that date just "pumping up the value of the stock before the IPO?"
If they actually have concerns they can communicate them directly and privately. There are less than 10 companies, in only 2 countries, with advanced enough AI programs to qualify for this type of concern. And Anthropic has the phone numbers for all of them.
Companies do tons of communication and work directly, without press releases or blog posts. If a statement is released publicly, it is done for a PR purpose.
It is not solely or even primarily the big AI labs that would need to prepare. They have a better idea of what’s coming, and they’re positioned to benefit from it.
It is governments, big companies, and individuals who could all experience fundamental changes if any of these predictions come true. If people within the labs believe these possibilities are around the corner, it would be responsible to try to let people know so they can be more ready if RSI suddenly hits and in a couple years time all our work is fundamentally changed.
That’s not to say I agree with their predictions, but rather I’m just saying that there are good reasons for Anthropic to publish stuff like this that are not just PR.
Maybe my bar for what constitutes a breakthrough is lower than other people's, but all of these seem like breakthroughs to me:
NLP as a field saw huge shifts. NLP tasks that used to be complex and inaccurate can now be setup very easily and quickly using structured outputs from LLMs, often with greater accuracy.
A small charity I help with has now been able to build their own website to manage their day-to-day operations. It saves them a lot of time, and it was vibe-coded using Manus. I don't think people appreciate how much room there is left for bespoke software to have big impacts on small organisations that can't afford to hire developers. The cost for software like the one they made has gone from 10s of thousands of dollars to $10/month and volunteer hours.
My brother has recently been setting up Cowork to do an automatic review of contracts before human review, and he said it is far more diligent than people when it comes to routine things to check. This is another huge breakthrough for not just efficiency, but the quality of work.
I really don't think we can discount AI finding bugs and vulnerabilities. If you care about code quality and keep up review standard, LLMs can help you write more robust software. AI has found a huge number of bugs for me before they hit production, including potential out-of-bounds memory accesses and segfaults.
ChatGPT has 1 billion MAU. People are now getting life advice, financial advice, and mental health help from chatbots at a scale and cost that no human support network could match.
Also, they have done a good job shutting down the psychotic behavior you could get from 4o era models. If there are remaining issues like that they ought to fix them too.
> ChatGPT has 1 billion MAU. People are now getting life advice, financial advice, and mental health help from chatbots at a scale and cost that no human support network could match.
Definitely, it is quite an extreme change. But the upsides of better access to support and advice are huge, even if the potential downsides are scary as well. This feels like one area where we need better transparency and regulation due to how much ChatGPT and others can affect people who listen to them.
You can claim the use of AI is unethical, or the work as derivative, but AI being used as a tool in no way precludes something from being art. It is thought provoking and challenging, it seems like textbook art to me, and it’s clearly struck a chord here. There is no “minimum effort” required for something to be art.
I personally found the contrast with the original “They’re made out of meat” to be really interesting. I don’t care that AI was used during its creation at all.
It seems to me that since the advent of image generators, art has been firmly defined by artists to mean that it was made by a human. But there might be a spectrum of human involvement where the less a human is involved the less it's art.
What happens too often during these discussions is that someone who writes "make me a cool image" gets conflated with someone used ai to fixup a small rock in their natural landscape drawing. (two extreme ends)
One problem though, is that we don't really know how much the supposed human author was involved in the piece. Now that it's becoming hard to judge, people against ai art can proudly change their opinion on on a piece once they learn that it was made by ai. I've come to think this is somewhat respectable, like you see a video of some extraordinary event (before ai) and then you learn that it was fake, just for views or something.
But on top of all this, there are different ways to "consume" art. Artists may think more about who the artist is as a person and what they felt when they made the piece, while non-artists may just enjoy the piece for what it is, detached from the artist. These two perspectives clash a lot.
reply