More

nullbio · 2026-06-14T12:24:04 1781439844

I had the same problem. Was using this a few months ago, had it running for weeks, and noticed that code was disappearing. I no longer use it. Aside from that, I decided there's no point anyway, considering that LLMs are great at figuring out merge conflicts.

nullbio · 2026-06-14T01:36:41 1781401001

Times are changing. The open-weight models have needed time to catch up, but they're finally at a point now where we can get almost frontier level capabilities for coding.

I just wish we had a way to actually benchmark them properly though. Still seems no one has solved the problem of software architecture, brittleness and bloat as the codebase grows. Models love to add stuff, but they rarely clean up as they go. In a perfect world they'd do both near equally as they're developing.

It would be nice if there was an "architecture quality" benchmark that distilled the essence of what it means to have a good architecture, but I suppose that's an open research question with a lot of variables? Like how is good architecture actually quantified and measured? Is there a mechanism that can be re-used across all codebases to clearly denote one that is good and one that is bad, or is it highly subjective and depend on the lens you're looking at it from? Is there a lot more to it than just "how much refactoring effort is required to extend this in the future?".

Surely this is something that has been well researched - yet I never really hear anything about it. Makes me wonder why.

irishcoffee · 2026-06-14T03:00:35 1781406035

> Surely this is something that has been well researched - yet I never really hear anything about it. Makes me wonder why.

Occam’s razor rings true here: where’s the money in it?

nullbio · 2026-06-14T01:22:55 1781400175

I know the big labs like to pretend that their models are trillion parameter. But how likely is that really to be the case when Qwen 3.6 35B A3B gets so close to their performance? Seems that with the best research applied, best training data, they'd be able to top the charts with a 60B model quite easily.

MisterKent · 2026-06-14T01:28:11 1781400491

They want people to believe they have massive models, that is effectively their moat at this point.

Because if they don't imply that size is needed for every task, they'll end up tanking their valuations.

https://blog.nilesh.io/post/ai-profit-race

redox99 · 2026-06-14T08:27:02 1781425622

Qwen 35B isn't even remotely close to the big models. It's just people over hyping small models. Ignore the benchmarks they are almost meaningless.

If you want something comparable you need the trillion parameter open models like deepseek.

otabdeveloper4 · 2026-06-14T13:07:28 1781442448

Number of parameters doesn't make the model smarter, it just makes it know more stuff out of the box.

At some point there's diminishing returns and your coding LLM performs worse because you encoded useless stuff like Pokemon combinations or languages you don't speak into its parameter space.

The "smartness" of the model comes from RLHF post-training, which is orthogonal to model size.

Also, if you're using an agentic harness a much better approach is to let the model control its own context. If you ever reach a point where your coding LLM needs to know about Pokemon, just give it a web search tool and let it google the Pokemons.

redox99 · 2026-06-14T13:39:45 1781444385

That's just... not true. Just compare any open model which is trained with the same recipe but multiple sizes.

oneshtein · 2026-06-15T08:29:20 1781512160

You can compare models at OpenRouter site. Qwen 3.6 dense is in top 24% for coding.

otabdeveloper4 · 2026-06-15T10:33:06 1781519586

> Just compare any open model which is trained with the same recipe but multiple sizes.

That's exactly what I did.

nullbio · 2026-06-14T00:55:36 1781398536

Can we stop pretending that Mythos is a good model yet?

sixothree · 2026-06-14T01:03:24 1781399004

What was your particular experience with it?

nullbio · 2026-06-14T02:50:06 1781405406

Hallucination city, doing whatever it feels like and far more than what I've bargained for, and performance on par with Opus 4.8 for coding in a large production codebase. I still have far more success with GPT 5.5, it actually follows my instructions and doesn't try to automate my entire job, which allows me to build skills and pipelines around the things I actually want automated.

sixothree · 2026-06-14T20:43:06 1781469786

Interesting. I only used Fable. And not for very much time. But one thing I did notice was improved adherence. Maybe it was just improved adherence to the claude.md and the instructions from the skills in use that I was noticing.

nullbio · 2026-06-13T06:57:18 1781333838

It's great for people who are just maintaining something. Less so for someone building something from scratch, in the earlier phases.

nullbio · 2026-06-13T04:19:50 1781324390

When Jensen (Nvidia) was doing interviews at his recent public talks, he was asked something along the lines of: "Why release these new laptops which are a low margin market, if your other businesses are vastly more profitable?" and his answer was basically that if they can build the coolest and best technology and push the frontier, they will do it. It's not all about making tons of money. He seemed genuinely excited about the tech.

It highlights the difference between companies like Nvidia and Anthropic to me, where one is clearly all about the money and power, and the other is doing it because they genuinely want to accelerate progress and make cool stuff as the driving factor. It's no surprise therefore, that Nvidia is the worlds largest open-source contributor to AI, with over 800 open-weight models.

Of course, these models run on Nvidia hardware, so they benefit from it as a company. But with that healthy mindset, they found a way to contribute that not only benefits everyone, but also benefits themselves.

Contrast to Anthropic, who has gone the complete opposite direction. Closed off everything, restricting everything, fearmongering progress, regulatory capture attempts, the list goes on. I mean, they won't even agree on using AGENTS.md as a standard because CLAUDE.md is free marketing for them. That's the level of disgusting greed we are dealing with...

From a game theory perspective, the cooperative strategies tend to win. As a result, Nvidia has set themselves up for a lifetime. Anthropic however, is playing a strategy of winner takes all, and they're happy to see the world and the entire AI industry collapse in the process.

ThrowawayR2 · 2026-06-13T05:51:21 1781329881

Amazing that anyone in 2026 still can believe in "don't be evil" marketing from multibillion dollar corporations.

nullbio · 2026-06-13T08:04:11 1781337851

The proof is in the pudding though. I'm judging based on their actions, not on their words. They're making AI models and AI research widely accessible, including selling consumer grade hardware to run them locally, and to use open-weight models. They could have just gone all in on selling to Anthropic, OpenAI, and all the other big tech companies, but they aren't. Meanwhile, Anthropic is trying to price people out of the market, increasing their restrictions, cutting the latest model from subscription plans, etc.

WarmWash · 2026-06-13T11:09:01 1781348941

Yeah but Claude has a cream white background, intelligent font, and fun hand drawn graphics cues... Anthropic must be pure

SXX · 2026-06-13T04:31:58 1781325118

Nvidia and "open source" is like opposite things. Nvidia only ever opened stuff that helps their bottom line or improve vendor lock-in.

But yeah they are good shovel seller and competitor to actually evil companies that literally wants to eat all the world chips and energy supply.

nullbio · 2026-06-13T08:08:34 1781338114

Strongly disagree: https://build.nvidia.com/models

Their license terms are also incredibly generous and allow commercial use, modification, etc, at no cost.

SXX · 2026-06-13T08:20:42 1781338842

How soon do you think this generosity end if AMD or Intel or some chinese competitor would be able to provide price competetive hardware?

zozbot234 · 2026-06-13T05:57:26 1781330246

In the open source space, the Nemotron models from nVidia are quite real. Including a Nemotron Ultra variety meant to be large enough for near-SOTA.

SXX · 2026-06-13T08:18:10 1781338690

Nvidia not doing it out of goodness of their hearts and love to open source. If at anynpoint their CUDA vendor lock-in moat will faik because Intel or AMD manage to get working software they'll return to keep everything locked and proprietary ASAP.

Basically everything Nvidia does in open source is there to make sure their proprietary stack have a good moat and no competitor stack can catch up.

cwnyth · 2026-06-13T04:31:27 1781325087

That's not really the impression I get from Anthropic, but if you have the links to back it up, I'm always willing to change my mind.

Compared to bizes like Oracle, Microsoft, or Facebook, I felt that Anthropic was more interested in progress (not to the neglect of business―AI training is expensive at the end of the day), but maybe I've just not seen what you've seen.

nullbio · 2026-06-13T04:38:56 1781325536

https://clawd.rip

nullbio · 2026-06-13T04:09:53 1781323793

This is a good idea. I've been hoping that a large player with enough social reach would create an open-source fund that everyone can contribute to, to develop a company that trains and releases open-source models at the cutting edge. We can crowdfund the training costs, and the whole world benefits.

It's the most logical solution for AI anyway, considering that it's training on humanities collective knowledge. It should be more of a public-funded and public-access resource, rather than something greedy tech companies distribute like crumbs while they use unlocked powers internally to clone all of our businesses and swallow the economy.

nullbio · 2026-06-13T04:03:59 1781323439

It's actually the opposite. Democratization of intelligence is the only way to stop existential threats and render them useless.

Right now, and likely forever, because biological threats can be sanctioned at a supply-chain level, the risk of AI is all digital. Fraud, phishing scams, spam, hacks, etc.

The only way we harden the worlds infrastructure to the point that it can withstand attack from bad AI is if we have an abundance of access to frontier intelligence to develop countermeasures.

Otherwise, bad actors will develop these capabilities behind closed doors and use them to hold the world hostage and cause irreparable harm. There's no putting the genie back in the bottle. Good and open-access AI and the people using it are the digital immune system.

If there's an asymmetry where bleeding edge is gated off to only a small group, and allowed to gain exponential power over the immune systems defense grid, the slightest infection will lead to death of the host.

SubiculumCode · 2026-06-13T06:47:09 1781333229

That's a thesis.

nullbio · 2026-06-13T03:47:45 1781322465

That's a huge grasp. Anthropic have been making this bed for years now. Altman did not need to do a single thing for this outcome to materialize.

nullbio · 2026-06-13T03:45:49 1781322349

It's not only tenable, it is a necessity. Unless you want humanity to be enslaved in perpetuity to a single figurehead.

Bad AI is only countered by having a majority of good, open-access and open-source AI to keep it in check, where the good AI can overpower the bad. The moment you destroy that balance is the moment a bad actor gains exponential advantage and the ability to hold the whole world hostage forever.