Hacker Newsnew | past | comments | ask | show | jobs | submit | chmod775's commentslogin

You probably noticed, but each of these is also just a quote from the paragraph right next to them. Also here the drawings are cute, so that's nice.

Maybe you're not used to that style, but it's pretty common in educational literature (especially for younger audiences). They're mostly navigation aids, highlight conclusions or other important tidbits, or merely exist to break up the flow so it's not just an impenetrable-looking block of text.

I remember many school books being full of these back in the day.

Some newspapers do it too, where they just place a quote from the article itself under images or just enlarged by itself, not really adding anything in either case.


Well, on a positive note they seem to have also reset all weekly usage on my two max accounts.

Now I can continue my vanity project of having Claude iterate on a single spec.md for hours on end. Surely at some point it won't be shit.


Once a spec becomes sufficiently large and detailed and complicated, it becomes very difficult to ensure it is internally consistent. That's why I start every project with a METASPEC.md so that Claude can break up the task of writing SPEC.md into manageable steps.

Everyone knows a philosophy comes before a spec. Claude has to write your applications philosophy first, then you write your spec. But a philosophy is crap without a values statement, so Claude has to actually write that first.

Are you vibe coding a project by building an entire corporation around it??

How do you avoid interruptions for permission? dangerously skip permissions, or is there something less nuclear than that? For me I guess the only less nuclear thing I can think of is running on a sacrificial machine. Is there any better way?

It's literally just writing a spec.md and reviewing it in a loop, fanning out to many agents using "reviewer -> [findings] -> validator (adversarial) -> judge (on conflict)" passes. Before I had it collect a kernel facts document from sources and a bunch of other stuff using the same kind of loop. It's got all it needs. No crazy permissions needed.

Also I'm doing this because I find it amusing and somewhat educational on a meta level. If I'd written this myself without a spec it would've been done last month and been likely more correct than what Claude is likely to do once it gets to implementing it (the first spec-free attempt failed miserably). This is way too complex an integration for the poor thing. I had some hopes Fable would get it unstuck, but now we'll never know. Fable did seem to be better at keeping it together.

Fun thing to watch on a second monitor though.

To answer your question, there is something less nuclear: You can cycle multiple modes with SHIFT+TAB.


claude has auto mode. do shift+tab a few times. it uses a classifier to ask for permission far less often

https://code.claude.com/docs/en/auto-mode-config


Run it in a VM. Note that just a container probably won't be enough (https://stateofsurveillance.org/articles/ai/claude-opus-4-5-...).

If you're unsure what exactly this is supposed to be, just have a look here:

https://en.wikipedia.org/wiki/The_Library_of_Babel

> [..] the inhabitants believe that the books contain every possible ordering of just 25 basic characters (22 letters, the period, the comma, and space). Though the vast majority of the books in this universe are pure gibberish, the laws of probability dictate that the library also must contain, somewhere, every coherent book ever written, or that might ever be written, and every possible permutation or slightly erroneous version of every one of those books.


Still missing the pipe into sh.

Good thing that isn't a popular pattern that would make its way into the training data!

Ah too late to edit. That is what I meant

It's only a vulnerability if you absolve humans of responsibility and demote them to "meatbag vehicle for checking in LLM code".

Calling that a park is stretching it, even if someone named it "park". That's a playground, some grass, and a parking space. Not something where you can enjoy a stroll for a couple hours.

A city of ~20k doesn't have to go crazy here, but surely you can maintain something nicer (especially once you have that data center money!)


It's a playground, some grass, a parking lot (a "parking space" is for one car), a basketball court, a baseball diamond, and what looks like a decent paved, tree-lined trail that goes all the way past the animal shelter to a neighborhood.

Seems.... fine?


I recently moved from the inter-mountain West to the east and that is one thing that is fascinating to me is how differently the term park is used between the 2.

Out west a couple of swing sets and a slide with a small patch of grass is considered a park whereas out east a park is multi acre wilderness with trees streams and miles of trails.

It's just funny to me how even though it's the same country it's 2 totally different things meant by the word "park"


I don't think this is an east-coast/west-coast thing, but I think people all over the USA use the word "park" to mean anything on the scale of corner playground to national wilderness area.

As someone who lived in the West my entire life, not many people would call a couple of swing sets and a slide a park. A playground maybe, but not a park. Now I believe the city would officially call it a park, but that doesn't make it a park.

> What's a legitimate use case for this API?

When you're the application providing the VPN or when you're any app built to communicate with something on a local-ish network, not something actually reachable globally.


> I don't even like to think about setting up a project or dev environment.

That's a strong burnout indicator/symptom (or maybe you just don't enjoy it anymore), not necessarily something age related.

In fact plenty of people seem to fill their days with more work as they get older, where their younger selves would have chosen to do as little as possible.


I've personally encountered some stories that were pretty much exactly that.

Vulnerable young people becoming low level drug dealers (often for lack of other options) isn't exactly a rare story.


Let's be real. Most of the time you ask an LLM "Why did you do it like this?", it responds with something along the lines of "Oops. My bad. You're right to point this out."

You even have a fair chance of getting a response like that when there isn't anything wrong and the question wasn't rhetorical - which perfectly illustrates the level of the genuine understanding LLMs operate at.


When you criticize AI, always remember that the alternative is the average employee. Today's models are pretty good.

A lot of people think they're above average. A lot of them are wrong.

A lot of average people are producing gigantic messes. At least previous to this they were gated by their mediocrity.


> the alternative is the average employee. Today's models are pretty good.

I have never seen anywhere in the world people that hates so much the working class as people do in the USA.

In my country the average employee is competent, they do their work and create wealth for the nation.

Again, only in the USA people think that billionaires are the ones creating value. Total non-sense indoctrination.


I'm not American or ever worked in the USA. It's not a judgement of human value. It's a judgement of work output.

To adequately validate work you must be at least at the same level, so if you were right (which dunning-kruger suggests unlikely) that would mean your "terrible" average employee is given a tool that will 10x their output which they cannot even check for correctness. And correctness will be low if the average employee is bad like you say, because it means they will give badly specified tasks and even with the best of us it's garbage in, garbage out. I am sure there is no way this can backfire.

All enablers also enable mediocrity. That's not new. At least when the non-mediocre engineer has to work with someone, they can have a tireless responsive partner.

I find this varies by individual, but the AI taking care of so much boilerplate and rote work of coding, and taking the role of architect, test designer, and reviewer is a lot more productive for me. Check the code may take the same skill, but it's an order of magnitude less work.


Perhaps if you need that much boilerplate it's not going to be a well-architected codebase in the first place. Abstract it out, make a lib out of it. Easier to review & test in separation. Loose coupling, high cohesion.

and have they totally got rid of the average employees? They can blame the models for the production outages already?

when you criticize the average employee, always remember that the alternative is the average employee with AI.

I remember hearing (perhaps last year?) that the model companies have specifically tried to obfuscate the "thinking/reasoning" behind the decisions the models make so as to prevent cheaper models from training on the reasoning logs. So asking one "why did you do it like this" might be not fruitful.

Not sure if that's true or if it might be influencing what you're seeing, but it's a thought.


I think that has to do more with the thinking "train of thought" that some models show as what the model is processing before making the response. There shouldn't be a distillation risk with actually asking the model to explain why it made a decision and getting the response.

This has happened to me, so I put this in my global CLAUDE.md, and it seems to help (I don't remember getting the response you mentioned for awhile now):

    **Lead with the answer when asked how/which/whether.** Name the command/mechanism first; a question seeking understanding isn't a go-ahead to execute. Answer, then offer to act.

That's because of a fundamental misunderstanding of what an LLM is. The only correct answer to "Why did you do it like this?" is that the specific combination of input text and RNG state caused this particular output. There's no reasoning to be had.

* EDIT * What's with the downvoting? That's a correct description of what happened. You can't ask an LLM why it did something and expect a coherent response, because there's no thinking chain, and no stored thinking state... At best, you can get a reconstruction of how the context relates to the output (basically a summarization of the context).


Can't remember the last time that happened.

Happened to me at least three times the past 14 days. I point out where it made a design decision that causes data loss. «Oops my mistake»

I encounter it constantly with the latest models. Claude is particularly prone to it.

> I shouldn’t have said that with confidence

> I got ahead of myself there

> I overstepped, allow me to correct that

It’s wild seeing how often it’s wrong, and I only know it’s wrong because I am an SME or actually reading the sources. Most of my coworkers are not SMEs with what they are asking and do not read the sources.

A huge part of my job now is fixing fuck ups and failures resulting from these slop jockeys who have already moved on to slop up the next task.


So what? That doesn’t negate the value they provide.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: