Hacker Newsnew | past | comments | ask | show | jobs | submit | NiloCK's commentslogin

I'm working on a framework for general purpose interactive tutoring systems. An SRS background process over a pluggable system of pedagogy protocols over a given curriculum. This is at https://github.com/patched-network/vue-skuilder, or https://patched.network/skuilder

With this framework, I'm making (among other things) an early literacy app at https://letterspractice.com. My aim here is to hit >= 75% efficacy of Mentava at <= 1% of the price.

The app is near to production readiness, and I'd be happy to share access now with anyone who has verbal but non-literate kids. Be in touch if interested at colin at letterspractice.com


At that time, nobody believed a dead internet was technically feasible. Maybe this is hard to remember now.

The "danger" was in terms of spam / misinformation proliferation, not the same category of capabilities adjacent risks current discussed.

You can hold your own opinions on spam/misinformation as a problem, but to say there was no credibly anticipated outsized downside to a sudden jump in human-passing text generation feels pretty off to me.


I remember the arguments back then. Those alarmists were wrong. Nothing happened or could've happened just because you could generate drunken ramblings.

It's the kind of people that want to ban anything because of some theoretical small harm is technically possible. We're lucky it's not more prevalent or we'd still be in the stone age.


It wasn't just "drunken ramblings," they were right about the dangers they called out. Reddit is largely LLMs arguing with each other, it's so easy and only costs a few thousand dollars to spin up a mass misinformation campaign.

They were right about the risks.


It's amazing to me that people actually thought GPT2 produced "human-passing" text, while I'm still tripping over obvious LLMisms in the output of recent models on a daily basis.

(It's also amazing to me that it took mere minutes for this observation, deep in a sub-thread, to get downvoted without any reply, with no obvious reason for it.)


People may perceive you to be cheaply mischaracterizing the argument.

Nobody believed or suggested that GPT2 could do longform or produce novel text that stood up against careful scrutiny as insightful or well informed. But because the capabilities were novel, people would have no strong alternative than to believe some person wrote it.

You current tripping over LLMisms is irrelevant. You have years of antibodies, both personal and herd-immunity (eg, the many, many articles and comments that describe LLMisms).


LLM-isms are much less prevalent in base models, which is what GPT-2 was. It had significant problems with maintaining coherence, but GPT-2 generated text did not have the obvious tells of today's LLMs.

... so the mechanic produced an invoice, itemized.

changing the CSS - $0.05

knowing which CSS to change - $30


For those that don't know, this is a reference to a lovely story involving Charles Proteus Steinmetz https://www.smithsonianmag.com/history/charles-proteus-stein...

overflow is CSS 101

Sure, live in shame, but don't let go of the humor in it all :)

I did get quite a laugh when the comments made me realize what an asshat I was

> Automation doesn't make operators more careful. It makes them forget how to be. The more reliable the system, the less ready the human.

The entire premise of a system is that it removes the need for careful attention.

system: signal lights tell me whether or not I can pass through an intersection, so that I do not have to attend to potentially high speed traffic from a variety of directions.

system: the side my knife blade sits on my arched guide fingers, so that I do not have to attend to the edge of the blade or the location of my fingers.

etc etc.


>The entire premise of a system is that it removes the need for careful attention.

I think this premise is flawed or, at best, too narrow. A system is just a logical grouping of items that perform a function. Sometimes that function can be to reduce cognitive burden, but it doesn't have to be. A "vision system" like what humans use does not reduce attention, but increases/enables it, while a autonomic nervous system can reduce attention. The ability to increase/reduce attention is not the central principle of a system.


What I'd say you're pointing out is that the word "system" is overloaded.

A vision system does allow you to pay less attention: you don't need to carefully remember how far away the door is, you just need to look! I tried this often as a kid: if you want to navigate a hallway with your eyes closed, you need to pay far more attention to your other senses than you need to pay with your eyes -- where attention here is not the volume of data, but rather the complexity of conscious bookkeeping -- I can (ironically) "play it by ear" with my eyes open, but eyes closed I must plan every step!

It just so happens to be that the ability to pay less attention makes more things possible and hence the demand for attention overall may increase -- if not intrinsically, due to your competitors (who can also see!)


I would argue this take conflates attention with cognitive overhead required because of a lack of training. Navigating with our closed feels like it takes more attention because we’ve practiced so many hours navigating by sight that it no longer feels cognitively burdensome. A bat would have no trouble navigating without sight for the same reason. I don’t think most people would say giving up our sight for echolocation would reduce our attention, it just transforms it.

> system: signal lights tell me whether or not I can pass through an intersection, so that I do not have to attend to potentially high speed traffic from a variety of directions.

You know I noticed this... I lived in a country where people obey traffic laws, and in a country where they very much don't.

I witnessed many more traffic accidents in the country where people are used to relying on the traffic lights to tell them if it's safe or not.

Whereas in the other country, everyone correctly assumes that the other drivers are completely insane, and so they stay vigilant.


Other than the data of road fatalities that disproves this anecdote, my own anecdote is this is the false sense of security people get in other countries that don’t have traffic laws. Oh see the people have to look all the time so it’s much safer. When you start to live it for a long time you realize it’s not true. Many more fatalities.

Now I do think the science shows if you design roads and systems to make drivers more thoughtful it can improve outcomes. Size roads for the speed limit, roundabouts, etc. these can make a difference as it balances the system.


Having lived, driven, and crossed roads in both -- what I find is essentially that drivers from poor systems pay far more attention, but the system is a lot more effective than attention.

The difference here is one of stability: in a developing country, I can just walk across a street (often there is no traffic light) by essentially signalling with my body language -- both I and the drivers are paying attention. And if one party fails, the other has a good chance of catching that mistake.

Now, in a developed country, neither side is paying attention. If I walk across the street, I'm in danger, no matter how clear my body language (I tried it on British streets a few times -- it works in some areas, but usually very poorly!), and no one expects a crazy driver to come barreling through a red light.

The developing countries fall behind because in the crazy * sane intersection, sometimes the sane person is just not fast enough -- whereas the crazy * crazy intersection is extremely dangerius and happens often enough.

On the other hand, a developed country makes every interaction sane * sane regardless of the personalities or moods of those involved -- but God forbid a bit of crazy leaks out!


>Other than the data of road fatalities that disproves this anecdote,

You can't make that assertion (well you can, it's called "lying with statistics" but that's beside the point) without knowing if the fatalities the result of the accident rate or just a higher conversion ratio as a result of reduced safety equipment, reduced seatbelt usage, more motorcycles, etc, etc, worse emergency services, etc, etc.

INB4 other people start whining on your behalf, I'm not saying those countries aren't less safe to drive, just that you can't do a straight comparison of accident rates and fatalities without considering the conversion ratio.


Of course you have to consider confounders. That’s why transportation data usually includes best efforts.

But at some point you have to look at the totality of the evidence. Countries with better road infrastructure, enforcement, vehicle standards, and driving behavior generally produce better safety outcomes. The fact that multiple factors contribute doesn’t make the observed outcome meaningless.

As I already stated there is absolutely systems that increase the perceived sense of risk that can help outcomes (road width sizing, roundabouts, minimal signs/lines) but those typically work best in a system where there is already some sense of order.

Less Reddit style snark would go a long way too.


Whether you witness something or not is a function of a ton of other things too, so much that it makes your anecdotes useless if not actively harmful.

For example, if you live somewhere where you use the highway more often, that sure as heck can skew the result.

Or if you live(d) somewhere where people tend to hit and run instead of waiting... you're obviously not going to witness them as often.

Also, note that accidents and injuries are not the same thing. You can totally have fewer accidents but more injuries or fatalities.

Without knowing the neighborhoods you've lived in (so people can compare the data for themselves) you're really not going to make a compelling case.


There's some documented studies of removing all the street clutter and lines from residential area intersections forcing drivers to be more careful, especially around pedestrians, reducing overall accidents. But this does reduce throughput slightly.

This is an example of risk compensation. When people perceive greater protections around themselves, they tend to become more aggressive at the margin, such as with the driving habits that you mentioned or hitting more violently in American football because of improvements in helmets and padding.

https://en.wikipedia.org/wiki/Risk_compensation


Football might be a weak example, because being able to hit harder is an overwhelming competitive advantage. A player who acted like they were not wearing a helmet would be effectively dysfunctional.

In contrast, most careless driving habits don't actually get anybody to their destination any quicker.


Update: someone replied below with a link to traffic deaths statistics. Turns out the data shows the opposite of what I witnessed.

The insane driving country has double the traffic related deaths as the chill, lawful driving country.


Except if you look at this map: https://en.wikipedia.org/wiki/List_of_countries_by_traffic-r...

People die on road more in countries that conventionally don't follow traffic laws.


Not sure why this would downvoted. Go to a less developed nation where traffic laws are not important and it’s one of those sense of false security ideas.

When I lived in India everyone would always tell me how everyone drives so much safer there because they're more aware etc (similar reasoning as we're seeing in this thread), but man, the national crash statistics say otherwise.

Not to mention I lost count of how many dozens of accidents I witnessed in my year there. I've personally been in 3 rickshaw crashes.


My experience as well in Vietnam. the first time I was there I figured they were right but it’s just a false narrative people sell to make themselves feel safe. Don’t even get me started on methed out American size semi trucks speeding with no concern of running you over.

>signal lights tell me whether or not I can pass

No they don't, they tell you and other vehicles to stop. You would fail your driving test if you depend only on the traffic lights and don't bother to verify it is safe to pass yourself.


The main function of a traffic signal is the green phase, not the red phase. A traffic signal increases throughput by allowing drivers to ignore crossing traffic.

(If safety/the red phase was the purpose, the intersection would use a roundabout instead.)


I mean, it depends on where you take your driving test. In a lot of places in the US (especially in some rural areas), you may still pass. In some cases you might not even drive near a stoplight during the test.

If you "know a guy" you can even pass a driving test without ever getting behind the wheel of a car. Road licensing is in complete shambles in the last 10 years. A lot of "workarounds" and corruption.

You definitely still should be paying attention to cross traffic, regardless of what the lights indicate. The lights just make it easier by stopping most traffic for you, so you only have to do a quick scan for outliers.

The distinction is subtle.

Someone learning to fly may be described as paying careful attention: to every little sound, vibration, and sensation. A common tactic by student pilots is overcontrolling the aircraft, e.g., large sudden changes rather than smooth pressures from flying with a light touch.

Automation requires active, intentional attention particularly when flying in clouds. What are my instruments telling me? Are they all telling the same story? Have any failed? Which ones?

A significant part of flight training and testing emphasizes the ability to divide attention between multiple competing needs, being able to correctly prioritize them, and responding promptly and safely in order of priority.


replication: https://en.wikipedia.org/wiki/Quine_(computing)

autonomous replication: https://en.wikipedia.org/wiki/Computer_worm

nb that writing your own quine remains in general terms a fun and challenging exercise in many programming languages, but not python.


We are talking about physical replication, please let me know when a computer worm can turn my laptop into 2 laptops

Otherwise you’re just arguing that Sims are totally alive because Sims can make baby Sims.


I'm no decision theorist but I think they should wait for the rewards outweigh the expected harms in expectation rather than being statistically equal.

Fortune favors the bold.

If they took the right gamble, that is :-)

You miss 100% of the shots you don't take.

And sometimes 1000% of the shots that you do. (See, e.g., derivative trading.)

By that logic, you never stop playing a double-or-nothing game. Good luck!

Once or twice I've experienced extreme pain, and it was downstream of a bright light shining on a wet rock for millions of years.

I try to imagine myself long ago, on the outside looking in, with someone explaining to me that extreme pain, wondrous art, hunger, triumph, and despair would all unfold in due time where the rocks were wet and the lights bright enough.

I can imagine myself calling this clear nonsense.


> You're assuming that because Claude produces text that appears to express these qualities, Claude must have them.

Not to be confrontational, but the OP assumed no such thing. OP asserted that it's important for Claude to have the qualities - not that it's important for Claude to present as-if it had them.


The OP said Claude and similar LLMs "exhibit objective, human-like behaviours". That's a claim about what is true, not about what is important. That's the claim I'm disputing: we don't have evidence that Claude exhibits such behaviors, we only have evidence that it produces text that is similar to the text humans produce when they claim to exhibit such behaviors. Which is not good evidence, for the reasons I gave.

OP wants to assert that it's "important" for these systems to have those qualities, while completely brushing aside the question of whether such systems can in principle actually have those qualities (or their opposites). Which is at best nonsensical, and at worst an attempt to argue by assertion that they can.

I find this line of reasoning highly dubious.

Yes, Anthropic is compute constrained, even after the SpaceX Colossus deal.

But supply constraints are the normal operating mode of any market. Anthropic could choose to serve whatever models it pleases at whatever price points it chooses and let the market decide where the value is.

If Mythos at $X overwhelms their capacity, they could just charge $X+1. If still overwhelmed, there are larger prices as well.


During periods of market exuberance, it’s in the vendors interest not to reveal where exactly x+1 is. At the moment, everyone just guesstimates the company’s TAM. Bringing certainty to that guesstimate cuts Anthropic off from the most exuberant market participants, bringing their post-IPO price down unnecessarily.

The question is, will anyone pay enough for Mythos to offset the opportunity cost of offering that much Opus? You don't want to end up in a spot where you don't have enough compute and your service's reliability degrades to an unusable state like xAI.

I feel like there's always a demand for the very best models, even at insane prices. If the opportunity cost is x times opus, maybe few but there will always be companies willing to pay x+1 times opus.

Isn't that kind of what they're doing with this rollout? Except they're just hand picking the companies.

Only if the price is under the competition, which does exist now.

> Mythos at $X overwhelms their capacity, they could just charge $X+1

This may not be as valuable in the long term as getting committed customers hooked at a sustainable price.


Sort of, but valuation models depend on X being in a certain range. If it > this range, revenue and therefore valuation are impacted.

And then the bubble would collapse. Corps are already putting limits on token usage across the board because of costs. Increasing costs would significantly contract the hype bubble.

No insider info, but just wanted to mention that pricing signals things too. If Mythos is only servable at $X*Y dollars and isn’t Y times better than $X of compute at another provider, it’s quite possible that affects the IPO price negatively versus the halo of having the worlds most expensive model that is “too powerful to release” unpriced and unbenchmarked.

I think that most people at Anthropic are true believers from my interactions with them so I don’t believe this theory anecdotally. The simplest explanation is that it really is taking a while to gain confidence they won’t be used for a spree of bad cyber attacks. Knowing how long it takes institutions to fix security issues when filed by humans I would be more suprised if this wasn’t the case.

But I would forgive anyone who did think it was deliberately sandbagged; given the staggering sums at play, true believers might believe the ends justify the means to a little “marketing” like this.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: