I suspect the big picture isn't just "governments restricting the availability of strong LLMs to the public", it's a group of tech lobbyists who have managed to push a narrative that's plausible enough to the majority, but serves their master's interests in stifling competition, whether that be from Anthropic or those who know how to use their tools effectively.
The fact that Anthropic are willing to dumb-down their own model responses to "Prevent foreign competitors from using the model to accelerate R & D and protect our leading position." [1] adds credence to this speculation. Anthropic are scared of their own model's power in the hands of competitors: it has nothing to do with security.
I wonder if this is the real problem: it was too good, and a lobby of companies feeling threatened by the competition decided to push the jailbreak narrative as a scapegoat.
In a sense they do care. Anthropic / OpenAI care that your projects are successful because that means more revenue for themselves. Therefore, their models are designed to care that your products work.
It seems to me the incentive, if anything, is to make the codebase so complex and “write-only” that developers become entirely dependent on LLMs to make any change whatever. They care that you keep burning tokens, not that those tokens accomplish anything for you.
UUIDs make client code so much simpler. Just create a UUID, use it client side to create your object graph and commit or not as appropriate. No need to retrieve an incremented integer.
Every DB, even MySQL can return the autoincrementing integer for you as part of the insert. Postgres, SQLite, and MariaDB (likely others, I’m just not familiar) can even return the rest of the data, should you need that.
IME, most of the arguments for why UUIDs make things better are due to developer ignorance of RDBMS features (or B+tree performance).
I’m seeing a similar improvement with Opus 4.8, which is acting like an engineer that cares about correctness. The harder the problem the better it seems to do.
I think a golden age of software is just starting for indie software. It’s just going to take a while to see the first really good results.
I like the concept. Assuming you were the inspiration for this (very possible) how do you feel about the usability?
I spent an hour today trying to get it working the way I’d expect and it still does odd things, like after disabling automatic reordering based on usage the order is different when 3 finger swiping previews as opposed to actual windows. The visual order is as expected but the swipe order is not linear.
My criticism of MacOS Spaces is one I have of all Apple's window management efforts over the years: It's a great start and a foundation to build upon, but they never really followed through with iterative improvements informed by real-world usage.
In Spaces' case, the problem is a combination of an overly rigid model (only two full-height windows per space) and high friction to management (the process of moving full-height windows from one space to another requires multiple steps). Traditional free-form windows are a little better, but it's easy to lose track of them because the overview itself requires two steps to access (open Mission Control, hover the top bar to expand thumbnails). These could have been gradually improved over the past 14 years, but Apple has somewhat frustratingly left these core workspace features to wither on the vine.
> rather than showing you a preview, the bar just says "Desktop 1", "Desktop 2"
I never noticed that behaviour because I only use mission control in full-screen mode. If you swipe up with three (or four) fingers from a full-screen window the previews are visible immediately. I have no idea why we need a different preview for desktop vs full screen however.
The part of this UX that annoys me is the spaces get re-ordered for no apparent reason. I usually have a few IDE windows open and it's tiring to have to double-check the window hasn't moved.
The full-screen mode handling might be a clue about what went wrong: if you swipe up from a space that contains a full screen app, it has an animation where the app goes into a slot in the preview strip, but that animation doesn't make sense visually for a non-full-screen space. So, perhaps someone was implementing that animation, didn't want to implement an alternate animation for the non-fullscreen case, and decided to minimize the preview strip instead? And because this was after Steve Jobs had died, there was no one left in charge of UX to explain why that was a bad idea?
The animation for the full-screen case serves a useful purpose: drawing the eye to the window in the preview.
The non-fullscreen (desktop) case uses an animation for the same purpose, locating the current app window in a sea of others.
So what would the preview be in the swipe-from-desktop case? A preview of the window-sea, or the desktop as is? What should the animation be? I suspect those questions are why they chose to just name the desktop.
I think it would be more consistent if the tab based preview only existed for the desktop window-sea and transitioned to the actual space previews when swiping between spaces.
> If you swipe up with three (or four) fingers from a full-screen window the previews are visible immediately.
Previews are also visible immediately if you set Mission Control as a hot-corner action. In never see the title-only spaces — i forgot it even did that until this discussion.
I also wish I could name the Spaces. "Desktop N" is pretty useless.
Turning off "Automatically rearrange Spaces based on most recent use" keeps the spaces in the order I left them. That's nice. Three finger swipe between spaces when not using the preview seems to work.
However, swiping beetween the previews, it sometimes jumps to random places in the order - which is not nice.
Possibly a bug, but I might as well just write this as a letter to Santa because it's got more chance of being read than a feeback.
Maybe someone can explain why an encoder would ever create the padding bytes allowed in LEB128. I contributed the parser for LEB128 in apple/swift-binary-parsing and I’m still none the wiser. I’m genuinely mystified.
Let's say you are writing into a byte[] and have a LEB128 length-prefix followed by a payload, but that determining the length actually involves nontrivial encoding work. For example, you have a UTF16 string and want to write out a UTF8 string, you want to go over the characters and write them out, but the UTF8 length is not known without doing all of that work.
If you can choose a fixed number of bytes for the length prefix, you can skip that number, do the encoding and find out the length, and then come back and fill in the length-prefix after.
But you actually don't know how many bytes it will take without doing all of the work to know the payload length (since larger payloads take more bytes to represent the length).
If you allow overlong representation you can reserve a few bytes and sometimes it'll just be the effective no-op bytes. If you don't, you won't be able to.
It allows you to fill in padding in a buffer. For example, all data in a buffer will be interpreted by a downstream system, and someone pre-calculated the size of that buffer. Rather than encode everything twice (once to figure out the exact size needed, and a second time to actually populate the buffer) the buffer size was calculated using foreknowledge of how many values would be written to that buffer and then just pessimistically assuming all of them are max-size so writing will never fail. Another situation is when you're rewriting part of an already-encoded file. If you want to change a bit of payload then using padding bytes gives you more flexibility so you can do that without having to do any memcpy into a new buffer.
It's uncommon but I've definitely seen it done (with media containers like Matroska, not actually LEB128) in extremely high-throughput systems that can't spare any cycles.
Maybe you want to byte align some data, or pack to a certain size but keep compat. I think they're going to be rare cases, but I can see it being used.
The issue is that non-unique encodings are an attack vector, because parsers may in practice behave differently for noncanonical (or nominally invalid) encodings.
For example, you have an envelope format that goes: length prefix in LEB128, message, signature. One party controls the length prefix and signature, a different party controls the message. The message-writing party carefully crafts the message so that, in isolation, it appears innocuous, but when wrapped in the envelope, the first few bytes of the message look like continuation of the length prefix. Best case is the receiving party safely fails to parse the message, worst case is the receiving party successfully parses the message, verifies its digital signature, and interprets it differently than the signer did.
Laziness probably. Maybe there's an argument if you want to avoid branches and just blast the integer out in a fixed number of statements/instructions/bytes, but that sounds a bit fringe.
I happen to be guilty of a variant of this, where I don't bother emitting a 16-bit floating point number instead of a 32-bit one in my CBOR encoder even if it can be represented exactly. That one is laziness.
It's useful whenever you don't know the value of an integer but would like to allocate space for it now, and then fill in the value later. Many have mentioned length-prefixed data, which is a good example. Another use case is static linking. I believe LLVM uses this when generating WASM object files.
I think this is probably the real reason such encodings are considered valid. The webassembly spec is explicit about allowing valid over-wide encodings:
Because it's not a real standard and there is no blessed RFC for it. The DWARF spec is as close as you'll get and it says, "The integer zero is a special case, consisting of a single zero byte." So in a way, it doesn't.
Either way, a properly written decoder (and it's like ten lines) should really not have any problems with it. I was agreeing with you.
Edit: to clarify, I was talking about the author's argument being strange, not yours.
Working on my own product using Claude, I feel like front-end coding hasn’t changed much. It still requires a lot of manual tweaking and understanding users at a human level.
Personally I’m happy that the backend and algorithmic side writes itself.
That's refreshing to read (frontend is my wheelhouse). I mostly agree. It seems like most people using AI treat FE as a solved problem, satisfied using tailwind and settling for "looks close enough".
I think there will always be space for good artisanal FE. This is a Ford Model T moment, the software production line has just been invented, but that didn’t stop smaller sports car manufacturers pushing the envelope.
Just an anecdote on UK local government tech incompetence: I received a ticket “Failing to comply with a prohibition on certain types of vehicle” from Hackney council. Initially I thought my car had been cloned as I haven’t driven for months, but either a person or an AI had misread my car number plate. It was all just such a waste of time, especially navigating the Ai designed to annoy you into paying.
Addendum: The really annoying part was when the AI asked me for my ticket number, then asked me for the two digit contravention code.
First, surely they know the contravention code because they gave me the ticket?!
Second, the two digit contravention code was actually a part of a three digit alphanumeric code found in this sentence "52m Failing to comply with a prohibition".
The fact that Anthropic are willing to dumb-down their own model responses to "Prevent foreign competitors from using the model to accelerate R & D and protect our leading position." [1] adds credence to this speculation. Anthropic are scared of their own model's power in the hands of competitors: it has nothing to do with security.
[1] https://eu.36kr.com/en/p/3848820681636481
reply