Grok would likely have an advantage there, as well - it's got better coupling to X/Twitter, a better web search index, fewer safety guardrails in pretraining and system prompt modification that distort reality. It's easy to envision random market realities that would trigger ChatGPT or Claude into adjusting the output to be more politically correct. DeepSeek would be subject to the most pretraining distortion, but have the least distortion in practice if a random neutral host were selected. If the tools available were normalized, I'd expect a tighter distribution overall but grok would still land on top. Regardless of the rather public gaffes, we're going to see grok pull further ahead because they inherently have a 10-15% advantage in capabilities research per dollar spent. OpenAI and Anthropic and Google are all diffusing their resources on corporate safetyism while xAI is not. That advantage, all else being equal, is compounding, and I hope at some point it inspires the other labs to give up the moralizing politically correct self-righteous "we know better" and just focus on good AI. I would love to see a frontier lab swarm approach, though. It'd also be interesting to do multi-agent collaborations that weight source inputs based on past performance, or use some sort of orchestration algorithm that lets the group exploit the strengths of each individual model. Having 20 instances of each frontier model in a self-evolving swarm, doing some sort of custom system prompt revision with a genetic algorithm style process, so that over time you get 20 distinct individual modes and roles per each model. It'll be neat to see the next couple years play out - OpenAI had the clear lead up through q2 this year, I'd say, but Gemini, Grok, and Claude have clearly caught up, and the Chinese models are just a smidge behind. We live in wonderfully interesting times.
Generally a good writeup, but the article seems a bit confused about undefined behavior. > What is the dreaded UB? I think the best way to understand it is to remember that, for any running program, there are FATES WORSE THAN DEATH. If something goes wrong in your program, immediate termination is great actually! This has nothing to do with UB. UB is what it says on the tin, it's something for which no definition is given in the execution semantics of the language, whether intentionally or unintentionally. It's basically saying, "if this happens, who knows". Here's an example in C: int x = 555; long long *l = (long long*)&x; x = 123; printf("%d\n", *l); This is a violation of the strict aliasing rule, which is undefined behavior. Unless it's compiled with no optimizations, or -fno-strict-aliasing which effectively disables this rule, the compiler is "free to do whatever it wants". Effectively though, it'll just print out 555 instead of 123. All undefined behavior is just stuff like this. The compiler output deviates from the expected input, and only maybe . You can imagine this kind of thing gets rather tricky with more aggressive optimizations, but this potential deviation is all that occurs. Race conditions, silent bugs, etc. can occur as the result of the compiler mangling your code thanks to UB, but so can crashes and a myriad of other things. It's also possible UB is completely harmless, or even beneficial. It's really hard to reason about that kind of thing though. Optimizing compilers can be really hard to predict across a huge codebase, especially if you aren't a compiler dev yourself. That unpredictability is why we say it's bad. If you're compiling code with something like TCC instead of clang, it's a completely different story. That's it. That's all there is to UB.
Yeah that's an interesting point, it feels like it should be even better than it is now (I might be ignorant of the quality of the best coding agents atm). Like rust seems particularly well suited for an agent based workflow, in that in theory an agent with a task could keep `cargo check`-ing it's solutions, maybe pulling from docs.rs or source for imported modules, and get to a solution that works with some confidence (assuming the requirements were well defined/possible etc etc). I've had a mixed bag of an experience trying this with various rust one off projects. It's definitely gotten me some prototype things working, but the evolving development of rust and crates in the ecosystem means there's always some patchwork to get things to actually compile. Anecdotally I've found that once I learned more about the problem/library/project I'll end up scrapping or rewriting a lot of the LLM code. It seems pretty hard to tailor/sandbox the context and workflow of an agent to the extent that's needed. I think the Bun acquisition by Anthropic could shift things too. Wouldn't be surprised if the majority of code generated/requested by users of LLM's is JS/TS, and Anthropic potentially being able to push for agentic integration with the Bun runtime itself could be a huge boon for Bun, and maybe Zig (which Bun is written in) as a result? Like it'd be one thing for an agent to run cargo check, it'd be another for the agent to monitor garbage collection/memory use while code is running to diagnose potential problems/improvements devs might not even notice until later. I feel like I know a lot of devs who would never touch any of the langs in this article (thinking about memory? too scary!) and would love to continue writing JS code until they die lol
I really hate the anti-RAII sentiments and arguments. I remember the Zig community lead going off about RAII before and making claims like "linux would never do this" ( https://github.com/torvalds/linux/blob/master/include/linux/cleanup.h ). There are bad cases of RAII APIs for sure, but it's not all bad. Andrew posted himself a while back about feeling bad for go devs who never get to debug by seeing 0xaa memory segments, and sure I get it, but you can't make over-extended claims about non-initialization when you're implicitly initializing with the magic value, that's a bit of a false equivalence - and sure, maybe you don't always want a zero scrub instead, I'm not sold on Go's mantra of making zero values always be useful, I've seen really bad code come as a result of people doing backflips to try to make that true - a constructor API is a better pattern as soon as there's a challenge, the "rule" only fits when it's easy, don't force it. Back to RAII though, or what people think of when they hear RAII. Scope based or automatic cleanup is good. I hate working with Go's mutex's in complex programs after spending life in the better world. People make mistakes and people get clever and the outcome is almost always bad in the long run - bugs that "should never get written/shipped" do come up, and it's awful. I think Zig's errdefer is a cool extension on the defer pattern, but defer patterns are strictly worse than scope based automation for key tasks. I do buy an argument that sometimes you want to deviate from scope based controls, and primitives offering both is reasonable, but the default case for a ton of code should be optimized for avoiding human effort and human error. In the end I feel similarly about allocation. I appreciate Zig trying to push for a different world, and that's an extremely valuable experiment to be doing. I've fought allocation in Go programs (and Java, etc), and had fights with C++ that was "accidentally" churning too much (classic hashmap string spam, hi ninja, hi GN), but I don't feel like the right trade-off anywhere is "always do all the legwork" vs. "never do all the legwork". I wish Rust was closer to the optimal path, and it's decently ergonomic a lot of the time, but when you really want control I sometimes want something more like Zig. When I spend too much time in Zig I get a bit bored of the ceremony too. I feel like the next innovation we need is some sanity around the real useful value that is global and thread state. Far too much toxic hot air is spilled over these, and there are bad outcomes from mis/overuse, but innovation could spend far more time on _sanely implicit context_ that reduces programmer effort without being excessively hidden, and allowing for local specialization that is easy and obvious. I imagine it looks somewhere between the rust and zig solutions, but I don't know exactly where it should land. It's a horrible set of layer violations that the purists don't like, because we base a lot of ABI decisions on history, but I'd still like to see more work here. So RAII isn't the big evil monster, and we need to stop talking about RAII, globals, etc, in these ways. We need to evaluate what's good, what's bad, and try out new arrangements maximize good and minimize bad.
> So RAII isn't the big evil monster, and we need to stop talking about RAII, globals, etc, in these ways. I disagree and place RAII as the dividing line on programming language complexity and is THE "Big Evil Monster(tm)". Once your compiled language gains RAII, a cascading and interlocking set of language features now need to accrete around it to make it ... not excruciatingly painful. This practically defines the difference between a "large" language (Rust or C++) and a "small" language (C, Zig, C3, etc.). For me, the next programming language innovation is getting the garbage collected/managed memory languages to finally quit ceding so much of the performance programming language space to the compiled languages. A managed runtime doesn't have to be so stupidly slow. It doesn't have to be so stupidly non-deterministic. It doesn't have to have a pathetic FFI that is super complex. I see the "strong typing everywhere" as the first step along this path. Fil-C might become an interesting existence proof in this space. I view having to pull out any of C, Zig, C++, Rust, etc. as a higher-level programming language failure. There will always be a need for something like them at the bottom, but I really want their scope to be super small. I don't want to operate at their level if I can avoid them. And I say all this as someone who has slung more than 100KLoC of Zig code lately. For a concrete example, let's look at Ghostty which was written in Zig. There is no strong performance reason to be in Zig (except that implementations in every other programming language other than Rust seem to be so much slower). There is no strong memory reason to be in Zig (except that implementations in every other programming language other than Rust chewed up vast amounts of it). And, yet, a relatively new, unstable, low-level programming language was chosen to greenfield Ghostty. And all the other useful terminal emulators seem to be using Rust. Every adherent of managed memory languages should take it as a personal insult that people are choosing to write modern terminal emulators in Rust and Zig.
- Strict team separation (frontend versus backend) - Moving all state-managament out of the backend and onto the frontend, in a supposedly easier to manage system - Page refreshes are indeed jarring to users and more prone to leading to sudden context losses - Desktop applications did not behave like web apps: they are "SPA"s in their own sense, without jarring refreshes or code that gets "yanked" out of execution. Since the OS has been increasingly abstracted under the browser, and the average computer user has moved more and more towards web apps[1], it stands to reason that the behavior of web apps should become more like that of desktop apps (i.e. "SPA"s)[2] (Not saying I agree with these, merely pointing them out) [1] These things are not entirely independent. It can be argued that the same powers that be (big corps) that pushed SPAs onto users are also pushing the "browser as OS" concept. [2] I know you can get desktop-like behavior from non-SPAs, but it is definitely not as easy to do it or at least to _learn it_ now. My actual opinion: I think it's a little bit of everything, with a big part of it coming from the fact that the web was the easiest way to build something that you could share with people effortlessly. Sharing desktop apps wasn't particularly easy (different targets, java was never truly run everywhere, etc.), but to share a webapp app you just put it online very quickly and have someone else point their browser to a URL -- often all they'll do is click a link! And in general it is definitely easier to build an SPA (from the frontender's perspective) than something else. This creates a chain: If I can create and share easily -> I am motivated to do things easily -> I learn the specific technology that is easiest -> the market is flooded with people who know this technology better than everything else -> the market must now hire from this pool to get the cheapest workers (or those who cost less to acquire due to quicker hiring processes) -> new devs know that they need to learn this technology to get hired -> the cycle continues So, TL;DR: Much lower barrier to entry + quick feedback loops P.S (and on topic): I am an extremely satisfied django developer, and very very very rarely touch frontend. Django is A-M-A-Z-I-N-G.
In a high stakes, challenging environment, every human weakness possible becomes a huge, career impeding liability. Very few people are truly all-around talented. If you are a Stanford level scientist, it doesn't take a lot of anxiety to make it difficult to compete with other Stanford level scientists who don't have any anxiety. Without accommodations, you could still be a very successful scientist after going to a slightly less competitive university. Rising disability rates are not limited to the Ivy League. A close friend of mine is faculty at a medium sized university and specializes in disability accommodations. She is also deaf. Despite being very bright and articulate, she had a tough time in university, especially lecture-heavy undergrad. In my eyes, most of the students she deals with are "young and disorganized" rather than crippled. Their experience of university is wildly different from hers. Being diagnosed doesn't immediately mean you should be accommodated. The majority of student cases receive extra time on exams and/or attendance exemptions. But the sheer volume of these cases take away a lot of badly needed time and funding for students who are talented, but are also blind or wheelchair bound. Accommodating this can require many months of planning to arrange appropriate lab materials, electronic equipment, or textbooks. As the article mentions, a deeply distorted idea of normal is being advanced by the DSM (changing ADHD criteria) as well as social media (enjoying doodling, wearing headphones a lot, putting water on the toothbrush before toothpaste. These and many other everyday things are suggested signs of ADHD/autism/OCD/whatever). This is a huge problem of its own. Though it is closely related to over-prescribing education accommodations, it is still distinct. Unfortunately, psychological-education assessments are not particularly sensitive. They aren't good at catching pretenders and cannot distinguish between a 19 year old who genuinely cannot develop time management skills despite years of effort & support, and one who is still developing them fully. Especially after moving out and moving to a new area with new (sub)cultures. Occasionally, she sees documents saying "achievement is consistent with intelligence", a polite way of saying that a student isn't very smart, and poor grades are not related to any recognized learning disability. Really and truly, not everyone needs to get an undergrad degree.
Those are all great examples where I agree that an accommodation seems uncontroversial. But to quote the article linked in the parent comment: > The increase is driven by more young people getting diagnosed with conditions such as ADHD, anxiety, and depression, and by universities making the process of getting accommodations easier. These disabilities are more complex for multiple reasons. One is the classification criteria. A broken hand or blindness is fairly discrete, anxiety is not. All people experience some anxiety; some experience very little, some people a great deal, and everything in between. The line between regular anxiety and clinical anxiety is inherently fuzzy. Further, a clinical anxiety diagnosis is usually made on the basis of patient questionnaires and interviews where a patient self-reports their symptoms. This is fine in the context of medicine, but if patients have an incentive to game these interviews (like more test time), it is pretty trvial to game a GAD-7 questionnaire for the desired outcome. There are no objective biomarkers we can use to make a clinical anxiety diagnosis. Another is the scope of accommodation. The above examples have an accommodation narrowly tailored to the disability in a way that maintains fairness. Blind users get a braille test that is of no use to other students anyway. A student with a broken hand might get more time on an eassy test, but presumably would receive no extra time on a multiple choice test and their accommodation is for a period of months, not indefinite.
 Top