That’s exactly the hard part, and I agree it matters more than the happy path. A few concrete things we do today: 1. It’s fully agentic rather than a fixed replay script. The model is prompted to treat GUI as one route among several, to prefer simpler / more reliable routes when available, and to switch routes or replan after repeated failures instead of brute-forcing the same path. In practice, we’ve also seen cases where, after GUI interaction becomes unreliable, the agent pivots to macOS-native scripting / AppleScript-style operations. I wouldn’t overclaim that path though: it works much better on native macOS surfaces than on arbitrary third-party apps. 2. GUI grounding has an explicit validation-and-retry path. Each action is grounded from a fresh screenshot, not stored coordinates. In the higher-risk path, the runtime does prediction, optional refinement, a simulated action overlay, and then validation; if validation rejects the candidate, that rejection feeds the next retry round. And if the target still can’t be grounded confidently, the runtime returns a structured `not_found` rather than pretending success. 3. The taught artifact has some built-in generalization. What gets published is not a coordinate recording but a three-layer abstraction: intent-level procedure, route options, and GUI replay hints as a last resort. The execution policy is adaptive by default, so the demonstration is evidence for the task, not the only valid tool sequence. In practice, when things go wrong today, the system often gets much slower: it re-grounds, retries, and sometimes replans quite aggressively, and we definitely can’t guarantee that it will always recover to the correct end state. That’s also exactly the motivation for Layer 3 in the design: when the system does find a route / grounding pattern / recovery path that works, we want to remember that and reuse it later instead of rediscovering it from scratch every time.
I think the people that dismiss this concern are just as bad and unscientific. It's a pretty decent Bayesian prior to assume that regular exposure to synthetic organic compounds in quantities or concentrations that hominids didn't receive exposure to in our evolution are likely to be potentially problematic. This is especially true when they don't occur naturally in any organisms biochemical processes and yet are active enough to interact with many of these things. This is OBVIOUSLY true for things used as pesticides / herbicides. We have evolved a natural aversion to areas where all the plants and animals are dead and rotting. There are good reasons for this to be a really good heuristic. I'll go so far as to say that almost any pesticide or herbicide is likely to be bad for vertebrates and invertebrates alike. This is really likely the case for perservatives as well, for what should be obvious reasons. It's really not that crazy to assume they're probably not good as a default assumption. Go into a hardware store and almost every chemical, solvent, paint, etc. that you encounter is not good for you. Eat a salmon and enjoy billions of plastic particles. Open almost any prepackaged food and you'll be ingesting all manner of dyes, perservatives, anti-caking agents, etc. etc. etc. that simply weren't around in your food environment during our evolution. It's a surprisingly good baseline assumption that these things aren't likely to be good for you. If you think about the study design and epidemiologic studies, it should be clear that it's going to be very difficult to prove harm in a lot of cases for things that are only a little harmful, or only harmful in combination, or harmful only after 20 years have passed since exposure, etc. ...except that the science is VERY clear: something (or lots of things) associated with "processed food" is really bad for you.
> In a nutshell, an IPv4x packet is a normal IPv4 packet, just with 128‑bit addresses. The first 32 bits of both the source and target address sit in their usual place in the header, while the extra 96 bits of each address (the “subspace”) are tucked into the first 24 bytes of the IPv4 body. A flag in the header marks the packet as IPv4x, so routers that understand the extension can read the full address, while routers that don’t simply ignore the extra data and forward it as usual. So you have to ship new code to every 'network element' to support IPv4x. Just like with IPv6. So you have to update DNS to create new resource record types ("A" is hard-coded to 32-bits) to support the new longer addresses, and have all user-land code start asking for, using, and understanding the new record replies. Just like with IPv6. (And the DNS idea won't work, as a lot of legacy code did not have room in data structures for multiple reply types: sure you'd get the "A" but unless you updated the code to get the "AX" address (for ipv4X addresses) you could never get to the longer with address… just like IPv6 needed code updates to recognize AAAA, otherwise you were A-only.) You need to update socket APIs to hold new data structures for longer addresses so your app can tell the kernel to send packets to the new addresses. Just like with IPv6. A single residential connection that gets a single IPv4 address also gets to use all the /96 'behind it' with this IPv4x? People complain about the "wastefulness" of /64s now, and this is even more so (to the tune of 32 bits). You'd probably be better served with pushing the new bits to the other end… like… * https://en.wikipedia.org/wiki/IPv6#IPv4-mapped_IPv6_addresses
> But will it? My prediction is no, because productivity gains must benefit the lower classes to see a multiplier in the economy. For example, ATMs being automated did cause a negative drop in teller jobs, but fast money any time does increase the velocity of money in the economy. It decreases savings rate and encourages spending among the class of people whose money imparts the highest multiplier. AI does not. All the spending on AI goes to a very small minority, who have a high savings rate. Junior employees that would have productively joined the labor force at good wages, must now compete to join the labor force at lower wages, depressing their purchasing power and reducing the flow of money. Look at all the most used things for AI: cutting out menial decisions such as customer service. There are no "productivity" gains for the economy here. Each person in the US hired to do that job would spend their entire paycheck. Now instead, that money goes to a mega-corp and the savings is passed on to execs. The price of the service provided is not dropping (yet). Thus, no technology savings is occurring, either. In my mind, the outcomes are: * Lower quality services * Higher savings rate * K-shaped economy catering to the high earners * Sticky prices * Concentration of compute in AI companies * Increased price of compute prevents new entrants from utilizing AI without paying rent-seekers, the AI companies * Cycle continues all previous steps We may reach a point where the only ones able to afford compute are AI companies and those that can pay AI companies. Where is the innovation then? It is a unique failure outcome I have yet to see anyone talk about, even though the supply and demand issues are present right now.
Newer does not mean better, or even imply it. Money spent on facilities has almost no correlation to educational outcomes. What matters are the peers you go to school with, supported by decent curriculum and moderately competent teachers. None of which is expensive. Oh, and administrators who actually care about teaching being done vs. being terrified of the lawsuit fairy. It’s the peers that matter by far the most - and that means parents. Parents that are self-selecting into good districts tend to skew heavily towards “involved” and some definition of functional. This can mean being able to and buying a home or rent an apartment in a good district, or finding some clever and/or creative workaround to get the same outcome. The latter is even better in most cases since those families are motivated at an even higher level to make sure it’s a success. The best school I went to as a kid was a private highly selective school in “the ghetto” where my dad lived growing up. Nearly every kid there was on some form of subsidized or full ride tuition, with very “working class” parents. The facilities were barebones at best. The vast majority of kids had parents who held them to extreme expectations even if they didn’t have financial means or even time to be highly involved day to day. The uber rich brand new high school I went to the next year in the suburbs wasn’t even close. The difference was in the kids who attended the school and the expectations put on them for both classroom behavior, engagement, and work ethic. Shitty disruptive kids were kicked out within a matter of days so as to let kids who wanted to be there actually learn. Anything beyond that is close to a rounding error for outcomes. The inner city school district I pay taxes into spends more per student than many of the suburbs. You could triple it again and get zero change in outcomes - in fact so far since living here school budgets are inversely correlated with outcome, although I don’t see a causation there in either direction. Schools that are allowed to be ran like schools and hold students to high expectations and standards do well. Schools that are ran like social programs trying to correct for all of societies ills do not. It’s pretty simple in the end.
From their front page: >*Full legal indemnification: *Through our offshore subsidiary in a jurisdiction that doesn't recognize software copyright* Heh, ok. So, the thinking is: 1. You contract them. 2. The actual Copyright infringement is done by an __offshore__ company. 3. If you get sued by the original software devs, you seek indemnification from the offshore subsidiary. 4. That offshore subsidiary is in a country without copyright laws or with weak laws so "you're good!" ... 5. Profit. This is a ridiculous legal defense since this "one-way-street" legal process will almost certainly result in you being sued first... the company actually using the infringing code. The indemnification is likely worthless since the offshore company won't have any assets anyway and will dissolve once there's a lawsuit and legal process is established. The "guarantee" is absurd: Their "MalusCorp Guarantee" promises a refund and moving headquarters to international waters if infringement is found. This is not a real legal remedy and is written to sound like a joke, which is telling about their seriousness... This whole "clean room as a service" concept is a legal gray area at best. In practice, it's extremely difficult to prove tha ta "clean room" process was truly clean, especially with AI models that have been trained on vast amounts of existing code (including the very projects they are "recreating"). The indemnification is a marketing gimmick to make a legally dangerous service seem safe. It creates a facade of protection while ensuring that any financial liability stays with you, the customer who wants to avoid infringement .
An interesting aspect of this, especially their blog post ( https://malus.sh/blog.html ), is that it acknowledges a strain in our legal system I've been observing for decades, but don't think the legal system or people in general have dealt with, which is that generally costs matter . A favorite example of mine is speed limits. There is a difference between "putting up a sign that says 55 mph and walking away", "putting up a sign that says 55 mph and occasionally enforcing it with expensive humans when they get around to it", and "putting up a sign that says 55 mph and rigidly enforcing it to the exact mph through a robot". Nominally, the law is "don't go faster than 55 mph". Realistically, those are three completely different policies in every way that matters. We are all making a continual and ongoing grave error thinking that taking what were previously de jure policies that were de facto quite different in the real world, and thoughtlessly "upgrading" the de jure policies directly into de facto policies without realizing that that is in fact a huge change in policy. One that nobody voted for, one that no regulator even really thought about, one that we are just thoughtlessly putting into place because "well, the law is, 55 mph" without realizing that, no, in fact that never was the law before. That's what the law said , not what it was . In the past those could never really be the same thing. Now, more and more, they can. This is a big change! Cost of enforcement matters. The exact same nominal law that is very costly to enforce has completely different costs and benefits then that same law becoming all but free to rigidly enforce. And without very many people consciously realizing it, we have centuries of laws that were written with the subconscious realization that enforcement is difficult and expensive, and that the discretion of that enforcement is part of the power of the government. Blindly translating those centuries of laws into rigid, free enforcement is a terrible idea for everyone . Yet we still have almost no recognition that that is an issue. This could, perhaps surprisingly, be one of the first places we directly grapple with this in a legal case someday soon, that the legality of something may be at least partially influenced by the expense of the operation.
 Top