My current expectation is that the Cowork/Codex set of "professional agents" for non-technical users will be one of the most important and fastest growing product categories of all time, so far. i.e. agents for knowledge workers who are not software engineers A few thoughts and questions: 1. I expect that this set of products will be extremely disruptive to many software businesses. It's like when a new VP joins a company, they often rip and replace some of the software vendors with their personal favorites. Well, most software was designed for human users. Now, peoples' agents will use software for them. Agents have different needs for software than humans do. Some they'll need more of, much they'll no longer need at all. What will this result in? It feels like a much swifter and more significant version of Google taking excerpts/summaries from webpages and putting it at the top of search results and taking away visits and ad revenue from sites. 2. I've tried dozens of products in this space. For most, onboarding is confusing, then the user gets dropped into a blank space, usage limits are uncompetitive compared to the subsidized tokens offered by OpenAI/Anthropic, etc. It's a tough space to compete in, but also clearly going to be a massive market. I'm expecting big investment from Microsoft, Google etc in this segment. 3. How will startups in this space compete against labs who can train models to fit their products? 4. Eventually will the UI/interface be generated/personalized for the user, by the model? Presumably. Harnesses get eaten by model-generated harnesses? A few more thoughts collected here: https://chrisbarber.co/professional-agents/ Products I've tried: ai browsers like dia, comet, claude for chrome, atlas, and dex; claw products like openclaw, kimi claw, klaus, viktor, duet, atris; automation things like tasklet and lindy; code agents like devin, claude code, cursor, codex; desktop automation tools like vercept, nox, liminary, logical, and raycast; and email products like shortwave, cora and jace. And of course, Claude Cowork, Codex cli and app, and Claude Code cli and app. Edit: Notes on trying the new Codex update 1. The permissions workflow is very slick 2. Background browser testing is nice and the shadow cursor is an interesting UI element. It did do some things in the foreground for me / take control of focus, a few times, though. 3. It would be nice if the apps had quick ways to demo their new features. My workflow was to ask an LLM to read the update page and ask it what new things I could test, and then to take those things and ask Codex to demo them to me, but it doesn't quite understand it's own new features well enough to invoke them (without quite a bit of steering) 4. I cannot get it to show me the in app browser 5. Generating image mockups of websites and then building them is nice
> My current expectation is that the Cowork/Codex set of "professional agents" for non-technical users will be one of the most important and fastest growing product categories of all time, so far. I agree this is going to be big. I threw a prototype of a domain-specific agent into the proverbial hornets' nest recently and it has altered the narrative about what might be possible. The part that makes this powerful is that the LLM is the ultimate UI/UX. You don't need to spend much time developing user interfaces and testing them against customers. Everyone understands the affordances around something that looks like iMessage or WhatsApp. UI/UX development is often the most expensive part of software engineering. Figuring out how to intercept, normalize and expose the domain data is where all of the magic happens. This part is usually trivial by comparison. If most of the business lives in SQL databases, your job is basically done for you. A tool to list the databases and another tool to execute queries against them. That's basically it. I think there is an emerging B2B/SaaS market here. There are businesses that want bespoke AI tools and don't have the discipline to deploy them in-house. I don't know if it is ever possible for OAI & friends to develop a "hyper" agent that can produce good outcomes here automatically. There are often people problems that make connecting the data sources tricky. Having a human consultant come in and make a case for why they need access to everything is probably more persuasive and likely to succeed.
Let's say we take Anthropic's security and alignment claims at face value, and they have models that are really good at uncovering bugs and exploiting software. What should Anthropic do in this case? Anthropic could immediately make these models widely available. The vast majority of their users just want develop non-malicious software. But some non-zero portion of users will absolutely use these models to find exploits and develop ransomware and so on. Making the models widely available forces everyone developing software (eg, whatever browser and OS you're using to read HN right now) into a race where they have to find and fix all their bugs before malicious actors do. Or Anthropic could slow roll their models. Gatekeep Mythos to select users like the Linux Foundation and so on, and nerf Opus so it does a bunch of checks to make it slightly more difficult to have it automatically generate exploits. Obviously, they can't entirely stop people from finding bugs, but they can introduce some speedbumps to dissuade marginal hackers. Theoretically, this gives maintainers some breathing space to fix outstanding bugs before the floodgates open. In the longer run, Anthropic won't be able to hold back these capabilities because other companies will develop and release models that are more powerful than Opus and Mythos. This is just about buying time for maintainers. I don't know that the slow release model is the right thing to do. It might be better if the world suffers through some short term pain of hacking and ransomware while everyone adjusts to the new capabilities. But I wouldn't take that approach for granted, and if I were in Anthropic's position I'd be very careful about about opening the floodgate.
If the vendors of programs do not want bugs to be found in their programs, they should search for them themselves and ensure that there are no such bugs. The "legit security firms" have no right to be considered more "legit" than any other human for the purpose of finding bugs or vulnerabilities in programs. If I buy and use a program, I certainly do not want it to have any bug or vulnerability, so it is my right to search for them. If the program is not commercial, but free, then it is also my right to search for bugs and vulnerabilities in it. I might find acceptable to not search for bugs or vulnerabilities in a program only if the authors of that program would assume full liability in perpetuity for any kind of damage that would ever be caused by their program, in any circumstances, which is the opposite of what almost any software company currently does, by disclaiming all liabilities. There exists absolutely no scenario where Anthropic has any right to decide who deserves to search for bugs and vulnerabilities and who does not. If someone uses tools or services provided by Anthropic to perform some illegal action, then such an action is punishable by the existing laws and that does not concern Anthropic any more than a vendor of screwdrivers should be concerned if someone used one as a tool during some illegal activity. I am really astonished by how much younger people are willing to put up with the behaviors of modern companies that would have been considered absolutely unacceptable by anyone, a few decades ago.
My parents made a home in a nice suburban neighborhood, where today some good restaurants and a coffeehouse are in walking distance, and grocery shopping is a short car ride. Yet we grew up still rather attached to neighborhoods further away, where our schools and grandparents lived. There was no possibility of bicycles or “kid power” to reach there; Mom and Dad always, always drove us everywhere! Today I find myself in an urban hellscape without owning a vehicle. Nothing is walkable. I am crammed in, thanks to Equal Housing, with immigrants and people of utterly alien races and cultures (I consider myself the minority.) If I expect to find people like me or shop within my demographic, nothing is adjacent and it’s all several miles worth of transportation. Car culture and forced integration has fragmented every possible family unit that could have been cohesive or collectivist. If I am celebrating a religious or cultural festival, I can count on none of my neighbors sharing that celebration, or in fact raising conflicts on the days most sacred to me. Anywhere I may choose to walk, or even if I drive, I am trudging through vast empty parking lots of asphalt because of cars. The roads are laid out for cars. A cop told me yesterday I shouldn’t drive my e-scooter at 17mph in the street but on the sidewalk. Every motorist also hates those scooters, whether in motion or properly parked. Every motorist also hates the light rail train and hate for Waymo is fomented by motorist and pedestrian alike. There is no place I could move to or live that would change this equation in any useful way. I do not hate cars, but I hate what they have done to our lives and our landscape.
 Top