It's strange to me they didn't reduce to PoC so the quantitative part is an apples-to-apples comparison. You don't need any fancy tooling, if you want to do this at home you can do something like below in whatever command line agent and model you like. I did take one bug all the way through remediation just out of curiosity. """ Your task is to study the following directive, research coding agent prompting, research the directive's domain best practices, and finally draft a prompt in markdown format to be run in a loop until the directive is complete. Concept: Iterative review -- study an issue, enumerate the findings, fix each of the findings, and then repeat, until review finds no issues. <directive> Your job is to run a security bug factory that produces remediation packages as described below. Design and apply a methodology based on best practices in exploit development, lean manufacturing, threat modeling, and the scientific method. Use checklists, templates, and your own scripts to improve token efficiency and speed. Use existing tools where possible. Use existing research and bug findings for the target and similar codebases to guide your search. Study the target's development process to understand what kind of harness and tools you need for this work, and what will work in this development environment. A complete remediation package includes a readme documenting the problem and recommendations, runnable PoC with any necessary data files, and proposed patch. Track your work in TODO.md (tasks identified as necessary) LOG.md (chronological list of tasks complete and lessons) and STATUS.md (concise summary of the current work being done). Never let these get more than a few minutes out of date. At each step ensure the repo file tree would make sense to the next engineer, and if not reorganize it. Apply iterative review before considering a task complete. Your task is to run until the first complete remediation package is ready for user review. Your target is <repo url>. The prompt will be run as follows, design accordingly. Once the process starts, it is imperative not to interrupt the user until completion or until further progress is not possible. Keep output at each step to a concise summary suitable for a chat message. ``` while output=$(claude -p "$(cat prompt.md)"); do echo "$output"; echo "$output" | grep -q "XDONEDONEX" && break; done ``` </directive> Draft the prompt into prompt.md, and apply iterative review with additional research steps to ensure will execute the directive as faithfully as possible. """
You can imagine a pipeline that looks at individual source files or functions. And first "extracts" what is going on. You ask the model: - "Is the code doing arithmetic in this file/function?" - "Is the code allocating and freeing memory in this file/function?" - "Is the code the code doing X/Y/Z? etc etc" For each question, you design the follow-up vulnerability searchers. For a function you see doing arithmetic, you ask: - "Does this code look like integer overflow could take place?", For memory: - "Do all the pointers end up being freed?" _or_ - "Do all pointers only get freed once?" I think that's the harness part in terms of generating the "bug reports". From there on, you'll need a bunch of tools for the model to interact with the code. I'd imagine you'll want to build a harness/template for the file/code/function to be loaded into, and executed under ASAN. If you have an agent that thinks it found a bug: "Yes file xyz looks like it could have integer overflow in function abc at line 123, because...", you force another agent to load it in the harness under ASAN and call it. If ASAN reports a bug, great, you can move the bug to the next stage, some sort of taint analysis or reach-ability analysis. So at this point you're running a pipeline to: 1) Extract "what this code does" at the file, function or even line level. 2) Put code you suspect of being vulnerable in a harness to verify agent output. 3) Put code you confirmed is vulnerable into a queue to perform taint analysis on, to see if it can be reached by attackers. Traditionally, I guess a fuzzer approached this from 3 -> 2, and there was no "stage 1". Because LLMs "understand" code, you can invert this system, and work if up from "understanding", i.e. approach it from the other side. You ask, given this code, is there a bug, and if so can we reach it?, instead of asking: given this public interface and a bunch of data we can stuff in it, does something happen we consider exploitable?
Coincidentally and Interestingly, again, I was reading an old thread from 2015 titled - ProtonMail pays $6k ransom, gets taken out by DDoS anyway The top comment says - "NEVER EVER PAY RANSOM MONEY. Please. Even if your business will suffer it will suffer a lot more if you do pay since now it is known you'll cave. Also: you are making the problem larger for others." The top response to that comment says - "From their blog: https://protonmaildotcom.wordpress.com/ At around 2PM, the attackers began directly attacking the infrastructure of our upstream providers and the datacenter itself. The coordinated assault on our ISP exceeded 100Gbps and attacked not only the datacenter, but also routers in Zurich, Frankfurt, and other locations where our ISP has nodes. This coordinated assault on key infrastructure eventually managed to bring down both the datacenter and the ISP, which impacted hundreds of other companies, not just ProtonMail. At this point, we were placed under a lot of pressure by third parties to just pay the ransom, which we grudgingly agreed to do at 3:30PM Geneva time to the bitcoin address 1FxHcZzW3z9NRSUnQ9Pcp58ddYaSuN1T2y. This was a collective decision taken by all impacted companies, and while we disagree with it, we nevertheless respected it taking into the consideration the hundreds of thousands of Swiss Francs in damages suffered by other companies caught up in the attack against us. We hoped that by paying, we could spare the other companies impacted by the attack against us, but the attacks continued nevertheless. This was clearly a wrong decision so let us be clear to all future attackers – ProtonMail will NEVER pay another ransom. " Full thread here - https://news.ycombinator.com/item?id=10523583
Same here—I tolerated the linguistic tics, I found enough meat to keep going, but it’s the conclusions that confuse me. And that feels like the intellectually dangerous part with this sort of LLM writing. For example: > Convergent evolution is real. Every major GDS independently arrived at the same underlying platform. That is not coincidence — it is the market discovering the optimal solution to a specific problem. I struggle to understand the claim that GDSes “arrived independently” at interoperability standards through “convergent evolution” and market discovery. Isn’t it something closer to a Schelling point, or a network effect, or using the word “platform” to mean “protocol” or “standard”? Isn’t it like saying “HTML arose from web browsers’ independent, convergent evolution”? Like—I guess , in that if you diverge from the common standard then you lose the cooperative benefits—see IE6. And I guess , in that in the beginning there was Mosaic, and Mosaic spoke HTML, that all who came after might speak HTML too. But that’s not convergent evolution, that’s helping yourself to the cooperative benefits of a standard. “The market” was highly regulated when the first GDSes were born in the US. Fares, carriers, and schedules were fixed between given points; interlining was a business advantage; the relationships between airlines and with travel agents were well-defined; and so on [0]. IATA extended standards across the world; you didn’t have to do it the IATA way, but you’d be leaving business on the table. If anything, it seems like direct-booking PSSes (he mentioned Navitaire [1]) demonstrate the opposite of the LLM’s claim. As the market opened up and the business space changed, new and divergent models found purchase, and new software paradigms emerged to describe them. It took a decade or two (and upheaval in the global industry) before the direct-booking LCC world saw value in integrating with legacy GDSes, right? …the LLM also seems bizarrely impressed that identifiers identify things: > One PNR, two airlines, the same underlying platform. > Two tickets, two currencies of denomination, one underlying NUC arithmetic tying them together. > One 9-character string, sitting in a PNR field, threading across four organisations' financial systems. [0] https://airandspace.si.edu/stories/editorial/airline-deregulation-when-everything-changed [1] https://www.phocuswire.com/Jilted-by-JetBlue-for-Sabre-Navitaire-strikes-back
 Top