It is weird to read because they bring up many things a lot of people have been critiquing for years. > But as impressive as these feats are, they obscure a simple truth: being a "test-taker" is not what most people need from an AI. > In all these cases, humans aren't relying solely on a fixed body of knowledge learned years ago. We are learning, in real-time, from the context right in front of us. > To bridge this gap, we must fundamentally change our optimization direction. I'm glad the conversation is changing but it's been a bit frustrating that when these issues were brought up people blindly point to benchmarks. It made doing this type of research difficult (enough to cause many to be pushed out). Then it feels weird to say "harder than we thought" because well... truthfully, they even state why this result should be expected > They rely primarily on parametric knowledge—information compressed into their weights during massive pre-training runs. At inference time, they function largely by recalling this static, internal memory, rather than actively learning from new information provided in the moment. And that's only a fraction of the story. Online algorithms aren't enough. You still need a fundamental structure to codify and compress information, determine what needs to be updated (as in what is low confidence), to actively seek out new information to update that confidence, make hypotheses, and so so much more. So I hope the conversation keeps going in a positive direction but I hope we don't just get trapped in a "RL will solve everything" trap. RL is definitely a necessary component and no doubt will it result in improvements, but it also isn't enough. It's really hard to do deep introspection into how you think. It's like trying to measure your measuring stick with your measuring stick. It's so easy to just get caught up in oversimplification and it seems like the brain wants to avoid it. To quote Feynman: "The first principle is to not fool yourself, and you're the easiest person to fool." It's even easier when things are exciting. It's so easy because you have evidence for your beliefs (like I said, RL will make improvements). It's so easy because you're smart, and smart enough to fool yourself. So I hope we can learn a bigger lesson: learning isn't easy, scale is not enough. I really do think we'll get to AGI but it's going to be a long bumpy road if we keep putting all our eggs in one basket and hoping there's simple solutions.
> You're beating it around the bush going offtopic and ignoring my question: No I'm not, you just don't like the answer. But at least you've edited to remove the "3 guys yelling at you" portion as I think even you can see how that might be a reasonable thing to do to a creep going around you business filming everything. > daycare having a misspelled sign and boarded up windows? The answer to this question is simple, a poor one. And I suspect that a daycare that primarily gets it's funds from people using government welfare likely isn't rolling in the dough. Broken windows are expensive to fix, boards are cheap. A misspelled sign is embarrassing but again could easily be something that the owner of the facilities just wasn't assed to pay to replace and properly fix. My spouse worked for years in that sort of daycare which is why it's unsurprising to me that a daycare in that state exists. She, for example, did a full summer in Utah without AC while the kids were fed baloney sandwiches every day. Her's wasn't a daycare committing fraud, it was just an owner that was cutting costs at every corner to make sure their own personal wealth wasn't impacted. A shitty daycare isn't an indicator of fraud. It's an indicator that the state has low regulation standards for daycares. Lots of states have that, and a lot of these places end up staying in operation because states decide that keeping open an F grade daycare is cheap and better for the community vs closing it because it's crap quality. They certainly don't often want to take control of such a business and they know a competently ran one isn't likely to replace it if it is shutdown.
Call me cynical, but from the comments online, I think this research is being used for propaganda marketing to push for MV3 adoption - to assuage the lay users that all the hue and cry is for nothing and they shouldn't abandon Chrome, and to provide an excuse for other browsers holding out to embrace MV3 completely. (Don't be surprised if Firefox suddenly changes its mind and decides to fully embrace MV3 and drop MV2). Here is the actual research - Privacy vs. Profit: The Impact of Google's Manifest Version 3 (MV3) Update on Ad Blocker Effectiveness - https://arxiv.org/html/2503.01000v2 > This study comes with several limitations ... this study faces limitations related to the dynamic behavior of websites and extensions ... Second, consistent with previous research, the automated browser-based experiment does not replicate all aspects of user interactions, such as visiting sub-pages of a website’s homepage, which introduce different trackers than the website’s homepage ... Third, the study explores ad blocker effectiveness using a European IP address and default ad blocker settings on a select sample of websites, focusing on display ads. As a result, the web’s long tail is under-represented. Consequently, we cannot rule out that MV3’s 30,000-rule limit could disproportionately affect less popular websites that rely on rarely used rule ... Because EU traffic often involves GDPR-driven website changes that reduce third-party activity irrespective of consent, baseline ad and tracker counts are lower ... Fourth, our study cannot exclude that any future changes to the Chrome extension ecosystem might hinder ad blocker effectiveness, as feared by some ad blocker providers ... Note that the intent of this study was to determine if MV3 ad-blockers are as effective as MV2 based ad-blockers. Nobody has said that MV3 cannot be used to block ads or trackers. What they have always criticised is that it makes it harder to do so than MV2, and the difficulty of implementing dynamic filtering with it gives the advertisers an upper hand over adblockers. When the ad-block extension developers tell you that MV3 cripples their extension and makes it harder to block ads and trackers in your browser, who are you going to believe - the experts at blocking ads and trackers or a corporate who makes its money from data harvesting and online advertising and has a real incentive to cripple ad-blocking on browsers?
↙ time adjusted for second-chance
The Monad Called Free (blog.sigfpe.com)
It's also important to note that in Haskell and other functional programming languages, there is no implied order of operations. You need a Monad type in order to express that certain things are supposed to happen after other things. Monads can also express that certain things happen "in between" two operations, which is why we have different kinds of Monads and mathematical axioms of what they're all supposed to do. Outside of FP however, this seems really stupid. We're used to operations that happen in the order you wrote them in and function applications that just so happen to also print things to the screen or send bits across the network. If you live in this world, like most people do, then "flatmap" is a good metaphor for Monads because that's basically all they do in an imperative language[1]. Well, that, and async code. JavaScript decided to standardize on a Monad-shaped "thenable" specification for representing asynchronous processes, where most other programming languages would have gone with green threads or some other software-transparent async mechanism. To be clear, it's better than the callback soup you'd normally have[0], but working with bare Thenables is still painful. Just like working with bare Monads - which is why Haskell and JavaScript both have syntax to work around them (await/async, do, etc). Maybe/Either get talked about because they're the simplest Monads you can make, but it makes Monads sound like a spicy container type. [0] The FP people call this "continuation-passing style" [1] To be clear, Monads don't have to be list-shaped and most Monads aren't.
I always find this 'character' argument disingenuous. The character of the neighbourhood is only invoked for perceived negative externalities. No one complains when the cracked sidewalks get repaved, or fiber internet lines replace slow copper, when increasing affluence mean that houses are better maintained, when a new sewer line allows people to remove septic tanks. That all changes the character of a neighbourhood, but never gets fought. Go ahead and commit to the bit, lock in on the character in ALL ways: make sure you fight any alteration to any building, any change in the shade of paint should be fought! Your neighbour replacing their front door? Denied! Replacing a concrete driveway with pavers? unacceptable? Replacing incandescent bulbs with LED? Uncharacteristic! Increasing home values changing who can afford to live there? Not acceptable, gotta sell your home for what you paid to maintain the character! > If someone moves into an area that is zoned for particular types of properties, then new zoning is imposed by outside fiat (not a vote of the people who live there) is not appropriate. How small are we going to allow the "area" to be defined? Is it one vote per property owner, or one vote per resident? Can we call a block an area? Who decides the arbitrary boundaries? Do people living on the boundary line get to vote for projects in adjacent properties in adjacent jurisdictions? Just call NIMBYism what it is, selfish justification for control of other people's property. Your position is - explicitly - that other people and property owners should be made less well off for your comfort. "The Character of the Neighbourhood" is a red herring.
 Top