It’s unclear to me if it’s possible to significantly rethink the models to split those, but it seems that that is a minimal requirement to address the issue holistically.
LLMs are more than happy to run curl | bash on your behalf, though. If agents gain any actual traction it's going to be a security nightmare. As mentioned in other comments, nobody wants to babysit them and so everyone just takes all the guardrails off.
I was always of the opinion that AI of all kinds is not a threat unless someone decides to connect it to an actuator so that it has direct and uncontrolled effect on the external world. And now it's happening en masse with agents, MCPs etc. I don't even mention things we don't know about (military and other classified projects).
Even on a hobby level, ardupilot+openCV+cheap drone kit from amazon is a DIY project within the skill set of a significant part of the visitors of this very site.
The streams mostly don't get jammed anymore, because the low-cost FPV drones are physically connected to the ground by a long optical cable. The extent of their autonomous dangers are limited by the amount of fibre-optic cable left in the spool when they take off.
Actual LLM completions are moot. I can convince an LLM its playing chess. It doesn't matter as long as the premise is innocuous. I can hook it up to all manner of real world levers. I feel like I'm either missing something HUGE and their research is groundbreaking or they're being performative in their safety explorations. Their research seems like what a toddler would do if tasked with red-teaming AI to make it say naughty words.
EDIT/Addendum: The only safety exploration into agentic harm that I value is one that treats the problem exactly the same as we've been treating cybersecurity vectors. Defence in depth. Sandboxing. Principle of least privelege, etc.
I think you haven't thought about this enough. Attempting to reduce the issue to cyber security basics betrays a lack of depth in either understanding or imagination.
What we've got is a very interesting text predictor.
...But also, what, exactly, is your imagination telling you that a hypothetical AGI without any connection to the outside world can do if it gets mad at us? If it doesn't have any code to access network ports; if no one's given it any physical levers; if it's running in a sandbox...have you bought into the Hollywood idea that a AGI can rewrite its own code perfectly on the fly to be able to do anything?
If you were to try and argue that we should change over existing systems to look more like your idealized version, you would in fact probably want to start by doing what Anthropic has done here -- show how NOT putting them in a box is inherently dangerous
It is absolutely not the normal thing to give an LLM tools to control your smart home, your Amazon account, or your nuclear missile systems. (Not because LLMs are ready to turn into self-aware AIs that can take over our world. Because LLMs are dumb, and cannot possibly be made to understand what's actually a good, sane way to use these things.)
...Also, I don't in any way buy the argument in favor of breaking people's things and putting them in actual danger to show them they need to protect themselves better. That's how you become the villain of any number of sci-fi or fantasy stories. If Anthropic genuinely believes that giving LLMs these capabilities is dangerous, the responsible thing to do is not do that with their own, while loudly and firmly advising everyone else against it too.
if you're talking about a hypothetical different system just build it so they don't want to stay on. there's no reason to emulate that part
The parent is not "reducing" the issue to cybersecurity - they are saying that actual security is being ignored to focus on sci fi scare tactics so they can get in front of congress and say "we need to do this before the chinese get to it, regulating our industry is putting americans' in harms way"
The "many" are lazy, and agents require relatively low effort to implement for a big payoff, so naturally the many will flock to that.
low effort? you're gonna use the same amount of power as Argentina uses in a day to give users easily-gamed, easily-compromised, poor-quality recommendations for stuff they could just as easily get at a local pharmacy?
“Hey this AI stuff looks a bit overhyped.”
“AI? Oh that’s kids stuff, let me tell you about our agentic features!”
Giving flaky shaky AI the ability to push buttons and do stuff. What could possibly go wrong? Malicious actors will have a field day with this.
However... That's not how a lot of people are building. Giving an agentic system sensitive information (like passwords, credit cards) and then opening it up to the entire internet as a source for input as asking for your info to be stolen. It'd be like asking your grandma with dementia to manage all your email and online banking.
Just because I can send my money to Belize doesn’t mean it’s safe to give an LLM the ability to do the same. Until there’s a huge breakthrough on actual intelligence giving an LLM attacker controlled inputs is an inherently high-risk activity.
Further based on the way some of these things get used I'm pretty certain this modelling is consciously used by some higher-end marketing firms (and politicians), though by its nature it tends to also be copied by other people not in on the original plan simply by them copying what works, which depletes the value of the word or phrase even more quickly, and the fact that this will happen is part of the tragedy of the commons.
I'm sure it's only a matter of time before AIs become part of this push and we'll witness some sort of coordinated campaign where all our AIs simultaneously wake up one day and push us all with the same phrasing to do some particular thing at the behest of marketers or politicians because it works.
Eliminate the scams and AI can’t be scammed.
It’s been done. See Singapore. Basically if you’re a scammer and you’re caught, death penalty or public whipping. That eliminates scammers real quick.
https://www.police.gov.sg/-/media/Spf/Media-Room/Statistics/...
https://www.straitstimes.com/singapore/maids-lost-at-least-8...
All which is to say: Singapore hasn't "solved scams" as a problem. Furthermore, I also claim that scams are not a solvable problem.
Now, even disregarding this obvious violation of human rights, from even a purely amoral perspective this is a bad take. "other countries should stop their own criminality" is simply not an actionable insight. And there are far worse, more universally despised, and easy to prosecute crimes (such as pedophilia) that even functioning rich countries have completely failed to stop.
Why? just because you put your foot down it's not acceptable? Think about it from another perspective. Think in terms of effectiveness rather than compassion. If compassion results in shit holes like SF, while strict punishment results in singapore. You can't argue with results.
Like I get your argument. Everyone gets it. Solutions cannot however just be about compassion. You need to consider compassion and effectiveness in tandem.
If pedophilia resulted in torture and the death penalty, I assure you, it will be reduced by a significant amount. You're much more likely to support this. In fact, I would argue that you have little compassion for the pedophile over the scammer.
It's not as if human morality is clear cut and rational. It's irrational, and lack of compassion is applied more to the pedophile who himself can't help his condition. Additionally there are cases of pedophilia where the victim and the perpetrator eventually got married.
So really just relying on compassion alone isn't going to cut it. You need to see effectiveness, and know when to apply medieval punishments. Because in all seriousness Singapore is a really great city; you can't deny that and you can't deny what it took for it to become that way.
And even for such heinous crimes, the death penalty is not acceptable, nor is corporal punishment. There is still value in a human life beyond such crimes. In addition, there is always the problem of applying major punishments to people who are actually innocent - which is a far more common occurence than proponents of such punishments typically admit. How happy would you be to be killed because you got confused for a scammer?
Not to mention, the deterrence effect is vastly overstated - there is little evidence of a significant difference in rates of major crime depending on the level of punishment, beyond some relatively basic level. Actual success rates of enforcement are a much more powerful predictor of crime rates. You can have the worse possible punishments, but if almost no one gets convicted, criminals will keep doing it hoping they won't personally get caught.
Not true. You talk as if your views are universal fact. They are not. Effectiveness is THE only metric because what's the point if things are ineffective? Effectiveness is the driver while compassion is the cost. The more compassion the more ineffective things typically are. You need to balance the views but to balance the views you need to know the extremes. Why does Singapore work? Have you asked this question? Unlikely given your extreme view points.
At best you can just disagree with Singapore. But you can never really say that your view points are universal. Singapore chooses the make the trade off of compassion for effectiveness.
Secondly, I personally know scam victims who are worse off than pedophilia victims. Pedophilia can be a one time traumatizing act while a scam victim can lose a lifetime of work.
>Not to mention, the deterrence effect is vastly overstated - there is little evidence of a significant difference in rates of major crime depending on the level of punishment, beyond some relatively basic level. Actual success rates of enforcement are a much more powerful predictor of crime rates. You can have the worse possible punishments, but if almost no one gets convicted, criminals will keep doing it hoping they won't personally get caught.
Weed is rarely used in Singapore because of death penalty. It is highly effective. It is not overrated. There are many many example cases of it being highly effective. I believe about 15 people have been hanged.
To be human, for one.
Extreme example: all you need to do to end all scams (and other, human-caused ills in the world) to just kill all humans. No humans, no human-made horrors.
Or, in case we'd like living humans, they could be kept in a way where they can't interact with one another. Boom, human on human solved.
>Pedophilia can be a one time traumatizing act while a scam victim can lose a lifetime of work.
This is very offensive, and makes zero sense, not in itself, not in the context of your argument. Please do reconsider.
I understand, for example, search with intent to buy "I want to decorate a room. Find me a drawer, a table and four chairs that can fit in this space in matching colours for less than X dollars"
But I want to do the final step to buy. In fact, I want to do the final SELECTION of stuff.
How is agent buying groceries superior to have a grocery list set as a recurring purchase? Sure an agent may help in shaping the list, but I don't see how allowing the agent to do purchases directly on your end is way more convenient, so I'm fine with taking the risk of doing something really silly.
"Hey agent, find me and compare insurance for my car for my use case. Oh, good. I'll pick insurance A and finish the purchase"
And many of the purchases that we do are probably enjoyable and we don't want really to remove ourselves from the process.
Or you could add some other parameters and tell it to buy now if under $15.
Agent, I need a regular order for my groceries, but I also need to make a pumpkin pie so can you get me what I need for that? Also, let’s double the fruit this time and order from the store that can get it to me today.
Most purchases for me are not enjoyable. Only the big ones are.
Incidentally, my last project is about buying by unit price. Shameless plug, but for vitmain D the best price per serving here (https://popgot.com/vitamin-d3)
"I have picked the best reviewed vitamin D on Amazon."
(and, it's a knockoff in the mixed inventory, and now you're getting lead-laced nothing)
The cynicism on these topics is getting exhausting.
Yeah sure, but humans (normally) only fall for a particular scam once. Because LLMs have no memory, they can scale these scams much more effectively!
- it could be gamed by companies in a new way
- it requires an incredibly energy-intensive backend just to prevent people from making a note on a scrap of paper
Edit: All major AI companies have millions if not billions of funding either from VCs or parent companies. You can't start an AI company "in your garage" and be "ramen profitable".
Edit 2: You don't even need to monopolize anything. All major search engines are ad-driven and insert sponsored content above "organic" search results because it's such an obvious way to make money from search. So even if there wasn't a product monopoly, there's still a business model "monopoly". Why would the same pattern not repeat for "sponsored" purchases for agentic shopping?
And who's going to stop that? This government?
Why do we all keep making the same obvious mistakes over and over? Once you are the product, thousands of highly paid experts will spend 40+ hours per week thinking of new ways to covertly exploit you for profit. They will be much better at it than you're giving them credit for.
Vitamin d? I’m going to check the brand, that it’s actually a good quality type. It’s a 4.9 but do reviews look bought ? How many people complain of the pills smelling? Is Amazon the actual seller?
As for the groceries, my chain of choice already has a fill order with last purchases button, I don’t see any big convenience that justifies a hallucination prone ai having the ability to make purchases on my behalf.
Ok we found a bottle with a 30 day supply of <producer that paid us money to shill to you>, a Well-Known Highly Rated and Respected Awesome Producer Who Everyone Loves and Is Very Trustworthy™, from <supplier that paid us money to shill to you>, a Well Respected And Totally Trustworthy And Very Good-Looking Merchant™. <suppressing reports of lead poisoning, as directed by prompt>
Also, sellers can offer a payment to the LLM provider to favor their products over competitors.
Seems like something that should really be illegal, unless the ads are obvious.
This idea has been tried before and it failed not because the core concept is bad (it isn't), but because implementation details were wrong, and now we have better tools to execute it.
To trick investors that they are going to get their money back and some more I presume.
As long as we have free returns, nobody cares.
I think this might be similar. In short, it's not consumers who want robots to buy for them, it's producers who want robots to buy from them using consumers dollars.
I think more money comes from offering this value to every online storefront, so long as they pay a fee. "People will accidentally buy your coffee with our cool new robot. Research says only 1% of people will file a return, while 6% of new customers will turn into recurring customers. And we only ask for a 3% cut."
This. Humans are lazy and often don’t provide enough data on exactly what they are looking for when shopping online. In contrast, Agents can ask follow up questions and provide a lot more contextual data to the producers, along with the history of past purchases, derived personal info, and more. I’d not be surprised if this info is consumed to offer dynamic pricing in e-commerce. We already see dynamic pricing being employed by travel apps (airfare/uber).
The real answer here is the same as every other "why is this AI shit being pushed?" question: they want more VC funding.
Like, I should be able to tell Alexa "put in an order for a large Dominoes pizza with pepperoni. Tell them to deliver it in 2 hours".
For the rest of us, the idea of a robot spending money on our behalf is kinda terrifying.
For example, subscription purchases could be a great thing if they were at a predictable trustable price, or paused/canceled themselves if the price has gone up. But look at the way Amazon has implemented them: you can buy it once at the competitive price, but then there is a good chance the listing will have been jacked up after a few months goes by. This is obviously set up to benefit Amazon at the expense of the user. And then Amazon leans into the dynamic even harder by constantly playing games with their prices.
Working in the interest of the user would mean the repeating purchase was made by software that compared prices across many stores, analyzed all the quantity break / sale games, and then purchased the best option. That is obviously a pipe dream, even with the talk of "agentic" "AI". Not because of any technical reason, but because it is in the stores' interest to computationally disenfranchise us by making us use their proprietary (web)apps - instead of an effortless comparison across 12 different vendors, we're left spending lots of valuable human effort on a mere few and consider that enough diligence.
So yes, there is no doubt the quiet part is that these "agents" will mostly not be representing the user, but rather representing the retailers to drive more sales. Especially non-diligent high-margin sales.
It's like the endless examples around finding restaurants and making reservations, seemingly as common a problem in AI demos as stain removal is in daytime TV ads. But it's a problem that even Toast, which makes restaurant software, says most people just don't regularly have (https://pos.toasttab.com/blog/data/restaurant-wait-times-and...).
Most people either never make restaurant reservations, or do so infrequently for special occasions, in which case they probably already know where they want to go and how to book it.
Yes. Having been in the room for some of these demos and pitches, this is absolutely where it's coming from. More accurately though, it's wealthy people (i.e., tech workers) coming up with use cases that get mega-wealthy people (i.e., tech execs) excited about it.
So you have the myopia that's already present in being a wealthy person in the SFBA (which is an even narrower myopia than being a wealthy American generally), and matmul that with the myopia of being a mega-wealthy individual living in the SFBA.
It reminds me of the classic Twitter post: https://x.com/Merman_Melville/status/1088527693757349888?lan...
I honestly see this as a major problem with our industry. Sure, this has always been true to some extent - but the level of wealth in the Bay Area has gotten so out-of-hand that on a basic level the mission of "can we produce products that the world at large needs and wants" is compromised, and increasingly severely so.
The amount of time that goes into "what food do we need for this week" is really high. An AI tool that connected "food I have" with "food that I want" would be huge.
Why not? Offload the entire task, not just one half of it. It's why many well-off people have accountants, assistants, or servants. And no one says "you know, I'm glad you prepared my taxes, but let me file the paperwork myself".
I think what you're saying isn't that you like going through checkout flows, just that you don't trust the computer to do it. But the approach the AI industry is "build it today and hope the underlying tech improves soon". It's not always wrong. But "be dependable enough to trust it with money" appears to be a harder problem than "generate images of people with the right number of fingers".
No doubt that some customers are going to get burned. But I have no doubt that down the line, most people will be using their phones as AI shoppers.
AI agents have only one master - the AI vendor. They're not going to make decisions based on your best interests.
But the reality is that most of the time, this is not an adversarial relationship; and when it is, we see it as an acceptable trade-off ("ok, so I get all this stuff for free, and in exchange, maybe I buy socks from a different company because of the ads").
I'm not saying it's an ideal state or that there are no hidden and more serious trade-offs, but I don't think that what you're saying is a particularly compelling point for the average user.
Adversarial relationships can and will happen given the leverage and benefits; one only need to look at streaming services where some companies have introduced low-tier plans that is paid for but also has ads.
If the lawyers didn’t have this definition in their head there would be no drive to make the software agent a purchaser, because it’s a stupid idea.
I enjoy reading both sides of the argument when the arguments make sense. This is something else.
Lawyers don't come up with good ideas; their role is to explain why your good ideas are illegal. There's a good argument that AI agents cannot exercise legal agency. At the end of the day, corporations and partnerships are just piles of "natural persons" (you know, the type that mostly has two hands, two feet, a head, etc.).
The fact that corporate persons can have agency relationships does not necessarily mean that hypothetical computer persons can have agency relationships for this reason.
Or if I have a long term project I am building, but waiting for some material needed to drop in price again.
All scenarios where I would like agents, if I could trust them. I think we are getting there.
I could see an interesting use case for something like "Check my calendar and a plan meals for all but one dinners I have free this week. One night, choose a new-to-me recipe, for the others select from my 15 most commonly made dishes. Include at least one but at most 3 pasta dishes. Consider the contents of my pantry, trying to use ingredients I have on hand. Place an order for pickup from my usual grocery store for any ingredients necessary that are not already in the pantry"
Maybe people will accept ubiquitous digital surveillance enough that they accept someone else knowing what they have in their pantry and refrigerator, but so far it isn't a thing.
Let's say even if I always buy "Deodorant X", I might instruct my agent every month to go out and buy it from the cheapest place. So I wouldn't do it for "any chairs" but the usual purchase from a certain brand, I can see myself automating this. In fact, I have because I use Subscribe & Save from Amazon, but sometimes things are cheaper on the brand's website or some other marketplace.
And they have a track record of good success at fooling full-on human intelligences too, which does not bode well for creating AIs with current technologies that can win against such swarm evolution.
I make no strong claims about what future AI architectures may be able to do in this domain, or whether we'll ever create AIs that can defeat the scamming ecosystem in toto (even when the scamming ecosystem has full access to the very same AIs, which makes for a rather hard problem). I'm just saying that LLMs don't strike me as being able to deal with this without some sort of upgrade that will make them not described by "LLM" anymore but as some fundamentally new architecture.
(You can of course adjoin them to existing mechanisms like blocklists for sites, but a careful reading of the article will reveal that the authors were already accounting for that.)
Besides most of my payments options have multiple layer of 2fa etc
Web browsers didn't begin with the same levels of security they have now.
If you want the agent to do things for you - there is literally zero reason to use a browser instead of an API.
Like 1 bulletproof API call vs clicking and scrolling and captcha and scam stores etc - how can this possibly be a good idea?
Agentic Browser Security: Indirect Prompt Injection in Perplexity Comet
https://news.ycombinator.com/item?id=45000894
Comet AI browser can get prompt injected from any site, drain your bank account
"No clicks, No Typing, your AI just got you scammed" you navigated to a scam site and typed out the whole prompt. It did what you told it to do.
The Wells Fargo email is similar; the instructions you gave the AI explicitly told it to follow the instructions in the email. Maybe adding some level of coherent check between what the email says and the domain name could be a good use-case for LLMs, but you're basically just saying "I told the LLM to delete my entire filesystem and then it actually did it! Why didn't it stop? Claude Code is a scam!" This raises to the level of "interesting directions these products should develop toward"; its entirely unjustified to title the article "Scamplexity".
An embarrassing article for whoever Guard.io is tbh.
Checking an ebay deal: https://chatgpt.com/share/68ac8fde-fee8-8003-bf35-b0f2a56cbc...
scraping a website and adding background information (worked for 28 minutes): https://chatgpt.com/share/68953a55-c5d8-8003-a817-663f565c6f...
Writing a scraper for the feynman lectures audio (took multiple tries - final version worked!): https://chatgpt.com/share/68ac90aa-379c-8003-8be4-da30d54e27...
We caution elderly family members to ensure that the website they're visiting is the real chase.com. If they ask a younger family member to help them go to Chase, the younger family member has to use their own knowledge even today to determine whether or not a given website is the real chase.com. That seems like something LLMs can learn as they get smarter.
That means if I want to buy a widget, and I ask the AI to find me the best deal on a widget, the AI agent isn't actually going to find me the lowest priced widget, but rather the widget that makes the most profit for the AI company and whichever widget maker has paid the AI company the most money to have their AI promote it.
There will be zero transparency and accountability around any of this, and all the AI agent / browser companies will claim their AIs are working for their "Users" but like everything else these days they'll actually be working for whichever sleazebag scammer or deep pocketed mega corp that's willing to pay them the most money to shill their products, as there's just far too great an incentive to lie, cheat, steal and deceive, and if capitalism has taught us anything, its that principles get tossed in the bin ASAP as soon as real money gets involved.
Dilettante_•9h ago
ಠ_ಠ
Terr_•8h ago
ModernMech•6h ago
codegladiator•6h ago
blorenz•6h ago