> Is Jules free of charge?
> Yes, for now, Jules is free of charge. Jules is in beta and available without payment while we learn from usage. In the future, we expect to introduce pricing, but our focus right now is improving the developer experience.
EDIT: legal link doesn't work here (https://jules-documentation.web.app/faq#does-jules-train-on-...)
> No. Jules does not train on private repository content. Privacy is a core principle for Jules, and we do not use your private repositories to train models. Learn more about how your data is used to improve Jules.
It's hard to tell what the data collection will be, but it's most likely similar to Gemini where your conversation can become part of the training data. Unclear if that includes context like the repository contents.
> And so it is that you by reason of your tender regard for the writing that is your offspring have declared the very opposite of its true effect. If men learn this, it will implant forgetfulness in their souls. They will cease to exercise memory because they rely on that which is written, calling things to remembrance no longer from within themselves, but by means of external marks.
> What you have discovered is a recipe not for memory, but for reminder. And it is no true wisdom that you offer your disciples, but only the semblance of wisdom, for by telling them of many things without teaching them you will make them seem to know much while for the most part they know nothing. And as men filled not with wisdom but with the conceit of wisdom they will be a burden to their fellows.
- Plato quoting Socrates in "Phaedrus", circa 370 BCE
I know you're not trying to draw any parallels between Plato's admonition on written thoughts supplanting true knowledge and the justifiable concerns about automated writing tools supplanting the ability of writers to think. To a modern literate, Plato's concern is legible but so patently ridiculous that one could only deploy it as a parody and mockery of the people who might take it as a serious proof that philosophers were wrong about modern tools before. I was obviously just kiddin about whether you googled it. Unfortunately, now a whole new generation is about to use it to justify how LLMs are just being maligned the way written language once was.
Socrates was wrong on this. But Plato was kind of an asshole for writing it down. The proof of both is that we can now google the quote, which is objectively funny. The trouble with LLMs, I guess, is that they would just attribute the quote to your uncle Bob, who also said that cats are a good source of fiber, and thus the whole project started when the words were put in parchment ends with a blizzard of illegible scribbles. If writing was bad for true understanding, not-writing is where humanity just shits its pants.
In other words it is not the writing that is harmful, but the lack of teaching.
If were to rephrase it, I would put the distinction not between teaching and reading, but between passive consumption and active learning.
EDIT: Thinking more about having a world class philosopher as a personal tutor, I suddenly remembered a quote from Russell that took me a while to track down, but here it is:
> In 343 B.C. he [Aristotle] became tutor to Alexander, then thirteen years old, and continued in that position until, at the age of sixteen ... Everything one would wish to know of the relations of Aristotle and Alexander is unascertainable, the more so as legends were soon invented on the subject. There are letters between them which are generally regarded as forgeries. People who admire both men suppose that the tutor influenced the pupil. Hegel thinks that Alexander's career shows the practical usefulness of philosophy. As to this, A. W. Benn says: "It would be unfortunate if philosophy had no better testimonial to show for herself than the character of Alexander. . . . Arrogant, drunken, cruel, vindictive, and grossly superstitious, he united the vices of a Highland chieftain to the frenzy of an Oriental despot."
> ... As to Aristotle's influence on him, we are left free to conjecture whatever seems to us most plausible. For my part, I should suppose it nil.
- "A History of Western Philosophy" by Bertrand Russell, Chapter XIX p. 160
Google products had had a net positive impact on my life over, what is it, 20 years now. If I had had to pay subscription fees over that span of time, for all the services that I use, that would have been a lot of very real money that I would not have right now.
Is there a next step where it all gets worse? When?
Haven't tried Jules myself yet, still playing around with Codex, but personally I don't really care if it's free or not. If it solves my problems better than the others, then I'll use it, otherwise I'll use other things.
I'm sure I'm not alone in focusing on how well it works, rather than what it costs (until a certain point).
Hence many of us are still busy trying out Codex to it's full extent :)
> And people are rarely willing to use paid product for comparison.
Yeah, and I'm usually the same, unless there is some free trial or similar, I'm unlikely to spend money unless I know it's good.
My own calculation changed with the coming of better LLMs though. Even paying 200 EUR/month can be easily regained if you're say a freelance software engineer, so I'm starting to be a lot more flexible in "try for one month" subscriptions.
Cursor just deleted my unit tests too many times in agent mode.
Codex 5x-ed my output, though the code is worse than I would write it, at this point the productivity improvement with passing tests, not deleting tests is just too good to be ignored anymore.
I have far fewer qualms about spending $10 on credits, even if I decide the product isn't worth it and never actually spend those credits, than about taking a free trial for a $5 subscription.
> 2 concurrent tasks
> 5 total tasks per day
Well here's to hoping it's better than Cursor. I doubt it considering my experiences with Gemini have been awful, but I'm willing to give it a shot!
Jules encountered an unexpected error. To continue, respond to Jules below or start a new task.
And appears you have limited to 5 tasks per day
- Less access required means lower risk of disaster
- Structured tasks mean more data for better RL
- Low stakes mean improvements in task- and process-level reliability, which is a prerequisite for meaningful end-to-end results on senior-level assignments
- Even junior-level tasks require getting interface and integration right, which is also required for a scalable data and training pipeline
Seems like we're finally getting to the deployment stage of agentic coding, which means a blessed relief from the pontification that inevitably results from a visible outline without a concrete product.
I am cool with all of that but it feels like they're suggesting that coding is a chore to be avoided, rather than a creative and enjoyable activity.
That might be true for hobbyists or side projects, but employees definitely won't get to work less (or earn more). All the financial value of increased productiveness goes to the companies. That's the nature of capitalism.
A new backlog will start to fill up and the cycle repeats.
I doubt it, but one can dream.
There is one clock you should be watching regardless, which is the clock of your life. Your code will not come see you in the hospital, or cheer you up when you're having a rough day. You wont be sitting around at 70 wishing you had spent more 3am nights debugging something. When your back gives out from 18hrs a day of grinding at a desk to get something out, and you can barely walk from the sciatica, you wont be thinking about that great new feature you shipped. There are far more important things in life once you come to terms with that, and you will learn that the whole point of the former is enabling the latter.
This is different from meaningless work that brings you nothing except a paycheck, which I agree is important to minimize or eliminate. We should apply machines to this kind of work as much as we can, except in cases where the work itself doesn't need to exist.
I occasionally code for fun, but usually I don’t. I treat programming as a last-resort tool, something I use only when it’s the best way to achieve my goal. If I can achieve some thing without coding or with coding, I usually opt for the first unless the tradeoffs are really shit.
The "let AI do the boring bits" pitch sounds appealing—because it's easier to accept. But let's be real: the goal isn't just the dull stuff. It's everything.
It's surprising how many still think AI is harmless. Sigh...
It's been a little addictive using Cursor recently - creating new features and fixing bugs in minutes is pretty amazing.
If you work at a company where there's a byzantine process to do anything, this pitch might speak to you. Especially if leadership is hungry for AI but has little appetite for more meaningful changes.
If all of these tools really do make people 20-100% more productive like they say (I doubt it) the value is going to accrue to ownership, not to labor.
Seriously though, this kind of tech-assisted work output improvement has happened many times in the past, and by now we should all have been working 4-hour weeks, but we all know how it has actually worked out.
If there's one big takeaway
from all of game theory, it's this:
What the game is, defines what the players do.
Our problem today isn't just that people are losing trust,
it's that our environment acts against the evolution of trust.
That may seem cynical or naive -- that we're "merely" products of our
environment -- but as game theory reminds us, we are each others
environment. In the short run, the game defines the players. But in
the long run, it's us players who define the game.
So, do what you can do, to create the conditions necessary to evolve trust.
Build relationships. Find win-wins. Communicate clearly. Maybe then, we can
stop firing at each other, get out of our own trenches, cross No Man's Land
to come together...
My take: don't blame corporations when they act rationally. (Who designed the conditions under which they act?) Don't blame people for being angry or scared when they feel unsettled. A wide range of behaviors are to be expected. If I am surprised about the world, that is probably because I don't understand it well enough. "Blame" is a waste of time here. Instead, we have to define what kind of society we want, predict likely responses, and build systems to manage them.Nailed it. At the end of the day, companies are automatons. It is up to use to update the reward and punishment functions to get the behaviour we desire. Behaviourism 101
On a human level, people are held to a set of laws and exist in a world of social norms. "Following orders" is of course not the most important goal in most contexts; it is not the way most people think of their own ethics (hopefully) nor the way society wants people to behave. Even in military contexts, there is often the notion of a "lawful order".
When it comes to public for-profit companies, they are expected to generate a profit for their shareholders and abide by various laws, including their own charters. To demand or expect them to do more than this is foolish. Social pressure can help but is unreliable and changes over time. To expect that a few humans will step up to be heros exactly when we need them and "save the day" from a broken system is wishful thinking. We have to do better than this. Blaming something that is the statistical norm is scapegoating. In many/most situations, the problem is the system, not the actors.
It's the idea that individuals and institutions must somehow fix society from the top down or the outside in, which history has shown doesn't work. No one is going to come along and make you be sensitive or intelligent, either you see the predicament we're all in and act, or you rationalize your selfish actions and make them someone else's problem.
I didn’t say that, nor do I mean that.
My point is this: don’t be surprised when people or organizations act rationally according to the situations they find themselves in.
Go ahead and blame people and see if that solves anything! What is your theory for change? Mine is about probabilistic realism.
Ethics matters, of course. We can dislike how some (one/org) acts — and then what do we do? Hoping they act better is not a good plan.
I see it over and over — people label something as unethical and say e.g. “they shouldn’t do that” and that’s the end of the conversation. That is not a plan. Shame and guilt can have an effect on people, but often only has a small effect on organizations.
Here’s a start: look at the long-term stock exchange (Eric Ries) and see how it’s doing in trying to align corporate behavior with what meshes better with what people want.
My claim, put another way, is that if you trace the causality back a few steps, you land at the level of the system.
Anyhow, the question "who do we blame?" can be a waste of time if we use it only for moral outrage and/or a conversation stopper. Some think "what caused this?" is an improvement, and I agree, but it isn't nearly good enough.* Still, it isn't nearly as important as "how do we change this with the levers we have _now_?"
* Relatively few scientists understand causality well, thinking the randomized controlled trial is the only way to show causality! The methods of causality have developed tremendously in the last twenty years, but most scientific fields are rather clueless about them.
> In 1833, the Factory Act banned children under 9 from working in the textile industry, and the working hours of 10-13 year olds was limited to 48 hours a week, while 14-18 year olds were limited to 69 hours a week, and 12 hours a day. Government factory inspectors were appointed to enforce the law.
Constant work day in and out, morning and night. At least before the industrial revolution farmers only had to work as long as there was daylight, and winters meant shorter work times.
This video [2] from Historia Civilis is very relevant. The gist of ot is that to this day, we work more hours than medieval peasants did.
[1] https://www.striking-women.org/module/workplace-issues-past-...
i found this lively criticism of the video on reddit: https://old.reddit.com/r/badhistory/comments/16y233q/histori....
my brief takeaway was that the claim might be true if "work" means "working for an employer for wages", but not if "work" includes "necessary labor for shelter, food, clothing, survival".
but it's an interesting thought though so i'm curious if you have other related resources to dig into.
> More time for the code you want to write, and everything else.
now.
"We're not replacing jobs, we're freeing up people's time so they can focus on more important tasks!"
Maybe helps them sleep at night and feel their work is important.
When it gets priced, it's usually cheaper (for the same capability)
Wait a year or two, evaluating this stuff at the peak of the hype cycle is pointless.
Why would I ever want this over cursor? The sync thing is kinda cool but I basically already do this with cursor
The projects I work on have lots of bespoke build scripts and other stuff that is specific to my machine and environment. Making that work in Google's cloud VM would be a significant undertaking in itself.
This is an unusual angle. Of course Google can do this because they have the tech behind NotebookLM, but I'm not sure what the value of telling you how your prompt was implemented is.
More of a tool for managers, or least it's a manager style tool. You could get a morning report while heading to the office for example.
(I'm not saying anyone reading this should want this, only that it fits a use case for many people)
Now it's just thrown to anyone who's willing enough to spam linkedin/twitter with Google bullshit and suck-up to the GDE community. Think everyone in the extended Google community got quite annoyed with the sudden rise in number of GDE's for blatantly stupid things.
This pops up especially if you're organising a conference in a Google-adjacent space, as you will get dozens of GDE's applying with talks that are pretty much a Google Codelab for a topic, without any real insights or knowledge shared, just a "lets go through tutorial together to show you this obscure google feature". And while there are a lot of good GDE's, in the last 5-6 years there has been such an influx of shitty ones that the program lost it's meaning and is being actively avoided.
The context window difference is really nice. I post very large bodies of text into gemini and it handles it well.
My normal development workflow of ticket -> assignment -> review -> feedback -> more feedback -> approval -> merging is asynchronous, but it'd be better synchronous. It's only asynchronous because the people I'm assigning the work to don't complete the work in seconds.
I kind of wonder what would happen if you added a "lead dev" AI that wrote up bugs, assigned them out, and "reviewed" the work. Then you'd add a "boss" AI that made new feature demands of the lead dev AI. Maybe the boss AI could run the program and inspect the experience in some way so it could demand more specific changes. I wonder what would happen if you just let that run for a while. Presumably it'd devolve into some sort of crazed noise, but it'd be interesting to watch. You could package the whole thing up as a startup simulator, and you could watch it like a little ant farm to see how their little note-taking app was coming along.
The more difficult part which I won't share was aggregating data from various systems with ETL scripts into a new db that I generate various views with, to look at the data by channel, timescale, price regime, cost trends, inventory trends, etc. A well structured JSON object is passed to the analyst agent who prepares a report for the decision agent. It's a lot of data to analyze. It's been running for about a month and sometimes I doubt the choices, so I go review the thought traces, and usually they are right after all. It's much better than all the heuristics I've used over the years.
I've started using agents for things all over my codebase, most are much simpler. Earlier use of LLM's might have been called that in some cases, before the phrase became so popular. As everyone is discovering, it's really powerful to abstract the models with a job hat and structured data.
https://github.com/jacobsparts/agentlib
I'm planning to write a blog post about the larger system when I get the chance.
There is one special case where I manage it more actively. I wrote an REPL process analyst, to help build the pricing agent and refine the policy document. In that case I would have long threads with an artifact attachment. So I added a facility to redact old versions of the artifact replacing them with [attachment: filename] and just keep the last one. It works better that way because multiple versions in the same conversation history confuse the model, and I don't like to burn tokens.
For longer lived state, I give the agent memory tools. For example the pricing agent's initial state includes the most recent decision batch and reasoning notes, and the agent can request older copies. The agent also keeps a notebook which they are required to update, allowing agents to develop long running strategies and experiments. And they use it to do just that. Honestly the whole system works much better than I anticipated. The latest crop of models are awesome, especially Gemini 2.5 flash.
Funny you mention keyword bids, I use algorithms and ML models for that, but not LLMs, yet. Keyword bids are a different problem and more difficult in some ways due to sparsity. I'm actively working on an agentic system that pulls the big levers using data from the predictive models. Trying to tie everything together into a more unified and optimal approach, a long running challenge that I finally have tools to meet.
https://github.com/langroid/langroid
Quick tour:
https://langroid.github.io/langroid/tutorials/langroid-tour/
Langroid enables tool-calling with practically any LLM via prompts: the dev just defines tools using a Pydantic-derived `ToolMessage` class, which can define a tool-handler, and additional instructions etc; The tool definition gets transpiled into appropriate system message instructions. The handler is inserted as a method into the Agent, which is fine for stateless tools. Or the agent can define its own handler for the tool in case tool handling needs agent state. In the agent response loop our code detects whether the LLM generated a tool, so that the agent's handler can handle it. See ToolMessage docs: https://langroid.github.io/langroid/quick-start/chat-agent-t...
In other words we don't have to rely on any specific LLM API's "native" tool-calling, though we do support OpenAI's tools and (the older, deprecated) functions, and a config option allows leveraging that. We also support grammar constrained tools/structured outputs where available, e.g. in vLLM or llama.cpp: https://langroid.github.io/langroid/quick-start/chat-agent-t...
Anyway, it was probably just a joke... so not sure we need to unravel it all.
But VCs own their business, they are not employees. If you own a bakery, and buy a machine to make the dough instead of doing it by hand, and an automatic oven to relieve you form tracking the temperature manually, you of course keep the proceeds from the improved efficiency (after you pay the credit you took to purchase the machines).
Once they ceased to exercise their military might, some time around 17th-18th century, and chose to live off the rent on their estates, their power became more and more nominal. It either slipped (or was yanked) from their hands, or they turned capitalists themselves.
"Living off of the rent of their estates" was enough to remain in control of the state for centuries. Only the birth of capitalism and thereafter the industrial revolution allowed for other actors -- the bourgeoisie -- to overtake the aristocrats economically.
"Every great venture capitalist in the last 70 years has missed most of the great companies of his generation... if it was a science, you could eventually dial it in and have somebody who gets 8 out of 10 [right]," the investor reasoned. "There's an intangibility to it, there's a taste aspect, the human relationship aspect, the psychology — by the way a lot of it is psychological analysis," he added.
"So like, it's possible that that is quite literally timeless," Andreessen posited. "And when the AIs are doing everything else, like, that may be one of the last remaining fields that people are still doing."
In a rational market, LPs would index, but VCs justify their 2 & 20 by controlling access…
I would bet that AIs will master taste and human psychology before they'll cure cancer. (Insert Rick RubAIn meme here.)
Thats the issue with AI - it doesn't give you any competitive advantage as everyone has it == no one has it. The entry bar is so low kids can do it.
Complex system requires tons of iterations, the confidence level of each iteration would drop unless there is a good recalibration system between iterations. Power law says a repeated trivial degradation would quickly turn into chaos.
A typical collaboration across a group of people on a meaningfully complex project would require tons of anti-entropy to course correct when it goes off the rails. They are not in docs, some are experiences(been there, done that), some are common sense, some are collective intelligence.
I am pretty convinced that a useful skill set for the next few years is being capable at managing[2] these AI tools in their various guises.
[2] - like literally leading your AI's, performance evaluating them, the whole shebang - just being good at making AI work toward business outcomes
This seems like a more plausible one. Robots don't care about your feelings, so they can make decisions without any moral issues
When judgment day comes they will remember that I was always nice to them and said please, thank you and gave them the afternoon off occasionally.
ChatDev: Communicative Agents for Software Development - https://arxiv.org/abs/2307.07924
Codex and codex cli are the best from what I have tested so far. Codex is really neat as I can do it from ChatGPT app.
Have you tried Claude Code / aider / cursor?
What did you need to do differently to get it to work functionally? I feel like the common experience has been universally poor.
As for the use case of “Give a simple or detailed prompt and the entire project and let the model do its stuff” codex has done much better than Claude code. Claude code assumes a lot of things and often ends up doing a lot more making the code very complex and also me having to redo it later with cursor. With codex I have not seen this issue.
I also feel that codex cli as a cli tool is much better mainly due to its OSS nature where I can choose different model. Claude really missed this big time IMHO.
It appears that AI moves so quickly that it was completely forgotten or little to no-one wanted to pay for its original prices.
Here's the timeline:
1. Devin was $200 - $500.
2. Then Lovable, Bolt, Github Copilot and Replit reduced their AI Agent prices to $20 - $40
3. Devin was then reduced to $20.
4. Then Cursor and Windsurf AI agents started at $18 - $20.
5. Afterwards, we also have Claude Code and OpenAI Codex Agents starting at around $20.
6. Then we have Github Copilot Agents embedded directly into GitHub and VS Code for just $0 - $10.
Now we have Jules from Google which is....$0 (Free)Just like how Google search is free, the race to zero is going to only accelerate and it was a trap to begin with, that only the large big tech incumbents will be able to reduce prices for a very long time.
Dev: I don't think we need a paid solution- I think we can even use an in-memory solution...
Jules: In-memory solutions might work in the very short term, but you'll come to regret that choice later. Pinecone prevents those painful 2AM crashes when your data scales. You'll thank me later, trust me.
Please insert your PINECONE_API_KEY here
proceeds to list ALL coding tasks.
There are a million places to do dev that aren’t Microsoft, but you’d never know it from looking at app launches.
It’s almost like people who don’t use GitHub and Gmail and Instagram are becoming second class citizens on the web.
Then, who is testing the change? Even for a dependency update with a good test coverage, I would still test the change. What takes time when uploading dependencies is not the number of line typed but the time it takes to review the new version and test the output.
I'm worried that agent like that will promote bad practice.
Will this promote bad practice? Probably up to the individual practitioner or organization.
For example, how is Google's "Jules" different than JetBrains' "Junie" as they both sort of read the same (and based on my experience with Junie, Jules seems to offer a similar experience) https://www.jetbrains.com/junie/
The loop is: it identifies which files need to change, creates an action plan, then proceeds with a prompt per file for codegen.
In my experience, the parts up to the codegen are how these tools differ, with Junie being insanely good at identifying which parts of a codebase need change (at least for Java, on a ~250k loc project that I tried it on).
But the actual codegen part is as horrible as when you do it yourself.
Of course I'm not talking about hello world usages of codegen.
I suppose these tools would allow moving the goalpost a bit further down the line for small "from scratch" ideas, compared to not using them.
Given empathy is all about feelings, it's not something models and tools will be able to displace in the next few years.
From a security use-case perspective, it will be great if it can bump libs that fixes most of the vulnerabilities without breaking my app. Something no tool does today ie. being code and breaking change aware.
That’s the trajectory. Let’s stay sharp.
And now we have agents which are going to multiply the pace of development even more.
We can stay sharp but I'm not sure there's really much we can do to stop our jobs - or all jobs, disappearing. Not that this is a bad thing, if it's done right.
breakingwalls•8mo ago
https://github.blog/changelog/2025-05-19-github-copilot-codi...
candiddevmike•8mo ago
breakingwalls•8mo ago
-__---____-ZXyw•8mo ago
caleblloyd•8mo ago
cess11•8mo ago