“I noticed a clear violation of our contributing guidelines”

https://github.com/antiwork/flexile/pull/427#issuecomment-3079688548

96•slacktivism123•6mo ago

Comments

slacktivism123•6mo ago

Is this the future of collaborative coding?

When

https://github.com/antiwork/flexile/pull/427#issuecomment-30...

results in

https://github.com/antiwork/flexile/pull/427#issuecomment-30...

More incredible examples where a LLM flags contributors' pull requests because their comments contain minor grammar errors:

https://github.com/antiwork/flexile/pulls?q="our+contributin...

samrus•6mo ago

Your last link seems to show alot of PRs being rejected because screenshots and test results werent included. I do get your point about the grammar rule being enforced badly, but the link isnt a great example it

WesolyKubeczek•6mo ago

Such future is instrumental in helping me accept and embrace my own mortality.

Asraelite•6mo ago

In principle I don't see a problem with bots looking out for grammar mistakes and typos that could confuse readers (so long as it's less intrusive than this bot), but in this case the bot is just incorrect.

"Good to merge? Test suite passes locally?" is perfectly valid English. You need to make sure that the bot is configured to not insist on arbitrary prescriptivist style guides that nobody cares about.

samrus•6mo ago

The problem is these bots dont understand the concept of "goals" or "achieving something". So they cant judge for themselves when they are actually helping (like in correcting grammar where its so bad its hindering communications with other contributors) or being pedantic (like this) because they wont think to consider the difference unless wxplicitly told to do so

And NGI like a human would be given this task and consider the spirit of the law, the goal we want to acheive, and enforce it to reach that. The next token AI doesnt model that. It just predicts the next token, and understanding the spirit of the law does not seem to be in the emergent capabilities of that

Asraelite•6mo ago

I often see comments on Github issues where poor wording makes it difficult to understand what is actually meant. Things like "I reproduced the bug on Linux, then I tried Windows. I can't reproduce it now." Does that mean it's just not reproducible on Windows, or not reproducible at all anymore? Ambiguities like that are especially annoying when it's someone posting a solution to a problem. Sometimes it's because of grammatical errors, sometimes not.

I think LLMs are actually great for catching things like this, and you generally don't need some higher-level understanding about the goals involved to notice the ambiguity. My point wasn't that bots shouldn't be used like this, just that they need to be given the right instructions.

samrus•6mo ago

> just that they need to be given the right instructions.

this is the part im talking about. i also think LLMs are very capable at detecting different types and levels or grammar, but they cant decide which ones should be filtered out to meet a certain goal. they need detailed instructions for that, and thats somewhat inefficient and causes issues like this right here.

we have done this song and dance many times with AI. its the bitter lesson: you need a system that learns these things, you cant just give it a rule based engine to patch the things it cant learn. that works in the short term, but leads to a dead end. we need something that has the "common sense" to see when grammar is fine versus hindering communication, and this just isnt there yet. so it needs to be given detailed instructions to do so. which may or may not be sustainable

Asraelite•6mo ago

I agree with you but I think we're talking about different things haha.

It sounds like you mean that curating messages purely to conform to a particular style guide requires context-dependent information and can never be accomplished reliably with some unmodified generic prompt across many different projects.

I'm saying that while this is true, if you ignore grammar guidelines and just look for cases of ambiguous and confusing wording specifically, then this can actually be accomplished reliably with a generic prompt, if you get that prompt right. Not 100% accurately of course, but good enough that it would be beneficial overall in the long run.

antonvs•6mo ago

Yeah, no-one wants Grammarly on pull requests.

samrus•6mo ago

Well the maintainers might. The both in (badly) enforcing guidelines they set out. So they might want to police grammar a little bit. And i could see their point if the grammar is so bad it hinders communication. Bit uncomfortably, it was also be a reasonable predictor of those spam PRs put in from people just trying to pad resumes.

But yeah the bot needs to loosen the boundary a bit there

bob1029•6mo ago

The PR thread looks like this to me:

https://www.reddit.com/r/memes/comments/7lkgdk/video_game_hu...

bschne•6mo ago

this picture is missing some ads, it's the logical next step

samrus•6mo ago

"Your PR is missing testing results. Now, with playright pro, enjoy agentic debugging analysis that coordinates with coding agents including claude code, cursor, copilot, and more! Only 19.99!"

enlyth•6mo ago

I feel you, at our workplace someone added some AI Code review thing and it slops the whole PR with useless sequence diagrams and poems (why would I want to read an AI slop poem?) and pages and pages of instructions on how to interact with it.

Our Slack channel is also completely slopped to the point it's mostly bot conversations constantly spamming about review reminders, pull request statuses, and other useless info you can just look up yourself if you need it.

The signal to noise ratio is to the point where I just ignore everything.

Aeolun•6mo ago

You should add a bot to summarize all the messages in those channels!

queenkjuul•6mo ago

My boss is obsessed with copilot PR summaries and they make me want to scream. A 12-line PR with a 40-line summary? And each description is "updated and enhanced <file>... Updated and enhanced <file>"

There's no utility in reading them, which means asking people more questions directly in slack instead, which genuinely makes the entire process slower for everyone. I know for a fact that no developer in my team would struggle to write 4-5 bullet points for any given PR in less than 2 minutes. My boss just genuinely thinks the slop is better (probably because he's not usually the one reading them)

Our Confluence is also rapidly turning into nothing but totally uninformative Claude-generated noise.

It's annoying to be repeatedly called a Luddite for questioning this stuff when the end results can be so obviously wasteful and unhelpful.

bob1029•6mo ago

> It's annoying to be repeatedly called a Luddite for questioning this stuff

I've been employing the strategy of pushing on the customer and product delivery expectations. Ask some variants of the following in your next team meeting:

  "Where are we at with Customer X?" 
  "How does Bob (the customer's CEO) feel about the project right now?" 
  "How will this sell us more copies or make the customer pay more money?"

These kinds of questions can rapidly reorient a team if a member with the power to terminate other employees is present. If the most senior management seems apathetic to these questions and you are interested in building an actual career, it might be time to move on. At some point, the person(s) paying for everything need to ask themselves if they intend to conduct a serious business or simply run a daycare for neurotic tech enthusiasts.

dearilos•6mo ago

There is so much slop that you end up ignoring the important stuff

AI code review really does help but only when used correctly

I'm building the tool to fix that

NooneAtAll3•6mo ago

where exactly does the bot say that?

I kept reading and reading and the "violation of our guidelines" phrase wasn't appearing, so I got bored

szszrk•6mo ago

It's in the second sentence of linked comment...

NooneAtAll3•6mo ago

wdym linked comment? it links to the whole pull request?

szszrk•6mo ago

It linked to particular comment of a bot for me. At least at the time I commented.

In all honesty, it works better for me in a vanilla browser on mobile. GitHub app is less predictable IMHO.

bru•6mo ago

https://github.com/antiwork/flexile/pull/427#issuecomment-30...

jstanley•6mo ago

You can type Ctrl-F to search for a phrase within a page.

tomhow•6mo ago

In their defence, the originally-submitted title was edited so it didn't match a string in the PR thread. I've modified it so it does, but that also required removing words from the end.

rokkamokka•6mo ago

Wow what an obnoxious and extremely verbose PR flow. Bot overflow

picafrost•6mo ago

This makes me wonder if AI usage will end up part of job position listings, similar to remote days, if it is not already. How much AI agent will you be able to use/be subjected to? Are people looking for this already when job searching?

VoidWhisperer•6mo ago

Some companies are definitely already looking for it - I think as part of my job search, I've run into probably atleast 4 or 5 companies that, on top of having it as a qualification, ask specific questions on the job application about what AI tools you use and how they've made you more productive

OJFord•6mo ago

I'm by no means a super keen or heavy user, I've barely dabbled really, but if we were hiring at the moment I think I'd ask about it - it's suddenly a huge part of understanding how a person works and what they might be like to interact with. Up there with considering cover letter/email/CV writing style, and commit messages from any take-home or their public repos.

queenkjuul•6mo ago

And now when candidates submit take home tests for me to review, half of them are barely functional AI slopfests.

A goddamn staff engineer candidate with 20 years experience submitted an auto generated, broken app, and then tried to pass it off as though it didn't work because our API was down. No, the AI hard coded the wrong URL and he didn't notice (the correct URL was specified in the email containing the test instructions)

I want off Mr Altman's wild ride lol

shdon•6mo ago

I run several job boards, and though there hasn't yet been any postings requiring the candidate's using AI (also not all that likely in the particular field), I am noticing a definite increase in the number of lists with emoji preceding each bullet point, and the use of em dashes. Personally, if I were a job seeker, I'd find that off-putting just as much as if I were presented with the requirement to use AI in my job.

queenkjuul•6mo ago

Oh my god, the fucking emoji bullets

I basically just stop reading on sight

id00•6mo ago

Don't want to be too judgemental but does "Self Serve invite link" feature really needs 50 commits, an army of bots, countless nitpicking and 140+ messages? We are not launching Apollo to the Moon here.

I don't know this particular project but seeing threads like this kill any motivation to contribute.

repeekad•6mo ago

Haha, wait until you hear how long and how many people it takes to change the text for a single button at a company like Google

mgraczyk•6mo ago

Probably one person one minute, unless it needs translation in which case potentially unbounded time

dvh•6mo ago

And now top nav breaks to second line on 63% on mobile devices and the submit button is now out of screen. I know these "how hard can it be?" type of changes

saidinesh5•6mo ago

Ngl .. the spectrum is really as wide as:

"We are a fast moving start up (even at 3-4 years old), we believe in moving fast and breaking things ... That's why we don't do code reviews or unit tests.. we just edit live running code and restart the server"

"This one line change needs 4 -5 commits to add feature flags,unit tests, integration tests - all to be reviewed by different teams and wait 1-2 months to be deployed properly to production"

spcebar•6mo ago

There's stories that bridge the gaps between the two models of explosions and fires and legally actionable fast moving breakages that bring new people into start-ups who put these processes into place and generally shift the culture towards cautiousness. Other industries need to start at point B, or otherwise learn the lessons that get you there _quickly_.

id00•6mo ago

I worked for a big tech and may be I was lucky but code wise it was much more tame. You may need to get an army of people to sign off the feature but nobody was scrutinizing my code like that

motorest•6mo ago

> Haha, wait until you hear how long and how many people it takes to change the text for a single button at a company like Google

I feel this is a needlessly obtuse statement. I'll explain you why, as I've worked professionally with frontend development. From your comment it seems you don't have that type of context.

The text that is expected to feature in a UI element is a critical factor in cross cutting concerns such as product management and UX design, and it involves things like internationalization and accessibility support. This means that if you change a line of text, it needs to be translated to all supported languages, and the translation needs to meet usability and GUI requirements. This check needs to be done in each and every single locale supported.

I can give you a very concrete example. Once I was tasked with changing a single line of text in a button featured in a dialog. It turns out the french translation ended up being too long that forced line breaks. The UI framework didn't handled those line breaks well and reflowed the whole UI, causing a huge mess. This required new translation requests, but it turned out that the new translations were too vague and ambiguous. Product Managers got involved because the french translation resulted in poor user experience. Ultimately the whole dialog was redesigned.

But to you it's just a text on a single button, isn't it?

anal_reactor•6mo ago

I think the problem is the detachment of a developer from what is being done. Many developers want to treat coding like a crative work, while corporations try to make it an automatic job that turns tickets into code. The fact that the person working on a dialog isn't aware that there are translations where things might force a different layout is a proof of a broken system.

I'm not saying that developer should also be UI specialist, that's nonsense. But a developer should know who to talk to regarding advice, and the contact should be easy. The model where a dev doesn't even know that French translation exists is wrong. The correct model is having a dev think "wait, this might affect translation, better send a Slack message to some UI guy and let him know".

Actually, what I said is a pipe dream. The reality is, most devs are average devs, and companies need to optimize processes for that. This results in very official communication that takes years to get anything done, because you can't trust average developer that they'll contact the UI guy. This leads to a lot of frustration from above-average developers, who simply need a different environment to shine (more freedom but also more responsibility)

Hilift•6mo ago

These resume's aren't going to write themselves.

csmantle•6mo ago

Apollo won't ever make it to the Moon if the engineers were flooded with these bot replies XD

StrLght•6mo ago

What an absolutely dystopian PR flow.

Shouldn't automation be somewhat useful? All these bot comments — do they really bring more value than they create distractions?

nicce•6mo ago

If you look the org, it might make more sense.

> Antiwork emerged from Gumroad's mission to automate repetitive tasks. In 2025, we're taking a bold step by open-sourcing our entire suite of tools that helped run and scale Gumroad. We believe in making powerful automation accessible to everyone.

viraptor•6mo ago

It looks like most of the coderabbit comments were genuinely addressed. Can't easily follow the Cursor ones - they definitely have some work to do on the right presentation / summary folding.

So yeah, it does look like they bring value for genuinely more reliable code.

PeterStuer•6mo ago

Imagine a contributor forgetting to rinse his message through an AI verbalizer. The horror!

mykowebhn•6mo ago

I know there's something to be said for retaining git history when merging, but merging 50 commits is a great way to pollute your commit history.

Some of the commit descriptions: "fix", "fixes", "clean up".

uniq7•6mo ago

Github will squash all commits into a single one that actually gets merged: https://github.com/antiwork/flexile/commit/123404b3caa52870b...

normie3000•6mo ago

> # Self-Serve Invite Link Feature — Fixes #348 This feature enables admins to generate an invite link for contractors

How is that self-service?

lolc•6mo ago

Shhh don't tell the bots!

spuz•6mo ago

This would be best implemented inline with the textbox of the comment form. If you want to give people feedback on their grammar then do it while they are actually writing - not after. Otherwise you just make an already hard to follow thread even more noisy.

raincole•6mo ago

Grammar MechaHitler.

_ache_•6mo ago

The comments of coderabbitai are ... well.

    +  const safeToken = typeof token === "string" && token.length > 0 && token.length < 256 ? token : "";

Of arguable quality I would say. The length size limit is arbitrary and the > 0 is ridiculous.

whatevsmate•6mo ago

> thank you for the clarification! I appreciate you sharing that domain knowledge about the document-signature relationship

> …

> Your expertise about the system's constraints helps provide important context that static analysis tools can't capture.

So much fawning bullshit bloating the message and the token count. I think this might be the thing with LLMs I dislike most.

Suggestion for prompt writers: “Don’t waste tokens. Keep messages succinct and direct.”

amiga386•6mo ago

Dear bot,

Fuck off.

Dear humans who advocated for installing the bot, let me use anodyne, US corporate bullshit language so you'll understand:

Your bot does not add value. Get rid of it, before it drives out all voluntary contributors.

Hilift•6mo ago

Imagine this in perfect French.

am17an•6mo ago

There's going to a wrapper around github PRs to summarize these issues, the mess they created in the first place. BTW this is the same guy which has famously stopped hiring engineers

juliangmp•6mo ago

How are these bots still a thing? Who actually wants then???

thomascountz•6mo ago

I often use the metaphor of LLMs as calculators.

Mathematicians use calculators, and so too do elementary school students, and grocery store clerks, and civil engineers. What each person needs from a calculator can be similar, but would you give a graphing calculator to the store clerk and expect them to be more "productive?"

Admittedly, my metaphor is leaky—and I also can't comment on the participants of the PR—but after reading the comments and the code itself, I'm getting a lot of "here’s a new calculator with a bunch of graphing functions, trigonometric menus, and poem generators—now go do the basic arithmetic you were already doing, but you work for the calculator now" vibes.

Said another way, it took me a lot more time and effort to understand what the bots were saying and if I agreed, than it did for me to formulate my own thoughts and questions.

Like the saying goes, "the best calculator is the one you have with you," and I'd much rather just use my own.

cyanydeez•6mo ago

Unfortinately, LLMs arnt deterministic. So they cannot be consistent. So calculators arnt comparable.

JohnKemeny•6mo ago

commit number n: add tests

r0ckarong•6mo ago

Like we needed more ways to stifle contribution to FOSS projects. Where is the GaaS (gatekeeping as a service) unicorn?

moribvndvs•6mo ago

This profession has gone from rewarding and enjoyable to a confusing, miserable hellscape faster than I possibly could have imagined.

dearilos•6mo ago

This is what sucks about code review today

It's just slop and useless context

Code review should only help with the repetitive and tedious parts of review

I'm building a tool to fix that

Show HN: LocalGPT – A local-first AI assistant in Rust with persistent memory

Haskell for all: Beyond agentic coding

SectorC: A C Compiler in 512 bytes (2023)

Speed up responses with fast mode

Software factories and the agentic moment

Brookhaven Lab's RHIC concludes 25-year run with final collisions

Homeland Security Spying on Reddit Users

Hoot: Scheme on WebAssembly

LLMs as the new high level language

Stories from 25 Years of Software Development

Total Surface Area Required to Fuel the World with Solar (2009)

First Proof

Vocal Guide – belt sing without killing yourself

Show HN: I saw this cool navigation reveal, so I made a simple HTML+CSS version

Al Lowe on model trains, funny deaths and working with Disney

FDA intends to take action against non-FDA-approved GLP-1 drugs

Vouch

Show HN: A luma dependent chroma compression algorithm (image compression)

Show HN: Axiomeer – An open marketplace for AI agents

Start all of your commands with a comma (2009)

The AI boom is causing shortages everywhere else

Microsoft account bugs locked me out of Notepad – Are thin clients ruining PCs?

I write games in C (yes, C) (2016)

Learning from context is harder than we thought

OpenCiv3: Open-source, cross-platform reimagining of Civilization III

Selection rather than prediction

Where did all the starships go?

Unseen Footage of Atari Battlezone Arcade Cabinet Production

The F Word

Show HN: Look Ma, No Linux: Shell, App Installer, Vi, Cc on ESP32-S3 / BreezyBox

Show HN: LocalGPT – A local-first AI assistant in Rust with persistent memory

Haskell for all: Beyond agentic coding

SectorC: A C Compiler in 512 bytes (2023)

Speed up responses with fast mode

Software factories and the agentic moment

Brookhaven Lab's RHIC concludes 25-year run with final collisions

Homeland Security Spying on Reddit Users

Hoot: Scheme on WebAssembly

LLMs as the new high level language

Stories from 25 Years of Software Development

Total Surface Area Required to Fuel the World with Solar (2009)

First Proof

Vocal Guide – belt sing without killing yourself

Show HN: I saw this cool navigation reveal, so I made a simple HTML+CSS version

Al Lowe on model trains, funny deaths and working with Disney

FDA intends to take action against non-FDA-approved GLP-1 drugs

Vouch

Show HN: A luma dependent chroma compression algorithm (image compression)

Show HN: Axiomeer – An open marketplace for AI agents

Start all of your commands with a comma (2009)

The AI boom is causing shortages everywhere else

Microsoft account bugs locked me out of Notepad – Are thin clients ruining PCs?

I write games in C (yes, C) (2016)

Learning from context is harder than we thought

OpenCiv3: Open-source, cross-platform reimagining of Civilization III

Selection rather than prediction

Where did all the starships go?

Unseen Footage of Atari Battlezone Arcade Cabinet Production

The F Word

Show HN: Look Ma, No Linux: Shell, App Installer, Vi, Cc on ESP32-S3 / BreezyBox

“I noticed a clear violation of our contributing guidelines”

Comments