frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Open in hackernews

Ask HN: Is there a general, multi-PL programming task dataset?

1•quartztz•1y ago
Hello!

Being a student interested in PL design, I have had this idea floating around for a while: the gist is finding out what programming languages LLMs might be the most proficient in, to study their design choices and syntactic features with the goal of designing the perfect language for LLMs. This is, of course, gimmicky, but I entertained the idea for a while as a fun afterschool project.

The challenge is: what would be the best way to evaluate programming performance _in specific languages_? There are two main hypotheses here:

1. There are intrinsic syntactic/structural features that the transformer architecture is uniquely able to parse/reproduce/understand best, leading to higher quality code generated. For example: Lisp dialects make parsing code structure and blocks very easy, so one could assume an LLM can "understand their code better" 2. There is so much Python/JS out there that the question isn't even worth asking, and the performance in those will beat whatever other language you throw at it. This is probably not as much of a point thanks to newer transformer architectures but the question is still up.

I suspect the answer can be made somewhat interesting by considering performance relative to language popularity, but the ground question is: is there a general dataset containing different programming challenges, of varying difficulty, in multiple languages, with standard solutions? I couldn't find anything when I looked around, but I might have missed something obvious. It wouldn't be impossible to build a simple website to crowdsource, but I'm thinking that if I missed something obvious I'd rather find out early than late. Also, if you have any input on the project itself, I'd love to hear your ideas!

Comments

Someone•1y ago
> For example: Lisp dialects make parsing code structure and blocks very easy, so one could assume an LLM can "understand their code better"

I would expect the reverse: lisp has no syntactic sugar, making it harder for a LLM to glue code fragments together in a way that produces valid lisp code. Even guaranteeing that parentheses are correctly nested already can be a challenge.

As to a set of programs: they aren’t exactly what you’re looking for, but I would consider https://projecteuler.net (does not contain solutions, but searching for project Euler solutions” finds some) or https://benchmarksgame-team.pages.debian.net/benchmarksgame.

sargstuff•1y ago
Very open ended questions. Geeks for Geeks loosely organized around computer science topics of study : https://www.geeksforgeeks.org/

nit-pick details:

Ignoring hardware differences, "performance" comparisons can be based on differences between algorithm(s) used vs. how algorithm is implimented. For a given language, "algorithm implimentation performance" can be defined as the trade-offs on how a a given algorithm is implimented in a language (compared to other programming languages, but also easy use/flexibility based on 'language generation level -> https://www.geeksforgeeks.org/generation-programming-languag... )

----------------------

1) General computation language specialty 'modules' not withstanding; "languages" are built/optimised around core algorithmic concepts / anticipated area/concentration of targeted professional environment. aka opencl (gpu), R (statistics), Lisp (engineering design), C (OS level), sql (data selection), jasper reports, cobol (business), etc. Languages tend to be 'popular' because of the ecosystem provided around/for a given language.

snarky side note -> can always write a more standard language that compiles to an esolang & provide appropriate emacs/vim/sed/spacemacs ide support.: https://esolangs.org/wiki/Main_Page

  LLM's are very useful at curating information and recognizing/summarizing "statisical" relevance. aka apl is great for engineering mind set, not so good for business use cases aka cobal.  LLM might recognize a language for a given user that combines commonly used 'apl' aspecs of user and commonly used 'cobal' aspecs of user and recommend a language(s) with suitable commonalities for given user. 


2) Search engine topic 'coding challenges' 'algorithmic coding challenges' brings up many types of answers/sites for honing one's coding skills (various languages, beginner to expert, etc). Coding 'algorithms' vs. coming up with algorithm(s) to code is sort of a side aspect. Also differences in 'competition' challenges vs. 'technical challenges' (aka 512 c64 vs. 1 raspberry pi) ; vs. "computer science coding challenges" vs. 'computational genomic challenges'

     ?? how easy / hard based on 'profession' aka artist vs. software designer 20 years experience programming in scheme; environment -- NASA vs. google vs. insurance company.

   ?? from scratch : https://synoptek.com/insights/it-blogs/10-challenges-every-software-product-developer-faces/

   ?? based on industry standards ?? ; just trying to keep skills honed ??

'Coda vs. Goodyear': Indefiniteness destroys trade secrets

https://www.reuters.com/legal/legalindustry/coda-v-goodyear-indefiniteness-destroys-trade-secrets...
1•georgecmu•13s ago•0 comments

An Argentine influencer made Tim Payne the World Cup's unlikely hero

https://aleagues.com.au/news/tim-payne-argentine-influencer-world-cup/
1•mgarciaisaia•2m ago•0 comments

'World's first fluid circuit board' can be physically rewired in under a minute

https://www.tomshardware.com/tech-industry/prototype-of-the-worlds-first-fluid-circuit-board-can-...
1•n0pe_p0pe•2m ago•0 comments

AI is changing this job so fast the interview process can't keep up

https://www.cnn.com/2026/05/28/tech/ai-software-engineering-job-interview
1•mooreds•3m ago•0 comments

Ask HN: Does Claude Code remove the need for so many front-end frameworks?

1•ex-aws-dude•4m ago•0 comments

Scram by Chris Crawford

https://archive.org/details/a8b_SCRAM_1980_Atari_US_req_OSb_BASIC
1•evo_9•4m ago•0 comments

What Google, Yahoo, Microsoft, and Apple are doing to your email

https://www.jacquescorbytuech.com/writing/what-google-yahoo-microsoft-and-apple-are-doing-your-email
1•iamacyborg•6m ago•0 comments

Shift Offers Free NYC Cleaning to Train Household Robots

https://twitter.com/joinshiftX/status/2060044783519735987
2•ZenoH•8m ago•0 comments

Why Google Stores Billions of Lines of Code in a Single Repository (2016)

https://cacm.acm.org/research/why-google-stores-billions-of-lines-of-code-in-a-single-repository/
1•downbad_•8m ago•0 comments

Learning from Ukraine, Hezbollah is now using fibre-optic drones to hit Israel

https://www.bbc.com/news/articles/c0r2ydlvk41o
1•daveoc64•8m ago•0 comments

Show HN: Pubflow, Backend trust layer for build faster AI based apps

https://www.pubflow.com/
1•SamuelRecio•8m ago•0 comments

Video – thirty years of breaking the web's assumptions

https://www.talking.video/posts/video-web-history-intro
1•pavlov•8m ago•0 comments

A Dark Room

https://adarkroom.doublespeakgames.com/
1•evo_9•10m ago•0 comments

Testing how LFP batteries fail when overcharged

https://hackaday.com/2026/05/28/testing-lfp-battery-failure-modes-with-overcharging/
1•logickkk1•10m ago•0 comments

OpenScholarXIV

https://github.com/ScholarXIV/OpenScholarXIV/releases/tag/v4.0.0
1•dagmawibabi•11m ago•0 comments

Brazilian Food-Delivery Giant iFood Targeted in Alleged 43.8M Data Leak

https://darkwebinformer.com/brazilian-food-delivery-giant-ifood-targeted-in-alleged-43-8m-record-...
1•flykespice•11m ago•0 comments

Amazon scraps AI leaderboard to stop workers chasing usage scores

https://www.ft.com/content/b1a62a7f-6df5-4c90-94ce-64ce9c9961b6
2•mmarian•12m ago•0 comments

RentAnAgent – automating 80% of white collar work

https://rentanagent.app
1•RentAnAgent•12m ago•0 comments

Ad Infinitum

https://matthiasott.com/notes/ad-infinitum
1•freediver•12m ago•0 comments

Show HN: CVE-2026-40369 Windows Kernel Arbitrary Write Chrome SBX

https://pwn2nimron.com/blog
2•orinimron123•13m ago•0 comments

Anthropic Rockets to $965B Valuation, Topping OpenAI in AI Showdown

https://www.wsj.com/tech/ai/anthropic-valuation-openai-80bf2c0a
2•doener•15m ago•0 comments

Avoiding Death on the Yellow Brick Road

https://www.a16z.news/p/avoiding-death-on-the-yellow-brick
4•ex-aws-dude•17m ago•0 comments

NY and NJ subpoena FIFA over 'manipulated' World Cup ticketing

https://www.theguardian.com/football/2026/may/27/new-york-new-jersey-investigation-fifa-ticketing
3•darth_avocado•19m ago•0 comments

LEGO model of a nuclear research reactor for teaching physics

https://beta.ideas.lego.com/product-ideas/e235fbd0-8ab8-4575-bd1c-37a25625f118
3•vilgoupil•20m ago•0 comments

BofA Banker Courted Epstein for Years Leading Up to His Arrest

https://www.bloomberg.com/news/articles/2026-05-28/epstein-files-reveal-ongoing-ties-with-this-on...
2•petethomas•20m ago•0 comments

Show HN: Grove, an open-source MCP server to search/write to your Obsidian Vault

https://github.com/jmilinovich/grove
2•jmilinovich•22m ago•0 comments

Protestware for Coding Agents

https://nesbitt.io/2026/05/28/protestware-for-coding-agents.html
3•SVI•23m ago•0 comments

The Python Features We Almost Got but Never Did

https://medium.com/techtofreedom/9-python-features-we-almost-got-but-never-did-4fcc5b358d55
2•yangzhou•25m ago•0 comments

StackAI Acquired by Asana

https://asana.com/press/releases/pr/asana-acquires-stackai-adding-cross-system-execution-for-huma...
2•karissaho•26m ago•0 comments

How far behind are open models?

https://www.lesswrong.com/posts/rJcCrXyEsJKmmDpWG/how-far-behind-are-open-models
5•alecco•27m ago•1 comments