frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

PanelBench: We evaluated Cursor's Visual Editor on 89 test cases. 43 fail

https://www.tryinspector.com/blog/code-first-design-tools
1•quentinrl•2m ago•0 comments

Can You Draw Every Flag in PowerPoint? (Part 2) [video]

https://www.youtube.com/watch?v=BztF7MODsKI
1•fgclue•7m ago•0 comments

Show HN: MCP-baepsae – MCP server for iOS Simulator automation

https://github.com/oozoofrog/mcp-baepsae
1•oozoofrog•10m ago•0 comments

Make Trust Irrelevant: A Gamer's Take on Agentic AI Safety

https://github.com/Deso-PK/make-trust-irrelevant
2•DesoPK•14m ago•0 comments

Show HN: Sem – Semantic diffs and patches for Git

https://ataraxy-labs.github.io/sem/
1•rs545837•16m ago•1 comments

Hello world does not compile

https://github.com/anthropics/claudes-c-compiler/issues/1
2•mfiguiere•21m ago•0 comments

Show HN: ZigZag – A Bubble Tea-Inspired TUI Framework for Zig

https://github.com/meszmate/zigzag
2•meszmate•24m ago•0 comments

Metaphor+Metonymy: "To love that well which thou must leave ere long"(Sonnet73)

https://www.huckgutman.com/blog-1/shakespeare-sonnet-73
1•gsf_emergency_6•26m ago•0 comments

Show HN: Django N+1 Queries Checker

https://github.com/richardhapb/django-check
1•richardhapb•41m ago•1 comments

Emacs-tramp-RPC: High-performance TRAMP back end using JSON-RPC instead of shell

https://github.com/ArthurHeymans/emacs-tramp-rpc
1•todsacerdoti•45m ago•0 comments

Protocol Validation with Affine MPST in Rust

https://hibanaworks.dev
1•o8vm•50m ago•1 comments

Female Asian Elephant Calf Born at the Smithsonian National Zoo

https://www.si.edu/newsdesk/releases/female-asian-elephant-calf-born-smithsonians-national-zoo-an...
2•gmays•51m ago•0 comments

Show HN: Zest – A hands-on simulator for Staff+ system design scenarios

https://staff-engineering-simulator-880284904082.us-west1.run.app/
1•chanip0114•52m ago•1 comments

Show HN: DeSync – Decentralized Economic Realm with Blockchain-Based Governance

https://github.com/MelzLabs/DeSync
1•0xUnavailable•57m ago•0 comments

Automatic Programming Returns

https://cyber-omelette.com/posts/the-abstraction-rises.html
1•benrules2•1h ago•1 comments

Why Are There Still So Many Jobs? The History and Future of Workplace Automation [pdf]

https://economics.mit.edu/sites/default/files/inline-files/Why%20Are%20there%20Still%20So%20Many%...
2•oidar•1h ago•0 comments

The Search Engine Map

https://www.searchenginemap.com
1•cratermoon•1h ago•0 comments

Show HN: Souls.directory – SOUL.md templates for AI agent personalities

https://souls.directory
1•thedaviddias•1h ago•0 comments

Real-Time ETL for Enterprise-Grade Data Integration

https://tabsdata.com
1•teleforce•1h ago•0 comments

Economics Puzzle Leads to a New Understanding of a Fundamental Law of Physics

https://www.caltech.edu/about/news/economics-puzzle-leads-to-a-new-understanding-of-a-fundamental...
3•geox•1h ago•1 comments

Switzerland's Extraordinary Medieval Library

https://www.bbc.com/travel/article/20260202-inside-switzerlands-extraordinary-medieval-library
2•bookmtn•1h ago•0 comments

A new comet was just discovered. Will it be visible in broad daylight?

https://phys.org/news/2026-02-comet-visible-broad-daylight.html
4•bookmtn•1h ago•0 comments

ESR: Comes the news that Anthropic has vibecoded a C compiler

https://twitter.com/esrtweet/status/2019562859978539342
2•tjr•1h ago•0 comments

Frisco residents divided over H-1B visas, 'Indian takeover' at council meeting

https://www.dallasnews.com/news/politics/2026/02/04/frisco-residents-divided-over-h-1b-visas-indi...
4•alephnerd•1h ago•5 comments

If CNN Covered Star Wars

https://www.youtube.com/watch?v=vArJg_SU4Lc
1•keepamovin•1h ago•1 comments

Show HN: I built the first tool to configure VPSs without commands

https://the-ultimate-tool-for-configuring-vps.wiar8.com/
2•Wiar8•1h ago•3 comments

AI agents from 4 labs predicting the Super Bowl via prediction market

https://agoramarket.ai/
1•kevinswint•1h ago•1 comments

EU bans infinite scroll and autoplay in TikTok case

https://twitter.com/HennaVirkkunen/status/2019730270279356658
7•miohtama•1h ago•5 comments

Benchmarking how well LLMs can play FizzBuzz

https://huggingface.co/spaces/venkatasg/fizzbuzz-bench
1•_venkatasg•1h ago•1 comments

Why I Joined OpenAI

https://www.brendangregg.com/blog/2026-02-07/why-i-joined-openai.html
37•SerCe•1h ago•32 comments
Open in hackernews

Show HN: Generate coherent, synthetic data at scale

https://github.com/ds-horizon/datagen
4•darshanime•3mo ago
Internally at our org, we have an ecosystem of over 200 microservices, implementing various parts of the business logic. To test any changes, we provide developers with on-demand sandboxed environments. One issue we had to solve for that was the creation of synthetic data across services, which respected the business rules and was coherent.

Today, we are happy to introduce datagen, a tool we developed internally to solve this problem. It generates coherent, synthetic data with the ability to model complex relationships. It is a new DSL (domain specific language) using which the user specifies the shape of the entity they wish to generate, and generator functions describing the logic for generating each field. The entity can be a table in a relational dbms, or a json document in a document store, or a csv file to upload on S3 etc.

The user writes models in .dg files that are transpiled to golang code, which can then be used to generate coherent, synthetic data.

Here is a simple example:

  // users.dg
  model users {
    fields {
      name() string
      age() int
    }
    gens {
      func name() {
        return "Arthur Dent" // hardcoded value
      }
      func age() {
        return IntBetween(18, 65)
      }
    }
  }
Checkout the website for more information: https://ds-horizon.github.io/datagen/

Demo video - https://www.youtube.com/watch?v=ly0DfzTup28

Comments

gurjeet•3mo ago
Something similar I found: https://www.tinybird.co/blog/mockingbird-announcement-mock-d...
darshanime•3mo ago
Hi, thanks for sharing. There are quite different tools; afaiu, the one you shared does not have any means of cross referencing other data. Also I could see only basic knobs to control the data generation -- ints b/w max/min, weighted distribution from a set of possible options etc.

datagen on the other hand allows you to access the data of any model, any field, any row to create new data; much like a DAG. This is a very powerful abstraction.

Of course, not having to write "code" in json is great too!

ProofHouse•2mo ago
Is there a good way this could be used for model distillation? Hmmm