We discuss how to generate synthetic Text2SQL datasets for training and evaluation using Dataframer and Haiku. We also open sourced a dataset on HuggingFace containing hundreds of executable SQL pairs validated end-to-end in a Postgres/MySQL/SQLite execution environment.
pjoshi30•17h ago