Today, we are happy to introduce datagen, a tool we developed internally to solve this problem. It generates coherent, synthetic data with the ability to model complex relationships. It is a new DSL (domain specific language) using which the user specifies the shape of the entity they wish to generate, and generator functions describing the logic for generating each field. The entity can be a table in a relational dbms, or a json document in a document store, or a csv file to upload on S3 etc.
The user writes models in .dg files that are transpiled to golang code, which can then be used to generate coherent, synthetic data.
Here is a simple example:
// users.dg
model users {
fields {
name() string
age() int
}
gens {
func name() {
return "Arthur Dent" // hardcoded value
}
func age() {
return IntBetween(18, 65)
}
}
}
Checkout the website for more information: https://ds-horizon.github.io/datagen/edit: demo video - https://www.youtube.com/watch?v=ly0DfzTup28