We had some users asking for a way to constrain the scope of the visual perception. Specifically, they were processing scanned forms where a field like "Gender" or "Payment Mode" would return inconsistent raw text (e.g., "M", "Male", or just a checked box symbol) depending on the document layout.
To solve this, we added an Enum type to the builder.
You can now visually map a region and strictly define the allowed states (e.g., ["Male", "Female"] or ["Sedan", "SUV", "Truck"]). The engine will now force the visual signal into one of those pre-defined buckets instead of returning ambiguous strings.
It’s a small change, but it makes the JSON output deterministic and saves you from writing extra code to normalize the data downstream.
Happy to hear any feedback.