The tool trains multiple Word2Vec models on each text independently, aligns them into a shared vector space via Generalized Procrustes Analysis, and lets you compute cosine similarity with 95% confidence intervals across the ensemble through an API. First, it lets you compare semantic drift between authors. How does the semantic neighbourhood of "rent" in Smith compare to "rent" in Ricardo? Where do they converge, and where do they diverge? Second, it measures precision. With shorter texts, embeddings get noisy. Confidence intervals help you distinguish genuine semantic drift from a model that didn't have enough data to learn a stable relationship.
Currently running on Smith, Ricardo, Mill, Steuart, and Bastiat, sourced from Project Gutenberg.
Stack: Python, FastAPI, Lambda, DynamoDB, React. Six containerized Lambdas, zero idle cost.
Backend: https://github.com/areebms/embedding-analytics Frontend: https://github.com/areebms/embedding-analytics-frontend
I think reliability measurements in embeddings may have applications well beyond historical texts. Would love feedback from anyone working on embedding evaluation or uncertainty quantification.