Hi HN! I’ve been working on crawl.ws for the last couple of months.
I created this because I was personally frustrated by how cluttered and difficult to navigate the original Wayback Machine can be. My goal is to build an independent, truly permanent, open-access web archive focused on a clean, fast, and simple user experience.
Current Technical Status & Future Plans
To provide immediate access to history and solve the 'cold start' problem, the frontend is currently built on top of the Wayback Machine API. When a user requests an archive, we automatically cache that snapshot on our own servers to improve performance and guarantee long-term availability.
For future indexing, I'm developing our own Python-based crawler with a focus on resilience and storage efficiency. The plan is to phase out reliance on the API as our own index grows.
As a future addition, I'll be integrating AI tools (available soon) to help researchers derive more insight from the archived data.
What I’m Looking For
I’m here to answer questions all day, but I’m specifically looking for feedback on two main areas:
Search Experience: Does the simple search feel fast, accurate, and easy to use?
Archival Gaps: What kind of sites do you find consistently missing or broken on other archives? Submitting those URLs helps us test and improve our next-generation crawler.
stackws•4h ago
I created this because I was personally frustrated by how cluttered and difficult to navigate the original Wayback Machine can be. My goal is to build an independent, truly permanent, open-access web archive focused on a clean, fast, and simple user experience.
Current Technical Status & Future Plans
To provide immediate access to history and solve the 'cold start' problem, the frontend is currently built on top of the Wayback Machine API. When a user requests an archive, we automatically cache that snapshot on our own servers to improve performance and guarantee long-term availability.
For future indexing, I'm developing our own Python-based crawler with a focus on resilience and storage efficiency. The plan is to phase out reliance on the API as our own index grows.
As a future addition, I'll be integrating AI tools (available soon) to help researchers derive more insight from the archived data.
What I’m Looking For
I’m here to answer questions all day, but I’m specifically looking for feedback on two main areas:
Search Experience: Does the simple search feel fast, accurate, and easy to use?
Archival Gaps: What kind of sites do you find consistently missing or broken on other archives? Submitting those URLs helps us test and improve our next-generation crawler.
Thanks for checking it out!