The crawlers were actually a second attempt, after first trying HN posts but not quite finding a good way to parse them, of gathering data for this side project I'd been toying around with on paper since a previous job. To expand a bit on the headline, it lets you upload search results in JSON, "swipe" through them (I haven't gotten to touch interactions yet) as a basic way of sorting results you do and don't want to move forward with, and track your overall interview process.
I recently shipped a first draft of the UI, which is the main reason I'm posting this, but the full project also includes an API for storing and tracking job results. In particular, I built this in order to have basic duplicate detection baked in -- both while crawling and at upload -- in order to aggregate results from multiple searches (including across different sources) and avoid making extraneous network calls. The frontend will also show off (mocks of) some other quality-of-life items, like some basic filters for the crawlers, default interview questions, and templated cover letters.
I don't really have a "roadmap" per se, because it's not really a product, but some top of mind stuff includes: a more cohesive design for enabling the creation and config of custom datasources; some basic quality of life fixes -- manually adding/editing jobs, more flexible units for pay, etc; and some more medium- to long-term projects like company tracking.
The crawler workflows themselves are, on that point, one of the next things I need to clean up, and in a private branch in the meantime, but by design this only strictly requires `name` and `company` fields on each result, and doesn't require any specific workflow for gathering that data.
To give you an idea of the intended scale -- it's a tool I built for me, both to cut out tedium and to have something I can show off a bit. But ideally, someone with a self-hosting bent might also get some use out of it in the event of any sudden need to start updating resumes.
Some random side notes:
- This is all hand-rolled CSS. I'm still figuring out a scrollbar issue around `dvh` units, and a couple smaller layout issues around smaller phones... but again, this is all hand-rolled CSS.
- There's also a lingering animation glitch in Firefox. That might be a rendering problem, which is a separate concern, as I'm considering a migration away from React now that I've got the basic functionality scaffolded.
- My time on this one has been split between: using it, dealing with critical-path usability issues (including crawler troubleshooting), building it out (both front and backend) to extend my use cases, building it out to show off, and handling deploy-specific tasks (like the test data and demo branch), in loose order of priority.
- I also contributed to multiple open-source projects in order to get this working. The duplicate detection is based on an analog to MongoDB's `findOne`/`findOneAndUpdate` that I contributed to kvdex, a document database built on Deno KV. I also contributed user agent switching to Astral, a browser automation lib also built on Deno.
- Keeping the project light on external dependencies is a goal, but keeping it entirely free of them is not.
- Some of the other, smaller issues you'll find -- around stuff like data not persisting -- are unique to the demo branch and its mocked API calls.
Demo: escape-rope.bhmt.dev
Backend repo: github.com/chaosharmonic/escape-rope
Frontend repo: github.com/chaosharmonic/escape-rope-ui
Lengthy scraping writeup: bhmt.dev/blog/scraping