For each app, I added the privacy policy URL to a database, then made a Perplexity Sonar API call to fetch the text and process it into a summary, then manually cleaned the result. Currently, I manually get the URL for the privacy policy and have cataloged the top 100 apps, but I plan to automate it further w/ web scrapers, which should allow for blitzscaling of the app database.
This is just a beta so far, let me know if anyone has any suggestions, and if anyone knows what web scraper is best to use? Currently, I'm thinking most likely ParseHub or Scrapy.
MajEagle•3h ago