It comes in two formats: One is a text document with with column size restrictions that makes it very hard to read, worse than the text version of an IETF RFC. The second is a machine readable XML document which itself isn't easily read.
Are there any good tools for viewing these? I did find GovTrack.us but it seems to be down so I'm not sure if it solves this problem.
[1]: https://www.congress.gov/bill/119th-congress/house-bill/1/text
beej71•9h ago
Except that it's a government thing so the parser's probably not going to be little. :)
Edit: The thing's basically XHTML without any kind of header. UTF-8 encoding, it looks like. So a conversion tool would just need to wrap it up and add styling.
Edit: Despite hints that it's XHTML, it's not valid XHTML.
Edit: Stick this at the top of the file:
--------------------- 8< ---------------------
<!DOCTYPE html>
<html>
<head>
</head>--------------------- 8< ---------------------
And add this to the bottom of the file:
--------------------- 8< ---------------------
</html>
--------------------- 8< ---------------------
I'll leave it as an exercise to the reader to write a script to do that. Automatically extracting the bill title should be Fun.
gabrielsroka•8h ago
https://developer.mozilla.org/en-US/docs/Web/HTML/Reference/...