I sort of need to pull all the data at the initialization because I need to map out how every post affects every other - the links between posts are what take up majority of the storage, not the text inside the posts. It's also kind of the only way to preserve privacy.
Why are you serving so much data personally instead of just reformatting theirs?
Even if you're serving it locally...I mean a regular 100mbit line should easily support tens or hundreds of text users...
What am I missing?
Because then you only need to download 40MB of data and do minimal processing. If you were to take the dumps off of Wikimedia, you would need to download 400MB of data and do processing on that data that would take minutes of time.
And also it's kind of rude to hotlink a half a gig of data on someone else's site.
> What am I missing?
40MB per second is 320mbps, so even 3 visitors per second maxes out a gigabit connection.
All I'm getting from your serve is a title, a sentence, and an image.
Why not give me say the first 20 and start loading the next 20 when I reach the 10th?
That way you're not getting hit with 40mb for every single click but only a couple of mb per click and a couple more per scroll for users that are actually using the service?
Look at your logs. How many people only ever got the first 40 and clicked off because you're getting ddosed? Every single time that's happened (which is more than a few times based on HN posts), you've not only lost a user but weakened the experience of someone that's chosen to wait by increasing their load time by insisting that they wait for the entire 40MB download.
I am just having trouble understanding why you've decided to make me and your server sit through a 40MB transfer for text and images...
Because you need all of the cross-article link data, which is the majority of the 40mb, to run the algorithm. The algorithm does not run on the server, because I care about both user privacy and internet preservation.
Once the 40MB is downloaded, you can go offline, and the algorithm will still work. If you save the index.html and the 40MB file, you can run the entire thing locally.
> actually using the service
This is a fun website, it is not a "service".
> you've not only lost a user but weakened the experience of someone that's chosen to wait by increasing their load time
I make websites for fun. Losing a user doesn't particularly affect me, I don't plan on monetizing this, I just want people to have fun.
Yes, it is annoying that people have to wait a bit for the page to load, but that is only because the project has hundreds of thousands of more eyes on it than I expected it to within the first few hours. I expected this project to get a few hundred visits within the first few hours, in which case the bandwidth wouldn't have been an issue whatsoever.
> I am just having trouble understanding why you've decided to make me and your server sit through a 40MB transfer for text and images...
Running the algorithm locally, privacy, stability, preservation, ability to look at and play with the code, ability to go offline, easy to maintain and host etc.
Besides, sites like Twitter use up like a quarter of that for the JavaScript alone.
Also, all three of the examples are projects that have years of dev effort and hosting infrastructure behind them - Xikipedia is a project I threw together in less than a day for fun, I don't want to put effort into server-side maintenance and upkeep for such a small project. I just want a static index.html I can throw in /var/www/ and forget.
And re: hosting, my bare metal box is fine. It's just slow right now because it's getting a huge spike of attention. I don't want to pay for a CDN, and I doubt I could host a file getting multiple gigabits per second of traffic for free.
Thank you for making my day a little brighter.
WP is already shit, why should anyone doomscroll it?
I easily have over 100 tabs of wikipedia open at any one time, reading about the most random stuff ever. I'm the guy who will unironically look up the food I'm eating on wikipedia while I'm eating it.
No need to try to make it "doomscrollable" when it's already got me by the balls.
DuckDB loaded in the browser via WebAssembly and Parquet files in S3.
I think it would be nice if you could do a non simple English version but nevertheless happy with what you've put together, and I've added a shortcut to my phone. Please don't let the negativity stop you from continuing to work on it.
Thank you.
The United States Virgin Islands are a group of islands in the Caribbean Sea. They are currently owned and under the authority of the United States Government. They used to be owned by Denmark (and called Danish West Indies). They were sold to the U.S. on January 17, 1917, because of fear that the Germans would capture them and use them as a submarine base in World War I.
https://simple.wikipedia.org/wiki/United_States_Virgin_Islan...
esperent•6h ago
I gave up after about a minute.