That said, it's often generated via Org-mode or WYSIWYG tools these days.
All the conferences I'm looking at have word and latex templates. Word clearly isn't going to replace LaTeX for many reasons.
Truth to be told: As long as you're given a template and have to stick to that template, LaTeX is all you could ever want. With overleaf there's barely a tooling learning curve either.
Years ago I had a student in my class who was unable to make out written material, but had a machine that read text aloud. My class notes (some 300+ pages) are chock full of mathematics. I went to the student's house to see how that reading machine worked. Provided with LaTeX input, it said a lot of things like "backslash alpha" and "begin open brace equation close brace" stuff. I wrote a quick perl script to change it, so it said "alpha" and "begin equation". Presto -- it was exactly what the student needed. This was, as I say, many years ago. Maybe now there is software that can handle MSword files, etc., but that definitely did not exist at the time. The result? The student was able to take the class, and did very well in it.
In mechanical and aerospace engineering broadly, I see less use of LaTeX over time. It's hard to estimate precisely which percent of the market is LaTeX vs. Word vs. something else, but I think I can see trends. Almost no one where I work uses LaTeX, though LaTeX used to be more popular there.
I think it probably varies a lot by narrow specialty and publication venue too. Papers submitted to Journal of Fluid Mechanics seem to overwhelmingly use LaTeX. The main conference I would submit papers to during my PhD is primarily Word (though I used LaTeX). I have seen at least one Word-only engineering journal, though it wasn't something I would publish in.
It appears functionality efficient, well organized, and fast as you observed.
https://www.npr.org/2025/04/09/g-s1-59090/trump-officials-ha... ("Trump officials halt $1 billion in funding for Cornell, $790 million for Northwestern")
0. https://blog.arxiv.org/2023/06/ 1. https://www.syracuse.com/news/2025/04/central-ny-college-sue...
Why do I smell someone from G was there and sold them fancy cloud story (or they wanted VMs and reseller sold them CloudRun)? Anyway, goodbye simplicity and stability, hello exorbitant monthy costs for the same/less service quality. Would love to be wrong.
But in all likelihood someone was probably just like "we're tired of doing ops on 2 decades old stacks"
I bet arXiv was run on server hardware costing under $10k before...
And now it'll end up costing $10k per month (with free credit from Google which will eventually go away and then arXiv will shut down or be forced to go commercial)
(pdf) https://info.arxiv.org/about/reports/arXiv_CY19_midyear.pdf
They are already using VMs but one of the things it'll do is:
> containerize all, or nearly all arXiv services so we can deploy via Kubernetes or services like Google Cloud Run
And further state:
> The modernization will enable: - arXiv to expand the subject areas that we cover - improve the metadata we collect and make available for articles, adding fields that the research community has requested such as funder identification - deal with the problem of ambiguous author identities - improve accessibility to support users with impairments, particularly visual impairments - improve usability for the entire arXiv community
Getting creative is often just a pain in the ass. Doing the standard things, walking the well-trod path, is generally easier to do, even if it may not be the cheapest or most hardware/software-efficient thing to do.
I think you should offer better thoughts instead of mad-libbing in buzzwords where they don't apply. Enshittification is actually a useful concept, and I don't want it to go the way of "FUD", which had a similar trajectory in the later years of Slashdot where people just reduced it to a useless catch-all phrase whenever Microsoft said anything about anything.
I believe there's a real chance of it happening here as a result of this transition. I personally experienced results of several of similar transitions over the course of my career. What I haven't experienced are problems with Arxiv that would motivate such a change. There might be actual problems they are trying to solve - but I still believe things will probably get worse as a result.
Note that Google doesn’t outright define an architecture for anyone, but people who worked at Google who come in as the hot hire bring that with them. Ditto for other large employers of the day. One of my mentors had to deal with this when Yahoo was the big deal.
In some cases, when abstractions are otherwise correct, this hasn’t been a big deal for the software projects and companies I’ve been involved with. It’s simply “there’s an off the shelf well supported industry standard we can use so we can focus on our customer/end goal/value add/etc.” Using an alternative docker runtime “that Google recommends” (aka is suggested by Kubernetes) is just a choice.
Where people get bit and then look at this with a squint, is when you work at several places where on the suggestion of the new big hire from Google/Amazon/Meta/etc, software that runs just fine on a couple server instances and has a low and stable KTLO budget ends up being broken down into microservices on Kubernetes and suddenly needs a team of three and doesn’t provide any additional value.
The worst I’ve experienced is a company that ended up listing the cost of maintaining their fork of Kubernetes and development environment as a material reason for a layoff.
Google’s marketing arm also has made deals to help move people to Google Cloud from AWS. Where I am working now this didn’t work to plan on either side it seems so we’re staying dual cloud, a situation no one is happy about. Before my time there was an executive on the finance side that believed Google was always the company to bet on and didn’t see Amazon as more than a book store. Also money. Different type of hubris, different type of pressure, same technical outcome as a CTO that runs on a “well Google says” game plan.
At the end of the day, Google is a big place and employs a lot of people. You’re going to have a lot of individuals who experience hucksters trying to parlay Google experience into an executive or high ranked IC role and they’re going to lean on what they know. That has nothing to do with Google itself, but their attempts to pry people away from AWS are about the same flavor from my personal experience.
Are we though? I see a pockets of world-expert level knowledge, some reasonable shop talk, and quite a bit of really dumb nonsense that is contradicted by my professional experience. Or just pedestrian level wrong. I mostly shitpost.
I don't have an opinion about arxiv's hosting, but it does read to be one of those projects that includes cleaning up of long standing technical debt that they probably couldn't get funded if not for the flashy goal. The flashy goal will, regardless of it's own merits, also be credited for improvements that they should have made all along.
Kubernetes does add complexity but it does add a lot of good things too. Auto scaling, cycling of unhealthy pods, and failover of failed nodes are some of them. I know there is this feeling here sometimes that cloud services and orchestrated containers are too much for many applications, but if you are running a very busy site like arXiv I can't see how running on bare metal is going to be better for your staff and experience. I don't think they are naive and got conned into GCP as the OP alludes to. They are smart people that are dealing with scaling and tech debt issues just like we all end up with at some point in our careers.
If we can get for the fact that we require javascript to run it, aside from that. Cloudflare workers is literally the best single thing to happen at least to me. With a single domain, I have done so many personal projects for problems I found interesting and I built so many projects for literally free, no Credit card. No worries whatsoever.
I might ditch writing other languages for server based like golang even though I like golang more just because cloudflare workers exists.
However Workers supports WASM so you don’t necessarily have to switch to JavaScript to use it.
I wrote some Rust code that I run in Cloudflare Functions, which is a layer on top of Cloudflare Workers which also supports WASM. I wrote up the gory details if you’re interested:
https://127.io/2024/11/16/generating-opengraph-image-cards-f...
JavaScript is most definitely the path of least resistance but it’s not the only way.
Funny, I also work on academic sites (much smaller than arXiv) and we're looking at moving from AWS to bare metal for the same reason. The $90/TB AWS bandwidth exit tariff can be a budget killer if people write custom scripts to download all your stuff; better to slow down than 10x the monthly budget.
(I never thought about it this way, but Amazon charges less to same-day deliver a 1TB SSD drive for you to keep than it does to download a TB from AWS.)
Its way more predictable in my opinion that you only pay per month a fixed amount to your storage, it can also help the fact that its on the edge so users would get it way faster than lets say going to bare metal (unless you are provisioning a multi server approach and I think you might be using kubernetes there and it might be a mess to handle I guess?)
https://www.reddit.com/r/sales/comments/134u0mq/cloudflare_c...
They got rid of all of the “underperforming” sales people and hired new ones. That nightmare is the result. I suspect the higher the sales performance, the more likely they were doing things like this.
If crawling is a problem, 1 it is pretty easy to rate limit crawlers, 2 point them at a requestor pays bucket and 3, offer a torrent with anti leech.
That said, I agree that transit costs are too high.
The reason to switch away from fiber should be sustained aggregate throughput, not transfer cost.
The original example cited people writing custom scripts to download all your stuff blowing your budget. A reasonable equivalent to that is shipping the interested party a storage device.
More generally, despite the two things being different their comparison can nonetheless be informative. In this case we can consider the up front cost of the supporting infrastructure in addition to the energy required to use that infrastructure in a given instance. The result appears to illustrate just how absurd the current pricing model is. Bandwidth limits notwithstanding, there is no way that the OPEX of the postal service should be lower than the OPEX of a fiber network. It just doesn't make sense.
Can you not reliably block crawlers in this day and age?
but I think cloudflare is the answer to this thing as well.. (Sorry if I am being annoying) (Cloudflare isn't sponsoring me, I just love their service so much)
Azure? Microsoft as a company is still the choice of most IT departments due to its ubiquitousness and low cost barrier to entry. I personally wouldn’t use Azure if I had the choice, because it’s easy and cheap at the surface, with hell underneath (except for products based on other things, like AD which was just a nice LDAP server, or C# which was modelled after Java).
I’d have gone with AWS. EKS isn’t bad to setup and is solid once it’s up. As far as the health of Amazon itself, China entering the their space hasn’t significantly changed their retail business, though eventually they’ll likely be in price wars.
The greatest risk to any cloud provider I think would be a war that could force national network isolation or government taking over infrastructure. And the grid would go down for a good while and water would stop, so everyone would start migrating, drinking polluted water, then maybe stealing food. At that point, whether or not they chose GC doesn’t matter anymore.
Google cloud profitable these days and advertising or other income streams drying up will only entice Google to further invest in cloud to ensure they are more diversified. Google isn't going to go away overnight and cloud is perhaps the least risky business they operate in.
Oh noes ... they got scammed
No to the second, unless they've come to some sure of agreement with Cornell that lets them.
Also worth noting that gcp has over a decade of continuous service with no indication to think it should disappear any time soon. It's not clear why Google's consumer product strategies can be used to infer how their cloud products are run.
In any case, are there any documented instances of Google Cloud discontinuing service or terminating a client's hosting for <reasons>?
I disagree, but it depends on what we mean by "doing".
If this is motivated by the prospect of being menaced by the current US government then, while Google might be a safer home, arXiv is still vulnerable to having its funding disrupted by malicious actors.
I'd like to know if GCP is covering part of the bill? Or will Cornell be paying all of it? The new architecture smells of "[GCP] will pay/credit all of these new services if you agree to let one of our architects work with you". If GCP is helping, stay tuned for a blog post from google some time around the completion of the migration with a title like "Reaffirming our commitment to science" or something similarly self affirming.
Our Supporters
...
Gold Sponsors
Google, Inc (USA)
"Google pays to run an enormous intellectual resource in exchange for a self-congratulatory blogpost" seems like a perfectly acceptable outcome for society here.
This is an odd criticism. If a company is footing the bill, it can’t even talk about it to gain some publicity/good will?
While I understand that something is more genuine if done in secret, it doesn't stop being a real commitment to science just because you make a pr post about it.
If company X contributes to Y open source foundation, that's real and they get to claim clout, nobody cares about a post anyways.
That's not what Google says: https://support.google.com/a/answer/2891389?hl=en
I didn't know how that situation had evolved since I last used GCP.
I can't be angry at Google for following US law, any more than I can be angry at Huawei for following Chinese law.
We don't know where the people who will make scientific breakthrough will be. Imagine losing the cure for cancer, or a form of clean energy (or anything that could change the world for everyone) due to this.
Source: I applied to a Cornell-related lab in March. A week after submitting my application the role was rescinded and my contact emailed me explaining the situation.
> Together with all of American higher education, Cornell is entering a time of significant financial uncertainty. The potential for deep cuts in federal research funding, as well as tax legislation affecting our endowment income, has now been added to existing concerns related to rapid growth and cost escalations. It is imperative that we navigate this challenging financial landscape with a shared understanding and common purpose, to continue to advance our mission, strengthen our academic community, and deepen our impact. [0]
https://investinopen.org/blog/ioi-partners-with-arxiv-to-dev... https://blog.arxiv.org/2023/06/12/arxiv-is-hiring-software/
So costs went from $80k an engineer a year maybe a decade ago with a few thousand to servers to $200k an engineer which you would struggle to find or $100k for a 'cloud engineer / architect' plus $100k to a cloud provider.
This sounds great in theory. Except that cloud providers are messy and once vendor locked in, you are in a big spiral. Secondly the costs can be hidden and climb exponentially if you don't know exactly what you are doing. You might also get into weird bugs that could be solved by a patch over a Monday to some package you could just update which might take months or never happen on a cloud provider. The reality of moving to cloud is not as rosy as it sounds.
Universities used to be the birth place of big projects that were created to solve problems they ran into hosting / running their own infrastructure. I hope that is still true.
I wonder if Ginsparg is finally retiring and relinquishing access.
I didn't realize arXiV was started in 1991. And then I wondered why I had never heard of it while I was at Cornell from 1997-2001. Apparently it only assumed the arXiV name in 1999.
I like that it was a bunch of shell scripts :)
Long before arXiv became critical infrastructure for scientific research, it was a collection of shell scripts running on Ginsparg’s NeXT machine.
Interesting connections:
As an undergrad at Harvard, he was classmates with Bill Gates and Steve Ballmer; his older brother was a graduate student at Stanford studying with Terry Winograd, an AI pioneer.
On the move to the web in the early 90's:
He also occasionally consulted with a programmer at the European Organization for Nuclear Research (CERN) named Tim Berners-Lee
And then there was a 1994 move to Perl, and 2022 move to Python ...
Although my favorite/primary language is Python, I can't help but wonder if "rewrite in Python" is mainly a social issue ... i.e. maybe they don't know how to hire Perl programmers and move to the cloud. I guess rewrites are often an incomplete transmission of knowledge about the codebase.
FAQ 1: Why did you create arXiv if journals already existed? Has it developed as you had expected?
Answer: Conventional journals did not start coming online until the mid to late 1990s. I originally envisioned it as an expedient hack, a quick-and-dirty email-transponder written in csh to provide short-term access to electronic versions of preprints until the existing paper distribution system could catch up, within about three months.
So it was in csh on NeXT. Tim Berners-Lee also developed the web on NeXT!
By all rights arxiv should be moving towards decentralization as opposed to being picked up by one of the largest centralized players.
Are all preprints on arXiv public?
Or is there actually a private unlisted preprint queue?
From behavior I've observed, I'm guessing maybe authors have the ability to hide papers and send private invites for select peer-review?
Postings go up once a day as a batch, so you could wait 24ish hours to see your paper appear, longer if your posting is just before the weekend.
No they do not.
In fact, authors cannot even delete submitted papers after they have officially appeared in the (approximately) daily cycle. You can update your paper with a new version (which happens frequently) or mark it as withdrawn (which happens rarely). But in either case all the old versions remain available.
stand corrected
progbits•23h ago
Too bad it's US only, devops is actually a good role for full remote setups. If you can't make it work asynchronously across timezones you are doing it wrong :)
dvrp•23h ago
progbits•22h ago
But I meant things like having good and automated tests, deploys etc so everyone can be productive on their own without multiple people getting involved ("hey can you please deploy X for me when you are here?").
tgsovlerkhgsel•22h ago
That way nobody has to work nights and you still get 24/7 coverage.
Spivak•22h ago
orochimaaru•21h ago
Also plenty of North Korean bad actors masquerading as US remote workers. I’m betting as European remote workers too. I like remote work but shit like that is why we can’t have good things.
https://www.yahoo.com/news/thousands-north-korean-workers-in...