I'm tired of "argument list too long" on my 96GB system.
This does not detract from the point that having to use this is inconvenient, but it's there as a workaround.
xargs --show-limits --no-run-if-empty </dev/null
...to see a nicely formatted output include the -2048 POSIX recommendation on a separate line.However, that sounds like solving the wrong end of the problem to me. I don't really know what a 4k JPEG worth of command line arguments is even supposed to be used for.
I didn't either, until I learned about compiler command line flags.
But also: the nice thing about conmand line flags is they aren't persisted anywhere (normally). That's good for security.
Typing an extra space should not invoke advanced functionality. Bad design, etc.
This would be really killer if it was always enabled and the same across shells but "some shells support something akin and you have to check if it is actually enabled on the ones that do" is just annoying enough that I probably won't bother adopting this on my local machine even though it sounds convenient as a concept.
YMMV with other shells and base distros
in a large directory of image files with json sidecars
Somebody will say use a database, but when working for example with ML training data one label file per image is a common setup and what most tooling expects, and this extends further up the data preparation chain
I mean I get that you're suggesting to provide only one directory on the argv. But it sucks that the above solution to add json files to an archive while ignoring non-json files only works below some not-insane number of files.
Does it suck you can’t use globing for this situation? Sure, yeah, fine, but by the time it’s a problem you’re already starting to push the limits of other parts of the system too.
Also using that glob is definitely going to bite you when you forget some day that some of the files you needed were in subdirectories.
It also just extends the original question. If I have a system with 96GB RAM and terabytes of fast SSD storage, why shouldn't I be able to put tens of thousands of files in a directory and write a glob that matches half of them? I get that this was inconceivable in v6 unix, but in modern times those are entirely reasonable numbers. Heck, Windows Explorer can do that in a GUI, on a network drive. And that's a program that has been treated as essentially feature complete for nearly 30 years now, on an OS with a famously slow file system stack. Why shouldn't I be able to do the same on a linux command line?
Then we agree :)
From a quick look at the tar manual, there is a --files-from option to read more command line parameters from a file; I haven't tried, but you could probably combine it with find through bash's process substitution to create the list of files on the fly.
perl -p -i -e 's#foo#bar#g' **/* find . -type f -exec perl -p -i -e 's#foo#bar#g' {} \; find . -type f -exec perl -p -i -e 's#foo#bar#g' {} +“The command line for command is built up until it reaches a system-defined limit (unless the -n and -L options are used). The specified command will be invoked as many times as necessary to use up the list of input items. In general, there will be many fewer invocations of command than there were items in the input. This will normally have significant performance benefits.”
Your only risk is that it won’t handle inputs that, on its own, are too long.
Most linkers have workarounds, I think you can write the paths separated by newlines to a file and make the linker read object file paths from that file. But it would be nice if such workarounds were unnecessary.
Looking back, it's unfortunate that Unix authors offered piping of input and output streams, but did not extend that to arbitrary number of streams, making process arguments just a list of streams (with some shorthand form for constants to type in command line, and universal grammar). We could have been used to programs that react to multiple inputs or produce multiple outputs.
It is obvious that it made sense in the '70s to just copy the call string to some free chunk of memory in the system record for the starting process, and let it parse those bytes in any way it wants, but, as a result, we can't just switch from list of arguments to arbitrary stream without rewriting the program. In that sense, argument strings are themselves a workaround, a quick hack which gave birth to ad-hoc serialisation rules, multi-level escaping chains, lines that are “too long” for this random system or for that random system, etc.
They did though - file handles are inherited by child processes which allows you to pass one end of a pipe and then feed things into the other end. E.g. make uses this to communicate between recursive invocations for concurrency control.
Imagine having this discussion for every array in a modern system ...
But using a pipe to move things between processes not the command buffer is easier...
Some build systems (eg Debian + python + dh-virtualenv) like to produce very long paths, and I'd be inclined to just let them.
Was too young to benefit from Y2K
All those 32 bit arm boards that got soldered into anything that needed some smarts won't have a Debian available.
Say, what's the default way to store time in an ESP32 runtime? Haven't worked so much with those.
This means Trixie won't have it?
So Trixie does not have 64-bit time for everything.
Granted, the article, subtitle and your link all point out that this is intentional and won't be fixed. But in the strictest sense that GP was likely going for Trixie does not have what the headline of this article announces
"... still better than leap seconds."
or we'll have failed to make it through the great filter and all be long extinct.
--- start quote ---
<erno> hm. I've lost a machine.. literally _lost_. it responds to ping, it works completely, I just can't figure out where in my apartment it is.
--- end quote ---
~$ host gmail.com gmail.com has address 142.250.69.69 gmail.com has IPv6 address 2607:f8b0:4020:801::2005 gmail.com mail is handled by 10 alt1.gmail-smtp-in.l.google.com. gmail.com mail is handled by 30 alt3.gmail-smtp-in.l.google.com. gmail.com mail is handled by 5 gmail-smtp-in.l.google.com. gmail.com mail is handled by 20 alt2.gmail-smtp-in.l.google.com. gmail.com mail is handled by 40 alt4.gmail-smtp-in.l.google.com.
~$ host gmail-smtp-in.l.google.com. gmail-smtp-in.l.google.com has address 142.250.31.26 gmail-smtp-in.l.google.com has IPv6 address 2607:f8b0:4004:c21::1a
If you spend 2 days vibe coding some chat app and then you have to spend 2 further days debugging why file sharing doesn't work for ipv4 users behind nat, you might just say it isn't supported for people whose ISP's use 'older technology'.
After that, I reckon the transition will speed up a lot.
The only sane thing to do in a SLAAC setup is block everything. So no, it isn’t a solved problem just because you used ipv6.
None of these are actually the game/app developers' problem. The OS takes care of them for you (you may need code for e2e connectivity when both are behind a NAT, but STUN/TURN/whatever we do nowadays is trivial to implement).
Except people complain to the game/app developer when it doesn't work.
If all the mobile is removed, what's the percentage then?
https://radar.cloudflare.com/explorer?dataSet=http&groupBy=i...
Obligatory XKCD:
RTX[0-7] would do. For time dilation purposes, we can have another 512 bit set to adjust ticking direction and frequency.
Or shall we go 1024 bits on both to increase resolution? I'd agree...
We’ve seen it before with 32 bit processors limited to 20 or 24 bits addressable because the high order bits got repurposed because “nobody will need these”.
That makes it slightly safer to use those bits, won’t it? As long as your code asks the OS how many bits the hardware supports, and only use the ones it requires to be zero, if you forget to clear the bits before following a pointer, the worst that can happen is a segfault, not reading ‘random’ memory.
https://www.rfc-editor.org/rfc/rfc2550.txt
* Published on 1999-04-01
ext4 moved some time ago to 30 bits of fractional resolution (on the order of nanoseconds) and 34 bits of seconds resolution. It punts the problem 400 years or so into the future. I'm sure we will eventually settle on 128-bit timestamps with 64 bits of seconds and 64 bits of fractional resolution, and that should sort things for forseeable human history.
I wonder what the zfs/btrfs type file systems do. I am a bit lazy to check but I expect btrfs is using 64 bit. zfs, I would not be surprised if it matches zfs (edit meant ext4 here).
So 580 years or so till problems (but probably patchable ones? I believe the on disk format is already 2x uint64s, this is just the gethrtime() function I saw).
https://man7.org/linux/man-pages/man3/timespec.3type.html
it is convenient unit because 10^9 fits neatly into 32 bit integer, and it is unlikely that anyone would need more precision than that for any general purpose use.
That is roughly 585 billion years[1].
[1]: https://www.wolframalpha.com/input?i=how+many+years+is+2%5E6...
Nit: time_t is a data type, not a variable.
A couple significant things I found much clearer in the wiki page than in the article:
* "For everything" means "on armel, armhf, hppa, m68k, powerpc and sh4 but not i386". I guess they've decided i386 doesn't have much of a future and its primary utility is running existing binaries (including dynamically-linked ones), so they don't want to break compatibility.
* "the move will be made after the release of Debian 13 'Trixie'" means "this change is included in Trixie".
If the old ABI used a 32-bit time_t, breaking the ABI was inevitable. Changing the package name prevents problems by signaling the incompatibility proactively, instead of resulting in hard-to-debug crashes due to structure/parameter mismatches.
I suppose in theory if there's one simple library that differs in ABI, you could have code that tries to dlload() both names and uses the appropriate ABI. But that seems totally impractical for complex ABIs, and forget about it when glibc is one of the ones involved.
There's no ABI breakage anyway if you do static linkage (+ musl), but that's not practical for GUI stuff for example.
I suppose you could have bundle wrapper .so for each that essentially converts one ABI to the other and include it in your rpath. But again doesn't seem easy for the number/complexity of libraries affected.
Other platforms make different trade-offs. Most of the pain is because on Debian, it's customary for applications to use system copies of almost all libraries. On Windows, each application generally ships their own copies of the libraries they use. That prevents these incompatibility issues, at the cost of it being much harder to patch those libraries (and a little bit of dikspace).
There's nothing technical preventing you from taking the same approach as Windows on Debian: as you pointed out, the libc ABI didn't change, so if you ship your own libraries with your application, you're not impacted by this transition at all.
However, other libraries (Qt, Gtk, ...) don't do that compatibility stuff. If you consider those to be also system libraries then yeah, its breaking the ABI of system libraries. Though a pre-compiled program under Linux could just bundle
all* of it's dependencies and just either use glibc (probably a good idea), statically link musl, or even do system calls on its own (probably not a good idea). Linux has a stable system call interface!(*) One can certainly argue about that point. Not sure about that point myself anymore when thinking about it, since there are things like libpcap, libselinux, libbpf, libmount, libudev etc. and I don't know if any of them use time_t anywhere and if they do weather they support the -D_FILE_OFFSET_BITS=64 and -D_TIME_BITS=64 stuff.
It also isn't strictly necessary until 2038 (depending on your needs for future timestamps) so you'd be creating problems now for people who might have migrated to something else in the 13 years that the current solution will still work for.
I guess a tricky thing might be casts from time_t to datatypes that are actually 64bit. E.g. for something like
struct Callback {
int64_t(*fn)(int64_t);
int64_t context;
}
If a time_t is used for context and the int64_t is then downcast to int32_t that could be hard to catch. Maybe you would need some runtime type information to annotate what the int64_t actually is.* NetBSD in 2012: https://www.netbsd.org/releases/formal-6/NetBSD-6.0.html
* OpenBSD in 2014: http://www.openbsd.org/55.html
For packaging, NetBSD uses their (multi-platform) Pkgsrc, which has 29,000 packages, which probably covers a large swath of open source stuff:
On FreeBSD, the last platform to move to 64-bit time_t was powerpc in 2017:
* https://lists.freebsd.org/pipermail/svn-src-all/2017-June/14...
but amd64 was in 2012:
* https://github.com/freebsd/freebsd-src/commit/8f77be2b4ce5e3...
with only i386 remaining:
* https://man.freebsd.org/cgi/man.cgi?query=arch
* https://github.com/freebsd/freebsd-src/blob/main/share/man/m...
And AFAIK glibc provides both functions, you can chose which one you want via compiler flags (-D_FILE_OFFSET_BITS=64 -D_TIME_BITS=64). So a pre-built program that ships all its dependencies except for glibc should also work.
Is it also known as that? It's a cute name but I've never seen anyone say it before this article. I guess it's kind of fetch though.
Do I spot a Mean Girls reference?!
>The year 2038 problem (also known as Y2038, Y2K38, Y2K38 superbug, or the Epochalypse)
So yeah, but only since 2017 and as a joke.
~12 years, 5 months, 22 days, 13 hours, 22 minutes.....
12 years+ is a long time to prepare for this. Normally I wouldn't have much faith in test/dev systems, network time being setup properly,etc...but it's a long time. Even if none of my assumptions are true, in a decade we couldn't at least identify where 32bit time is being used and plan for contingencies? that's unlikely.
But hey, let me know when Python starts supporting nano-second precision time :'(
https://stackoverflow.com/a/10612166
Although, it's been a while since I checked to see they support it. In Windows-land at least, everything system-side uses 64bit/nsec precision, as far as I've had to deal with it at least.
An embedded device bought today may be easily in use 12 years from now.
Many of the device going into production now won't have 64bit time, they'll still run version of Linux that was certified, or randomly worked, in 2015. I hope you're right, but in any case it will be worse than Y2K.
Are you still involved in Debian?
You kind of have to pick your poison; do you want a) reasonable signed behavior for small differences but inability to represent large differences, b) only able to represent non-negative differences, but with the full width of the type, c) like a, but also convincing your programming system to do a mixed signed subtraction ... like for ptrdiff_t.
The "long long" standard integer type was only standardized with C99, long after Linux established it's 32-bit ABI. IIRC long long originated with GCC, or at least GCC supported it many years before C99. And glibc had some support for it, too. But suffice it to say that time_t had already been entrenched as "long" in the kernel, glibc, and elsewhere (often times literally--using long instead of time_t for (struct timeval).tv_sec).
This could have been fixed decades ago, but the transition required working through alot of pain. I think OpenBSD was the first to make the 32-bit ABI switch (~2014); they broke backward binary compatibility, but induced alot of patching in various open source projects to fix time_t assumptions. The final pieces required for glibc and musl-libc to make the transition happened several years later (~2020-2021). In the case of glibc it was made opt-in (in a binary backward compatible manner if desired, like the old 64-bit off_t transition), and Debian is only now opting in.
For one, OpenBSD (and others?) did this a while ago. If it breaks software when Debian does it, it was probably mostly broken.
For another, most people are using 64-bit os and 64-bit userland. These have been running 64-bit time_t forever (or at least a long time), so it's no change there. Also, someone upthread said no change for i386 in Trixie... I don't follow Debian to know when they're planning to stop i386 releases in general, but it might not be that far away?
But I agree that Debian is still too slow to move forward with critical changes even with that in mind. I just don't think that OpenBSD is the best comparison point.
FreeBSD did it in 2012 (for the 2014 release of 10.0?):
* https://github.com/freebsd/freebsd-src/commit/8f77be2b4ce5e3...
And has compat layers going back many releases:
* https://www.freshports.org/misc/compat4x/
* https://wiki.freebsd.org/BinaryCompatibility
So newly (re-)compiled programs can take advantage of newer features, but old binaries continue to work.
=3
This seems overly harsh/demeaning.
1. those 2 bytes were VERY expensive on some systems or usages, even into the mid-to-late 90's
2. software was moving so fast in the 70s/80's/90's that you just didn't expect it to still be in use in 5 years, much less all the way to the mythical "year 2000"
They could have represented dates as a simple int value 0ed at 1900. The math to convert a day number to a day/month/year is pretty trivial even for 70s computers and the end result would have been saving more than just a couple of bytes. 3 bytes could represent days from 1900->~44,000 (unsigned).
Even 2 bytes would have bought ~1900->2070
That gives you something like 20,000BCE -> 22,000CE.
Doesn't really change the math to spit out a year and it uses fewer bytes than what they did with dates.
I will say the math gets more tricky due to calendar differences. But, if we are honest, nobody is really caring a lot about March 4, 43-BCE
See eg. https://www.mainframemaster.com/tutorials/cobol/picture-clau...
Many systems stored numbers only in BCD or text formats.
In standard COBOL? No, they couldn't have.
COBOL has datatypes built into it, even in COBOL 60. Date, especially for what COBOL was being used for, would have made a lot of sense to add as one of the supported datatypes.
It’s difficult to understand in an era of cheap terabyte SSDs, but in the 1960s and 1970s, DASD (what IBM mainframes called hard drives) was relatively tiny and very expensive.
And so programmers did the best they could (in COBOL) to minimize the amount of data stored. Especially for things that there were lots of, like say, bank transactions. Two bytes here and two bytes there and soon enough you’re saving millions of dollars in hardware costs.
Twenty years later, and that general ledger system that underlies your entire bank’s operations just chugging along solidly 24/7/365 needs a complete audit and rewrite because those saved bytes are going to break everything in ten years.
But it was probably still cheaper than paying for the extra DASD in the first place.
Also it was a bit dumb to imagine the computers would crash at 00:00 on Jan 1st 2000, bugs started to happen earlier as it's common to work with dates in the future.
That is why people have the "nothing happened" reaction. There were doomers predicting planes would literally fall out of the sky when the clock rolled over, and other similar Armageddon scenarios. So of course when people were making predictions that strong, everyone notices when things don't even come close to that.
30 year mortgages were the first thing that was fixed, well before my time. But we still had heaps of deadlines through the 90’s as future dates passed 2000.
The inter-bank stuff was the worst: lots of coordination needed to get everyone ready and tested before the critical dates.
It’s difficult to convey how much work it all was, especially given the primitive tools we had at the time.
For example, credit cards often use the mm/yy format for expiration dates because it is more convenient to write and considering the usual lifetime of a credit card, it is sufficient. But it means there is a two digit date somewhere in the system, and if the conversion just adds 2000, we are going to have a problem in 2100 if nothing changes, no matter how many bytes we use to represent and store the date. A lot of the Y2K problem was simple UI problems, like a text field with only 2 characters and a hardcoded +1900.
One of the very few Y2K bugs I personally experienced was an internet forum going from the year 1999 to the year 19100. Somehow, they had the correct year (2000), subtracted 1900 (=100) and put a "19" in front as a string. Nothing serious, it was just a one-off display error, but that's the kind of thing that happened in Y2K, it wasn't just outdated COBOL software and byte savings.
POSIX struct tm (which e.g. PHP wraps directly) contains the year as a counter since 1900.
These field sizes have to hard coded into all parts of the COBOL program including data access, UI screens, batch jobs, intermediate files, and data transfer files.
That is incorrect. USAGE COMP will use binary, with the number number of bytes depending on the number of digits in the PIC. COMP-1 specifically takes 4 bytes. COMP-3 uses packed decimal (4 bits per digits).
Y = (yy < 90) ? (2000 + yy) : (1900 + yy);
This would have to be handled differently than something that was required to be IBM PC or IBM AT compatible with every compatible quirk. It's simply a way to save 8-bits of battery-backed SRAM or similar.That's inaccurate. We actually switched over all 32-bit ports except i386 because we wanted to keep compatibility for this architecture with existing binaries.
All other 32-bit ports use time64_t, even m68k ;-). I did the switch for m68k, powerpc, sh4 and partially hppa.
Except x86.
pilif•6mo ago
(snark aside: I understand the arguments for and against making the change of i386 and I think they did the right thing. It's just that I take slight issue with the headline)
pantalaimon•6mo ago
Ekaros•6mo ago
pantalaimon•6mo ago
pilif•6mo ago
Plus, keeping i386 the same also means any still available support for running 32 bit binaries on 64 bit machines.
All of these cases (especially the installable 32 bit support) must be as big or bigger than the amount of ARM machines out there.
bobmcnamara•6mo ago
pilif•6mo ago
bobmcnamara•6mo ago
Note also that the numbers are log-scale, so while it looks like Arm64 is a close third over all bitwidths, it isn't.
umanwizard•6mo ago
bobmcnamara•6mo ago
umanwizard•6mo ago
axus•6mo ago
mananaysiempre•6mo ago
Also, I’m sorry to have to tell you that the 80386 came out in 1985 (with the Compaq Deskpro 386 releasing in 1986) and the Pentium Pro in 1995. That is, i686 is three times closer to i386 than it is to now.
jart•6mo ago
People still buy 16-bit i8086 and i80186 microprocessors too. Particularly for applications like defense, aerospace, and other critical systems where they need predictable timing, radiation hardening, and don't have the resources to get new designs verified. https://www.digikey.at/en/products/detail/rochester-electron...
wongarsu•6mo ago
Linux is a lot more uniform in its software, but when emulating windows software you can't discount i386
pm215•6mo ago
One notable use case is Steam and running games under Wine -- there are apparently a lot of 32 bit games, including still some relatively recent releases.
Of course if your main use case for the architecture is "run legacy binaries" then an ABI change is probably inducing more pain than it seeks to solve, hence the exception of it from Debian's transition here.
zokier•6mo ago
> From trixie, i386 is no longer supported as a regular architecture: there is no official kernel and no Debian installer for i386 systems.
> Users running i386 systems should not upgrade to trixie. Instead, Debian recommends either reinstalling them as amd64, where possible, or retiring the hardware.
https://www.debian.org/releases/trixie/release-notes/issues....
IsTom•6mo ago
Contrast of age of retired hardware with Windows 11 is a little funny.
account42•6mo ago
pavon•6mo ago