I don't know what "significant" means in this case, but a password is worth something only to those who know what the password is for and are willing to find out. I'm pretty sure all those seemingly popular "editing" plugins that read everything on the screen to send to a cloud service for "AI assistance suggestions" do far worse... and given what I've seen people do with accidentally pasting things into Google, it likely already knows a lot more than you thought it did.
I also copy-paste my username from KeePass, so you'd pretty quickly get everything
It's like coming across a key someone dropped on the road. You don't even know what it's for.
Of course all this assumes that there's even someone paying any special attention to the probably huge volume of data that these services are going to get.
There's a lot of keys that are self-identifying, even real keys. My key has "Apartment Name, Apartment Number" engraved into the head, and searching the apartment name on google brings it up in the first 5 results.
Let's say you find the following plaintext on the network: "sk-xxx....". Do you know what it's for? What if it's AKIAIOSFODNN7EXAMPLE?
What if it's a list of words from the BIP-39 wordlist?
> Of course all this assumes that there's even someone paying any special attention to the probably huge volume of data that these services are going to get.
It only takes one person, and since this is HTTP traffic, not HTTPS, the number of people who can see it is huge. Everyone on your wifi (i.e. the whole coffeeshop, remember firesheep), your ISP, each router between your ISP and china, and so on.
I wouldn't be surprised if someone is scanning all traffic that they see for bitcoin private keys and BIP-39 phrases since both of those could lead to some significant financial gain.
Heck, back in the day in my college dorm I ran a wifi hotspot only to sniff plaintext traffic and poke around, since I had a less strong sense of morals, and I bet the kids these days are still doing that.
Hotels learned not to do such silly things several decades ago.
I'm surprised that your building management lacks such obvious wisdom.
And yes, I too usually copy-paste both the username and the password, one right after the other. I have often thought that it seems very risky, but good to learn that Wayland already prevents clipboard sniffing.
> The stardict has "Scan" function, when user enable this function, after user select some text, it will trigger stardict do translate for this selected text... Why the user selects some confidential data to query dictionary?
"Sir, we have intel, the enemy is having translation server errors."
stardict --install en_US hi_IN ta_IN
For a trilingual person, just 100MB of storage. Problem solved no?
Edit: it's a full dictionary with all sorts of information. Example entry:
ABANDONED A*ban"doned, a.
1. Forsaken, deserted. "Your abandoned streams." Thomson.
2. Self-abandoned, or given up to vice; extremely wicked, or sinning without restraint; irreclaimably wicked ; as, an abandoned villain.
Syn. -- Profligate; dissolute; corrupt; vicious; depraved; reprobate; wicked; unprincipled; graceless; vile. -- Abandoned, Profligate, Reprobate. These adjectives agree in expressing the idea of great personal depravity. Profligate has reference to open and shameless immoralities, either in private life or political conduct; as, a profligate court, a profligate ministry. Abandoned is stronger, and has reference to the searing of conscience and hardening of heart produced by a man's giving himself wholly up to iniquity; as, a man of abandoned character. Reprobate describes the condition of one who has become insensible to reproof, and who is morally abandoned and lost beyond hope of recovery. God gave them over to a reprobate mind. Rom. i. 28.
Sogou, a keyboard for Windows, iOS and Android used by most Chinese users just sends everything (badly) encrypted to the cloud and nobody minds [1]. So I'm not very surprised the developer of stardict enabled this feature by default.
https://citizenlab.ca/2023/08/vulnerabilities-in-sogou-keybo...
- The clipboard can not be read by backgrounded applications
- Apps by default are unable to use HTTP
Also Wayland breaks a lot of stuff. It's certainly a move in the right direction on the whole but I wouldn't blindly interpret something like this as a win.
> That does mean that it breaks StarDict's scan feature, though.
Better does not necessarily mean good though, that Mac approach of block by default but allow users to enable these things for specific apps on settings would be a great improvement.
It's not like "define current selection" is some niche feature either. It's a default feature in macOs, iOS and Android.
You either do it the macos way or the windows/x11 way. You cannot half-ass something in between. That is just security theatre and is utterly retarded. Every wayland release until it makes a macos-style permission system (I dont care whether the default is accept or deny) is pure cancer. And every distro/DE that pushes wayland onto you until that point is also cancer.
Still doesn't prevent an ad library from bundling libcurl and doing HTTP calls manually, of course, but it's a sane default.
Just like any modern web browser. /s
Not even /s makes sense here.
I wouldn't say that is just a given, if I've apt-get installed a dictionary I might expect that is the whole thing on my machine. It's not like we haven't had dictionaries in physical books for centuries... It seems like stardict is very much an online thing, which I suppose could be legit, but the whole thing does seem like a trap.
Additionally, a typical spell checker feature is to provide alternative, correct, spellings, rather than just telling you whether a word is correctly spelled.
I bet there's some cool way to do this with zero-knowledge or homomorphic cryptography though!
But you might still be able to use some frequency sampling to predict the words used, unless those chunks are very very carefully constructed.
The code for which would almost certainly be larger than a fully local dictionary for any human language.
People in your coffee shop on the same WiFi could read it.
I get some people don't realize that's how TCP/IP works and the firesheep stuff all happened 15 years ago. But a bit worrying to see a frequent HN contributor challenging that.
That's why we now push for Https everywhere.
If we search for the author's bio, that seems to check out. They are a well-credentialed CS person; obviously they know that dictionary programs such as translation pop ups can have offline dictionaries, and mentions that. But they are a person of their time with an according set of "of courses".
Today, an application being locally installed and works with offline data is like a a statement of quaint chivalry, promulgated by a few remaining Don Quixotes of computing. (It saddens me to say. So much that this analogy brings me insufficient amusement.)
~> wc -cl /usr/share/dict/words
235976 2493885 /usr/share/dict/words
One might even expect a program to use a common Unix preinstalled dictionary.Yea, because, how else am I going to run shady poorly maintained dictionary software that ignores system settings from a hostile country? What kind of world are we living in with X11?!
The software could just as well hook into your downloads folder and transparently "translate" any downloaded text or PDF file for you. In which case the method by which pixels arrive on your screen would not be relevant.
How is this an X11 vs Wayland issue and not a distribution hygiene issue? Why is this package even a part of the distribution? In the desire to force one desktop system to stop existing, for whatever reason, I think they've missed the broader point.
It's not really a bug if it's an advertised feature you don't like, so security team cannot do much in theory.
correct which is why wayland is only one piece in improving security, you still need proper sandboxing
I know that there is a flag to disable the installation for "recommended" packages. I just think the default is a disservice here.
For a brief moment `--break-system-packages` surpassed it, then I discovered `pip` accepts abbrev flags so `--br` is enough, and sounds like bruh.
You can avoid that clusterfuck using `uv tool install`. E.g. `uv tool install pre-commit`.
First of all, "Recommends" is reserved for packages which enhance the functionality of the package you're installing. Without these the package will not break, but some very useful functionality might be disabled.
The package-class you're talking about is "suggests", IOW, "these packages might also be useful for you, wanna look?" section. These are not installed by default already.
On the other hand, apt and aptitude provides previews before doing something. You don't have to accept them. In aptitude's case, you can fine tune before the final commit, even.
There's a tension. Minimalism vs. user utility. Somebody told in Debian 13 release comments that "Debian will never be a end-user friendly distro". Now, you're saying that packages shouldn't install recommends by default.
What should Debian be? "An IKEAesque DIY distro", or "A more user friendly, yet very stable and vanilla distro". I vote for the latter, personally. Plus, as I told before, advanced users are free to use what they want to change.
If you want to change the default, the configuration files are at /etc/apt/conf.d/. If you want to disable feature for once, it's --no-install-recommends.
And that's perfectly fine, it just means I don't align with Debian on this one. And that freedom is what Linux is all about, I guess. So it seems it's working as intended :)
Edit: And I totally get that users might often want that kind of maximalism. It's just not for me. Although starting network daemons by default might sometimes be a bridge too far, or the case described in the article here.
...and this is what Debian Testing is actually for. To catch these types of issues.
Of course, people are free to select what they resonates with them. I'm not against more DIY distributions (I'm also contemplating using a LFS VM to explore things even further, but time is an issue), and I'm not against your personal choices. I just wanted to note the tension, and share my observations about Debian.
> On the other hand, apt and aptitude provides previews before doing something. You don't have to accept them. In aptitude's case, you can fine tune before the final commit, even.
You can't expect the average user to understand the entire dependency tree and read the description of dozens of random packages that the average program pulls in. RTFM is not a valid excuse for bad defaults.
Wouldn't be the first (or last) time a Debian maintainer has pulled the "you should read the descriptions of all (hundreds) of your packages (most installed as dependencies)" card in response to a bug report.
If someone started reading all the package descriptions and READMEs we're meant to be thoroughly familiar with when Trixie was released a few days ago, they'd still be reading them.
Intent or not, that developer is a risk to the project.
Note that clipboard data can be just about anything and is a valuable dataset, more so if the source of the data isn't aware of being a source.
I think it's just a cultural difference. Sogou, a super popular Chinese input program for Windows iOS and Android does the same with everything you type and nobody cares.
Just because Microsoft did it that doesn't make it a valid defense, in fact it shows the opposite (after all, they too did not have the best interests of their users at heart). The fact that the recipient of the data sits on the other side of the GFW and that clipboards can contain very interesting data you really should wonder about the intentions of the author, they do not get the benefit of the doubt. In fact, open source software that to all intents and purposes looks like it runs locally but pumps your (private) data out without your consent is a very large red flag to me: it gains access to data that otherwise likely would never be found in the wild. At a minimum this is a fairly serious GDPR violation.
I have been told to "RTFM!" countless times in many places. Some of them were legitimately the correct answer in that context, in hindsight. Some were knee-jerk reactions like this.
Debian's discussion culture might be a little edgy sometimes, but this has nothing to do with Debian.
On Windoes, I remember some translation programs go extreme, they hijack all GDI calls and scan for all strings on GUIs trying to translate and replace them inline. Local dictionary were pretty limited so many of them use online services. What happens when user input something "sensitive" on the GUI?
Well they goes straight to the translation service.
With the GDI hijacking programs you usually download them for specific languages with the knowledge they're internet connected.
As an ESL user, I vehemently disagree. You're only going to need translations as long as you keep relying on translations. Like it or not but English is the lingua franca of the computing age and you're doing yourself a disservice if you don't learn it.
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=806960
Given enough eyeballs, all bugs are closed as WONTFIX.
StarDict on Wayland has a different issue, it causes a segfault.
Sat, 02 Aug 2025: Bug#1003710: stardict crash in gnome with message Segmentation fault
https://www.mail-archive.com/debian-bugs-dist@lists.debian.o...
Whether malicious or not, to me isn't the point. The point is that I, as an individual deserve the illusion of control over my data and communication. I have neither the time, nor inclination to read all release notes. Furthermore, as someone who has spent enough time writing code - I recognize that humans make mistakes and don't always update them with salient details. All the automation in the world, and AI (yes, I've tried AI for release notes) just doesn't help.
Hey, an area I finally know something about. It depends on what you're trying to do.
The slimmed down version of a Finnish dictionary I provide in `tsk` [1] weighs in at around 30 MB, for about 250,000 Finnish words. It's small enough that I embed the whole dictionary directly into the binary and reconstruct the prefix search on the fly every time the user starts the app.
However, the much larger database which contains things like lemmatization and etymology information easily balloons up to many, many gigabytes in size. My problem domain is providing Truly Instant Lookup, keystroke by keystroke, so I can't really get around this level of memoization. The work to figure all this out was sufficient that I decided to make future versions a paid product instead [2].
Most other use cases would just call out to a server, because it's silly to think most people are going to download a giant database for that use case alone. A hybrid approach could also make a lot of sense, eg cache the most common 10,000 words locally and call out for the next 1.5 million, which are statistically extremely rare.
[1]: https://github.com/hiandrewquinn/tsk
[2]: https://taskusanakirja.com/ (offline for now until I get Digicert to certify my downloads wholesome for Windows resale)
pabs3•3h ago
https://wiki.debian.org/PrivacyIssues
Luckily there are things like opensnitch that can block some of these issues:
https://github.com/evilsocket/opensnitch
fsflover•1h ago
account42•10m ago