Meta and TikTok are obstructing researchers' access to data, EU commission rules

https://www.science.org/content/article/meta-and-tiktok-are-obstructing-researchers-access-data-european-commission-rules

147•anigbrowl•4h ago

Comments

paxys•3h ago

Remember that Cambridge Analytica was "research" as well. Laws like these sound good on paper, but it's the company that has to deal with the fallout when the data is used improperly. Unless the government can also come up with a fool proof framework for data sharing and enforce adequate protections, it's always going to be better for the companies to just say no and eat the fines.

shakna•3h ago

Which is why the EU doesn't use a letter-of-the-law system, and also have an ethics regulation system.

So it falls on those misusing the data, unless you knew it would be misused but collected it anyway.

Golden rule: Don't need the data? Don't collect it.

verst•3h ago

As I recall it Cambridge Analytica was a ton of OAuth apps (mostly games and quizzes) requesting all or most account permissions and then sharing this account data (the access for which had been expressly (foolishly) granted by the user) with a third-party data aggregator, namely Cambridge Analytica. Only this re-sharing of data with a third party was against Facebook Terms of Service.

I would not classify Cambridge Analytica as research. They were a data broker that used the data for political polling.

paxys•2h ago

From https://en.wikipedia.org/wiki/Cambridge_Analytica

> The New York Times and The Observer reported that the company had acquired and used personal data about Facebook users from an external researcher who had told Facebook he was collecting it for academic purposes.

tguvot•1h ago

link from sentence that you copy pasted https://en.wikipedia.org/wiki/Facebook%E2%80%93Cambridge_Ana...

The data was collected through an app called "This Is Your Digital Life", developed by data scientist Aleksandr Kogan and his company Global Science Research in 2013.[2] The app consisted of a series of questions to build psychological profiles on users, and collected the personal data of the users' Facebook friends via Facebook's Open Graph platform.[2] The app harvested the data of up to 87 million Facebook profiles

pms•1h ago

This "research" and data access wouldn't be allowed under the DSA, because (i) the researcher didn't provide any data protection safeguards, (ii) his university (and data protection officer) didn't assume legal liability for his research, (iii) his research isn't focused on systemic risks to society.

tguvot•1h ago

not sure what's the point that you are making. but under "common sense comments act of 2054" unclear comments are not allowed.

brendoelfrendo•54m ago

The article for this post is about the EU's Digital Services Act (DSA). Since the original comment argues against research access to data by arguing that "Cambridge Analytica was research as well," another poster chimed in to rebut that assertion by arguing that Aleksandr Kogan's research would not have been allowed access to user data under the DSA and thus, that specific legal concern is moot.

tguvot•40m ago

kogan "research" harvested data through application and he was outside of eu.

so even it was happening today, whatever he did is irrelevant to EU/DSA unless they plan to chase everybody across the globe. somewhat like ofcom going after 4chan

_--__--__•3h ago

I don't think you get it: the EU has a law that says these researchers need to find casus belli to wrestle the norms of online freedom of speech away from American corporations. Therefore they get to request data on every account that has ever interacted with certain political parties on those platforms, as a treat.

pms•2h ago

Republicans and Elon Musk have become very skilled at exerting political influence in the US [1] and Europe [2] through social media in ways the public isn't really aware of. Is this really that far from the goal of Cambridge Analytica of influencing elections without people's knowledge? Is it fine for large online platforms to influence election outcomes? Why wouldn't an online platform be used to this end if that's beneficial for it and there is no regulation discouraging it?

[1] https://www.techpolicy.press/x-polls-skew-political-realitie...

[2] https://zenodo.org/records/14880275

santadays•1h ago

I can’t imagine this is not happening. There exists the will, the means and the motivation, with not a small dose of what pg might call naughtiness.

terminalshort•1h ago

I can't stand this "influencing elections" nonsense. It's a term meant to mislead with connotations of manipulating the voting tabulation when what is actually going on is influencing people to vote the way you want them to, which is perfectly legal and must always be legal in a functioning democracy.

pms•1h ago

Long story short, this "research" and data access wouldn't be allowed under the DSA, because (i) the researcher didn't provide any data protection safeguards, (ii) his university (and their data protection officer) didn't assume legal liability for his research, (iii) his research isn't focused on systemic risks to society.

loeg•1h ago

Platforms (reasonably!) do not trust random academic researchers to be safe custodians of user data. The area of research focus and assumption of liability do not matter. Once a researcher's copy of data is leaked, the damage is done.

SilverElfin•3h ago

Sorry but this sounds like a privacy nightmare. No one should trust random researchers with mountains of data. They clearly won’t know how to secure it. But also the angle of trying to police information and elections is disturbing, especially after several instances of the EU interfering in elections in recent times.

anonymousDan•3h ago

You talk as if the US hasn't attempted to interfere in elections.

If online ads can be trivially used by big US tech companies to sway our elections using misinformation without it being observable to anyone or possible to refute (as would be the case for newspaper or TV ads) then why shouldn't it be monitored?

andsoitis•3h ago

Can you tell us more about how big tech had used ads to sway elections?

Also, minor detail, TikTok in the EU is not a US tech company.

pms•2h ago

I don't think it's about the US in its entirety, nor ads, but Republicans and Elon Musk have become very skilled at exerting political influence in the US [1] and Europe [2] through social media in ways the public isn't really aware of:

[1] https://www.techpolicy.press/x-polls-skew-political-realitie...

[2] https://zenodo.org/records/14880275

anigbrowl•2h ago

The platforms themselves strike many as a privacy nightmare. I'm not aware of any mass data breaches that can be attributed to poor academic security in recent memory.

instances of the EU interfering in elections

Do tell.

loeg•1h ago

Eh. Users choose to use the platform. Users concerned about privacy don't have to use it.

How can I, an end user who doesn't trust the ability of these researchers to keep my data private, prevent the platform from sharing my data with them?

dmix•3h ago

I'd hate to be the engineer that has to deal with these requests. Not even a formal government investigation, just any number of random 3rd party researchers demanding specialized access.

consumer451•3h ago

> Engineer

Let me know when devs get stamps that make them legally liable for their decisions. Only then will that honor be applicable to software.

tdb7893•2h ago

Most of my friends are mechanical or aerospace engineers and it's all the same job in a different medium (many do a significant amount of software as part of their work). They don't have stamps and aren't any more legally liable than we are and staying we aren't engineers just seems to be a misunderstanding of what engineering is.

consumer451•2h ago

I grew up in a structural and civil engineering family. My issue is that there is no path to "professional engineer" or "architect" in software, which as a Hammurabi old, makes me suspect of the entire profession. I am involved in software dev, and I would never call it engineering. This might be niche semantics, and yet it feels very important to me.

https://en.wikipedia.org/wiki/Regulation_and_licensure_in_en...

https://en.wikipedia.org/wiki/Engineering_law

https://en.wikipedia.org/wiki/Code_of_Hammurabi

zeroonetwothree•2h ago

Yes well unfortunately you aren’t the English language semantics overlord. So it doesn’t much matter what you think compared to general usage.

consumer451•1h ago

That's "literally" fine with me.

noir_lord•2h ago

I was an industrial electrician before I was a paid programmer.

I worked with engineers, what we generally do isn’t engineering by the standards of those engineers.

Which isn’t to say that all software development isn’t.

People writing avionics software and medical software etc are doing what I’d recognise as engineering.

It’s about the process more than anything.

Software in the wild is simply a young field and we aren’t there yet widely.

consumer451•2h ago

> Collins Aerospace: Sending text messages to the cockpit with test:test

https://news.ycombinator.com/item?id=45747804

___

Think about how physical engineering orgs are formed. It's a collective of engineers, as it should be. The reason is that zero consequence management abstraction layers cannot exist in a realm with true legal responsibility. Real engineering org structure is written in blood.

I wonder what it will take for software to face that reality. I know that lack of regulation leads to faster everything, and I really do appreciate and love that... but as software continues to eat the world, there will be real consequences eventually, right?

The reason that real engineering liability goes back to at least the Code of Hammurabi is that people got killed by bad decisions and corner cutting.

What will that look like in software history?

cwillu•1h ago

In the 90's, “Software is a young field” had a point. In the 2020's though, I think we have to start admitting to ourselves that it is largely an immature/developmentally-delayed field.

terminalshort•58m ago

If your definition of legitimacy rests on credentials rather than skill your personality sounds more suited for a lawyer than an engineer to me.

consumer451•31m ago

Personal insults aside, I hear you. In an attempt to get on the same page: do you know why the Code of Hammurabi called out deficient architecture/engineering?

This is not a gotcha. My understanding is that bad physical engineering kills people. Is that your understanding as well?

As software takes over more and more control of everything... do you see what I am getting at? Or, not at all?

To be clear, my understanding is that physical professional engineer (PE) legal responsibility is not like the medical ethical code of "do no harm." It's just follow best practices and adopted standards, don't allow test:test login on things like fighter jets, etc. If you fail that, then there may be legal repercussions.

We have allowed software "engineering" to skip all levels of basic responsibility, haven't we?

terminalshort•8m ago

I do. And I get where the Hammurabi Code is coming from. But note that it makes no mention of credentials or who is allowed to do the engineering. Only that if you screw it up the penalty is death.

And I suspect that if you instituted such a system today the results wouldn't be what you like. Failures in complex engineering are typically multiple failures that happen simultaneously when any individual failure would have been non fatal. The bugs are lurking always and when different systems interact in unpredictable ways you get a catastrophic failure. And the way that N systems can interact is on the order of 2^N, so it's impossible to think of everything. Applying the Hammurabi Code to software engineering wouldn't lead to safer software, it would lead to every engineer getting a lottery ticket every time they push a feature, and if the numbers come up you die.

Nextgrid•3h ago

The key is to make all the data public so there is no concept of “specialized access” and then you’re golden.

pms•2h ago

If applications and datasets are fragmented, then that's going to be a nightmare for all stakeholders, including:

* researchers, because they will have to write data access applications, including a sufficient description of planned safeguards, detailed enough to the point that their university is ready to take a legal liability (and you can imagine how easy this will be), and

* digital service coordinators, because it will take ages for them to process applications from thousands of researchers each requesting a slightly different dataset.

In the end, we need to develop standardized datasets across platforms and to streamline data access processes so that they're safe and efficient.

terminalshort•1h ago

Who is this better for? Not me, the user, that's for damn sure. So instead of just sharing my data with Meta, now I have to share it with Meta and everybody in the damn world?

paxys•2h ago

Engineers don't deal with the requests, lawyers do. No regular engineer at any big tech company is ever going to be in a position where they are responsible for such decisions.

lanyard-textile•1h ago

Legal will be the face of it, but engineers often handle the actual underlying request.

Over a couple large public companies, I’ve had to react to a court ruling and stop an account’s actions, work with the CA FTB for document requests, provide account activity for evidence in a case, things like that.

thrwaway55•1h ago

Uhhhhhhhh who do you think builds those tools to enforce the things legal stamps.

Delete all docs we aren't legally required to retain on topic Y before we get formally subpoena'd. We expect it to be on XXX based on our contact.

hexage1814•3h ago

Do their own scraping, for God's sake.

pms•3h ago

Except platforms don't allow it and have sued for it...

https://www.google.com/search?q=x+sues+researchers

Esophagus4•2h ago

It's a shame the legal system favors big money, because courts typically rule that scraping public data is not against the law[1]. In spite of how much the platforms protest.

Sadly, big companies can bully the scrapers with nuisance lawsuits as long as they wear the scrapers down in legal costs before it gets to trial.

[1]https://www.proskauer.com/release/proskauer-secures-dismissa...

loeg•1h ago

Good!

Workaccount2•3h ago

Europe doing everything possible to scare away modern industry.

pms•2h ago

I hope it's quite the opposite, since this can lead to innovation as we figure out how to depolarize social media platforms or how to develop more politically neutral online media.

mc32•2h ago

I agree on the need for depolarization (go back to timelines and get rid of recommendation engines) but once you cede control of content to government even if it's for things people would agree on like "nuke that misinformation" you will end up being a mouthpiece for the government in power -whoever it be. Look at how "innocent" embeds wholly shaped the messaging on Covid and how they sidelined all dissent (lots of people of renown and stature within relevant disciplines)

pms•1h ago

That's a great point. I agree that's a danger, but please note DSA doesn't cede the control of content to government, but rather it creates an institution of (national) Digital Service Coordinators (DSCs) that decide whether a researcher's access to data is well-reasoned. In most cases that institution will be in a different country (the country of company's EU's HQs) than the researcher. That said, there could be malicious players involved, e.g., researchers and respective DSCs secretly recruited by a government to influence elections. This, however, sounds implausible, since in principle both the DSCs and researchers are independent from national governments.

Also, we can have depolarized recommendation algorithms. We don't need to go back all the way to timelines.

thesmtsolver•1h ago

That's quite optimistic given EU's track record in practice.

E.g., DieselGate. Europe was more impacted but US caught Volkswagen cheating.

https://en.wikipedia.org/wiki/Volkswagen_emissions_scandal#E...

pms•1h ago

It's also quite optimistic to think that the industry will self-regulate, as the recent history of Boeing 737 MAX shows...

mc32•1h ago

It’s also optimistic to think the gov will do what’s good for the people as exemplified by Chernobyl.

There is no “good” answer. Each has its pros and cons.

thesmtsolver•56m ago

No one is saying that the industry will self-regulate. There is a right amount of regulation and all evidence points that the EU is over that limit. The US is below (probably) but closer.

terminalshort•52m ago

This reads to me like: "please note DSA doesn't cede the control of content to government, but rather it creates a more obfuscated and shady government that pretends not to be a government, but is actually 10x worse and completely devoid of democratic control, and then it cedes control to that."

terminalshort•56m ago

What you want is censorship. Using words like "depolarize" to hide it doesn't fool anyone. The polarization comes not from the platform, but from its users.

anigbrowl•2h ago

What's the product of this industry? It certainly generates huge negative externalities.

abtinf•2h ago

Chat Control is a never ending battle.

ggm•2h ago

American corporations don't want to accede to european rules about access to data, but it would be grossly simplistic to say all the problem lies on either side. I am not an EU resident but I tend to think there are sound reasons for some of what they want, and as American corporate entities the majors are bound in some tricky issues around US state department expectations, FTC expectations, but it isn't just trade, it's also civil liberties and privacy-vs-security political autonomy issues.

I would have preferred the companies like this emerged as federated entities and european data stayed in european DC and was subject to european laws. I think it would have avoided a lot of this, if they had not constructed themselves to be a single US corporate sheild, with transfer pricing on the side to maximise profit.

user3939382•2h ago

These services shouldn’t exist in the first place.

charcircuit•2h ago

Allowing mass scraping like researchers want will further push people away from public platforms towards private group chats like Discord.

pms•2h ago

[1] https://www.techpolicy.press/x-polls-skew-political-realitie...

[2] https://zenodo.org/records/14880275

nothrowaways•1h ago

Hi X

nothrowaways•1h ago

They left out X because he will bitch_ about it to his 600 million followers lol.

nothrowaways•1h ago

Interesting times

sva_•1h ago

I'm not really sure what the argument is for letting these researchers make users of these platforms non-consenting participants in their studies? Is it even possible to opt out?

MiiMe19•1h ago

Nope :D The EU has determined that you WILL be a part of these studies!

_el1s7•1h ago

Yes, there is an opt out, make your profile private.

_el1s7•1h ago

That's why https://tikapi.io exists.

Nio1024•28m ago

Because of the EU’s strict regulations, many internet products have simply given up on the European market. In some ways, this makes Europe seem a bit “behind the times.” Of course, the world still needs some conservatives to keep things in balance.

Uv is the best thing to happen to the Python ecosystem in a decade

China has added forest the size of Texas since 1990

Tell HN: Azure outage

IRCd service written in awk

Minecraft removing obfuscation in Java Edition

Raspberry Pi Pico Bit-Bangs 100 Mbit/S Ethernet

OS/2 Warp, PowerPC Edition

Dithering – Part 1

AWS to bare metal two years later: Answering your questions about leaving AWS

How the U.S. National Science Foundation Enabled Software-Defined Networking

AOL to be sold to Bending Spoons for $1.5B

Kafka is Fast – I'll use Postgres

A century of reforestation helped keep the eastern US cool

Tailscale Peer Relays

Crunchyroll is destroying its subtitles

OpenAI’s promise to stay in California helped clear the path for its IPO

Board: New game console recognizes physical pieces, with an open SDK

The Internet runs on free and open source software and so does the DNS

GLP-1 therapeutics: Their emerging role in alcohol and substance use disorders

How to Obsessively Tune WezTerm

Keep Android Open

Meta and TikTok are obstructing researchers' access to data, EU commission rules

Responses from LLMs are not facts

Using Atomic State to Improve React Performance in Deeply Nested Component Trees

More than DNS: Learnings from the 14 hour AWS outage

Upwave (YC S12) is hiring software engineers

Composer: Building a fast frontier model with RL

How blocks are chained in a blockchain

Extropic is building thermodynamic computing hardware

Tailscale Services

Meta and TikTok are obstructing researchers' access to data, EU commission rules

Comments

Uv is the best thing to happen to the Python ecosystem in a decade

China has added forest the size of Texas since 1990

Tell HN: Azure outage

IRCd service written in awk

Minecraft removing obfuscation in Java Edition

Raspberry Pi Pico Bit-Bangs 100 Mbit/S Ethernet

OS/2 Warp, PowerPC Edition

Dithering – Part 1

AWS to bare metal two years later: Answering your questions about leaving AWS

How the U.S. National Science Foundation Enabled Software-Defined Networking

AOL to be sold to Bending Spoons for $1.5B

Kafka is Fast – I'll use Postgres

A century of reforestation helped keep the eastern US cool

Tailscale Peer Relays

Crunchyroll is destroying its subtitles

OpenAI’s promise to stay in California helped clear the path for its IPO

Board: New game console recognizes physical pieces, with an open SDK

The Internet runs on free and open source software and so does the DNS

GLP-1 therapeutics: Their emerging role in alcohol and substance use disorders

How to Obsessively Tune WezTerm

Keep Android Open

Meta and TikTok are obstructing researchers' access to data, EU commission rules

Responses from LLMs are not facts

Using Atomic State to Improve React Performance in Deeply Nested Component Trees

More than DNS: Learnings from the 14 hour AWS outage

Upwave (YC S12) is hiring software engineers

Composer: Building a fast frontier model with RL

How blocks are chained in a blockchain

Extropic is building thermodynamic computing hardware

Tailscale Services