I think AI companies are afraid to be a huge flop, of throwing huge amounts of money in the trash, of being publicly exposed for chasing a pipe dream.
They're paranoid that some foreign state is compromising their efforts (never the problem, the issue is that LLMs suck), they're paranoid that adoption is being dragged down by developers afraid of losing jobs (never a problem, the issue is that LLMs suck), and so on.
From this point, there are only two outcomes. Either the biggest advancement in tech (cure cancer, solve democracy, etc) or the biggest failure (everyone invested in an useless toy). No one is interested in a disappointing miracle, and all the hype is the responsibility of the companies themselves.
I think they're scared of being in this position, and making unreasonable decisions based on that fear.
that being said:
1. most of human generated media is soulless slop too.
2. the gen-ai is out of the bottle. it is hubris and delusion to believe that it can be wished away. for every redditor, journalist, and social media influencer tilting at that windmill, foaming at the mouth in impotent rage, there are tens of thousands of people who think it's useful and neat. with the current US administration being much more friendly to it than the previous one, with China rapidly catching up, with the sheer amount of capital behind it, gen AI is here to stay.
tonetegeatinst•4h ago
In any case, have a tag that let's you recognize what is and is not AI generated can be useful as it can point to the model or method the data was created by.
Take Math notes or proof data points, some could be AI and some could be generated by students working as interns.
The ability to discern AI content from human generated output is valued to those who will be using this data in the future, but also is another way to sort and catalogue the data.
Will I still check AI output, of course I would. Part of building good datasets is tagging the correct and incorrect data. Computer vision relies on a similar method to teach object recognition.
Overall I see the value in tagging the data as AI. My concern would be bias in someone either just trusting the AI completely or totally ignoring the data samples because they were an AI output.