I wouldn't mind these groups keeping their models private except that their success sucks all the air out of the room when it comes to developing fully open models. The vast majority of users are satisfied with the app or API and so if you aren't you're going to be going it alone. (Of course a for-profit company could have the same effect, but it feels extra bad when it's a non-profit/government agency doing it.)
contingencies•5h ago
Shame! IMHO open data input should yield open data output. The community contribute far too much time, data, expertise and money to tolerate this kind of BS, which opens questions about fundamental compatibility with science.
iNaturalist should remove non-open data and commit to fully open output within a fixed period of time to maintain community support.
scellus•4h ago
If they feel like keeping the models to themselves, I think it's a fair game. I give them observations, they gave me the id service for free. Maybe they even sell the models to fund their development efforts? I wouldn't mind... they need to fund their functions somehow anyway.
And remember, their observation databases are open. In fact my observations are automatically copied to the databases of a national biodiversity institution (which is open as well, except for some critical species).
Institutions need to maintain themselves and be able to pay their employees for them being able to feed their kids, etc.
contingencies•4h ago
scellus•4h ago
That's probably not a sustainable situation.
contingencies•4h ago
IMHO funds received by well-run non-profits will be banked, not spent, therefore they yield ongoing returns which are used to meet costs and sustain the organization. The fund origin is immaterial.
scellus•3h ago
momoschili•4h ago
scellus•3h ago
Meanwhile, the whole idea of iNaturalist has evolved around voluntary reporting, community involvement, and open data, and I think some of that needs to stay. They can't turn fully commercial.
chongli•43m ago
xattt•4h ago
contingencies•4h ago
scellus•3h ago
Especially selling identification services, which is related to keeping the models private, would make sense. Museums and various kinds of biodiversity monitoring schemes need mass identification, and having AI there to partially replace people would be a cost saving for the researchers and potential funding for iNaturalist. Offering such a service for free is neither practical nor justified.
(Meanwhile, I can imagine there to be lots of naturalist who hate the idea of their services being partially replaced by AI. It may lower the quality but the cost margin between a human and an iNat model is really wide.)
I think EU had a plan on using AI identification in some of their monitoring schemes. It could have been iNaturalist or someone else, anyway it demonstrates the need.
kube-system•4h ago
They're a scientific 501(c)(3), not a FOSS 501(c)(3), right? It seems like their missions should be to support scientific progress, sometimes that means using data that is encumbered with IP baggage. It seems like it would be against their mission (and borderline a violation of tax law) to take a stance on IP law... that isn't what they do.
contingencies•3h ago
This aligns with the suggestion to commit to fully open data and fully open models.
kube-system•3h ago
Using scientific data that they can use to do science with but they can't share is 100% legit.
contingencies•3h ago
IMHO it's very hard to argue that something is in the public interest if the public can't see it, hold it, analyze it, criticize it, and replicate it: particularly in the field of science where we have a replication crisis.
If it's a black-box service, it's not science.
If it's replicable and open, thus provable, it's science.
kube-system•3h ago
There is no requirement that a 501(c)(3) post everything publicly.
I completely understand and agree that sharing science is a good thing... but it is also dumb to suggest that scientists must put their head in the sand and ignore data that just happens to be under copyright. And just because it is, doesn't mean that it can't be reviewed -- it means it can't be redistributed.
I mean, for heavens sake, every science textbook I ever read in school was encumbered by copyright. That doesn't mean we should burn science text books or that the data in them is subject to some replication crisis.
I think you're building a mountain out of a molehill here.
contingencies•2h ago
kube-system•2h ago