I.e., where they are making money off of this.
One clear indication that there are strings attached is that they're bundling a specialized GenAI assistant with the IDE.
Wish it was made clear in the FAQ. It doesn't cover this at all.
tl;dr is that the desktop app (including remote SSH sessions) is free to use with a permissive license (no account needed, no subscription, commercial use is OK, etc), but using Positron in a server mode does require a paid subscription.
Why did you relicense it under Elastic License 2.0 from VSCode’s MIT?
A better alternative would be proprietary extensions under a different license like Microsoft does.
We talk a bit about why we chose the Elastic license here: https://positron.posit.co/licensing.html
We have thought pretty carefully about what kind of functionality works well in extensions (in fact, we build and maintain a number of extensions!) and came to the decision that the more integrated data science experience we wanted to make required forking.
Also, are you using Open VSX, and what’s your take on the recent malware extension story?
We do use OpenVSX, yes, like the other forks, and our company is a major sponsor of OpenVSX. Security around the extension ecosystem is a pretty messy, complicated issue both for the proprietary Microsoft marketplace and OpenVSX. For example, the recent Amazon Q story! I currently think about it as conceptually fairly similar to the risks of using packages from PyPI or npm.
that actually defined its whole new class of ultra-portable 3D printers (Positron-style 3D printers; their drive system is named "Positron drive").
A sibling of the Positron is the JourneyMaker:
> https://github.com/mcfazio2001/JourneyMaker-Positron
A cost-reduced (no CNC-machined parts) variant of the JourneyMaker with a unibody chassis that you can 3D-print by yourself is the Lemontron:
And not to beat a dead horse, but I'm also not a huge fan of the broad claims around it being OSS when it very clearly has some strict limitations.
I've already had to migrate from R Connect Server / Posit Server at work, because of the extreme pricing for doing simple things like having auth enabled on internal apps.
We found a great alternative that's much better anyways, plus made our security folks a lot happier, but it was still a massive pain and frustrated users. I've avoided any commercial products from Posit since then and this one makes me hesitant especially with these blurry lines.
It paid for itself in terms of scientists spinning up their own projects without having to provision server hardware, VMs, or anything else.
We're still investing in RStudio and while the products have some overlap there's no attempt to convert people from one to the other.
(I work at Posit on both of these products)
This is what we use: https://domino.ai/ The marketing is a bit intense on the website, but the docs are pretty good: https://docs.dominodatalab.com/en/cloud/user_guide/71a047/wh...
They definitely target large scale companies, but you can use their SaaS offering and it can be relatively affordable. The best part is the flexibility and scaling, but the license model is awesome too. There's no usage based billing, you just pay a flat license fee per user that writes code and for the underlying cloud costs and they'll deploy it on GCP, AWS, or Azure.
They're used by a lot of large companies, but academia as well to replace or augment on-prem HPC clusters. That's what we used them for as well.
I'm interested in your opinion as a user on a bit of a new conundrum for me: for as many jobs / contracts as I can remember, the data science was central enough that we were building it ourselves from like, the object store up.
But in my current role, I'm managing a whole different kind of infrastructure that pulls in very different directions and the people who need to interact with data range from full-time quants to people with very little programming experience and so I'm kinda peeking around for an all-in-one solution. Log the rows here, connect the notebook here, right this way to your comprehensive dashboards and graphs with great defaults.
Is this what I should be looking at? The code that needs to run on the data is your standard statistical and numerics Python type stuff (and if R was available it would probably get used but I don't need it): I need a dataframe of all the foo from date to date and I want to run a regression and maybe set up a little Monte Carlo thing. Hey that one is really useful, let's make it compute that every night and put it on the wall.
I think we'd pay a lot for an answer here and I really don't want to like, break out pyarrow and start setting up tables.
For me the core of the solution - parquet in object store at rest and arrow for IPC - haven't changed in years, but I'm tired of re-building the whole metadata layer and job dependency graphs at every new place. Of course the building blocks get smarter with time (SlateDB, DuckDB, etc.) but it's all so tiresome.
On the front end I've always had reasonable outcomes with `wandb` for tracking runs once you kind get it all set up nicely, but it's a long tail of configuration and writing a bunch of glue code.
In this situation I'm dealing with a pretty medium amount of data and very modest model training needs (closer to `sklearn` than some mega-CUDA thing) and it feels like I should be able to give someone the company card and just get one of those things with 7 programming languages at the top of the monospace text box for "here's how to log a row", we do Smart Things and now you have this awesome web dashboard and you can give your quants this `curl foo | sh` snippet and their VSCode Jupyter will be awesome.
The one other big thing that Domino isn't, is it's not a database or data warehouse. You pair it with something like BigQuery or Snowflake or just S3 and it takes a huge amount of the headache of using those things away for the staff you're describing. The best way to understand it is to just look at this page: https://docs.dominodatalab.com/en/cloud/user_guide/fa5f3a/us...
People at my work, myself included, absolutely love this feature. We have an incredibly strict and complex cloud environment and this makes it, so people can skip the setup nonsense and it will just work.
This isn't to say that you can't store data in Domino, it's just not a SQL engine. Another loved feature is their datasets. It's just EFS masquerading as an NFS, but Domino handles permissions and mounting. It's great for non-SQL file storage. https://docs.dominodatalab.com/en/cloud/user_guide/6942ab/us...
So, with those constraints in mind, I'd say it's great for what you're describing. You can deploy apps or API endpoints. You can create on-demand large scale clusters. We have people using Spark, Ray, Dask, and MPI. You can schedule jobs and you can interact with the whole platform programmatically.
the problem? the fact that you need a license to use. it's not OSS. you are not allowed to deploy this on a hosted/managed system:
``` Limitations
You may not provide the software to third parties as a hosted or managed service, where the service provides users with access to any substantial set of the features or functionality of the software.
You may not move, change, disable, or circumvent the license key functionality in the software, and you may not remove or obscure any functionality in the software that is protected by the license key.
You may not alter, remove, or obscure any licensing, copyright, or other notices of the licensor in the software. Any use of the licensor's trademarks is subject to applicable law. ```
this seems unmaintained https://dayssincelastvscodefork.com/
The API is so good that a lot of core VS Code behavior (e.g. Github integration, support for lots of languages) is implemented in the form of built-in extensions.
It is possible to get 80% of VS Code's functionality with 10-20% of the code if you just bake everything into one monolith, but this has been tried repeatedly and it keeps failing in part because the extension ecosystem and attendant network effects form a wide moat.
(disclaimer - I work on Positron)
(But that’s just the editor component, if you need all the other IDE stuff you’ll have to build it :-)
For something non-browser, I’m currently using Zed and it’s pretty good: https://zed.dev/
(Hiding behind my couch after writing that)
That said, I do know that the type of person who likes configuring things very in-depth can set up intricate and powerful workflows in Emacs. I don't know what kind of data science IDE specifically you're interested in putting together, but here's a general article:
https://michaelneuper.com/posts/replace-jupyter-notebook-wit...
There's also this MOOC on reproducible research in French and English from Inria, where you're encouraged to follow the course in one of three ways: Jupyter, RStudio, or in Emacs' Org-Mode. I'd love to do it, but can't really justify spending the time at the minute.
https://www.fun-mooc.fr/en/courses/reproducible-research-met...
Creator of org-mode is Carsten Dominik, who is an astronomer by trade, so, it's a scientist's tool. A few of his talks are listed on this page, if you're interested in going straight to the source:
I would say that Positron is better for folks who use more than one language (not only R) or want to customize/extend their IDE in a way that is not possible in RStudio.
If you find that interesting, there are actually quite a few others
https://github.com/airbytehq/airbyte-platform/blob/main/LICE...
https://github.com/apollographql/federation/blob/main/LICENS...
I appreciate that full IDEs are heavy tools, but when I just need an editor I go with vim, if I have to do real work why not take out the power tools?
Positron provides a batteries-included experience that lets you work with Python and R out of the box; it's easier to get started, everything's already set up for data work, and the tools all work together smoothly. At least, that's the goal. :-)
(disclaimer - I work on Positron)
Also, my impression is that that is also a big part of why MATLAB still exists, despite outraging prices.
I think the common theme among these tools' main user groups is that they are not developers. They are not comfortable fiddling a lot with a dev environment, but can be productive in an environment where everything just works.
Thus, if Positron can get the same smooth and rock-solid out-of-the box experience, it will be able to reach a lot of these non-developer user groups.
At least that's my 5c.
Graphical table creator? View and export ER schemas? SQL Syntax and autocomplete in .sql files AND within literal strings in your code? Query explainer?
Yeah, I don't think so.
But we have some pretty big aspirations around expanding our SQL support, based on the features we have already built like that Connections pane, our Data Explorer, our Observable support via Quarto, etc. We plan to invest in this area over the coming months, starting in Q4 this year.
I'll keep tabs on you guys, my DS colleagues might be interested in the project.
On Linux it's slow and buggy unfortunately. It's improving though.
But thanks for adrenaline spike. Now I can start my morning.
I have different specific habits for R and Python. Think it'll take a bit of time for people like me to switch. For a week, I also tried R in VS code, but something wasn't feeling right. I excited for the connections-pane if it works smoothly
tl;dr is that we build a lot of extensions ourselves and are familiar with what you can/can't do with it. Our goals around a data science IDE require more fully integrated experience that isn't possible with the VS Code extension API.
williamstein•1d ago
sunnybeetroot•1h ago