frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Tailscale state file encryption no longer enabled by default

https://tailscale.com/changelog
98•traceroute66•1h ago•53 comments

Sugar industry influenced researchers and blamed fat for CVD (2016)

https://www.ucsf.edu/news/2016/09/404081/sugar-papers-reveal-industry-role-shifting-national-hear...
539•aldarion•7h ago•342 comments

LMArena is a cancer on AI

https://surgehq.ai/blog/lmarena-is-a-plague-on-ai
42•jumploops•17h ago•13 comments

NPM to implement staged publishing after turbulent shift off classic tokens

https://socket.dev/blog/npm-to-implement-staged-publishing
95•feross•3h ago•13 comments

Shipmap.org

https://www.shipmap.org/
372•surprisetalk•7h ago•62 comments

US will ban Wall Street investors from buying single-family homes

https://www.reuters.com/world/us/us-will-ban-large-institutional-investors-buying-single-family-h...
351•kpw94•2h ago•346 comments

Eat Real Food

https://realfood.gov
280•atestu•4h ago•523 comments

LaTeX Coffee Stains (2021) [pdf]

https://ctan.math.illinois.edu/graphics/pgf/contrib/coffeestains/coffeestains-en.pdf
255•zahrevsky•7h ago•54 comments

Health care data breach affects over 600k patients, Illinois agency says

https://www.nprillinois.org/illinois/2026-01-06/health-care-data-breach-affects-600-000-patients-...
124•toomuchtodo•5h ago•45 comments

Claude Code Emergent Behavior: When Skills Combine

https://vibeandscribe.xyz/posts/2025-01-07-emergent-behavior.html
32•ryanthedev•2h ago•16 comments

We found cryptography bugs in the elliptic library using Wycheproof

https://blog.trailofbits.com/2025/11/18/we-found-cryptography-bugs-in-the-elliptic-library-using-...
23•crescit_eundo•6d ago•2 comments

Native Amiga Filesystems on macOS / Linux / Windows with FUSE

https://github.com/reinauer/amifuse
53•doener•4d ago•10 comments

2026 Predictions Scorecard

https://rodneybrooks.com/predictions-scorecard-2026-january-01/
8•calvinfo•31m ago•3 comments

Notion AI: Unpatched data exfiltration

https://www.promptarmor.com/resources/notion-ai-unpatched-data-exfiltration
25•takira•2h ago•1 comments

Creators of Tailwind laid off 75% of their engineering team

https://github.com/tailwindlabs/tailwindcss.com/pull/2388
776•kevlened•6h ago•486 comments

A4 Paper Stories

https://susam.net/a4-paper-stories.html
262•blenderob•9h ago•130 comments

Many hells of WebDAV

https://candid.dev/blog/many-hells-of-webdav
97•candiddevmike•6h ago•55 comments

Building voice agents with Nvidia open models

https://www.daily.co/blog/building-voice-agents-with-nvidia-open-models/
59•kwindla•6h ago•3 comments

Michel Siffre: This man spent months alone underground – and it warped his mind

https://www.newscientist.com/article/mg23931900-400-this-man-spent-months-alone-underground-and-i...
6•Anon84•6d ago•1 comments

ChatGPT Health

https://openai.com/index/introducing-chatgpt-health/
89•saikatsg•2h ago•85 comments

What *is* code? (2015)

https://www.bloomberg.com/graphics/2015-paul-ford-what-is-code/
99•bblcla•6d ago•40 comments

A glimpse into V8 development for RISC-V

https://riseproject.dev/2025/12/09/a-glimpse-into-v8-development-for-risc-v/
17•floitsch•17h ago•2 comments

Meditation as Wakeful Relaxation: Unclenching Smooth Muscle

https://psychotechnology.substack.com/p/meditation-as-wakeful-relaxation
116•surprisetalk•7h ago•76 comments

Show HN: I visualized the entire history of Citi Bike in the browser

https://bikemap.nyc/
12•freemanjiang•3h ago•5 comments

So you wanna de-bog yourself (2024)

https://www.experimental-history.com/p/so-you-wanna-de-bog-yourself
6•calvinfo•55m ago•1 comments

Optery (YC W22) Hiring a CISO and Web Scraping Engineers (Node) (US and Latam)

https://www.optery.com/careers/
1•beyondd•10h ago

My first paper: A practical implementation of Rubiks cube based passkeys

https://ieeexplore.ieee.org/document/11280260
5•acorn221•33m ago•1 comments

Show HN: An LLM response cache that's aware of dynamic data

https://blog.butter.dev/on-automatic-template-induction-for-response-caching
4•raymondtana•1h ago•0 comments

A tab hoarder's journey to sanity

https://twitter.com/borisandcrispin/status/2008709479068794989
68•borisandcrispin•4h ago•74 comments

Polymarket refuses to pay bets that US would 'invade' Venezuela

https://www.ft.com/content/985ae542-1ab4-491e-8e6e-b30f6a3ab666
205•petethomas•19h ago•199 comments
Open in hackernews

Launch HN: Tamarind Bio (YC W24) – AI Inference Provider for Drug Discovery

75•denizkavi•1d ago
Hi HN, we're Deniz and Sherry from Tamarind Bio (https://www.tamarind.bio). Tamarind is an inference provider for AI drug discovery, serving models like AlphaFold. Biopharma companies use our library of leading open-source models to design new medicines computationally.

Here’s a demo: https://youtu.be/luoMApPeglo

Two years ago, I was hired at a Stanford lab to run models for my labmates. Some post-doc would ask me to run a set of 1-5 models in sequence with tens of thousands inputs and I would email them back the result after setting up the workflow in the university cluster.

At some point, it became unreasonable that all of an organization's computational biology work would go through an undergrad, so we built Tamarind as a single place for all molecular AI tools, usable at massive scale with no technical background needed. Today, we are used by much of the top 20 pharma, dozens of biotechs and tens of thousands of scientists.

When we started getting adoption in the big pharma companies, we found that this problem also persisted. I know directors of data science, where half their job could be described as running scripts for other people.

Lots of companies have also deprecated their internally built solution to switch over, dealing with GPU infra and onboarding docker containers not being a very exciting problem when the company you work for is trying to cure cancer.

Unlike non-specialized inference providers, we build both a programmatic interface for developers along with a scientist-friendly web app, since most of our users are non-technical. Some of them used to extract proteins from animal blood before replacing that process with using AI to generate proteins on Tamarind.

Besides grinding out images for each of the models we serve, we’ve designed a standardized schema to be able to share each model’s data format. We’ve built a custom scheduler and queue optimized for horizontal scaling (each inference call takes minutes to hours, and runs on one GPU at a time), while splitting jobs across CPUs and GPUs for optimal timing.

As we've grown to handle a substantial portion of the biopharma R&D AI demand on behalf of our customers, we've expanded beyond just offering a library of open source protocols.

A common use case we saw from early on was the need to connect multiple models together into pipelines, and having reproducible, consistent protocols to replace physical experiments. Once we became the place to build internal tools for computational science, our users started asking if they could onboard their own models to the platform.

From there, we now support fine-tuning, building UIs for arbitrary docker containers, connecting to wet lab data sources and more!

Reach out to me at deniz[at]tamarind.bio if you’re interested in our work, we are hiring! Check out our product at https://app.tamarind.bio and let us know if you have any feedback to support how the biotech industry uses AI today.

Comments

brandonb•1d ago
Congrats on the launch. I always love to see smart ML founders applying their talents to health and bio.

What were the biggest challenges in getting major pharma companies onboard? How do you think it was the same or different compared to previous generations of YC companies (like Benchling)?

denizkavi•1d ago
Thanks! I think advantages we had over previous generations of companies is that demand and value for software has become much clearer for biopharma. The models are beginning to actually work for practical problems, most companies have AI, data science or bioinformatics teams that apply these workflows, and AI has management buy-in.

Some of the same problems exist, large enterprises don't want to process their un-patented, future billion-dollar drug via a startup, because leaking data could destroy 10,000 times the value of the product being bought.

Pharma companies are especially not used to buying products vs research services, there's also historical issues with the industry not being served with high quality software, so it is kind of a habit to build custom things internally.

But I think the biggest unlock was just that the tools are actually working as of a few years ago.

idontknowmuch•1d ago
What tools are "actually working" as of a few years ago? Foundation models, LLMs, computer vision models? Lab automation software and hardware?

If you look at the recent research on ML/AI applications in biology, the majority of work has, for the most part, not provided any tangible benefit for improving the drug discovery pipeline (e.g. clinical trial efficiency, drugs with low ADR/high efficacy).

The only areas showing real benefit have been off-the-shelf LLMs for streamlining informatic work, and protein folding/binding research. But protein structure work is arguably a tiny fraction of the overall cost of bringing a drug to market, and the space is massively oversaturated right now with dozens of startups chasing the same solved problem post-AlphaFold.

Meanwhile, the actual bottlenecks—predicting in vivo efficacy, understanding complex disease mechanisms, navigating clinical trials—remain basically untouched by current ML approaches. The capital seems to be flowing to technically tractable problems rather than commercially important ones.

Maybe you can elaborate on what you're seeing? But from where I'm sitting, most VCs funding bio startups seem to be extrapolating from AI success in other domains without understanding where the real value creation opportunities are in drug discovery and development.

unignorant•20h ago
These days it's almost trivial to design a binder against a target of interest with computation alone (tools like boltzgen, many others). While that's not the main bottleneck to drug development (imo you are correct about the main bottlenecks), it's still a huge change from the state of technology even 1 or 2 years ago, where finding that same binder could take months or years, and generally with a lot more resources thrown at the problem. These kinds of computational tools only started working really well quite recently (e.g., high enough hit rates for small scale screening where you just order a few designs, good Kd, target specificity out of the box).

So both things can be true: the more important bottlenecks remain, but progress on discovery work has been very exciting.

machbio•1d ago
Looks good - would have really appreciated if the pricing page contained any examples of pricing instead of book a meeting
denizkavi•1d ago
That's fair, I wish we were able to just add in a calculator for getting a price on a per hour basis, given your models of interest and intended volume.

We actually did have this available early on, our rationale for why we structure it differently now is basically that there is a lot of diversity between how people use us. We have some examples where a twenty person biotech company will consume more inference than a several hundred person org. Each tool has very different compute requirements, and people may not be clear on which model exactly they will be using. Basically we weren't able to let people calculate the usage/annual commitment/integration and security requirements in one place.

We do have a free tier which tends to be decent estimate of usage hours and a form you can fill out if and we can get back to you with a more precise price.

Akshay0308•1d ago
That's really cool! How much do scientists at big pharma use open-source models as opposed to models trained on their proprietary data? Do you guys have tie-ups to provide inference for models used internally at big pharma trained on proprietary data?
denizkavi•1d ago
Good amount of both! I would say proprietary models tend to be fine-tuned versions of the published ones, although many will be completely new architectures. We also let folks fine-tune models with their proprietary data on Tamarind directly.

We do let people onboard their own models too, basically the users just see a separate tab for their org, which is where all the scripts, docker images, notebooks their developers built interfaces for live on Tamarind.

washedDeveloper•1d ago
The org I work on develops HTCondor. We have a lot of scientists that end up running alphafold and other bio related models on our pool of GPUs and CPUs. I am curious to know how and why your team implemented yet another job scheduler. HTCondor is agnostic to the software being ran, so maybe there is more clever scheduling you can come up with. That being said, HTCondor also has pretty high flexibility with regards to policy.
denizkavi•1d ago
That’s interesting. We’ve developed a kubernetes-based scheduler that we’ve found better takes into account our custom job priority needs, allows for more strict data isolation between tenants, and a production grade control plane, though the core scheduling could certainly be implemented in something like HTCondor.

Originally, my first instinct was to use Slurm or AWS batch, but started having problems once we tried to multi cloud. We're also optimizing for being able to onboard an arbitrary codebase as fast as possible, so building a custom structure natively compatible with our containers (which are now automatically made from linux machines with the relevant models deployed) has been helpful.

the__alchemist•1d ago
Cool project! I have a question based on the video: What sort of work is it doing from the "Upload mmCIF file and specify number of molecules to generate" query? That seems like a broad ask. For example, it is performing ML inference on a data set of protein characteristics, or pockets in that protein? Using a ligand DB, or generating ligands? How long does that run take?
denizkavi•1d ago
In this case the input to the model is the input structure of the protein target, i.e. you can define the whole search space for it to try to find a binder/drug against. We let you pick a preset recipe to follow at the top, which basically are common ways people are using this protocol for. The model itself can find a pocket, or the user can specify it if they know ahead of time. There is a very customizable variant for this tool, where you can set distances between individual atoms, make a custom scaffold for your starting molecule, but 90% of the time, the presets tend to be sufficient.

Runs vary significantly between models/protocols used, some generative models can take several hours, while some will run a few seconds. We have tools that would screen against DBs if the goal is to find an existing molecule to act against the target, but often, people will import and existing starting point and modify it or design completely novel ones on the platform.

t_serpico•1d ago
nice stuff! how do you handle security concerns big pharma may have? wouldn't they just run their stuff on-prem?
denizkavi•1d ago
It certainly was an investment for us to meet the security and enterprise-readiness criteria for our enterprise users. As an n of 1, we don't tend to do on-prem, and even much of the most skeptical companies will find a way to use cloud if they want your product enough.

I think most large companies have similar expectations around security requirements, so once those are resolved most IT teams are on your side. We occasionally do some specific things like allowing our product to be run in a VPC on the customer cloud, but I imagine this is just what most enterprise-facing companies do.

johnsillings•22h ago
selling to big pharma companies as a startup is hard, so huge props on getting adoption there. the product looks very slick.
conradry•17h ago
You may find this library I wrote a couple years ago interesting: https://github.com/conradry/prtm. Curious about why you chose to make separate images for each model instead of copy-pasting source code into a big monorepo (similar to Huggingface transformers).
denizkavi•16h ago
Oh yeah, I've seen this before! Cool stuff

I would say primary concerns were:

dependency issues, needing more than model weights to be able to consume models (Multiple Sequence Alignment needs to be split, has its own always on server, so on), more convenient if the inputs and outputs are hardened interfaces as different envs

Our general findings in the BioML are that the models are not at all standardized especially compared to the diffusion model world for example, so treating each with its own often weird dependencies helped us get out more tools quicker.