frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

fp.

Part 1 the Persistent Vault Issue: Your Encryption Strategy Has a Shelf Life

1•PhantomKey•1m ago•0 comments

Teleop_xr – Modular WebXR solution for bimanual robot teleoperation

https://github.com/qrafty-ai/teleop_xr
1•playercc7•4m ago•1 comments

The Highest Exam: How the Gaokao Shapes China

https://www.lrb.co.uk/the-paper/v48/n02/iza-ding/studying-is-harmful
1•mitchbob•8m ago•1 comments

Open-source framework for tracking prediction accuracy

https://github.com/Creneinc/signal-tracker
1•creneinc•10m ago•0 comments

India's Sarvan AI LLM launches Indic-language focused models

https://x.com/SarvamAI
1•Osiris30•11m ago•0 comments

Show HN: CryptoClaw – open-source AI agent with built-in wallet and DeFi skills

https://github.com/TermiX-official/cryptoclaw
1•cryptoclaw•14m ago•0 comments

ShowHN: Make OpenClaw respond in Scarlett Johansson’s AI Voice from the Film Her

https://twitter.com/sathish316/status/2020116849065971815
1•sathish316•16m ago•1 comments

CReact Version 0.3.0 Released

https://github.com/creact-labs/creact
1•_dcoutinho96•18m ago•0 comments

Show HN: CReact – AI Powered AWS Website Generator

https://github.com/creact-labs/ai-powered-aws-website-generator
1•_dcoutinho96•19m ago•0 comments

The rocky 1960s origins of online dating (2025)

https://www.bbc.com/culture/article/20250206-the-rocky-1960s-origins-of-online-dating
1•1659447091•24m ago•0 comments

Show HN: Agent-fetch – Sandboxed HTTP client with SSRF protection for AI agents

https://github.com/Parassharmaa/agent-fetch
1•paraaz•25m ago•0 comments

Why there is no official statement from Substack about the data leak

https://techcrunch.com/2026/02/05/substack-confirms-data-breach-affecting-email-addresses-and-pho...
5•witnessme•29m ago•1 comments

Effects of Zepbound on Stool Quality

https://twitter.com/ScottHickle/status/2020150085296775300
2•aloukissas•33m ago•1 comments

Show HN: Seedance 2.0 – The Most Powerful AI Video Generator

https://seedance.ai/
2•bigbromaker•36m ago•0 comments

Ask HN: Do we need "metadata in source code" syntax that LLMs will never delete?

1•andrewstuart•42m ago•1 comments

Pentagon cutting ties w/ "woke" Harvard, ending military training & fellowships

https://www.cbsnews.com/news/pentagon-says-its-cutting-ties-with-woke-harvard-discontinuing-milit...
6•alephnerd•44m ago•2 comments

Can Quantum-Mechanical Description of Physical Reality Be Considered Complete? [pdf]

https://cds.cern.ch/record/405662/files/PhysRev.47.777.pdf
1•northlondoner•45m ago•1 comments

Kessler Syndrome Has Started [video]

https://www.tiktok.com/@cjtrowbridge/video/7602634355160206623
2•pbradv•47m ago•0 comments

Complex Heterodynes Explained

https://tomverbeure.github.io/2026/02/07/Complex-Heterodyne.html
4•hasheddan•48m ago•0 comments

MemAlign: Building Better LLM Judges from Human Feedback with Scalable Memory

https://www.databricks.com/blog/memalign-building-better-llm-judges-human-feedback-scalable-memory
1•superchink•1h ago•0 comments

CCC (Claude's C Compiler) on Compiler Explorer

https://godbolt.org/z/asjc13sa6
2•LiamPowell•1h ago•0 comments

Homeland Security Spying on Reddit Users

https://www.kenklippenstein.com/p/homeland-security-spies-on-reddit
28•duxup•1h ago•6 comments

Actors with Tokio (2021)

https://ryhl.io/blog/actors-with-tokio/
1•vinhnx•1h ago•0 comments

Can graph neural networks for biology realistically run on edge devices?

https://doi.org/10.21203/rs.3.rs-8645211/v1
1•swapinvidya•1h ago•1 comments

Deeper into the shareing of one air conditioner for 2 rooms

1•ozzysnaps•1h ago•0 comments

Weatherman introduces fruit-based authentication system to combat deep fakes

https://www.youtube.com/watch?v=5HVbZwJ9gPE
3•savrajsingh•1h ago•0 comments

Why Embedded Models Must Hallucinate: A Boundary Theory (RCC)

http://www.effacermonexistence.com/rcc-hn-1-1
1•formerOpenAI•1h ago•2 comments

A Curated List of ML System Design Case Studies

https://github.com/Engineer1999/A-Curated-List-of-ML-System-Design-Case-Studies
3•tejonutella•1h ago•0 comments

Pony Alpha: New free 200K context model for coding, reasoning and roleplay

https://ponyalpha.pro
1•qzcanoe•1h ago•1 comments

Show HN: Tunbot – Discord bot for temporary Cloudflare tunnels behind CGNAT

https://github.com/Goofygiraffe06/tunbot
2•g1raffe•1h ago•0 comments
Open in hackernews

LLM-feat: Python library for automated feature engineering with Pandas

https://pypi.org/project/llm-feat/
1•srinivaskumarr•1mo ago

Comments

srinivaskumarr•1mo ago
*What My Project Does:*

llm-feat is a Python library that uses OpenAI LLMs (like GPT-4) to automatically generate feature engineering code for pandas DataFrames. You provide your DataFrame and metadata describing what each column means, and the LLM generates context-aware feature engineering code that actually makes sense for your domain.

The library works directly in Jupyter notebooks - when you call the function, the generated code automatically appears in the next cell. You can also get detailed reports explaining the rationale behind each feature, which helps you understand what the LLM is thinking and why certain features were created.

Under the hood, it uses GPT-4's understanding of domain context to generate features that are specific to your problem. For example, when tested on a medical dataset, it generated clinically relevant features like lipid ratios (LDL/HDL) and BMI interactions that a generic rule-based library wouldn't know to create.

*Target Audience:*

This library is designed for:

- Data Scientists and ML Engineers building predictive models who want to speed up the feature engineering process without sacrificing domain relevance.

- ML Practitioners working on real projects who need production-ready tools (I've been using it in my own work), especially useful during the exploratory phase when you're trying to figure out what features might work.

- Anyone tired of manually engineering features and wants an intelligent assistant that understands context rather than just applying generic transformations.

*Comparison:*

vs. Rule-based libraries (featuretools, tsfresh): These libraries use predefined transformation rules that work across all domains but don't understand context. llm-feat uses LLMs to understand your specific domain and generate features that are relevant to your problem. For example, on a medical dataset, it generated lipid ratios and composite risk scores that a generic library wouldn't create.

vs. AutoML tools (AutoGluon, H2O AutoML): AutoML tools are black boxes that handle the entire ML pipeline. llm-feat gives you the actual code to review, modify, and understand. You maintain full control over your feature engineering process while getting intelligent suggestions.

vs. Manual feature engineering: Obviously much faster - what would take hours of domain research and coding happens in seconds. Plus, the LLM often suggests features you might not have thought of.

*Results:*

Tested on the Diabetes dataset: - Baseline: RMSE 54.33 with 10 original features - With LLM features: RMSE 53.53 with 20 features (10 original + 10 generated) - Improvement: 1.47% RMSE reduction, R² improved from 0.44 to 0.46

The generated features included lipid ratios, BMI interactions, and composite risk scores that were clinically relevant and improved model performance.

*Links & Source:*

GitHub: https://github.com/codeastra2/llm-feat

PyPI: pip install llm-feat

I would love feedback on the API design or suggestions for improvements!