frontpage.
newsnewestaskshowjobs

Made with ♥ by @iamnishanth

Open Source @Github

Dropbox Is Down

https://status.dropbox.com/
1•davidcox143•43s ago•1 comments

My Microsoft MultiMedia Keyboard 1.0A is dead

https://kotaku.com/mechanical-keyboard-microsoft-squidgy-keys-1851780616
1•ericzawo•3m ago•0 comments

Yume – Transform your content with GPU shaders

https://yume.sh/
1•andrew_rfc•4m ago•0 comments

Unhappy Meals (2007)

https://www.nytimes.com/2007/01/28/magazine/28nutritionism.t.html
1•Tomte•6m ago•0 comments

How to systematically secure anything (2023)

https://github.com/veeral-patel/how-to-secure-anything
1•Tomte•6m ago•0 comments

A Framework for Defining and Refining Your ICP

https://a16z.com/framework-define-refine-icp/
1•tzury•7m ago•0 comments

Keeping Time on a Stream

https://s2.dev/blog/timestamping
1•shikhar•8m ago•0 comments

Beyond the Wrist: Debugging RSI

https://www.debugyourpain.org/docs/main_posts/understand/debugging_rsi/
3•luu•9m ago•0 comments

54 years ago, a computer programmer fixed a bug, created an existential crisis

https://www.inverse.com/innovation/blinking-cursor-history
2•cpeterso•9m ago•0 comments

How the 'end of history' illusion shapes your life choices

https://www.bbc.com/future/article/20230619-how-the-end-of-history-illusion-shapes-your-life-choices
2•jhncls•10m ago•0 comments

Warners Reverses Course: Changes Max's Name Back to HBO Max

https://www.hollywoodreporter.com/business/digital/max-name-change-hbo-max-upfronts-1236216616/
2•jaredwiener•12m ago•0 comments

How we (re)built our AI agent for code reviews in IDEs

https://www.coderabbit.ai/blog/how-we-built-our-ai-code-review-tool-for-ides
1•smb06•14m ago•1 comments

In search of a dynamist vision for safe superhuman AI

https://helentoner.substack.com/p/dynamism-vs-stasis
1•stephenflanders•14m ago•0 comments

ONOX: The all-electric tractor with swappable battery packs

https://electrek.co/2025/05/13/meet-onox-the-all-electric-tractor-with-swappable-battery-packs/
1•gnabgib•15m ago•1 comments

Trump admin ends extreme weather database that has tracked cost of disasters

https://www.cnn.com/2025/05/08/climate/noaa-ends-disaster-database
5•vinnyglennon•16m ago•0 comments

Learning pointers 10 years too late

https://codebynight.dev/posts/day-2-of-learning-go-the-pointers-i-finally-understood-10-years-later/
2•shivc•17m ago•0 comments

CISA changes vulnerabilities updates, shifts to X and emails

https://www.theregister.com/2025/05/12/cisa_vulnerabilities_updates_x/
2•rbanffy•17m ago•1 comments

TikTok is using AI-generated alt text to describe photos

https://www.theverge.com/news/666632/tiktok-accessibility-ai-generated-alt-text-contrast-bold
1•01-_-•18m ago•1 comments

Students Are Short-Circuiting Their Chromebooks for a Social Media Challenge

https://www.nytimes.com/2025/05/14/us/tiktok-trend-school-laptops-fire.html
5•ChrisArchitect•19m ago•0 comments

Show HN: I made a client MCP react app for Supabase's MCP server

https://github.com/tambo-ai/supabase-mcp-client
1•michaelmilst•20m ago•0 comments

China and Russia have signed a deal to build a nuclear power plant on the moon

https://www.scmp.com/news/china/science/article/3310315/china-and-russia-sign-nuclear-reactor-deal-fuel-lunar-research-station
2•zachguo•21m ago•0 comments

Free AI Code Reviews for Cursor, Windsurf and VS Code: CodeRabbit in IDE

https://www.coderabbit.ai/blog/ai-code-reviews-vscode-cursor-windsurf
1•smb06•26m ago•1 comments

Cell cycle duration determines oncogenic transformation capacity

https://www.nature.com/articles/s41586-025-08935-x
3•PaulHoule•26m ago•0 comments

Show HN: Turn any workflow diagram into compilable, running and stateful code

https://workflows.diagrid.io/
1•yaronsc•26m ago•0 comments

Teaching Kids about Money in the Age of Tap to Pay

https://blog.tendollaradventure.com/teaching-kids-about-money-in-a-digital-age/
2•dskhatri•28m ago•0 comments

Which Carnegie made a bigger impact on our world?

https://tkan.medium.com/which-carnegie-made-bigger-impact-to-our-world-andrew-or-dale-2abe785c7de8
1•giardini•29m ago•2 comments

Show HN: Build a free linktree alternative that skips in-app-browsers

https://www.link-it.bio/
2•DevEric•30m ago•0 comments

Suspected industrial discharges of PFAS Map

https://www.ewg.org/interactive-maps/2021_suspected_industrial_discharges_of_pfas/map/
2•mfro•30m ago•0 comments

Modding TikTok to only show Cat Videos (2023)

https://bryce.co/cattok/
1•charlieirish•31m ago•0 comments

Performance Testing – WebKit

https://webkit.org/performance/
1•whatever3•31m ago•0 comments
Open in hackernews

Ask HN: How are you cleaning and transforming data before imports/uploads?

21•dataflowmapper•1h ago
Hi all,

I’m curious how folks handle the prep work for data imports/uploads into systems like Salesforce, Workday, NetSuite, or really any app that uses template based import for data loading, migration, or implementation.

Specifically: - How do you manage conversions/transformations like formatting dates, getting everything aligned with the templates, mapping old codes to new ones, etc.

- Are you primarily using Excel, custom scripts, Power Query or something else?

- What are the most tedious/painful parts of this process and what have you found that works?

Really appreciate any insights and am curious to learn from everyone's experience.

Comments

PaulHoule•1h ago
"Scripts" in Python, Java and other conventional programming languages (e.g. whatever it is you already use)

Not Bash, not Excel, not any special-purpose tool because the motto of those is "you can't get there from here". Maybe you can get 80% of the way there, which is really seductive, but that last 20% is like going to the moon. Specifically, real programming languages have the tools to format dates correctly with a few lines of code you can wrap into a function, fake programming languages don't. Mapping codes is straightforward, etc.

chaos_emergent•1h ago
for the longest time I envisioned some sort of configuration specification that could retrieve URLs, transform and map data, handle complex conditional flows...and then I realized that I wanted a Normal Programming Language for Commerce and started asking o3 to write me Python scripts.
aaronbrethorst•1h ago
Hell, for me, would be what you described and implemented in Yaml.
dataflowmapper•43m ago
Yeah programming definitely offers most flexibility if you have that skillset. I'm particularly interested in your 'last 20% is like going to the moon' analogy for special-purpose tools or even Excel/Bash. Do you have any examples off the top of your head of the kinds of transformation or validation challenges that you find fall into that really difficult 20%, where only a 'real programming language' can effectively get the job done?
stop50•1h ago
Python's csv module is extremly powerful. It has done what i needed to do with it.
chaos_emergent•1h ago
as I sit in front of my computer waiting for a transformation-for-import job to complete, I can describe my basic workflow:

1. define a clean interface target - for me, that's an interface that I made for my startup to import call data.

2. explore the data a little to get a sense of transformation mappings.

3. create a PII-redacted version of the file, upload it to ChatGPT along with the shape of my interface, ask it to write a transformation script in Python

4. run it on a subset of the data locally to verify that it works.

5. run it in production against my customer's account.

I'm curious - that seems like a reasonably standard flow, and it involves a bit of manual work, but it seems like the right tradeoff between toil and automation. Do you struggle with that workflow or think it could be better somehow?

dataflowmapper•1h ago
Thanks for sharing that workflow, for more straight forward flows, that sounds like a decent approach. My main thoughts on where it could be improved, or where I see potential struggles, are when:

  - People aren't comfortable or familiar with coding/Python.
  - You get into more complex imports like historical data, transactional data, etc. There you might have like 15 transaction types that have to be mapped, all with different fields, math, and conditional logic where the requirements become too much for just prompting ChatGPT effectively, and iterating on the Python can get pretty involved.
  - The source data structure and transformation needs aren't consistently the same, leading to a lot of 'throwaway' or heavily modified scripts for each unique case.
  - Tasks like VLOOKUPs or enriching data come into play, which might add manual coding or complexity beyond a simple 1-to-1 source-to-destination script.
These are the areas where I'm exploring if a more structured way could offer benefits in terms of repeatability and accessibility for a wider range of users or complex scenarios. Appreciate the insight into your process and your thoughts on this.
schmookeeg•1h ago
We're an AWS shop, so for lightweight or one-off stuff, it's a typescript lambda. Everything else ends up in a python script to output glue-friendly stuff to S3.

Assume at some point, the data will bork up.

If you ingest Excel (ugh), treat it like free range data. I have a typescript lambda that just shreds spreadsheets in a "ok scan for this string, then assume the thing to the right of it is this value we want" style -- it's goofy AF but it's one of my favorite tools in the toolbox, since I look magical when I use it. It allows me to express-pass janky spreadsheets into Athena in minutes, not days.

It is based on the convert-excel-to-json library and once you grok how it wants to work (excel -> giant freaky JSON object with keys that correspond to cell values, so object.A, object.B, object.C etc for columns. array index for row number), you can use it as a real blunt-force chainsaw approach to unstructured data LARPing as an excel doc :D

aerhardt•1h ago
DuckDB, Python and LLMs. I can write in more detail when I have time!
francisofascii•56m ago
Writing C# / LINQ scripts for this gives you the flexibility to deal with whatever impedance mismatch you have. It gets tedious and maybe makes less sense when you have dozens of model properties that are straight copy property X from A to B. Then maybe a ETL tool like FME makes more sense.

Date example:

var dateValue = DateTime.ParseExact(yyyymmdd, "yyyyMMdd", null); var dateString = dateValue.ToString("yyyy-MM-dd HH:mm:ss")