Show HN: LLM Based Spark Profiler

27•ambrood•1w ago

Hey HN,

Spark event logs run into 100s of MBs and offer a wealth of insight into your workloads but making sense of them has always been quite a bit prohibitive. We’ve recently built a lightweight tool that automatically parses Spark event logs and surfaces targeted insights to help you optimize your data jobs.

Whether you’re chasing down a bottleneck or balancing performance vs. cost, the profiler got you covered with real-time configuration recommendations, data skew analysis, and more.

Curious how it works in action? Check out this quick Loom video for a walk-through: https://www.loom.com/share/07348eb54f6b440da93f96753937792a?...

We’d love your feedback — check it out at https://app.datasre.ai and let us know what you think!

Comments

emgeee•1w ago

fellow co-founder here! One fun thing about this project is the entire frontend was vibe-coded using Bolt in a few days.

skeptrune•1w ago

Very awesome. Not having to burn time on a UI that looks and feels nice is a huge win.

vector_spaces•1w ago

Maybe you mentioned it in your demo and I missed it, but how does this differ pasting the log messages to ChatGPT / Claude / another LLM? Is it mainly that yours can iterate over a large logfile without blowing up the context window?

Does it suffer from the same issue as other LLMs, where it will always identify potential optimizations or improvements even if none are truly needed?

ambrood•1w ago

> Maybe you mentioned it in your demo and I missed it, but how does this differ pasting the log messages to ChatGPT / Claude / another LLM? Is it mainly that yours can iterate over a large logfile without blowing up the context window?

We do quite a bit of aggregation over the log file, and generate summary stats and choose what bits to stuff in the LLM. Plan to support more platforms than just spark.

> Does it suffer from the same issue as other LLMs, where it will always identify potential optimizations or improvements even if none are truly needed?

Funnily enough, instructing sonnet-3.7 to not suggest unnecessary optimisations seems to have done the trick!

ztratar•1w ago

Also curious how the agent works?

Show HN: AgentAPI – HTTP API for Claude Code, Goose, Aider, and Codex

Show HN: I rebuilt my AI browser game using 100 pieces of feedback from HN

Show HN: I built an AI-powered packaged food ingredients list vegetarian scanner

Show HN: Happy Little Monoliths, First Edition

Show HN: HN Watercooler – listen to HN threads as an audio conversation

Show HN: Stadium Crowd Scale Visualise Large Groups of People

Show HN: val – An arbitrary precision calculator language

Show HN: Plandex v2 – open source AI coding agent for large projects and tasks

Show HN: Unsure Calculator – back-of-a-napkin probabilistic calculator

Show HN: Zuni (YC S24) – AI Copilot for the Browser

Show HN: Weblook – a headless webapp screenshot tool written in Rust

Show HN: We Put Chromium on a Unikernel (OSS Apache 2.0)

Show HN: Startup Success Calculator

Show HN: Torque – A lightweight meta-assembler for any processor

Show HN: Resonate – real-time high temporal resolution spectral analysis

Show HN: I built a deep learning engine from scratch in Python

Show HN: Lit.money – Ethically designed to be a private, simple way to see money

Show HN: Resurrecting Infocom's Unix Z-Machine with Cosmopolitan

Show HN: Nicely designed editor for mockups and screenshots

Show HN: BorgLens – Securely access your Borg backup anywhere, anytime on iOS

Show HN: Open-Source Conversational Analytics

Show HN: MCP-Shield – Detect security issues in MCP servers

Show HN: InferX – an AI-native OS for running 50 LLMs per GPU with hot swapping

Show HN: Serverless MCP – Debug AWS serverless resources in your IDE

Show HN: Zero-codegen, no-compile TypeScript type inference from Protobufs

Show HN: Chonky – a neural approach for text semantic chunking

Show HN: I made a free tool that analyzes SEC filings and posts detailed reports

Show HN: C++ library for embedded and IoT projects (ESP32)

Show HN: Fp-filters – A curated collection of TS/JS array filter functions

Show HN: Single-Header Profiler for C++17