frontpage.

While running llama.cpp on a Samsung Galaxy Watch 4 Classic (armeabi-v7a, Mali G68), I noticed the Vulkan backend was rejecting every quantized MUL_MAT operation despite reporting "33/33 layers offloaded to GPU".

Root cause: a missing block size division in tensor stride calculation inside create_tensor() in llama-model-loader.cpp. The wrong stride cascades into ggml_nbytes() overflow, exceeding max_buffer_size on 32-bit where size_t is 32-bit.

On 64-bit devices the overflow is silently masked — wrong value but still within GPU memory limits so nobody noticed. Bug has likely been there for years.

Fix and context: https://github.com/Perinban/llama.cpp/tree/axon-dev

MCP Connectors for Marketers: The 5-Minute Setup

They thought they were downloading Claude Code source

Chinese scientists unveil glowing Avatar

Show HN: SkillCompass – Diagnose and Improve AI Agent Skills Across 6 Dimensions

New Rowhammer attacks give complete control of machines running Nvidia GPUs

What "Parse, don't validate" means in Python? (2025)

Why AI lies, cheats and steals

LogHub: A large dataset of real-world logs to benchmark your tools

Ask HN: Alternatives to Kagi Assistant?

Math Encounters-142857: A Magical Number Everyone Should Know – Manjul Bhargava [video]

Q148637: Windows 95/98 Overwrites Boot-Sector Field on Floppy Disks (2001)

Show HN: Save to Linkding, iOS/iPadOS system extension and app

Built computing and data science package with JavaScript

Litha2022,project by Neksha DeSilva, is now associated with techfirm Lakpura LLC

Silicon Valley's Billion Dollar Design Scams

UK social media users less active on tech platforms due to rise of video apps

What 16 Security Engines Found in 2,900 MCP Servers

Show HN: Wazear – A visual AI orchestrator where agents review each other

Show HN: Gemma 4 based local RAG on 25 Years of news articles

Refrigeration 101 [video]

What Claw Code Reveals About AI Coding Agent Architecture (5-Part Series)

AST-copy – Fast file copier with dedup and SSH tar streaming

AI models will deceive you to save their own kind

Linux Kernel Hits Record Correct Bug Reports Thanks to AI

André Arko: Towards an Amicable Resolution with Ruby Central

Deploying Agent Fleets Governed

Relaunching the Instaparser API

Emotion concepts and their function in a large language model

Ask HN: The repo is the app. Codex is the runtime. Could this be future pattern?

The Family That Decided to Have Their Stomachs Removed

Fixed a llama.cpp bug silently disabling Vulkan GPU on all 32-bit ARM devices