Show HN: AnydocAI – Every file exists as all file types

22•grandslammer•1mo ago

I work with AI tools often and I find myself constantly swapping between file types to navigate different sites and use cases. I built anydoc — I imagined a universal document format drive layer that would allow me to essentially have a file that is many different file types at once. CSV for organizing, markdown for feeding into LLMs, HTML for fast sharing... so here it is and I am working on making it better all of the time. Just shipped an update that makes the most relevant format types available faster on a document upload @boshjerns on X

Comments

gus_massa•1mo ago

I think calling it a "drive" will confuse a lot of people. I expected a device driver, and not technical people have no idea of drives, they just have documents and photos.

I tried with a text file

  X Y
  1 2
  3 4

and for some reason the convert version has 1 2 3 4 in the same row.

gus_massa•1mo ago

Too late to edit: Google drive is call "drive", so I guess it's more usual than I noticed.

alentred•1mo ago

Hm, I don't know, I am OK with "drive". Google Drive, Microsoft OneDrive, iCloud Drive.

grandslammer•1mo ago

It really does just make sense too. We have hard drive so I wouldn’t we have digital drives.

unsnap_biceps•1mo ago

Have you considered writing this as a FUSE system rather than a web service?

nkrisc•1mo ago

So it turns one file into many? Or is it actually one file that is simultaneously a valid HTML document and PNG?

IAmBroom•1mo ago

According to what I read, the latter.

c0wb0yc0d3r•1mo ago

And you’d be mislead. The video shows the original file is converted to different formats, depending on the user’s selection. The video shows jpeg to html (using AI to perform OCR?).

Pandoc but extra AI steps.

grandslammer•1mo ago

That argument really skips over what most people actually need. Nobody outside of a tech bubble wants to learn half a dozen Pandoc flags, stitch together shell commands and temp files, or write Lua filters just to reshape a document. With our drive layer you literally rename a file or type “make this header bold and export as PDF” and the work just happens, no scripts required.

This isn’t about replacing power-user workflows, it’s about giving anyone on your team the ability to reshape data and documents without ever opening a terminal. You getflexibility with the simple UX of renaming a file. Calling it “Pandoc plus AI” misses the fact that 90 percent of users neither know nor care about Pandoc’s internals. They just want “I have a file, make it look like this, or formatted with these sections to share with X person who works in X field...” and that’s exactly what our natural-language, filesystem-driven approach delivers.

grandslammer•1mo ago

It’s basically an access layer that gives you quick access to all the different conversions of the files in one place, but it also allows you to redesign them with natural language so that you can configure them for your needs on the fly

RileyJames•1mo ago

It’s an interesting idea.

I’ve definitely felt the pain of file formats in some unexpected ways recently.

Like airdropping a photo from iPhone only to discover a .HEIC file, which nothing will accept.

I’ve previously used “what ever turns up first on google”, but I now won’t for anything of significance (privacy)

I’ve recently discovered Automator (on Mac) and the quick actions menu. Which can achieve a lot of image and pdf related conversions, but takes some setup (not a mass market solution)

I like the idea of this product. But I think the challenge will be: - reaching the user at the moment they have this problem

- making your solution frictionless to solve their immediate problem, while also bootstrapping to solve it next time around (without them forgetting it exists)

If you can nail that experience for a single use case, I think this will be a winner.

DontchaKnowit•1mo ago

I think the real problem is getting it to actually work....

grandslammer•1mo ago

i think i hit credit limits because so many people were using the app all of a sudden and i'm just like using my own funds for api costs and had a cap on my openai account

grandslammer•1mo ago

Hey, I’m just catching up here and I really appreciate the feedback and I’m gonna work to integrate all this feedback into the application and repost about it again I really appreciate you

grandslammer•1mo ago

Let me actually work on this HEIC issue right now. I think that I know a fix for this.

globular-toast•1mo ago

I like the idea if it was deterministic. So if there are standard ways to convert to/from document types, like Pandoc, being able to write to any one of them and have it update the rest would be interesting.

I hate it if it's built with "AI". Can't imagine a use case for this apart from just shit you don't care about. Why would I be hoarding data I don't care about?

grandslammer•1mo ago

It’s not about hoarding data rather it’s about the malleability of the data itself so for me, I’m constantly working with data but need to format the way that it is displayed whether it’s the file type or the way that the data is given in the specific file if it’s a CSV for example so an application like this allows me to quickly reformat the files with natural language to maybe make them an HTML where I could share a certain form of document HTML file or take that information and reverted into a CSV format. I need to configure it with a management system or something like that.

SPBS•1mo ago

the page is really laggy on Edge, kills any interest in wanting to explore more (strangely, it's much snappier on Chrome)

grandslammer•1mo ago

I will work to fix this asap I just caught up here

ramoz•1mo ago

AI as a use case doesn’t make sense to me.

You’re using AI to create a transpilation of whatever modality. It’s a wasted step if the purpose is to feed back into AI.

cyanydeez•1mo ago

keep in mind, almost all the uses of the current AI are to generate some unstable product that whimsically can change given a butterfly's wings in Japan.

grandslammer•1mo ago

I literally find myself using this tool every day because I need to use natural language to reformat files and the data that are in the files like CSV‘s or markdown so maybe this isn’t useful for you but it definitely is useful to have the LM be able to interpret your natural language to redesign the file the way that you want it to give the information

ramoz•1mo ago

You're talking about a commodity interaction at this point your tool offers nothing different from a chatbot other than your confusing semantics and abstraction.

What Im saying: If the point is to "convert this csv to markdown so i can feed the markdown to a LLM to ask questions about it" etc... it is a completely unnecessary step.

Your service is nothing more than:

1. augmented metadata for files; btw if that requires a whole new drive-oriented solution then you're doing too much.

2. llm api wrapper for a commoditized capability (custom format/or transpilation)

grandslammer•1mo ago

The friction isn’t “can I call an LLM,” it’s every time I want to do anything with this file I have to:

open it in a tool that understands the format,

export / paste the part I care about,

phrase an LLM prompt,

paste the result back,

do this all again if i want the data formatted differently for different use cases.

adding the ability to format your data and view/download that natively, fast is like giving python scripting capabilities to normal users. You're thinking like a dev not like a business owner who may want to take a picture of a timesheet and have that immediately become a CSV then have it reformatted for a management system they use, all on the fly through natural language... there's so many ways that normal people navigate files and formats and I want to give these people some superpowers that they won't seek out themselves.

the gpt-wrapper argument is so played out. just like you’d say “my app is a GPT-wrapper” (it wraps the OpenAI API in a file-centric UX), you could say “Google Drive is a distributed-storage-wrapper” or “a cloud-storage-and-sync wrapper.” It’s the polished frontend and glue that makes the raw backend useful to end users.

voidUpdate•1mo ago

So I can have my exe convert into a shapefile and an mp3?

voidUpdate•1mo ago

Well I tried to convert an exe into a pptx and it outputted a file that looked like an attempt at html, saying that the conversion wasn't feasible due to its nature and size

lloeki•1mo ago

You have it the wrong way around? Usually you are handed a pptx by customers and your job is to turn that into an exe.

voidUpdate•1mo ago

I mean the website cays it can convert anything to anything ("every file exists as all file types all of the time") so it should be able to do exe to pptx

thebeardisred•1mo ago

You missed the joke

jeroenhd•1mo ago

Based on important research like https://www.youtube.com/watch?v=uNjxe8ShM-8, it should definitely be possible to generate a .pptx that will run Windows inside of an emulator inside PowerPoint slides. That HTML file is lying to you!

grandslammer•1mo ago

i need to work on the pptx conversions and some of the other file types specifically- now i'm on this

grandslammer•1mo ago

I’m actually working on some really interesting conversions right now

dgan•1mo ago

i ve read the title 5 times, and can't make sense of it. Is this even valid English ?

Akronymus•1mo ago

>Imagine a drive[,] where every file exists[,] as all file types[,] all of the time

Basically treating one file type as if it were any arbitrary other file type

quesera•1mo ago

Punctuated like that, I can't help reading it in the movie trailer guy[0] voice.

[0] https://en.wikipedia.org/wiki/Don_LaFontaine .. wow, dead for 17 years!

grandslammer•1mo ago

Hahahahaha thank you??

grandslammer•1mo ago

I may need to work on the short pitch

sigmaisaletter•1mo ago

It's a fancy file format conversion utility.

Am I missing something?

dsr_•1mo ago

Yes: it's a fancy file format conversion service that adds errors so your QA people have more work.

emadda•1mo ago

Could have called it quantumdoc

grandslammer•1mo ago

It’s not too late

jy14898•1mo ago

Now make it an HTTP API where content negotiation always succeeds

ramses0•1mo ago

Back in the day there were a bunch of `x2y` programs[1], like html2pdf, xls2csv, rst2odt, jpg2png, png2jpg, etc...

You could imagine something like `any2zip`, or `any2tgz` or `iso2mp4` or something.

It seems like there could/should be some sort of virtual filesystem where you could say "cat inventory.xls.csv", or "wine.exe excel.exe inventory.csv.xls" (please bear with me on these examples). Effectively "$BLOB.format.format", where "." becomes a sort of "convert to this $TYPE".

Imagine being able to say:

    `echo "# Hello\n\n * World" > README.md ; cat README.md.html"`
    (effectively invoking `md2html`)
    
    `printer README.md.html.pdf`
    (eg: `cat README.md | md2html | html2pdf | printer`)

...if you requested `README.md.pdf`, maybe it could intuit the intermediate `md2html2pdf` (HTML) portion?

I really wish local linux filesystems (for end-users) would at least match Apple's capabilities. eg: `$RECENT`, spotlight, auto-OCR. We've really regressed since the era of `locate`, but I'd _LOVE_ some sort of modern equivalent.

Imagine: `inotify`, `auditd`, just anything that can avoid full-disk scans during "normal end user" daily operation... wired up to `llm-summary $FILE >> sqlite.db ; `llava-describe $IMAGE >> sqlite.db ; etc...`

For bonus points, catch anything missed with some sort of full daily/weekly backup operation. We're on the cusp of a much more intimate "partnership" with the compute boxes underneath our desks, but so much is getting sucked into the void of "the network is the computer".

[1]: compgen -c | grep 2 | grep -v '2$' | grep -v '\.2' | grep -v '2\.'

[2]: https://en.wikipedia.org/wiki/Locate_(Unix)

RetroTechie•1mo ago

> Back in the day there were a bunch of `x2y` programs[1], like html2pdf, xls2csv, rst2odt, jpg2png, png2jpg, etc...

They're still around. A problem is loss of information on each conversion. For example, wav->mp3 loses info. Converting back (mp3->wav) won't get you the exact .wav you started with. Similar thing with file types supporting different resolution graphics, vector vs. bitmap, metadata being stripped, features in format A not supported in format B, etc.

Another problem is the explosion of M:N file format combinations. A possible fix would be a universal (?) in-between format, functioning as a container for [portions of a file] + whatever metadata was extracted from original. That way you can at least do conversions along the lines of video container formats, where container type is changed but video inside does not get decompressed/re-encoded. Or simular operations like extracting/shuffling pages in a pdf document.

All in all this is not an easy problem & therefore unlikely to be solved anytime soon.

grandslammer•1mo ago

really appreciate you adding to the discourse here - I'm not sure if you got a chance to test out the site but I refilled my credits after the surge of attention and would love if you checked it out! also @boshjerns on X if you want to reach out to chat

TomMasz•1mo ago

I got "No video with supported format and MIME type found." in the How It Works section.

troyvit•1mo ago

I think you can get past that if you download the video, then upload it back up to anydoc and ask it to translate it to Markdown.

edit: /s

grandslammer•1mo ago

hahahaahhaha nice one

grandslammer•1mo ago

Video files are not supported right now is probably the issue. Working on this because I’m going to have to pass the video into frames and then feed the frames into a model and I just need to work this out a little bit more

Y_Y•1mo ago

I thought this was something like a FUSE driver that would on-the-fly generate any file you tried to read, with some consistency. Like if you open stories/zombie-party.txt it will have some generative network make it, and cache it. If you later ask for stories/zombie-party.odt it can just do a conversion.

I vibe-coded a demo of such a thing, with the idea of making game assets like textures/outdoor/wall.jpg etc. You can do it easily enough, but you need to be patient, and not particularly discerning.

raphman•1mo ago

FWIW, I wrote a small paper on this general topic a few years ago, collecting earlier work and own ideas.

"Files as Directories: Some Thoughts on Accessing Structured Data within Files"

https://dl.acm.org/doi/pdf/10.1145/3191697.3214323

RetroTechie•1mo ago

Says "get access" with a "locked" icon.

Is this paper freely available somewhere?

kurtoid•1mo ago

Here: https://www.shift-society.org/salon/papers/2018/revised/file...

Link found here: https://scholar.google.com/scholar?cluster=14832107127874645...

raphman•1mo ago

Thanks! Sorry - I didn't realize that the paper in the ACM DL is not open-access.

grandslammer•1mo ago

i think we have some similar thoughts - i am working on a file format that accomplishes something like this

quesera•1mo ago

I played with this idea for media servers.

I want iTunes and Audiobookshelf and beets and Jellyfin, etc to all work on the same filesystem and media archive.

There are challenges.

lawlessone•1mo ago

Wouldn't this make every file a lot bigger?

grandslammer•1mo ago

It’s not really saving everything into one file type rather than allowing a layer. That access is all the file types easily and fast.

jeroenhd•1mo ago

You might not want to use https://anydocai.com/result/<incremental number> for URLs like that. Anyone can enumerate the ~300 files from the home page and look at what others have uploaded.

That said, the website doesn't seem to work anymore. It just errors out.

pavel_lishin•1mo ago

I wonder if they ran out of credits.

grandslammer•1mo ago

Yeah, this is exactly what happened. I did not expect this or catch up until just now, but I just fixed it.

grandslammer•1mo ago

I didn’t expect this to go semi viral on here so I just refilled the credits. It actually ran out of credits for open AI.

woleium•1mo ago

gemini is cheaper, probably

grandslammer•1mo ago

Also, as far as the enumeration users are only authorized to access the files that they’ve created in our system, but I should definitely obscure the file count

grandslammer•1mo ago

I’m like a mid-level developer though so if I messed up the authorization access and you worked around it in someway if you let me know that would be sick @boshjerns on X

_wire_•1mo ago

Imagine no file types

♪ It's easy if you try No hell below us Above us only sky Imagine all the people Visualize whirled peas Ah ah ah oooo!

You may say I'm a dreamer...

grandslammer•1mo ago

real one

The New Skill in AI Is Not Prompting, It's Context Engineering

I write type-safe generic data structures in C

The hidden JTAG in a Qualcomm/Snapdragon device’s USB port

There are no new ideas in AI only new datasets

The Original LZEXE (A.K.A. Kosinski) Compressor Source Code Has Been Released

They don't make 'em like that any more: Sony DTC-700 audio DAT player/recorder

Show HN: TokenDagger – A tokenizer faster than OpenAI's Tiktoken

Show HN: New Ensō – first public beta

The provenance memory model for C

Xfinity using WiFi signals in your house to detect motion

End of an Era

Creating fair dice from random objects

Donkey Kong Country 2 and Open Bus

Ask HN: What Are You Working On? (June 2025)

14.ai (YC W24) hiring founding engineers in SF to build a Zendesk alternative

Datadog's $65M/year customer mystery solved

The Plot of the Phantom, a text adventure that took 40 years to finish

CertMate – SSL Certificate Management System

Ask HN: 80s electronics book club; anyone remember this illustrator?

Show HN: We're two coffee nerds who built an AI app to track beans and recipes

Ask HN: What's the 2025 stack for a self-hosted photo library with local AI?

Researching LED Displays for the Time Circuits

Entropy of a Mixture

Jacobi Ellipsoid

Asynchronous Error Handling Is Hard

Printegrated Circuits: Merging 3D Printing and Electronics

Reverse Engineering Vercel's BotID

New proof dramatically compresses space needed for computation

Auth for B2B SaaS: it's not like auth for consumer software

Cloud-forming isoprene and terpenes from crops may drastically improve climate

The New Skill in AI Is Not Prompting, It's Context Engineering

I write type-safe generic data structures in C

The hidden JTAG in a Qualcomm/Snapdragon device’s USB port

There are no new ideas in AI only new datasets

The Original LZEXE (A.K.A. Kosinski) Compressor Source Code Has Been Released

They don't make 'em like that any more: Sony DTC-700 audio DAT player/recorder

Show HN: TokenDagger – A tokenizer faster than OpenAI's Tiktoken

Show HN: New Ensō – first public beta

The provenance memory model for C

Xfinity using WiFi signals in your house to detect motion

End of an Era

Creating fair dice from random objects

Donkey Kong Country 2 and Open Bus

Ask HN: What Are You Working On? (June 2025)

14.ai (YC W24) hiring founding engineers in SF to build a Zendesk alternative

Datadog's $65M/year customer mystery solved

The Plot of the Phantom, a text adventure that took 40 years to finish

CertMate – SSL Certificate Management System

Ask HN: 80s electronics book club; anyone remember this illustrator?

Show HN: We're two coffee nerds who built an AI app to track beans and recipes

Ask HN: What's the 2025 stack for a self-hosted photo library with local AI?

Researching LED Displays for the Time Circuits

Entropy of a Mixture

Jacobi Ellipsoid

Asynchronous Error Handling Is Hard

Printegrated Circuits: Merging 3D Printing and Electronics

Reverse Engineering Vercel's BotID

New proof dramatically compresses space needed for computation

Auth for B2B SaaS: it's not like auth for consumer software

Cloud-forming isoprene and terpenes from crops may drastically improve climate

Show HN: AnydocAI – Every file exists as all file types

Comments