The pattern is effectively long-running research tasks that drive a search tool. You give them a prompt, they churn away for 5-10 minutes running searches and they output a report (with "citations") at the end.
This Tongyi model has been fine-tuned to be really good at using its search tool in a loop to produce a report.
So without specifying which model is being used, it's really hard to know what is better than something else, because we don't understand what the underlying model is, and if it's better because of the model itself, or the tooling, which feels like an important distinction.
Constraints are the fun part here. I know this isn't the 8x Blackwell Lamborghini, that's the point. :)
If you really do have a 2080ti with 128gb of VRAM, we'd love to hear more about how you did it!
get the biggest one that will fit in your vram.
This comfortably fits FP8 quantized 30B models that seem to be "top of the line for hobbyists" grade across the board.
- Ryzen 9 9950X
- MSI MPG X670E Carbon
- 96GB RAM
- 2x RTX 3090 (24GB VRAM each)
- 1600W PSU
Of course this is in a single-user environment, with vLLM keeping the model warm.
It's slower than a rented Nvidia GPU, but usable for all the models I've tried (even gpt-oss-120b), and works well in a coffee shop on battery and with no internet connection.
I use Ollama to run the models, so can't run the latest until they are ported to the Ollama library. But I don't have much time for tinkering anyway, so I don't mind the publishing delay.
Given the size of frontier models I would assume that they can incorporate many specializations and the most lasting thing here is the training environment.
But there is probably already some tradeoff, as GPT 3.5 was awesome at chess and current models don't seem trained extensively on chess anymore.
As far as I remember, it's post-training that kills chess ability for some reason (GPT-3 wasn't post-trained).
Domain specific models have been on the roadmap for most companies for years now for both competitive (why give up your moat to OpenAI or Anthropic) and financial (why finance OpenAI's margins) perspective.
That you can individually train and improve smaller segments as necessary
function replaceInTextNodes(node) { if (node.nodeType === Node.TEXT_NODE) { node.nodeValue = node.nodeValue .replace(/\u00A0/g, ' '); } else { node.childNodes.forEach(replaceInTextNodes); } }
replaceInTextNodes(document.body);
The Chinese version of the link says "通义 DeepResearch" in the title, so doesn't look like the "agree" to be the case. Completely agreed that it would be hilarious.
1: https://www.alibabacloud.com/en/solutions/generative-ai/qwen...
I switch between gemini and ChatGpt whenever I feel one fails to fully grasp what I want, I do coding in claude.
How are they supposed to become the 1 trillion dollar company they want to be, with strong competition and open source disruptions every few months?
Arguably LLMs are both (1) far easier to switch between models than it is today to switch from AWS / GCP / Azure systems, and (2) will be rapidly decreasing switching costs for your legacy systems to port to new ones - ie Oracle's, etc. whole business model.
Meanwhile, the whole world is building more chip fabs, data centers, AI software/hardware architectures, etc.
Feels more like we're headed to commodification of the compute layer more than a few giant AI monopolies.
And if true, that's actually even more exciting for our industry and "letting 100 flowers bloom".
Not to different from a lot of consulting reports, in fact, and pretty much of no value if if you’re actually trying to learn something.
China is full of people who want communism to dominate the world with totalitarian control so no one wants China to dominate anything at all because they are bad...
The whole country is going down the drain right now. There is nothing about it, sane people outside the Republican bubble would consider "freedom".
n-cpu-moe in https://github.com/ggml-org/llama.cpp/blob/master/tools/serv...
All this to ask the question, if I host these open source models locally, how is the user interface layer that remembers and picks the right data from my previous session and the agentic automation and others implemented? Do I have to do it myself or are the free options for that?
What is the state of AI in China? My personal feeling is that it doesn't dominate the zeitgeist in China as it does in the US and despite this because of the massive amount of intellectual capital they have just a small portion of their software engineering talent working on this is enough to go head to head with us even though it only takes a fraction of their attention.
jychang•7h ago
earthnail•6h ago
jwr•2h ago