The pattern is effectively long-running research tasks that drive a search tool. You give them a prompt, they churn away for 5-10 minutes running searches and they output a report (with "citations") at the end.
This Tongyi model has been fine-tuned to be really good at using its search tool in a loop to produce a report.
So without specifying which model is being used, it's really hard to know what is better than something else, because we don't understand what the underlying model is, and if it's better because of the model itself, or the tooling, which feels like an important distinction.
Constraints are the fun part here. I know this isn't the 8x Blackwell Lamborghini, that's the point. :)
If you really do have a 2080ti with 128gb of VRAM, we'd love to hear more about how you did it!
get the biggest one that will fit in your vram.
(If nothing else Tongyi are currently winning AI with cutest logo)
This comfortably fits FP8 quantized 30B models that seem to be "top of the line for hobbyists" grade across the board.
- Ryzen 9 9950X
- MSI MPG X670E Carbon
- 96GB RAM
- 2x RTX 3090 (24GB VRAM each)
- 1600W PSU
Of course this is in a single-user environment, with vLLM keeping the model warm.
It's slower than a rented Nvidia GPU, but usable for all the models I've tried (even gpt-oss-120b), and works well in a coffee shop on battery and with no internet connection.
I use Ollama to run the models, so can't run the latest until they are ported to the Ollama library. But I don't have much time for tinkering anyway, so I don't mind the publishing delay.
Given the size of frontier models I would assume that they can incorporate many specializations and the most lasting thing here is the training environment.
But there is probably already some tradeoff, as GPT 3.5 was awesome at chess and current models don't seem trained extensively on chess anymore.
As far as I remember, it's post-training that kills chess ability for some reason (GPT-3 wasn't post-trained).
Domain specific models have been on the roadmap for most companies for years now for both competitive (why give up your moat to OpenAI or Anthropic) and financial (why finance OpenAI's margins) perspective.
That you can individually train and improve smaller segments as necessary
Right now, I believe we're seeing that the big general-purpose models outperform approximately everything else. Special-purpose models (essentially: fine tunes) of smaller models make sense when you want to solve a specific task at lower cost/lower latency, and you transfer some/most of the abilities in that domain from a bigger model to a smaller one. Usually, people don't do that, because it's a quite costly process, and the frontier models develop so rapidly, that you're perpetually behind them (so in fact, you're not providing the best possible abilities).
If/when frontier model development speed slows down, training smaller models will make more sense.
So yeah I think there are different levels of thinking, maybe future models with have some sort of internal models once they recognize patterns of some level of thinking, I'm not that knowledgeable of the internal workings of LLMs so maybe this is all nonsense.
function replaceInTextNodes(node) { if (node.nodeType === Node.TEXT_NODE) { node.nodeValue = node.nodeValue .replace(/\u00A0/g, ' '); } else { node.childNodes.forEach(replaceInTextNodes); } }
replaceInTextNodes(document.body);
The script is great!
The Chinese version of the link says "通义 DeepResearch" in the title, so doesn't look like the "agree" to be the case. Completely agreed that it would be hilarious.
1: https://www.alibabacloud.com/en/solutions/generative-ai/qwen...
I switch between gemini and ChatGpt whenever I feel one fails to fully grasp what I want, I do coding in claude.
How are they supposed to become the 1 trillion dollar company they want to be, with strong competition and open source disruptions every few months?
Arguably LLMs are both (1) far easier to switch between models than it is today to switch from AWS / GCP / Azure systems, and (2) will be rapidly decreasing switching costs for your legacy systems to port to new ones - ie Oracle's, etc. whole business model.
Meanwhile, the whole world is building more chip fabs, data centers, AI software/hardware architectures, etc.
Feels more like we're headed to commodification of the compute layer more than a few giant AI monopolies.
And if true, that's actually even more exciting for our industry and "letting 100 flowers bloom".
The underlying architecture isnt special, the underlying skills and tools aren't special.
There is nothing openAI brings to the table other than a willingness to lie, cheat, and steal. That only gives you an edge for so long.
ask a loaded, "filter question" I more or less know the answer for, and mostly skip the prose and get to the links to its sources.
Not to different from a lot of consulting reports, in fact, and pretty much of no value if if you’re actually trying to learn something.
Edit to add: even the name “deep research” to me feels like something defined to appeal to people who have never actually done or consumed research, sort of like the whole “phd level” thing.
China is full of people who want communism to dominate the world with totalitarian control so no one wants China to dominate anything at all because they are bad...
The whole country is going down the drain right now. There is nothing about it, sane people outside the Republican bubble would consider "freedom".
n-cpu-moe in https://github.com/ggml-org/llama.cpp/blob/master/tools/serv...
All this to ask the question, if I host these open source models locally, how is the user interface layer that remembers and picks the right data from my previous session and the agentic automation and others implemented? Do I have to do it myself or are the free options for that?
What is the state of AI in China? My personal feeling is that it doesn't dominate the zeitgeist in China as it does in the US and despite this because of the massive amount of intellectual capital they have just a small portion of their software engineering talent working on this is enough to go head to head with us even though it only takes a fraction of their attention.
No, the reason you don't see many open source models coming from the rest-of-world (other than Mistral in France) is that you still need a ton of capital to do it. China can compete because the CCP used a combination of the Great Firewall and lax copyright/patent enforcement to implement protectionism for internet services, which is a unique policy (one that obviously came with massive costs too). This allowed China to develop home grown tech companies which then have the datacenters, capital and talent density to train models. Rest of world didn't do this and wasn't able to build up domestic tech industries competitive with the USA.
jychang•11h ago
earthnail•10h ago
jwr•6h ago