A couple notable things: first is that you can do this at all, (left to right model -> out of order diffusion via finetuning) which is really interesting. Second, the final version beats original by a small margin on some benchmarks. Third is that it’s in the ballpark of Gemini diffusion, although not competitive — to be expected for any 7B parameter model.
A diffusion model comes with a lot of benefits in terms of parallelization and therefore speed; to my mind the architecture is a better fit for coding than strict left to right generation.
Overall, interesting. At some point these local models will get good enough for ‘real work’ and they will be slotted in at API providers rapidly. Apple’s game is on-device; I think we’ll see descendants of these start shipping with Xcode in the next year as just part of the coding experience.
Why can’t I backup an iOS device to a local NAS in the way I can use Time Machine, for example? (Rhetorical question; the answer is obviously that they want to sell more iCloud storage for that all-important services revenue).
> Step-by-Step Guide: How to Backup iPhone to Synology NAS
https://www.ubackup.com/phone-backup/backup-iphone-to-synolo...
There are two methods presented: one only backs up the camera roll; the other requires plugging into a computer and manually clicking around, at which point you might as well use the first party backup built into Finder (or iTunes on Windows? Is that still a thing?), no random third party application needed. I also highly doubt their “backup every single content” claim.
It’s also a sneaky marketing article for that third party application, following the common SEO practice of giving you a half-ass solution capturing a frequent search term (in this case, “backup iPhone to Synology”), then plug their own questionable thing as the better solution.
That’s a guide on how to backup an iPhone to a NAS using a computer.
Unsurprisingly, a reasonably capable general-purpose OS supports network file systems in a way transparent to applications, but that doesn’t help people using only an iOS device.
You can backup your iPhone using Finder.
Finder -> Locations -> Your iPhone -> Backup all the data on this iPhone to your Mac.
Once you have done this you can find the backup in "Manage Backups", right click on an entry and select "Show in Finder". From there you can copy it to your NAS.
Not as smooth as a Time Machine backup but it is possible.
I’d personally call it “absurdly clunky and intentionally impractical for a big chunk of Apple’s user base”.
When I connect my iPhone to my iMac it does to a local backup to a file, which then gets backed up via Time Machine (and SuperDuper/CarbonCopyCloner).
"How to back up your iPhone, iPad, and iPod touch with your Mac":
* https://support.apple.com/en-ca/108796
There's also a checkbox for 'Wifi syncing' so a cable isn't necessarily needed.
iOS natively supports SMB over any network connection including wired Ethernet, mounting encrypted APFS volumes on USB storage devices at 10 Gbps etc.
It’s Apples explicit vision that an iPad Pro can replace a Mac even for some professional users. Why don’t these deserve local backups?
Apple already provides first party software to handle iDevice backups on Windows or Mac.
Backing up an Android device to a PC using adb is significantly more difficult, especially for the less technically minded.
That’s arguably the wrong question: I bet a lot more would own one if they could easily backup their iOS devices to it.
There aren’t that many people that are willing to own a device from a company but not trusting that company with their data
I think EVs destroying Ultra Large Container ships had better odds, amd both are extremely unlikely. Dc advantages Apple won't be able to overcome: compute density, cooling, cheap power, physical security to protect the software, scale + bandwidth, lower costs to customers of using contract manufacturers and/or commodity hardware.
There is no universe where large enterprises ditch their geo-located racks. Let alone hyperscalers, especially now that they are scrounging for energy, reneging on pledges on renewables, and paying bug bucks to bring nuclear power stations online
We have to see if it produces better results. Humans have a planning phase, followed be a part-by-part implementation phase. This is reasonably well emulated by plan/architect + codegen tools.
But yeah, RWKV also ends up in a similar performance area with similar sizes - I wish someone started using it at scale finally...
It's possible that some of those new architecture / optimization would allow us to go beyond the current benchmark score, but probably with more training data, and money. But to get money you need to show results, which is what you see today. Scaling remains king; maybe one of these technique is 2025 "attention" paper, but even that one needed a lot of scaling to go from the 2017 version to ChatGPT.
Are these small models good enough for anything but autocomplete?
They predict more than just the second half of a word you are typing, but at the end of the day they're still just predicting what a human would have typed.
I had a similar notion and am excited to see this research being done. My experience of writing code is that the structure of the whole system influences each individual part, which has always felt like a better match for a diffusion type model.
I’m suspecting this is a 7B model because it’s an experiment, but I do like seeing Apple playing with smaller models - I think Google’s “no moat” memo is still fundamentally correct, either via better architectures or Moore’s law, and it seems like Apple thinks the same.
The Zed team recently posted a pretty good intro to diffusion models for text: https://www.youtube.com/watch?v=oot4O9wMohw
It's notable that this was an intern project.
Did you use the normal Jetbrains AI assistant, or was it junie?
> Apple’s model is built on top of Qwen2.5‑7B, an open-source foundation model from Alibaba. Alibaba first fine-tuned that model for better code generation (as Qwen2.5‑Coder‑7B), then Apple took it and made its own adjustments.
> it still doesn’t quite reach the level of GPT-4 or Gemini Diffusion.
So even though it is faster (than what?) it still doesn’t beat top models?
On a purely tech point, I'm not working very near the cutting edge on AI research, but hadn't realised so much had been done - and was possible - with diffusion models for text. Will need to dig into that, as it looks fascinating.
I've been looking for good/best-practice workflow setups to work with docker python/fastapi backends and vue frontends...but I haven't found much.
If anyone has tips for where to look, I'd genuinely appreciate it!
Other, non-programming-related tasks are a different story though
Apple's is open weights, so that's a big deal.
WillAdams•6h ago