Yes.
This is more of a good old classic partitioning, which was rare outside of hardware with special support for it.
Most RTOS + non-RTOS combinations use RTOS doubling as hypervisor, with RT tasks running at guaranteed timeframes and non-RTOS guest running in more relaxed form.
Unlike in other Linux virtualization solutions such as User Mode Linux (or the forementioned VMware), special driver software on the host operating system is used to execute the coLinux kernel in a privileged mode (known as ring 0 or supervisor mode).
By constantly switching the machine's state between the host OS state and and the coLinux kernel state, coLinux is given full control of the physical machine's MMU (i.e, paging and protection) in its own specially allocated address space, and is able to act just like a native kernel, achieving almost the same performance and functionality that can be expected from a regular Linux which could have ran on the same machine standalone.
So my understanding is that it's a Windows driver which contains a full Linux kernel and does some (scary sounding!) time sharing with the Windows kernel running at the same CPL.
The colinux home page also says:
To cooperatively share hardware with the host operating system, coLinux does not access I/O devices directly. Instead, it interfaces with emulated devices provided by the coLinux drivers in the host OS. For example, a regular file in Windows can be used as a block device in coLinux. All real hardware interrupts are transparently forwarded to the host OS, so this way the host OS's control of the real hardware is not being disturbed and thus it continues to run smoothly.
So just like UML, colinux hooks int 80h (or sysenter) and forwards the request to windows. Thus while it may make use of direct access to the MMU, most devices iirc are virtualized.
If you need more security/isolation, go to a VM or bare metal.
[0.5]: https://en.wikipedia.org/wiki/Exokernel
[1.5]: https://wiki.osdev.org/Exokernel
[2.5]: "Should array indices start at 0 or 1? My compromise of 0.5 was rejected without, I thought, proper consideration." — Stan Kelly-Bootle
https://www.digiater.nl/openvms/doc/alpha-v8.3/83final/aa_re...
This sounds like running multiple kernels in a shared security domain, which reduces the performance cost of transitions and sharing, but you lose the reliability and security advantages that a proper VM gives you. It reminds me of coLinux (essentially, a Linux kernel as a Windows NT device driver)
Does anyone have more details on how OpenVMS Galaxy was actually implemented? I believe it was available for both Alpha and Itanium, but not yet x86-64 (and probably never…)
The firmware support was mainly there to provide booting of separate partitions, but otherwise no virtualisation was involved - all resources were exclusively owned.
I think Linux will have to move to a microkernel architecture before this can work. Once you have separate "processes" for hardware drivers, running two userlands side-by-side should be a piece of cookie (at least compared to the earlier task of converting the rest of the kernel).
Will be interesting to see where this goes. I like the idea, but if I were to go in that direction, I would choose something like a Genode kernel to supervise multiple Linux kernels.
- Enhanced security through kernel-level separation
- Better resource utilization than traditional VM (KVM, Xen etc.)
but I don't think it works for - Improved fault isolation between different workloads
- Potential zero-down kernel update with KHO (Kernel Hand Over)
since if the "main" kernel crashes or is supposed to get upgraded then you have to hand hardware back to it.Isn't that similar to starting up from hibernate to disk? Basically all of your peripherals are powered off and so probably can not keep their state.
Also you can actually stop a disk (member of a RAID device), remove the PCIe-SATA HBA card it is attached to, replace it with a different one, connect all back together without any user-space application noticing it.
Here's my graphics chip getting reset:
[drm:mes_v11_0_submit_pkt_and_poll_completion.constprop.0 [amdgpu]] *ERROR* MES failed to response msg=3
[drm:amdgpu_mes_unmap_legacy_queue [amdgpu]] *ERROR* failed to unmap legacy queue
[drm:gfx_v11_0_hw_fini [amdgpu]] *ERROR* failed to halt cp gfx
amdgpu 0000:c6:00.0: amdgpu: MODE2 reset
amdgpu 0000:c6:00.0: amdgpu: GPU reset succeeded, trying to resumeThis allowed cheap "logical partitioning" of machines without actually using a hypervisor or special hardware support.
Today, you can grab a physical NIC and create some number of virtual NICs. Same for GPUs.
I guess the idea is that you have some hardware, and each kernel (read "virtual machine") will get:
- some dedicated CPU
- some physical memory
- some virtual NICs
- some storage, maybe (if dedicated; if through network, then nothing to do here)
- maybe a virtual GPU for the AI hype train
Every kernel will mostly think it owns real hardware, while in fact it only deals with part of it (all of this due to virtualized hardware support that can be found in many places)This feature does not seem like a general-usage feature, which can be used in our laptop
I think the architecture assumes all loaded kernels are trusted, and imposes no isolation other than having them running on different CPUs.
Given the (relative) simplicity of the PoC, it could be really performant.
Which of the kernel does the PCI enumeration, for instance, and how it is determined which kernel gets ownership over a PCI device? How about ACPI? Serial ports?
How does this architecture transfers ownership over RAM between each kernel, or is it a fixed configuration? How about NUMA-awareness? (Likely you would want to partition systems so that RAM is along with the CPUs of the same NUMA node).
Looks to me that one kernel would need to be have 'hypervisor'-like behavior in order to divvy up resources to other kernels. I think PVM (https://lwn.net/Articles/963718/) would be a preferred solution in this case, because the software stack of managing hypervisor resources can already be reused with it.
Could the new kernel be genetically scored for effectiveness (security, performance, etc), and iterated upon automatically, by e.g. an AI?
andutu•4mo ago
intermerda•4mo ago
mensi•4mo ago
intermerda•4mo ago