Kernel
Contents
ROSA Kernel
This text was started on 27 Feb 2014, this is only the very first draft, an early 'work in progress'...
at the end, when complete, it would contain the main specs of each flavours and suggestion of use.
ROSA has in its availability a great number of kernel series
- Kernel ONE with nr.3 flavour series: basic, nrj, nrjQL
- Kernel ONE is the most complex and complete source with so many configs and features, it can configure and generate a lot of different specialized flavours, also called (69) Yin/Yang for its completeness
- Kernel Vanilla with one basic vanilla flavour plus two vanilla + nrj based flavours
- Kernel RT with one basic rt flavour plus one rt based flavour (rtQL)
The sources are shared with OpenMandriva linux, so the same sources can generate a great number of different kernel flavours for ROSA Linux http://www.rosalab.com/, OpenMandriva Lx http://openmandriva.org/, and their spin-off like MagOS http://www.magos-linux.ru/, MoonDrake http://moondrake.org/, Unity Linux http://unity-linux.org/, ...
The Kernel ONE
The basic, nrj and nrjQL kernel flavours
nrj and nrjQL are two different codenames used to distinguish the two advanced flavour series from the basic ones
these three flavours series can be generated from different kernel sources (srpms): 1> from kernel source, that may be recognized by release number rel.1, for basic and nrj 2> from kernel source, that may be recognized by release number rel.69, for nrjQL only, 3> from kernel source, that may be recognized by release number rel.70, also called with the name the Kernel ONE generating all basic, nrj, nrjQL flavours
History about the nrj and nrjQL kernels
Just below, the link from where all has born...
http://mib.pianetalinux.org/forum/viewtopic.php?f=38&t=3463
Main configs and features
The same source is able to configure and generate nr.3 kinds of kernel flavour levels:
1> basic flavours use 'old mdv model', it's a complete featured kernels set with simple common configs, and the same old mdv names:
examples: kernel-desktop-i586, kernel-desktop, kernel-server, ...
2> nrj flavours = contain the same patches and features of 1> plus few extended features:
- Full Preemption
- RCU preemption
- RCU boosting
- BFQ (disk I/O sched) enabled by default (instead of standard CFQ)
examples: kernel-nrj-desktop, kernel-nrj-laptop, kernel-nrj-realtime, ...
ROSA Linux distro have chosen kernel-nrj-desktop as the default kernel flavour, but it is possible, after the OS has been installed on the HD or SSD, to install and use any of the of different specialized kernel flavours, some examples?
- If you have a laptop PC, you need a better Cpu cooling and Energy Saving, we sugggest you kernel-nrj-laptop
- If you have a laptop / netbook PC, you need better Energy Saving, the lightest flavour is kernel-nrj-netbook
- If you need a most responsive system for audio applications, we suggest you installing kernel-nrj-realtime
- If you need to prepare a server for your LAMP applications, we suggest you installing kernel-server
- it's also possible installing, one or more flavours, from the nrjQL flavour list, just below
3> nrjQL flavours = contain the same patches and features of 2> nrj plus few extended features:
- C.K. patches, designed to improve system responsiveness and interactivity
- BFS (Process scheduler), enabled by default (instead of standard CFS)
- UKSM, the Ultra Kernel Memory DeDuplication, enabled by default
- TOI (Tux On Ice), suspend-to-disk or hibernate, enabled by default
examples: kernel-nrjQL-desktop, kernel-nrjQL-laptop, kernel-nrjQL-realtime, ...
OpenMandriva Lx distro have chosen kernel-nrjQL-desktop as the default kernel flavour, but it is possible, after the OS has been installed on the HD or SSD, to install and use any of the of different specialized kernel flavours, some examples?
- If you have a laptop PC, you need a better Cpu cooling and Energy Saving, we sugggest you nrjQL-laptop
- If you have a laptop / netbook PC, you need better Energy Saving, the lightest flavour is nrjQL-netbook
- If you need a most responsive system for audio applications, we suggest you installing nrjQL-realtime
- If you need to prepare a server for your LAMP applications, we suggest you installing nrjQL-server
- If you need preparing a Game Server for CounterStrike or other FPS games, you have nrjQL-server-games
- If you need a performant server for encoding/decoding, building sources, you have nrjQL-server-computing
Table of configs and features with descriptions
Flavour Names | basic configs | nrj model | nrjQL model | misc | |||||||||||||||
kernel | Hertz | TkLs | Gov. | Xen | C.Pr | R.Pr | R.bs | Disk | C.K. | Prcs | MemD | Hybr | #1 | #2 | #3 | ||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
desktop | 1000 | yes | OnD | no | yes | ||||||||||||||
laptop | 300 | yes | OnD | no | yes | ||||||||||||||
netbook | 250 | yes | OnD | no | yes | ||||||||||||||
server | 100 | yes | OnD | yes | yes | ||||||||||||||
nrj-desktop | 1000 | yes | OnD | no | yes | yes | yes | BFQ | yes | yes | |||||||||
nrj-laptop | 300 | yes | OnD | no | yes | yes | yes | BFQ | yes | yes | |||||||||
nrj-netbook | 250 | yes | OnD | no | yes | yes | yes | BFQ | yes | yes | |||||||||
nrj-realtime | 2000 | no | Perf | no | yes | yes | yes | BFQ | yes | yes | |||||||||
nrjQL-desktop | 1000 | no | OnD | no | yes | yes | yes | BFQ | yes | BFS | UKSM | TOI | yes | yes | yes | ||||
nrj-QL-laptop | 300 | yes | OnD | no | yes | yes | yes | BFQ | yes | BFS | UKSM | TOI | yes | yes | yes | ||||
nrjQL-netbook | 250 | yes | OnD | no | yes | yes | yes | BFQ | yes | BFS | UKSM | TOI | yes | yes | yes | ||||
nrjQL-realtime | 2000 | no | Perf | no | yes | yes | yes | BFQ | yes | BFS | UKSM | TOI | yes | yes | yes | ||||
nrjQL-server | 100 | yes | OnD | yes | yes | yes | yes | BFQ | yes | BFS | UKSM | TOI | yes | yes | yes | ||||
nrjQL-server-computing | 100 | yes | OnD | yes | yes | yes | yes | BFQ | yes | BFS | UKSM | TOI | yes | yes | yes | ||||
nrjQL-server-games | 3000 | yes | Perf | no | yes | yes | yes | BFQ | yes | BFS | UKSM | TOI | yes | yes | yes |
LEGENDA
Hertz=Scheduler frequency - TkLs=TickLess mode - Gov.=Power Governor - Xen=XEN Server
C.Pr=Cpu Preempt - R.Pr=RCU Preempt - R.bs=Rcu Boost - Disk=Disk I/O scheduler
C.K.=Con Kolivas patches - PrSc=Process scheduler - MemD=Memory Deduplicator - Hybr=Hybernation/Suspend
misc1 basic available features: 3rd party, AUFS3, OverlayFS, NdisWrapper
misc2 nrj available features: 3rd party, AUFS3, OverlayFS, NdisWrapper, ReiserFS4, ESF, ...
misc3 nrjQL available features: 3rd party, AUFS3, OverlayFS, NdisWrapper, ReiserFS4, ESF, ...
Hertz > Timer Wheel, Jiffies and HZ (or, the way it was)
http://elinux.org/Kernel_Timer_Systems#Timer_Wheel.2C_Jiffies_and_HZ_.28or.2C_the_way_it_was.29
The original kernel timer system (called the "timer wheel) was based on incrementing a kernel-internal value (jiffies) every timer interrupt. The timer interrupt becomes the default scheduling quamtum, and all other timers are based on jiffies. The timer interrupt rate (and jiffy increment rate) is defined by a compile-time constant called HZ. Different platforms use different values for HZ. Historically, the kernel used 100 as the value for HZ, yielding a jiffy interval of 10 ms. With 2.4, the HZ value for i386 was changed to 1000, yeilding a jiffy interval of 1 ms. Recently (2.6.13) the kernel changed HZ for i386 to 250. (1000 was deemed too high).
Tickless Mode / Dynamic ticks (TkLs)
http://elinux.org/Kernel_Timer_Systems
Tickless kernel, dynamic ticks or NO_HZ is a config option that enables a kernel to run without a regular timer tick. The timer tick is a timer interrupt that is usually generated HZ times per second, with the value of HZ being set at compile time and varying between around 100 to 1500. Running without a timer tick means the kernel does less work when idle and can potentially save power because it does not have to wake up regularly just to service the timer. The configuration option is CONFIG_NO_HZ and is set by Tickless System (Dynamic Ticks), on the Kernel Features configuration menu.
Power Governors (Gov.)
https://www.kernel.org/doc/Documentation/cpu-freq/governors.txt
OpenMandriva desktop kernel has OnD (OnDemand), the most responsive realtime flavours use Perf (Performance)
Ondemand
The CPUfreq governor "ondemand" sets the CPU depending on the current usage. To do this the CPU must have the capability to switch the frequency very quickly.
Performance
The CPUfreq governor "performance" sets the CPU statically to the highest frequency within the borders of scaling_min_freq and scaling_max_freq.
Xen hypervisor (Xen)
http://en.wikipedia.org/wiki/Xen
http://www.xenproject.org/
Xen in server flavour allows the kernel to boot in a paravirtualized environment under the Xen hypervisor.
CPU Preemption (C.Pr)
http://en.wikipedia.org/wiki/Preemption_%28computing%29
"Preemptive multitasking allows the computer system to more reliably guarantee each process a regular "slice" of operating time. It also allows the system to rapidly deal with important external events like incoming data, which might require the immediate attention of one or another process."
For the user experience the CPU Preemption makes the Operative System more reactive and responsive to his inputs, more effective in multitasking, and when PC is used as a multimedia workstation
RCU Preemption (R.Pr) http://www.rdrop.com/users/paulmck/RCU/whatisRCU.html
http://www.rdrop.com/users/paulmck/RCU/whatisRCU.html https://lwn.net/Articles/541037/
RCU has 4 modes, NRJ model is configured as shown in item: 4. SMP && PREEMPT: TREE_PREEMPT_RCU
1. !SMP && !PREEMPT: TINY_RCU, which is used for embedded systems with tiny memories (tens of megabytes).
2. !SMP && PREEMPT: TINY_PREEMPT_RCU, for deep sub-millisecond realtime response on small-memory systems.
3. SMP && !PREEMPT: TREE_RCU, which is used for high performance and scalability on server-class systems where scheduling latencies in milliseconds are acceptable.
4. SMP && PREEMPT: TREE_PREEMPT_RCU, which is used for systems requiring high performance, scalability, and deep sub-millisecond response.
So, if you currently use TINY_PREEMPT_RCU, please go forth and test TREE_PREEMPT_RCU on your hardware and workloads.
RCU Boosting (R.bs)
http://cateee.net/lkddb/web-lkddb/RCU_BOOST_PRIO.html
This option specifies the real-time priority to which long-term preempted RCU readers are to be boosted. If you are working with a real-time application that has one or more CPU-bound threads running at a real-time priority level, you should set RCU_BOOST_PRIO to a priority higher then the highest-priority real-time CPU-bound thread. The default RCU_BOOST_PRIO value of 1 is appropriate in the common case, which is real-time applications that do not have any CPU-bound threads.
Some real-time applications might not have a single real-time thread that saturates a given CPU, but instead might have multiple real-time threads that, taken together, fully utilize that CPU. In this case, you should set RCU_BOOST_PRIO to a priority higher than the lowest-priority thread that is conspiring to prevent the CPU from running any non-real-time tasks. For example, if one thread at priority 10 and another thread at priority 5 are between themselves fully consuming the CPU time on a given CPU, then RCU_BOOST_PRIO should be set to priority 6 or higher.
Disk I/O scheduler (Disk)
http://algo.ing.unimo.it/people/paolo/disk_sched/
http://lwn.net/Articles/275978/
We are currently using BFQv7r2, waiting for v7r3 that promises to double the throughput with random load
"BFQ is a proportional-share storage-I/O scheduler that also supports hierarchical scheduling with a cgroups interface. Here are the main nice features of BFQ.
Low latency for interactive applications According to our results, whatever the background load is, for interactive tasks the storage device is virtually as responsive as if it was idle."
Just a video: http://www.youtube.com/watch?feature=player_embedded&v=J-e7LnJblm8
C.K. > Con Kolivas patches
http://users.on.net/~ckolivas/kernel/
These are patches designed to improve system responsiveness and interactivity with specific emphasis on the desktop, but suitable to any workload.
BFS - Process Scheduler Prcs
http://ck.kolivas.org/patches/bfs/3.0/3.12/3.12-sched-bfs-444.patch
BFS is the Brain Fuck Scheduler. It was designed to be forward looking only, make the most of lower spec machines, and not scale to massive hardware. ie, it is a desktop orientated scheduler, with extremely low latencies for excellent interactivity by design rather than "calculated", with rigid fairness, nice priority distribution and extreme scalability within normal load levels.
UKSM - Memory Deduplicator (MemD)
http://www.phoronix.com/scan.php?page=news_item&px=MTEzMTI http://kerneldedup.org/en/projects/uksm/
The Ultra KSM (UKSM) patch-set for the Linux kernel continues to be maintained for providing transparent full-system memory de-duplication for Linux. UKSM is about de-duplication of data in system memory rather than being another de-duplicating file-system. UKSM can work for KVM virtualization as well to reduce memory usage for guest virtual machines and there is also a KernelDeDup project for supporting Xen virtualization too, in an effort to reduce memory pressure.
TuxOnIce - TOI (Hybr)
http://en.wikipedia.org/wiki/TuxOnIce http://tuxonice.nigelcunningham.com.au/
TuxOnIce (formerly known as Suspend2) is an implementation of the suspend-to-disk (or hibernate) feature which is available as patches for the 2.6 Linux kernel. During the 2.5 kernel era, Pavel Machek forked the original out-of-tree version of swsusp (then at approximately beta 10) and got it merged into the vanilla kernel, while development continued in the swsusp/Suspend2/TuxOnIce line. TuxOnIce includes support for SMP, highmem and preemption.
AUFS3 (misc#1) http://en.wikipedia.org/wiki/Aufs
http://aufs.sourceforge.net/ http://sourceforge.net/p/aufs/aufs3-standalone/ci/master/tree/
aufs (AnotherUnionFS in version 1, but advanced multi layered unification filesystem since version 2) implements a union mount for Linux file systems.
Developed by Junjiro Okajima in 2006,[1] aufs is a complete rewrite of the earlier UnionFS. It aimed to improve reliability and performance, but also introduced some new concepts, like writable branch balancing,[2] and other improvements - some of which are now implemented in the UnionFS 2.x branch.
OverlayFS (misc#1)
http://sourceforge.net/projects/olfs/
An FUSE filesystem module that merges content of several directories in to a single directory transparently.
Commands and Tools
There are some command tools that are generated from the same kernel srpm
- cpupower
to monitor and / or change the Power Governor profile
- kernel-header
it contains the needed headers for some applications
- perf
to execute some interesting performance comparison tests
To manage the energy profiles, we can use the command tools, mainly cpupower and perf
The operators available are: conservative, userspace, powersave, ondemand, performance.
All of the kernel flavor for OpenMandriva and ROSA are configured with default OnDemand, except the realtime type flavours and server-games, needing the most of responsiveness, which are configured with Performance governor.
If you have these not installed, you can install now
# urpmi cpupower perf
We can see the list of command options
[root@localhost ~]# cpupower
Usage: cpupower [-d|--debug] [-c|--cpu cpulist ] <command> [<argsnrgetic>]
Supported commands are:
frequency-info
frequency-set
idle-info
idle-set
set
info
monitor
help
To ask which is the used configuration (in the case below is Performance)
[root@localhost ~]# cpupower frequency-info
analisi della CPU 0:
modulo acpi-cpufreq
CPU che operano alla stessa frequenza hardware: 0
CPU che è necessario siano coordinate dal software: 0
latenza massima durante la transizione: 10.0 us.
limiti hardware: 1000 MHz - 1.67 GHz
frequenze disponibili: 1.67 GHz, 1.33 GHz, 1000 MHz
gestori disponibili: conservative, userspace, powersave, ondemand, performance
gestore attuale: la frequenza deve mantenersi tra 1000 MHz e 1.67 GHz.
Il gestore "ondemand" può decidere quale velocità usare
in questo intervallo.
la frequenza attuale della CPU è 1.67 GHz (ottenuta da una chiamata diretta all'hardware).
boost state support:
Supported: no
Active: no
If you prefer to enable "powersave" for: the best cpu cooling, more battery lasting, but with the worst performance
[root@localhost ~]# cpupower frequency-set -g powersave
Setting cpu: 0
Setting cpu: 1
We check now that Powersave is on (and it is)
[root@localhost ~]# cpupower frequency-info
analisi della CPU 0:
modulo acpi-cpufreq
CPU che operano alla stessa frequenza hardware: 0
CPU che è necessario siano coordinate dal software: 0
latenza massima durante la transizione: 10.0 us.
limiti hardware: 1000 MHz - 1.67 GHz
frequenze disponibili: 1.67 GHz, 1.33 GHz, 1000 MHz
gestori disponibili: conservative, userspace, powersave, ondemand, performance
gestore attuale: la frequenza deve mantenersi tra 1000 MHz e 1.67 GHz.
Il gestore "powersave" può decidere quale velocità usare
in questo intervallo.
la frequenza attuale della CPU è 1000 MHz (ottenuta da una chiamata diretta all'hardware).
boost state support:
Supported: no
Active: no
We can edit the config file, to have it permanently
/etc/sysconfig/cpupower
You can replace the 'ondemand' with your preferred governor. Now: Save it and reboot!
Other Kernels
Vanilla Kernels flavours
OpenMandriva has other kernels in its availability, these are generated from different SRPMS
The vanilla flavours are prepared with the most basic vanilla features and configs, with none 3rd party patches add
What is a Vanilla Kernel ?
Is the basic kernel sources in the following link "http://www.kernel.org
- kernel-vanilla
the basic vanilla
- kernel-vanilla-nrj-desktop
vanilla plus the full nrj preemption mode for the CPU and RCU tree, RCU boosting
- kernel-vanilla-nrj-laptop
vanilla plus the full nrj preemption mode for the CPU and RCU tree, RCU boosting, but at 300Hz
RT Kernels flavours
Kernel RT and the -rt flavours based on the Andrew Morton -rt patchset
What is an RT Kernel ?
Is the basic vanilla kernel sources plus the -rt patches
https://rt.wiki.kernel.org/
- kernel-rt
the basic -rt flavour
- kernel-rtQL
rt plus other features, most from QL, as AUFS3, BFQ, REISERFS4, TOI, UKSM
How to Install
When you choose the kernel flavour that you want to install, we suggest to install also the source rpm, or at least, the related flavour -devel package
example:
kernel-nrjQL-laptop + kernel-source
otherwise you may install:
kernel-nrjQL-laptop + kernel-nrjQL-laptop-devel
to simplify the installation and automatically the further updates , we can install all through the call for the metapackages named "-latest"
example:
kernel-nrjQL-laptop-latest + kernel-source-latest
otherwise you may install:
kernel-nrjQL-laptop-latest + kernel-nrjQL-laptop-devel-latest
VMWare Virtualization
The current versions of the widespread virtualization software VMWare Workstation 10.0.1 and VMPlayer 6.0.1
work fine with the stock Kernel only upto kernel 3.11
With Kernel 3.12. the vmci and vsock modules don't build when kernel has the NAMESPACES with UIDGID enabled
With Kernel 3.13 there is a further build problem, this time is with the vnet module, that need to be fixed properly
We have searched for unofficial patches from other Community, but with no luck
MIB has prepared the solutions and the source patches to solve these troubles
The patched archives (vmci.tar, vsock.tar) for the kernel 3.12 are available here:
http://mib.pianetalinux.org/MIB/rosa2012.1/others/vmware-kernel312/
The patched archive (vmnet.tar) for the kernel 3.13.6+ is downloadable from there:
Of course, you need to replace also the two above patched archives for kernel 3.12
http://mib.pianetalinux.org/MIB/rosa2012.1/others/vmware-kernel313/
Make a safety backup of the original /source folder contents from
/usr/lib/vmware/modules/source/
then put in our archives with the fixes in
/usr/lib/vmware/modules/source/
Now, all the needed vmware modules should be built properly!
Who is the developer and maintainer
NicCo (Nicolò Costanza)
Kernel designer, engineer, maintainer and tester for ROSA Desktop and OpenMandriva Lx OSes
System admin, and moderator for the User Community of ROSA and OpenMandriva Linux OSes
MIB Blog > http://mib.pianetalinux.org/blog - MIB Forum > http://mib.pianetalinux.org/forum