OpenSource kan sneller
Unix/Linux werd traditioneel altijd gebruikt op systemen die
altijd aan staan. De snelheid waarmee zo’n server opstart is daarom niet zo
belangrijk. Maar voor embedded en realtime systemen is dat wel belangrijk. De
gebruiker verwacht dat hij zijn systeem direct kan gebruiken. Met EQSL (nu:
Embedded-Linux Kick-Start) is het gebruiken van Linux eenvoudig geworden. Nu
gaan we het bootproces bestuderen en zien waar we tijdwinst kunnen
behalen. Maar ook hoe we dit betrouwbaar kunnen meten.
Transcript
OpenSource kan sneller! Historical, Linux is used on systems where boot-time is not relevant. With EQSL inux it has become easy to start using embedded Linux. Now, we want to make to make Linux boot faster! But, Linux is complex ... It always has a thousand solutions ... And we even are not sure about the problem ... So, How to solve it? Time to investigate, to study, ...
Today
Is there a problem?
Linux is a slow starter...
How does Linux boot?
Some theory and a few quick wins...
How to measure?
We need to KNOW , not hope or guess
Measure, change, re-measure and compare!
Results and solutions
Not only PTS’ EQSLinux, but for all (embedded) Linux-systems!
Typical Linux ...
Linux is never designed for embedded systems
Unix/Linux used to be a ‘server’
Boot-time is not important, flexibility is
Linux@theDesktop is ‘hot’
People get used to wait for computers
There is a lot of ‘user-space’ waiting
‘ Building the desktop ’ takes a lot more then system-boot
Example: Booting one of our development-systems takes 143 seconds!
suse-9.1, with lots of HW
On most embedded systems, that is not acceptable
We are used to milli second, not thousands of them!
Booting Linux is complex
Flexibility
Unix-brands, Linux-distributions, SystemUse
E.g. Banking versus SW-development
Hardware, Systems
Legacy
RunLevels, Compatibility
Lack of design-vision
Background & know-how of developers
Some developers have other ideas, or “enrich” (don’t understand) the concept
Focus on ‘C’, ‘desktop’, ‘new’
No time left for: Makefiles, engineering, booting, embedded
Booting Linux Systems
There are 5 phases
With several sub-phases; some of them are time-outs !
And thousands of steps
time UP-ness (%) example development system : 143 seconds ! POST (bios) BOOT (grub) Kernel (vmlinuz) modules (*.ko) Start-up (rc-scripts) READY (login/application)
Booting Linux Systems
There are 5 phases
With several sub-phases; some of them are time-outs !
And thousands of steps
time UP-ness (%) example development system : 143 seconds ! POST (bios) BOOT (grub) Kernel (vmlinuz) modules (*.ko) Start-up (rc-scripts) READY (login/application) Hidden sheet Needed for notes of prev sheet
How to measure?
Reliable measuring the boot-steps is difficult
All is SW-only. So, no external (scope) measurement
Multiple domains, HW, SW & versions
E.g. bios; grub, uboot, redboot; Linux-kernel-2.*.*; scripts
Instrumenting is difficult
Can’t change BIOS , hardly can change ‘ BOOT ’ (there is no space!)
Lot’s & lots of code in Linux-system; fork() is architecture-dependent
Time-resolution: seconds or micro-seconds?
Some steps take less then 1 milli-second!
Every statement takes time! a_time() takes to much!
We need to see the ‘big picture’; not only details
Note: monitoring 143s in ms-resolution is (about) 500-meters of print-out!!
StartLog
Concept
Make it fast
Capture data in real-time, transfer later
Process data off-line
Make it simple
Minimize the changes (simple to port to every Linux)
Both for kernel, modules and start-up scripts
Make it reliable
Measure the measurement
Measure, change Linux, re-measure AND compare!
Result: StartLog (patch & post-processing)
Details: patch
Log system-call ‘exec’
Start a new program
It will miss ' source *.sh '
Filename is available
Arch independent (Linux)
Use ‘ dmesg ’ storage
It’s available
Easy to read
Increase size ! (Watch for build-bug)
Use ‘jiffies’ for time
Counter (long unsigned integer)
Can wrap; random start
About 1 ms (most systems)
do_execve (... filename ... ) {
static int PTS_startlog =0;
// int PTS_i;
printk( " @PTS@startlog=%07d,
jiffies=%012lu, do_execve(%s) ",
PTS_startlog ++, jiffies , filename );
/* //Optional:
for ( PTS _i =0; PTS _i < 9; PTS _i++) printk(" %s, ", argv [ PTS _i] );
printk(” ; ");
for ( PTS _i = 0; PTS _i < 9; PTS _i++) printk(" %s, ", envp [ PTS _i] );
printk(" ) ”);
*/
...
}
Details: processing & Quality
Store Log
dmesg -s 512000 > aFile
ftp/scp to host
Off-line
Some awk script
Filter, recount jiffies, csv-format
Excel macros & graphing
X-Y (scatter) for timeline
Histogram (bar), #calls
Pie, for bottleneck
Customisable
It is easy to adapt to find each problem.
Influence (real-time)
Each log: 200-300 µsec
Depending on CPU!
Near linear slowdown
Especially for interesting parts
Accurate for slow steps!
Repeatability
Variation ∆jiffies: <5 (95%)
Double, triple log for check
Use series of three or more
Example: When flashing memory, the first boot always takes a few seconds somewhere !
Environment
Watch it! It has effect on boot-speed
E.g. dhcp timeout!
Some Results
The following sheets show some results.
They show what CAN be (& is) measured
Some general examples are selected
Measurements are very project specific
And are –often– very boring for others
Often, POST and BOOT phases excluded
Not generic, not changeable,
Not measured with StartLog (but can be measured and added to graphs!)
For that reason, details are not explained
Please contact me directly/offline for your specific questions
All times (numbers) in jiffies All systems are NON-optimized!
A first impression
Boot and clock- time does give some info , but
Only limited information
No real phases
But close, for a 1 st look
What to improve?
Hard to explain
(where) is the system
Busy ( ‘overloaded’ ) , or
Waiting?
We need more detail !
And concentrate on
Linux/OpenSource parts
Delays (system is doing nothing)
Some timelines
Gumstix
2 timelines
Exactly the same, but for
the horizontal line; which is
a network timeout (See next sheet)
A lot less processes are started:
Only look to % of total number!
It uses less modules than above
Embedded PC
4 timelines
Similar but different
Notice:
Speed:
Fastest: yellow
Slowest: purple
Horizontal lines:
No progress!
Bottleneck && Network timeout
Normal (top)
Kernel 23%
Modules 35%
Networking 28%
*-mount 10%
S01 + S10 + S11
Network Timeout (bottom)
Times are equal, but for
S20networking
When the dhcp-server is gone (no network, cable or server)
It takes an extra 16 seconds to boot
/etc/rcS.d/S01mountvirtfs /etc/rcS.d/S05module-init-tools /etc/rcS.d/S10mountall /etc/rcS.d/S11RAMdisk /etc/rcS.d/S15hostname /etc/rcS.d/S17sysklogd /etc/rcS.d/S20networking /etc/rcS.d/S25cron /etc/rcS.d/S30thttpd Kernel
Program Count
Often-exec’ed programs are good candidates to optimize
Found some surprises
‘ hotplug ’ is called directly by the kernel
Even when it does not exists!
Called 123 (aside) to 1315 (below) times!
Most: called a few times Some: an awful lot
Use logarithmic 2 nd axis
Summary (1/4)
Problems & Solutions
Linux is a slow starter
It needs more attention then a traditional RTOS
There are thousand of ‘improvements’ on the Net
Google on ‘ make embedded Linux boot faster ’
1.4 million hits, of which
198 PowerPoint presentations in past 3 months (excluding this one)
Usually they are (more or less ‘good’) ideas
But, what do they improve? Or change?
Does it apply to your HW, system, version, ... too?
Do you know your bottleneck?
How to measure that improvement?
Summary (2/4)
Booting Linux
Flexibility & Power do come with a cost
Embedded Linux boots a lot faster then ‘normal Linux’
There is much more but ‘the Linux kernel’
At least 5 phases
Thousands of steps
The ‘environment’ has influence
dhcp example: 16 extra seconds without networking!
Linux is OpenSource ...
You can change it!
There is more OpenSource then ‘Linux’ only
Non-kernel stuff; other OS (both Unix-alike and others)
Summary (3/4)
StartLog
Capture, measure and visualize the Linux start-up
Simple, reliable, repeatable
Cheap
It is a concept, with little, free code
Easy and fast to operate
For interpretation Linux know-how is needed
Useable on all Linux versions
So you can improve your system!
Improve
‘ Measure, pin-point, change, re-measure’
Summary (4/4)
Generic Quick Wins
Disable timeouts
Disable/remove unneeded kernel-modules
Trade-off time/space
Uncompressed images are (usually) faster!
Advice
Make specific (non-generic) boot-scripts
Use delayed/background processing
E.g. Start network (dhcp) late, background fsck (BSD only)
Measure and compare what is going on!
Questions and More info
Although the sheets are overloaded with info, it’s only a fraction of what’s available.
More info:
Most patches & scripts are available
This presentation is available
See the ‘note-pages’! (Print hidden sheets!)
See the website(s) for the latest versions.
Questions:
http://albert.mietus.nl [email_address]
http://www.PassieVoorTechniek.nl