From acb86de34e3262cd6233da66bf2fa0b9c8a22171 Mon Sep 17 00:00:00 2001 From: Scott Rifenbark Date: Thu, 17 Jan 2013 12:57:16 -0800 Subject: profile-manual: Added oprofile usage section. No re-writing at all. (From yocto-docs rev: f42230e3665903a7603212696949241244555f8b) Signed-off-by: Scott Rifenbark Signed-off-by: Richard Purdie --- .../profile-manual/profile-manual-usage.xml | 552 +++++++++++++++++++++ 1 file changed, 552 insertions(+) (limited to 'documentation/profile-manual') diff --git a/documentation/profile-manual/profile-manual-usage.xml b/documentation/profile-manual/profile-manual-usage.xml index 39a0c5c91f..f2bc868b28 100644 --- a/documentation/profile-manual/profile-manual-usage.xml +++ b/documentation/profile-manual/profile-manual-usage.xml @@ -2221,6 +2221,558 @@ +
+ oprofile + + + oprofile itself is a command-line application that runs on the + target system. + + +
+ Setup + + + For this section, we'll assume you've already performed the + basic setup outlined in the + "General Setup" + section. + + + + For the the section that deals with oprofile from the command-line, + we assume you've ssh'ed to the host and will be running + oprofile on the target. + + + + oprofileui (oprofile-viewer) is a GUI-based program that runs + on the host and interacts remotely with the target. + See the oprofileui section for the exact steps needed to + install oprofileui on the host. + +
+ +
+ Basic Usage + + + Oprofile as configured in Yocto is a system-wide profiler + (i.e. the version in Yocto doesn't yet make use of the + perf_events interface which would allow it to profile + specific processes and workloads). It's relies on hardware + counter support in the hardware (but can fall back to a + timer-based mode), which means that it doesn't take + advantage of tracepoints or other event sources for example. + + + + It consists of a kernel module that collects samples and a + userspace daemon that writes the sample data to disk. + + + + The 'opcontrol' shell script is used for transparently + managing these components and starting and stopping + profiles, and the 'opreport' command is used to + display the results. + + + + The oprofile daemon should already be running, but before + you start profiling, you may need to change some settings + and some of these settings may require the daemon not + be running. One of these settings is the path the the + vmlinux file, which you'll want to set using the --vmlinux + option if you want the kernel profiled: + + root@crownbay:~# opcontrol --vmlinux=/boot/vmlinux-`uname -r` + The profiling daemon is currently active, so changes to the configuration + will be used the next time you restart oprofile after a --shutdown or --deinit. + + You can check if vmlinux file: is set using opcontrol --status: + + root@crownbay:~# opcontrol --status + Daemon paused: pid 1334 + Separate options: library + vmlinux file: none + Image filter: none + Call-graph depth: 6 + + If it's not, you need to shutdown the daemon, add the setting + and restart the daemon: + + root@crownbay:~# opcontrol --shutdown + Killing daemon. + + root@crownbay:~# opcontrol --vmlinux=/boot/vmlinux-`uname -r` + root@crownbay:~# opcontrol --start-daemon + Using default event: CPU_CLK_UNHALTED:100000:0:1:1 + Using 2.6+ OProfile kernel interface. + Reading module info. + Using log file /var/lib/oprofile/samples/oprofiled.log + Daemon started. + + If we get the status again we now see our updated settings: + + root@crownbay:~# opcontrol --status + Daemon paused: pid 1649 + Separate options: library + vmlinux file: /boot/vmlinux-3.4.11-yocto-standard + Image filter: none + Call-graph depth: 6 + + We're now in a position to run a profile. For that we used + 'opcontrol --start': + + root@crownbay:~# opcontrol --start + Profiler running. + + In another window, run our wget workload: + + root@crownbay:~# rm linux-2.6.19.2.tar.bz2; wget http://downloads.yoctoproject.org/mirror/sources/linux-2.6.19.2.tar.bz2; sync + Connecting to downloads.yoctoproject.org (140.211.169.59:80) + linux-2.6.19.2.tar.b 100% |*******************************| 41727k 0:00:00 ETA + + To stop the profile we use 'opcontrol --shudown', which not + only stops the profile but shuts down the daemon as well: + + root@crownbay:~# opcontrol --start + Stopping profiling. + Killing daemon. + + Oprofile writes sample data to /var/lib/oprofile/samples, + which you can look at if you're interested in seeing how the + samples are structured. This is also interesting because + it's related to how you dive down to get further details + about specific executables in OProfile. + + + + To see the default display output for a profile, simply type + 'opreport', which will show the results using the data in + /var/lib/oprofile/samples: + + root@crownbay:~# opreport + + WARNING! The OProfile kernel driver reports sample buffer overflows. + Such overflows can result in incorrect sample attribution, invalid sample + files and other symptoms. See the oprofiled.log for details. + You should adjust your sampling frequency to eliminate (or at least minimize) + these overflows. + CPU: Intel Architectural Perfmon, speed 1.3e+06 MHz (estimated) + Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (No unit mask) count 100000 + CPU_CLK_UNHALT...| + samples| %| + ------------------ + 464365 79.8156 vmlinux-3.4.11-yocto-standard + 65108 11.1908 oprofiled + CPU_CLK_UNHALT...| + samples| %| + ------------------ + 64416 98.9372 oprofiled + 692 1.0628 libc-2.16.so + 36959 6.3526 no-vmlinux + 4378 0.7525 busybox + CPU_CLK_UNHALT...| + samples| %| + ------------------ + 2844 64.9612 libc-2.16.so + 1337 30.5391 busybox + 193 4.4084 ld-2.16.so + 2 0.0457 libnss_compat-2.16.so + 1 0.0228 libnsl-2.16.so + 1 0.0228 libnss_files-2.16.so + 4344 0.7467 bash + CPU_CLK_UNHALT...| + samples| %| + ------------------ + 2657 61.1648 bash + 1665 38.3287 libc-2.16.so + 18 0.4144 ld-2.16.so + 3 0.0691 libtinfo.so.5.9 + 1 0.0230 libdl-2.16.so + 3118 0.5359 nf_conntrack + 686 0.1179 matchbox-terminal + CPU_CLK_UNHALT...| + samples| %| + ------------------ + 214 31.1953 libglib-2.0.so.0.3200.4 + 114 16.6181 libc-2.16.so + 79 11.5160 libcairo.so.2.11200.2 + 78 11.3703 libgdk-x11-2.0.so.0.2400.8 + 51 7.4344 libpthread-2.16.so + 45 6.5598 libgobject-2.0.so.0.3200.4 + 29 4.2274 libvte.so.9.2800.2 + 25 3.6443 libX11.so.6.3.0 + 19 2.7697 libxcb.so.1.1.0 + 17 2.4781 libgtk-x11-2.0.so.0.2400.8 + 12 1.7493 librt-2.16.so + 3 0.4373 libXrender.so.1.3.0 + 671 0.1153 emgd + 411 0.0706 nf_conntrack_ipv4 + 391 0.0672 iptable_nat + 378 0.0650 nf_nat + 263 0.0452 Xorg + CPU_CLK_UNHALT...| + samples| %| + ------------------ + 106 40.3042 Xorg + 53 20.1521 libc-2.16.so + 31 11.7871 libpixman-1.so.0.27.2 + 26 9.8859 emgd_drv.so + 16 6.0837 libemgdsrv_um.so.1.5.15.3226 + 11 4.1825 libEMGD2d.so.1.5.15.3226 + 9 3.4221 libfb.so + 7 2.6616 libpthread-2.16.so + 1 0.3802 libudev.so.0.9.3 + 1 0.3802 libdrm.so.2.4.0 + 1 0.3802 libextmod.so + 1 0.3802 mouse_drv.so + . + . + . + 9 0.0015 connmand + CPU_CLK_UNHALT...| + samples| %| + ------------------ + 4 44.4444 libglib-2.0.so.0.3200.4 + 2 22.2222 libpthread-2.16.so + 1 11.1111 connmand + 1 11.1111 libc-2.16.so + 1 11.1111 librt-2.16.so + 6 0.0010 oprofile-server + CPU_CLK_UNHALT...| + samples| %| + ------------------ + 3 50.0000 libc-2.16.so + 1 16.6667 oprofile-server + 1 16.6667 libpthread-2.16.so + 1 16.6667 libglib-2.0.so.0.3200.4 + 5 8.6e-04 gconfd-2 + CPU_CLK_UNHALT...| + samples| %| + ------------------ + 2 40.0000 libdbus-1.so.3.7.2 + 2 40.0000 libglib-2.0.so.0.3200.4 + 1 20.0000 libc-2.16.so + + The output above shows the breakdown or samples by both + number of samples and percentage for each executable. + Within an executable, the sample counts are broken down + further into executable and shared libraries (DSOs) used + by the executable. + + + + To get even more detailed breakdowns by function, we need to + have the full paths to the DSOs, which we can get by + using -f with opreport: + + root@crownbay:~# opreport -f + + CPU: Intel Architectural Perfmon, speed 1.3e+06 MHz (estimated) + Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (No unit mask) count 100000 + CPU_CLK_UNHALT...| + samples| %| + + 464365 79.8156 /boot/vmlinux-3.4.11-yocto-standard + 65108 11.1908 /usr/bin/oprofiled + CPU_CLK_UNHALT...| + samples| %| + ------------------ + 64416 98.9372 /usr/bin/oprofiled + 692 1.0628 /lib/libc-2.16.so + 36959 6.3526 /no-vmlinux + 4378 0.7525 /bin/busybox + CPU_CLK_UNHALT...| + samples| %| + ------------------ + 2844 64.9612 /lib/libc-2.16.so + 1337 30.5391 /bin/busybox + 193 4.4084 /lib/ld-2.16.so + 2 0.0457 /lib/libnss_compat-2.16.so + 1 0.0228 /lib/libnsl-2.16.so + 1 0.0228 /lib/libnss_files-2.16.so + 4344 0.7467 /bin/bash + CPU_CLK_UNHALT...| + samples| %| + ------------------ + 2657 61.1648 /bin/bash + 1665 38.3287 /lib/libc-2.16.so + 18 0.4144 /lib/ld-2.16.so + 3 0.0691 /lib/libtinfo.so.5.9 + 1 0.0230 /lib/libdl-2.16.so + . + . + . + + Using the paths shown in the above output and the -l option to + opreport, we can see all the functions that have hits in the + profile and their sample counts and percentages. Here's a + portion of what we get for the kernel: + + root@crownbay:~# opreport -l /boot/vmlinux-3.4.11-yocto-standard + + CPU: Intel Architectural Perfmon, speed 1.3e+06 MHz (estimated) + Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (No unit mask) count 100000 + samples % symbol name + 233981 50.3873 intel_idle + 15437 3.3243 rb_get_reader_page + 14503 3.1232 ring_buffer_consume + 14092 3.0347 mutex_spin_on_owner + 13024 2.8047 read_hpet + 8039 1.7312 sub_preempt_count + 7096 1.5281 ioread32 + 6997 1.5068 add_preempt_count + 3985 0.8582 rb_advance_reader + 3488 0.7511 add_event_entry + 3303 0.7113 get_parent_ip + 3104 0.6684 rb_buffer_peek + 2960 0.6374 op_cpu_buffer_read_entry + 2614 0.5629 sync_buffer + 2545 0.5481 debug_smp_processor_id + 2456 0.5289 ohci_irq + 2397 0.5162 memset + 2349 0.5059 __copy_to_user_ll + 2185 0.4705 ring_buffer_event_length + 1918 0.4130 in_lock_functions + 1850 0.3984 __schedule + 1767 0.3805 __copy_from_user_ll_nozero + 1575 0.3392 rb_event_data_length + 1256 0.2705 memcpy + 1233 0.2655 system_call + 1213 0.2612 menu_select + + Notice that above we see an entry for the __copy_to_user_ll() + function that we've looked at with other profilers as well. + + + + Here's what we get when we do the same thing for the + busybox executable: + + CPU: Intel Architectural Perfmon, speed 1.3e+06 MHz (estimated) + Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (No unit mask) count 100000 + samples % image name symbol name + 349 8.4198 busybox retrieve_file_data + 308 7.4306 libc-2.16.so _IO_file_xsgetn + 283 6.8275 libc-2.16.so __read_nocancel + 235 5.6695 libc-2.16.so syscall + 233 5.6212 libc-2.16.so clearerr + 215 5.1870 libc-2.16.so fread + 181 4.3667 libc-2.16.so __write_nocancel + 158 3.8118 libc-2.16.so __underflow + 151 3.6429 libc-2.16.so _dl_addr + 150 3.6188 busybox progress_meter + 150 3.6188 libc-2.16.so __poll_nocancel + 148 3.5706 libc-2.16.so _IO_file_underflow@@GLIBC_2.1 + 137 3.3052 busybox safe_poll + 125 3.0157 busybox bb_progress_update + 122 2.9433 libc-2.16.so __x86.get_pc_thunk.bx + 95 2.2919 busybox full_write + 81 1.9542 busybox safe_write + 77 1.8577 busybox xwrite + 72 1.7370 libc-2.16.so _IO_file_read + 71 1.7129 libc-2.16.so _IO_sgetn + 67 1.6164 libc-2.16.so poll + 52 1.2545 libc-2.16.so _IO_switch_to_get_mode + 45 1.0856 libc-2.16.so read + 34 0.8203 libc-2.16.so write + 32 0.7720 busybox monotonic_sec + 25 0.6031 libc-2.16.so vfprintf + 22 0.5308 busybox get_mono + 14 0.3378 ld-2.16.so strcmp + 14 0.3378 libc-2.16.so __x86.get_pc_thunk.cx + . + . + . + + Since we recorded the profile with a callchain depth of 6, we + should be able to see our __copy_to_user_ll() callchains in + the output, and indeed we can if we search around a bit in + the 'opreport --callgraph' output: + + root@crownbay:~# opreport --callgraph /boot/vmlinux-3.4.11-yocto-standard + + 392 6.9639 vmlinux-3.4.11-yocto-standard sock_aio_read + 736 13.0751 vmlinux-3.4.11-yocto-standard __generic_file_aio_write + 3255 57.8255 vmlinux-3.4.11-yocto-standard inet_recvmsg + 785 0.1690 vmlinux-3.4.11-yocto-standard tcp_recvmsg + 1790 31.7940 vmlinux-3.4.11-yocto-standard local_bh_enable + 1238 21.9893 vmlinux-3.4.11-yocto-standard __kfree_skb + 992 17.6199 vmlinux-3.4.11-yocto-standard lock_sock_nested + 785 13.9432 vmlinux-3.4.11-yocto-standard tcp_recvmsg [self] + 525 9.3250 vmlinux-3.4.11-yocto-standard release_sock + 112 1.9893 vmlinux-3.4.11-yocto-standard tcp_cleanup_rbuf + 72 1.2789 vmlinux-3.4.11-yocto-standard skb_copy_datagram_iovec + + 170 0.0366 vmlinux-3.4.11-yocto-standard skb_copy_datagram_iovec + 1491 73.3038 vmlinux-3.4.11-yocto-standard memcpy_toiovec + 327 16.0767 vmlinux-3.4.11-yocto-standard skb_copy_datagram_iovec + 170 8.3579 vmlinux-3.4.11-yocto-standard skb_copy_datagram_iovec [self] + 20 0.9833 vmlinux-3.4.11-yocto-standard copy_to_user + + 2588 98.2909 vmlinux-3.4.11-yocto-standard copy_to_user + 2349 0.5059 vmlinux-3.4.11-yocto-standard __copy_to_user_ll + 2349 89.2138 vmlinux-3.4.11-yocto-standard __copy_to_user_ll [self] + 166 6.3046 vmlinux-3.4.11-yocto-standard do_page_fault + + Remember that by default OProfile sessions are cumulative + i.e. if you start and stop a profiling session, then start a + new one, the new one will not erase the previous run(s) but + will build on it. If you want to restart a profile from scratch, + you need to reset: + + root@crownbay:~# opcontrol --reset + + +
+ +
+ OProfileUI - A GUI for OProfile + + + Yocto also supports a graphical UI for controlling and viewing + OProfile traces, called OProfileUI. To use it, you first need + to clone the oprofileui git repo, then configure, build, and + install it: + + [trz@empanada tmp]$ git clone git://git.yoctoproject.org/oprofileui + [trz@empanada tmp]$ cd oprofileui + [trz@empanada oprofileui]$ ./autogen.sh + [trz@empanada oprofileui]$ sudo make install + + OprofileUI replaces the 'opreport' functionality with a GUI, + and normally doesn't require the user to use 'opcontrol' either. + If you want to profile the kernel, however, you need to either + use the UI to specify a vmlinux or use 'opcontrol' to specify + it on the target: + + + + First, on the target, check if vmlinux file: is set: + + root@crownbay:~# opcontrol --status + + If not: + + root@crownbay:~# opcontrol --shutdown + root@crownbay:~# opcontrol --vmlinux=/boot/vmlinux-`uname -r` + root@crownbay:~# opcontrol --start-daemon + + Now, start the oprofile UI on the host system: + + [trz@empanada oprofileui]$ oprofile-viewer + + To run a profile on the remote system, first connect to the + remote system by pressing the 'Connect' button and supplying + the IP address and port of the remote system (the default + port is 4224). + + + + The oprofile server should automatically be started already. + If not, the connection will fail and you either typed in the + wrong IP address and port (see below), or you need to start + the server yourself: + + root@crownbay:~# oprofile-server + + Or, to specify a specific port: + + root@crownbay:~# oprofile-server --port 8888 + + Once connected, press the 'Start' button and then run the + wget workload on the remote system: + + root@crownbay:~# rm linux-2.6.19.2.tar.bz2; wget http://downloads.yoctoproject.org/mirror/sources/linux-2.6.19.2.tar.bz2; sync + Connecting to downloads.yoctoproject.org (140.211.169.59:80) + linux-2.6.19.2.tar.b 100% |*******************************| 41727k 0:00:00 ETA + + Once the workload completes, press the 'Stop' button. At that + point the OProfile viewer will download the profile files it's + collected (this may take some time, especially if the kernel + was profiled). While it downloads the files, you should see + something like the following: + + + + + + + + Once the profile files have been retrieved, you should see a + list of the processes that were profiled: + + + + + + + + If you select one of them, you should see all the symbols that + were hit during the profile. Selecting one of them will show a + list of callers and callees of the chosen function in two + panes below the top pane. For example, here's what we see + when we select __copy_to_user_ll(): + + + + + + + + As another example, we can look at the busybox process and see + that the progress meter made a system call: + + + + + + + + Tying It Together: oprofile does have build options to enable + use of the perf_event subsystem and benefit from the perf_event + infrastructure by adding support for something other than + system-wide profiling i.e. per-process or workload profiling, + but the version in danny doesn't yet take advantage of + those capabilities. + +
+ +
+ Documentation + + + Yocto already has some information on setting up and using + OProfile and oprofileui. As this document doesn't cover + everything in detail, it may be worth taking a look at the + "Profiling with OProfile" + section in the Yocto Project Development Manual + + + + The OProfile manual can be found here: + OProfile manual + + + + The OProfile website contains links to the above manual and + bunch of other items including an extensive set of examples: + About OProfile + +
+ + + + + + + + + +