diff options
author | Michael Opdenacker <michael.opdenacker@bootlin.com> | 2024-03-13 13:47:11 +0100 |
---|---|---|
committer | Richard Purdie <richard.purdie@linuxfoundation.org> | 2024-03-19 15:22:27 +0000 |
commit | 063b0cf8f99e2a438e35c4796a48cf946a5e5329 (patch) | |
tree | 7933f394c1f905bd358af5f31bdc22a47c182e24 | |
parent | e987c5151c34cb388cc1a106a21bdc118206cfac (diff) | |
download | poky-063b0cf8f99e2a438e35c4796a48cf946a5e5329.tar.gz |
profile-manual: usage.rst: formatting fixes
Plus a few text styling improvements, some reported by "make stylecheck"
(From yocto-docs rev: c29a1bf4dd5223e4af9c1be5a8a8ea6a88bb7a58)
Signed-off-by: Michael Opdenacker <michael.opdenacker@bootlin.com>
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
-rw-r--r-- | documentation/profile-manual/usage.rst | 467 |
1 files changed, 235 insertions, 232 deletions
diff --git a/documentation/profile-manual/usage.rst b/documentation/profile-manual/usage.rst index 6f0b0418e7..bd45dc2137 100644 --- a/documentation/profile-manual/usage.rst +++ b/documentation/profile-manual/usage.rst | |||
@@ -10,10 +10,10 @@ Basic Usage (with examples) for each of the Yocto Tracing Tools | |||
10 | This chapter presents basic usage examples for each of the tracing | 10 | This chapter presents basic usage examples for each of the tracing |
11 | tools. | 11 | tools. |
12 | 12 | ||
13 | perf | 13 | Perf |
14 | ==== | 14 | ==== |
15 | 15 | ||
16 | The 'perf' tool is the profiling and tracing tool that comes bundled | 16 | The perf tool is the profiling and tracing tool that comes bundled |
17 | with the Linux kernel. | 17 | with the Linux kernel. |
18 | 18 | ||
19 | Don't let the fact that it's part of the kernel fool you into thinking | 19 | Don't let the fact that it's part of the kernel fool you into thinking |
@@ -61,7 +61,7 @@ Basic Perf Usage | |||
61 | ---------------- | 61 | ---------------- |
62 | 62 | ||
63 | The perf tool is pretty much self-documenting. To remind yourself of the | 63 | The perf tool is pretty much self-documenting. To remind yourself of the |
64 | available commands, simply type 'perf', which will show you basic usage | 64 | available commands, simply type ``perf``, which will show you basic usage |
65 | along with the available perf subcommands:: | 65 | along with the available perf subcommands:: |
66 | 66 | ||
67 | root@crownbay:~# perf | 67 | root@crownbay:~# perf |
@@ -97,19 +97,19 @@ along with the available perf subcommands:: | |||
97 | Using perf to do Basic Profiling | 97 | Using perf to do Basic Profiling |
98 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | 98 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
99 | 99 | ||
100 | As a simple test case, we'll profile the 'wget' of a fairly large file, | 100 | As a simple test case, we'll profile the ``wget`` of a fairly large file, |
101 | which is a minimally interesting case because it has both file and | 101 | which is a minimally interesting case because it has both file and |
102 | network I/O aspects, and at least in the case of standard Yocto images, | 102 | network I/O aspects, and at least in the case of standard Yocto images, |
103 | it's implemented as part of BusyBox, so the methods we use to analyze it | 103 | it's implemented as part of BusyBox, so the methods we use to analyze it |
104 | can be used in a very similar way to the whole host of supported BusyBox | 104 | can be used in a very similar way to the whole host of supported BusyBox |
105 | applets in Yocto. :: | 105 | applets in Yocto:: |
106 | 106 | ||
107 | root@crownbay:~# rm linux-2.6.19.2.tar.bz2; \ | 107 | root@crownbay:~# rm linux-2.6.19.2.tar.bz2; \ |
108 | wget &YOCTO_DL_URL;/mirror/sources/linux-2.6.19.2.tar.bz2 | 108 | wget &YOCTO_DL_URL;/mirror/sources/linux-2.6.19.2.tar.bz2 |
109 | 109 | ||
110 | The quickest and easiest way to get some basic overall data about what's | 110 | The quickest and easiest way to get some basic overall data about what's |
111 | going on for a particular workload is to profile it using 'perf stat'. | 111 | going on for a particular workload is to profile it using ``perf stat``. |
112 | 'perf stat' basically profiles using a few default counters and displays | 112 | This command basically profiles using a few default counters and displays |
113 | the summed counts at the end of the run:: | 113 | the summed counts at the end of the run:: |
114 | 114 | ||
115 | root@crownbay:~# perf stat wget &YOCTO_DL_URL;/mirror/sources/linux-2.6.19.2.tar.bz2 | 115 | root@crownbay:~# perf stat wget &YOCTO_DL_URL;/mirror/sources/linux-2.6.19.2.tar.bz2 |
@@ -135,9 +135,9 @@ Many times such a simple-minded test doesn't yield much of | |||
135 | interest, but sometimes it does (see Real-world Yocto bug (slow | 135 | interest, but sometimes it does (see Real-world Yocto bug (slow |
136 | loop-mounted write speed)). | 136 | loop-mounted write speed)). |
137 | 137 | ||
138 | Also, note that 'perf stat' isn't restricted to a fixed set of counters | 138 | Also, note that ``perf stat`` isn't restricted to a fixed set of counters |
139 | - basically any event listed in the output of 'perf list' can be tallied | 139 | --- basically any event listed in the output of ``perf list`` can be tallied |
140 | by 'perf stat'. For example, suppose we wanted to see a summary of all | 140 | by ``perf stat``. For example, suppose we wanted to see a summary of all |
141 | the events related to kernel memory allocation/freeing along with cache | 141 | the events related to kernel memory allocation/freeing along with cache |
142 | hits and misses:: | 142 | hits and misses:: |
143 | 143 | ||
@@ -164,22 +164,22 @@ hits and misses:: | |||
164 | 164 | ||
165 | 44.831023415 seconds time elapsed | 165 | 44.831023415 seconds time elapsed |
166 | 166 | ||
167 | So 'perf stat' gives us a nice easy | 167 | So ``perf stat`` gives us a nice easy |
168 | way to get a quick overview of what might be happening for a set of | 168 | way to get a quick overview of what might be happening for a set of |
169 | events, but normally we'd need a little more detail in order to | 169 | events, but normally we'd need a little more detail in order to |
170 | understand what's going on in a way that we can act on in a useful way. | 170 | understand what's going on in a way that we can act on in a useful way. |
171 | 171 | ||
172 | To dive down into a next level of detail, we can use 'perf record'/'perf | 172 | To dive down into a next level of detail, we can use ``perf record`` / |
173 | report' which will collect profiling data and present it to use using an | 173 | ``perf report`` which will collect profiling data and present it to use using an |
174 | interactive text-based UI (or simply as text if we specify ``--stdio`` to | 174 | interactive text-based UI (or simply as text if we specify ``--stdio`` to |
175 | 'perf report'). | 175 | ``perf report``). |
176 | 176 | ||
177 | As our first attempt at profiling this workload, we'll simply run 'perf | 177 | As our first attempt at profiling this workload, we'll simply run ``perf |
178 | record', handing it the workload we want to profile (everything after | 178 | record``, handing it the workload we want to profile (everything after |
179 | 'perf record' and any perf options we hand it --- here none, will be | 179 | ``perf record`` and any perf options we hand it --- here none, will be |
180 | executed in a new shell). perf collects samples until the process exits | 180 | executed in a new shell). perf collects samples until the process exits |
181 | and records them in a file named 'perf.data' in the current working | 181 | and records them in a file named ``perf.data`` in the current working |
182 | directory. :: | 182 | directory:: |
183 | 183 | ||
184 | root@crownbay:~# perf record wget &YOCTO_DL_URL;/mirror/sources/linux-2.6.19.2.tar.bz2 | 184 | root@crownbay:~# perf record wget &YOCTO_DL_URL;/mirror/sources/linux-2.6.19.2.tar.bz2 |
185 | 185 | ||
@@ -189,7 +189,7 @@ directory. :: | |||
189 | [ perf record: Captured and wrote 0.176 MB perf.data (~7700 samples) ] | 189 | [ perf record: Captured and wrote 0.176 MB perf.data (~7700 samples) ] |
190 | 190 | ||
191 | To see the results in a | 191 | To see the results in a |
192 | 'text-based UI' (tui), simply run 'perf report', which will read the | 192 | "text-based UI" (tui), simply run ``perf report``, which will read the |
193 | perf.data file in the current working directory and display the results | 193 | perf.data file in the current working directory and display the results |
194 | in an interactive UI:: | 194 | in an interactive UI:: |
195 | 195 | ||
@@ -199,24 +199,24 @@ in an interactive UI:: | |||
199 | :align: center | 199 | :align: center |
200 | :width: 70% | 200 | :width: 70% |
201 | 201 | ||
202 | The above screenshot displays a 'flat' profile, one entry for each | 202 | The above screenshot displays a "flat" profile, one entry for each |
203 | 'bucket' corresponding to the functions that were profiled during the | 203 | "bucket" corresponding to the functions that were profiled during the |
204 | profiling run, ordered from the most popular to the least (perf has | 204 | profiling run, ordered from the most popular to the least (perf has |
205 | options to sort in various orders and keys as well as display entries | 205 | options to sort in various orders and keys as well as display entries |
206 | only above a certain threshold and so on --- see the perf documentation | 206 | only above a certain threshold and so on --- see the perf documentation |
207 | for details). Note that this includes both userspace functions (entries | 207 | for details). Note that this includes both userspace functions (entries |
208 | containing a [.]) and kernel functions accounted to the process (entries | 208 | containing a ``[.]``) and kernel functions accounted to the process (entries |
209 | containing a [k]). (perf has command-line modifiers that can be used to | 209 | containing a ``[k]``). perf has command-line modifiers that can be used to |
210 | restrict the profiling to kernel or userspace, among others). | 210 | restrict the profiling to kernel or userspace, among others. |
211 | 211 | ||
212 | Notice also that the above report shows an entry for 'busybox', which is | 212 | Notice also that the above report shows an entry for ``busybox``, which is |
213 | the executable that implements 'wget' in Yocto, but that instead of a | 213 | the executable that implements ``wget`` in Yocto, but that instead of a |
214 | useful function name in that entry, it displays a not-so-friendly hex | 214 | useful function name in that entry, it displays a not-so-friendly hex |
215 | value instead. The steps below will show how to fix that problem. | 215 | value instead. The steps below will show how to fix that problem. |
216 | 216 | ||
217 | Before we do that, however, let's try running a different profile, one | 217 | Before we do that, however, let's try running a different profile, one |
218 | which shows something a little more interesting. The only difference | 218 | which shows something a little more interesting. The only difference |
219 | between the new profile and the previous one is that we'll add the -g | 219 | between the new profile and the previous one is that we'll add the ``-g`` |
220 | option, which will record not just the address of a sampled function, | 220 | option, which will record not just the address of a sampled function, |
221 | but the entire callchain to the sampled function as well:: | 221 | but the entire callchain to the sampled function as well:: |
222 | 222 | ||
@@ -238,11 +238,11 @@ took the most time, but we can also see a summary of how those functions | |||
238 | were called and learn something about how the program interacts with the | 238 | were called and learn something about how the program interacts with the |
239 | kernel in the process. | 239 | kernel in the process. |
240 | 240 | ||
241 | Notice that each entry in the above screenshot now contains a '+' on the | 241 | Notice that each entry in the above screenshot now contains a ``+`` on the |
242 | left-hand side. This means that we can expand the entry and drill down | 242 | left-hand side. This means that we can expand the entry and drill down |
243 | into the callchains that feed into that entry. Pressing 'enter' on any | 243 | into the callchains that feed into that entry. Pressing ``Enter`` on any |
244 | one of them will expand the callchain (you can also press 'E' to expand | 244 | one of them will expand the callchain (you can also press ``E`` to expand |
245 | them all at the same time or 'C' to collapse them all). | 245 | them all at the same time or ``C`` to collapse them all). |
246 | 246 | ||
247 | In the screenshot above, we've toggled the ``__copy_to_user_ll()`` entry | 247 | In the screenshot above, we've toggled the ``__copy_to_user_ll()`` entry |
248 | and several subnodes all the way down. This lets us see which callchains | 248 | and several subnodes all the way down. This lets us see which callchains |
@@ -253,13 +253,13 @@ As a bit of background explanation for these callchains, think about | |||
253 | what happens at a high level when you run wget to get a file out on the | 253 | what happens at a high level when you run wget to get a file out on the |
254 | network. Basically what happens is that the data comes into the kernel | 254 | network. Basically what happens is that the data comes into the kernel |
255 | via the network connection (socket) and is passed to the userspace | 255 | via the network connection (socket) and is passed to the userspace |
256 | program 'wget' (which is actually a part of BusyBox, but that's not | 256 | program ``wget`` (which is actually a part of BusyBox, but that's not |
257 | important for now), which takes the buffers the kernel passes to it and | 257 | important for now), which takes the buffers the kernel passes to it and |
258 | writes it to a disk file to save it. | 258 | writes it to a disk file to save it. |
259 | 259 | ||
260 | The part of this process that we're looking at in the above call stacks | 260 | The part of this process that we're looking at in the above call stacks |
261 | is the part where the kernel passes the data it has read from the socket | 261 | is the part where the kernel passes the data it has read from the socket |
262 | down to wget i.e. a copy-to-user. | 262 | down to wget i.e. a ``copy-to-user``. |
263 | 263 | ||
264 | Notice also that here there's also a case where the hex value is | 264 | Notice also that here there's also a case where the hex value is |
265 | displayed in the callstack, here in the expanded ``sys_clock_gettime()`` | 265 | displayed in the callstack, here in the expanded ``sys_clock_gettime()`` |
@@ -270,8 +270,8 @@ busybox. | |||
270 | :align: center | 270 | :align: center |
271 | :width: 70% | 271 | :width: 70% |
272 | 272 | ||
273 | The above screenshot shows the other half of the journey for the data - | 273 | The above screenshot shows the other half of the journey for the data --- |
274 | from the wget program's userspace buffers to disk. To get the buffers to | 274 | from the ``wget`` program's userspace buffers to disk. To get the buffers to |
275 | disk, the wget program issues a ``write(2)``, which does a ``copy-from-user`` to | 275 | disk, the wget program issues a ``write(2)``, which does a ``copy-from-user`` to |
276 | the kernel, which then takes care via some circuitous path (probably | 276 | the kernel, which then takes care via some circuitous path (probably |
277 | also present somewhere in the profile data), to get it safely to disk. | 277 | also present somewhere in the profile data), to get it safely to disk. |
@@ -281,7 +281,7 @@ of how to extract useful information out of it, let's get back to the | |||
281 | task at hand and see if we can get some basic idea about where the time | 281 | task at hand and see if we can get some basic idea about where the time |
282 | is spent in the program we're profiling, wget. Remember that wget is | 282 | is spent in the program we're profiling, wget. Remember that wget is |
283 | actually implemented as an applet in BusyBox, so while the process name | 283 | actually implemented as an applet in BusyBox, so while the process name |
284 | is 'wget', the executable we're actually interested in is BusyBox. So | 284 | is ``wget``, the executable we're actually interested in is BusyBox. So |
285 | let's expand the first entry containing BusyBox: | 285 | let's expand the first entry containing BusyBox: |
286 | 286 | ||
287 | .. image:: figures/perf-wget-busybox-expanded-stripped.png | 287 | .. image:: figures/perf-wget-busybox-expanded-stripped.png |
@@ -311,16 +311,15 @@ To generate the debug info for the packages in the image, we can add | |||
311 | EXTRA_IMAGE_FEATURES = "debug-tweaks tools-profile dbg-pkgs" | 311 | EXTRA_IMAGE_FEATURES = "debug-tweaks tools-profile dbg-pkgs" |
312 | 312 | ||
313 | Additionally, in order to generate the type of debuginfo that perf | 313 | Additionally, in order to generate the type of debuginfo that perf |
314 | understands, we also need to set | 314 | understands, we also need to set :term:`PACKAGE_DEBUG_SPLIT_STYLE` |
315 | :term:`PACKAGE_DEBUG_SPLIT_STYLE` | ||
316 | in the ``local.conf`` file:: | 315 | in the ``local.conf`` file:: |
317 | 316 | ||
318 | PACKAGE_DEBUG_SPLIT_STYLE = 'debug-file-directory' | 317 | PACKAGE_DEBUG_SPLIT_STYLE = 'debug-file-directory' |
319 | 318 | ||
320 | Once we've done that, we can install the | 319 | Once we've done that, we can install the debuginfo for BusyBox. The |
321 | debuginfo for BusyBox. The debug packages once built can be found in | 320 | debug packages once built can be found in ``build/tmp/deploy/rpm/*`` |
322 | ``build/tmp/deploy/rpm/*`` on the host system. Find the busybox-dbg-...rpm | 321 | on the host system. Find the ``busybox-dbg-...rpm`` file and copy it |
323 | file and copy it to the target. For example:: | 322 | to the target. For example:: |
324 | 323 | ||
325 | [trz@empanada core2]$ scp /home/trz/yocto/crownbay-tracing-dbg/build/tmp/deploy/rpm/core2_32/busybox-dbg-1.20.2-r2.core2_32.rpm root@192.168.1.31: | 324 | [trz@empanada core2]$ scp /home/trz/yocto/crownbay-tracing-dbg/build/tmp/deploy/rpm/core2_32/busybox-dbg-1.20.2-r2.core2_32.rpm root@192.168.1.31: |
326 | busybox-dbg-1.20.2-r2.core2_32.rpm 100% 1826KB 1.8MB/s 00:01 | 325 | busybox-dbg-1.20.2-r2.core2_32.rpm 100% 1826KB 1.8MB/s 00:01 |
@@ -336,7 +335,7 @@ their functions symbolically: | |||
336 | :align: center | 335 | :align: center |
337 | :width: 70% | 336 | :width: 70% |
338 | 337 | ||
339 | If we expand one of the entries and press 'enter' on a leaf node, we're | 338 | If we expand one of the entries and press ``Enter`` on a leaf node, we're |
340 | presented with a menu of actions we can take to get more information | 339 | presented with a menu of actions we can take to get more information |
341 | related to that entry: | 340 | related to that entry: |
342 | 341 | ||
@@ -346,7 +345,7 @@ related to that entry: | |||
346 | 345 | ||
347 | One of these actions allows us to show a view that displays a | 346 | One of these actions allows us to show a view that displays a |
348 | busybox-centric view of the profiled functions (in this case we've also | 347 | busybox-centric view of the profiled functions (in this case we've also |
349 | expanded all the nodes using the 'E' key): | 348 | expanded all the nodes using the ``E`` key): |
350 | 349 | ||
351 | .. image:: figures/perf-wget-busybox-dso-zoom.png | 350 | .. image:: figures/perf-wget-busybox-dso-zoom.png |
352 | :align: center | 351 | :align: center |
@@ -355,8 +354,8 @@ expanded all the nodes using the 'E' key): | |||
355 | Finally, we can see that now that the BusyBox debuginfo is installed, | 354 | Finally, we can see that now that the BusyBox debuginfo is installed, |
356 | the previously unresolved symbol in the ``sys_clock_gettime()`` entry | 355 | the previously unresolved symbol in the ``sys_clock_gettime()`` entry |
357 | mentioned previously is now resolved, and shows that the | 356 | mentioned previously is now resolved, and shows that the |
358 | sys_clock_gettime system call that was the source of 6.75% of the | 357 | ``sys_clock_gettime`` system call that was the source of 6.75% of the |
359 | copy-to-user overhead was initiated by the ``handle_input()`` BusyBox | 358 | ``copy-to-user`` overhead was initiated by the ``handle_input()`` BusyBox |
360 | function: | 359 | function: |
361 | 360 | ||
362 | .. image:: figures/perf-wget-g-copy-to-user-expanded-debuginfo.png | 361 | .. image:: figures/perf-wget-g-copy-to-user-expanded-debuginfo.png |
@@ -365,15 +364,15 @@ function: | |||
365 | 364 | ||
366 | At the lowest level of detail, we can dive down to the assembly level | 365 | At the lowest level of detail, we can dive down to the assembly level |
367 | and see which instructions caused the most overhead in a function. | 366 | and see which instructions caused the most overhead in a function. |
368 | Pressing 'enter' on the 'udhcpc_main' function, we're again presented | 367 | Pressing ``Enter`` on the ``udhcpc_main`` function, we're again presented |
369 | with a menu: | 368 | with a menu: |
370 | 369 | ||
371 | .. image:: figures/perf-wget-busybox-annotate-menu.png | 370 | .. image:: figures/perf-wget-busybox-annotate-menu.png |
372 | :align: center | 371 | :align: center |
373 | :width: 70% | 372 | :width: 70% |
374 | 373 | ||
375 | Selecting 'Annotate udhcpc_main', we get a detailed listing of | 374 | Selecting ``Annotate udhcpc_main``, we get a detailed listing of |
376 | percentages by instruction for the udhcpc_main function. From the | 375 | percentages by instruction for the ``udhcpc_main`` function. From the |
377 | display, we can see that over 50% of the time spent in this function is | 376 | display, we can see that over 50% of the time spent in this function is |
378 | taken up by a couple tests and the move of a constant (1) to a register: | 377 | taken up by a couple tests and the move of a constant (1) to a register: |
379 | 378 | ||
@@ -382,7 +381,7 @@ taken up by a couple tests and the move of a constant (1) to a register: | |||
382 | :width: 70% | 381 | :width: 70% |
383 | 382 | ||
384 | As a segue into tracing, let's try another profile using a different | 383 | As a segue into tracing, let's try another profile using a different |
385 | counter, something other than the default 'cycles'. | 384 | counter, something other than the default ``cycles``. |
386 | 385 | ||
387 | The tracing and profiling infrastructure in Linux has become unified in | 386 | The tracing and profiling infrastructure in Linux has become unified in |
388 | a way that allows us to use the same tool with a completely different | 387 | a way that allows us to use the same tool with a completely different |
@@ -392,7 +391,7 @@ traditional tools can also make use of the expanded possibilities now | |||
392 | available to them, and in some cases have, as mentioned previously). | 391 | available to them, and in some cases have, as mentioned previously). |
393 | 392 | ||
394 | We can get a list of the available events that can be used to profile a | 393 | We can get a list of the available events that can be used to profile a |
395 | workload via 'perf list':: | 394 | workload via ``perf list``:: |
396 | 395 | ||
397 | root@crownbay:~# perf list | 396 | root@crownbay:~# perf list |
398 | 397 | ||
@@ -528,13 +527,13 @@ workload via 'perf list':: | |||
528 | .. admonition:: Tying it Together | 527 | .. admonition:: Tying it Together |
529 | 528 | ||
530 | These are exactly the same set of events defined by the trace event | 529 | These are exactly the same set of events defined by the trace event |
531 | subsystem and exposed by ftrace/tracecmd/kernelshark as files in | 530 | subsystem and exposed by ftrace / tracecmd / kernelshark as files in |
532 | /sys/kernel/debug/tracing/events, by SystemTap as | 531 | ``/sys/kernel/debug/tracing/events``, by SystemTap as |
533 | kernel.trace("tracepoint_name") and (partially) accessed by LTTng. | 532 | kernel.trace("tracepoint_name") and (partially) accessed by LTTng. |
534 | 533 | ||
535 | Only a subset of these would be of interest to us when looking at this | 534 | Only a subset of these would be of interest to us when looking at this |
536 | workload, so let's choose the most likely subsystems (identified by the | 535 | workload, so let's choose the most likely subsystems (identified by the |
537 | string before the colon in the Tracepoint events) and do a 'perf stat' | 536 | string before the colon in the Tracepoint events) and do a ``perf stat`` |
538 | run using only those wildcarded subsystems:: | 537 | run using only those wildcarded subsystems:: |
539 | 538 | ||
540 | root@crownbay:~# perf stat -e skb:* -e net:* -e napi:* -e sched:* -e workqueue:* -e irq:* -e syscalls:* wget &YOCTO_DL_URL;/mirror/sources/linux-2.6.19.2.tar.bz2 | 539 | root@crownbay:~# perf stat -e skb:* -e net:* -e napi:* -e sched:* -e workqueue:* -e irq:* -e syscalls:* wget &YOCTO_DL_URL;/mirror/sources/linux-2.6.19.2.tar.bz2 |
@@ -607,8 +606,8 @@ and tell perf to do a profile using it as the sampling event:: | |||
607 | 606 | ||
608 | The screenshot above shows the results of running a profile using | 607 | The screenshot above shows the results of running a profile using |
609 | sched:sched_switch tracepoint, which shows the relative costs of various | 608 | sched:sched_switch tracepoint, which shows the relative costs of various |
610 | paths to sched_wakeup (note that sched_wakeup is the name of the | 609 | paths to ``sched_wakeup`` (note that ``sched_wakeup`` is the name of the |
611 | tracepoint --- it's actually defined just inside ttwu_do_wakeup(), which | 610 | tracepoint --- it's actually defined just inside ``ttwu_do_wakeup()``, which |
612 | accounts for the function name actually displayed in the profile: | 611 | accounts for the function name actually displayed in the profile: |
613 | 612 | ||
614 | .. code-block:: c | 613 | .. code-block:: c |
@@ -631,10 +630,10 @@ receive paths that presumably end up waking up wget (busybox) when | |||
631 | network data is ready. | 630 | network data is ready. |
632 | 631 | ||
633 | Note that because tracepoints are normally used for tracing, the default | 632 | Note that because tracepoints are normally used for tracing, the default |
634 | sampling period for tracepoints is 1 i.e. for tracepoints perf will | 633 | sampling period for tracepoints is ``1`` i.e. for tracepoints perf will |
635 | sample on every event occurrence (this can be changed using the -c | 634 | sample on every event occurrence (this can be changed using the ``-c`` |
636 | option). This is in contrast to hardware counters such as for example | 635 | option). This is in contrast to hardware counters such as for example |
637 | the default 'cycles' hardware counter used for normal profiling, where | 636 | the default ``cycles`` hardware counter used for normal profiling, where |
638 | sampling periods are much higher (in the thousands) because profiling | 637 | sampling periods are much higher (in the thousands) because profiling |
639 | should have as low an overhead as possible and sampling on every cycle | 638 | should have as low an overhead as possible and sampling on every cycle |
640 | would be prohibitively expensive. | 639 | would be prohibitively expensive. |
@@ -645,10 +644,10 @@ Using perf to do Basic Tracing | |||
645 | Profiling is a great tool for solving many problems or for getting a | 644 | Profiling is a great tool for solving many problems or for getting a |
646 | high-level view of what's going on with a workload or across the system. | 645 | high-level view of what's going on with a workload or across the system. |
647 | It is however by definition an approximation, as suggested by the most | 646 | It is however by definition an approximation, as suggested by the most |
648 | prominent word associated with it, 'sampling'. On the one hand, it | 647 | prominent word associated with it, ``sampling``. On the one hand, it |
649 | allows a representative picture of what's going on in the system to be | 648 | allows a representative picture of what's going on in the system to be |
650 | cheaply taken, but on the other hand, that cheapness limits its utility | 649 | cheaply taken, but on the other hand, that cheapness limits its utility |
651 | when that data suggests a need to 'dive down' more deeply to discover | 650 | when that data suggests a need to "dive down" more deeply to discover |
652 | what's really going on. In such cases, the only way to see what's really | 651 | what's really going on. In such cases, the only way to see what's really |
653 | going on is to be able to look at (or summarize more intelligently) the | 652 | going on is to be able to look at (or summarize more intelligently) the |
654 | individual steps that go into the higher-level behavior exposed by the | 653 | individual steps that go into the higher-level behavior exposed by the |
@@ -661,7 +660,7 @@ applicable to our workload:: | |||
661 | -e syscalls:sys_enter_read -e syscalls:sys_exit_read -e syscalls:sys_enter_write -e syscalls:sys_exit_write | 660 | -e syscalls:sys_enter_read -e syscalls:sys_exit_read -e syscalls:sys_enter_write -e syscalls:sys_exit_write |
662 | wget &YOCTO_DL_URL;/mirror/sources/linux-2.6.19.2.tar.bz2 | 661 | wget &YOCTO_DL_URL;/mirror/sources/linux-2.6.19.2.tar.bz2 |
663 | 662 | ||
664 | We can look at the raw trace output using 'perf script' with no | 663 | We can look at the raw trace output using ``perf script`` with no |
665 | arguments:: | 664 | arguments:: |
666 | 665 | ||
667 | root@crownbay:~# perf script | 666 | root@crownbay:~# perf script |
@@ -692,7 +691,7 @@ arguments:: | |||
692 | This gives us a detailed timestamped sequence of events that occurred within the | 691 | This gives us a detailed timestamped sequence of events that occurred within the |
693 | workload with respect to those events. | 692 | workload with respect to those events. |
694 | 693 | ||
695 | In many ways, profiling can be viewed as a subset of tracing - | 694 | In many ways, profiling can be viewed as a subset of tracing --- |
696 | theoretically, if you have a set of trace events that's sufficient to | 695 | theoretically, if you have a set of trace events that's sufficient to |
697 | capture all the important aspects of a workload, you can derive any of | 696 | capture all the important aspects of a workload, you can derive any of |
698 | the results or views that a profiling run can. | 697 | the results or views that a profiling run can. |
@@ -717,7 +716,7 @@ problem at hand. For example, if we wanted to make use of a 'counter' | |||
717 | that maps to the value of the time difference between when a process was | 716 | that maps to the value of the time difference between when a process was |
718 | scheduled to run on a processor and the time it actually ran, we | 717 | scheduled to run on a processor and the time it actually ran, we |
719 | wouldn't expect such a counter to exist on its own, but we could derive | 718 | wouldn't expect such a counter to exist on its own, but we could derive |
720 | one called say 'wakeup_latency' and use it to extract a useful view of | 719 | one called say ``wakeup_latency`` and use it to extract a useful view of |
721 | that metric from trace data. Likewise, we really can't figure out from | 720 | that metric from trace data. Likewise, we really can't figure out from |
722 | standard profiling tools how much data every process on the system reads | 721 | standard profiling tools how much data every process on the system reads |
723 | and writes, along with how many of those reads and writes fail | 722 | and writes, along with how many of those reads and writes fail |
@@ -726,7 +725,7 @@ right tools easily extract and present that information, but we'd need | |||
726 | something other than pre-canned profiling tools to do that. | 725 | something other than pre-canned profiling tools to do that. |
727 | 726 | ||
728 | Luckily, there is a general-purpose way to handle such needs, called | 727 | Luckily, there is a general-purpose way to handle such needs, called |
729 | 'programming languages'. Making programming languages easily available | 728 | "programming languages". Making programming languages easily available |
730 | to apply to such problems given the specific format of data is called a | 729 | to apply to such problems given the specific format of data is called a |
731 | 'programming language binding' for that data and language. Perf supports | 730 | 'programming language binding' for that data and language. Perf supports |
732 | two programming language bindings, one for Python and one for Perl. | 731 | two programming language bindings, one for Python and one for Perl. |
@@ -744,15 +743,15 @@ two programming language bindings, one for Python and one for Perl. | |||
744 | created an elaborate compiler-based machinery to translate its | 743 | created an elaborate compiler-based machinery to translate its |
745 | language into kernel modules written in C. | 744 | language into kernel modules written in C. |
746 | 745 | ||
747 | Now that we have the trace data in perf.data, we can use 'perf script | 746 | Now that we have the trace data in ``perf.data``, we can use ``perf script |
748 | -g' to generate a skeleton script with handlers for the read/write | 747 | -g`` to generate a skeleton script with handlers for the read / write |
749 | entry/exit events we recorded:: | 748 | entry / exit events we recorded:: |
750 | 749 | ||
751 | root@crownbay:~# perf script -g python | 750 | root@crownbay:~# perf script -g python |
752 | generated Python script: perf-script.py | 751 | generated Python script: perf-script.py |
753 | 752 | ||
754 | The skeleton script simply creates a Python function for each event type in the | 753 | The skeleton script simply creates a Python function for each event type in the |
755 | perf.data file. The body of each function simply prints the event name along | 754 | ``perf.data`` file. The body of each function simply prints the event name along |
756 | with its parameters. For example: | 755 | with its parameters. For example: |
757 | 756 | ||
758 | .. code-block:: python | 757 | .. code-block:: python |
@@ -766,7 +765,7 @@ with its parameters. For example: | |||
766 | print "skbaddr=%u, len=%u, name=%s\n" % (skbaddr, len, name), | 765 | print "skbaddr=%u, len=%u, name=%s\n" % (skbaddr, len, name), |
767 | 766 | ||
768 | We can run that script directly to print all of the events contained in the | 767 | We can run that script directly to print all of the events contained in the |
769 | perf.data file:: | 768 | ``perf.data`` file:: |
770 | 769 | ||
771 | root@crownbay:~# perf script -s perf-script.py | 770 | root@crownbay:~# perf script -s perf-script.py |
772 | 771 | ||
@@ -795,8 +794,8 @@ perf.data file:: | |||
795 | syscalls__sys_exit_read 1 11624.859944032 1262 wget nr=3, ret=1024 | 794 | syscalls__sys_exit_read 1 11624.859944032 1262 wget nr=3, ret=1024 |
796 | 795 | ||
797 | That in itself isn't very useful; after all, we can accomplish pretty much the | 796 | That in itself isn't very useful; after all, we can accomplish pretty much the |
798 | same thing by simply running 'perf script' without arguments in the same | 797 | same thing by simply running ``perf script`` without arguments in the same |
799 | directory as the perf.data file. | 798 | directory as the ``perf.data`` file. |
800 | 799 | ||
801 | We can however replace the print statements in the generated function | 800 | We can however replace the print statements in the generated function |
802 | bodies with whatever we want, and thereby make it infinitely more | 801 | bodies with whatever we want, and thereby make it infinitely more |
@@ -817,8 +816,8 @@ event. For example: | |||
817 | 816 | ||
818 | Each event handler function in the generated code | 817 | Each event handler function in the generated code |
819 | is modified to do this. For convenience, we define a common function | 818 | is modified to do this. For convenience, we define a common function |
820 | called inc_counts() that each handler calls; inc_counts() simply tallies | 819 | called ``inc_counts()`` that each handler calls; ``inc_counts()`` simply tallies |
821 | a count for each event using the 'counts' hash, which is a specialized | 820 | a count for each event using the ``counts`` hash, which is a specialized |
822 | hash function that does Perl-like autovivification, a capability that's | 821 | hash function that does Perl-like autovivification, a capability that's |
823 | extremely useful for kinds of multi-level aggregation commonly used in | 822 | extremely useful for kinds of multi-level aggregation commonly used in |
824 | processing traces (see perf's documentation on the Python language | 823 | processing traces (see perf's documentation on the Python language |
@@ -836,7 +835,7 @@ binding for details): | |||
836 | 835 | ||
837 | Finally, at the end of the trace processing run, we want to print the | 836 | Finally, at the end of the trace processing run, we want to print the |
838 | result of all the per-event tallies. For that, we use the special | 837 | result of all the per-event tallies. For that, we use the special |
839 | 'trace_end()' function: | 838 | ``trace_end()`` function: |
840 | 839 | ||
841 | .. code-block:: python | 840 | .. code-block:: python |
842 | 841 | ||
@@ -865,7 +864,7 @@ The end result is a summary of all the events recorded in the trace:: | |||
865 | syscalls__sys_exit_write 8990 | 864 | syscalls__sys_exit_write 8990 |
866 | 865 | ||
867 | Note that this is | 866 | Note that this is |
868 | pretty much exactly the same information we get from 'perf stat', which | 867 | pretty much exactly the same information we get from ``perf stat``, which |
869 | goes a little way to support the idea mentioned previously that given | 868 | goes a little way to support the idea mentioned previously that given |
870 | the right kind of trace data, higher-level profiling-type summaries can | 869 | the right kind of trace data, higher-level profiling-type summaries can |
871 | be derived from it. | 870 | be derived from it. |
@@ -878,29 +877,29 @@ System-Wide Tracing and Profiling | |||
878 | 877 | ||
879 | The examples so far have focused on tracing a particular program or | 878 | The examples so far have focused on tracing a particular program or |
880 | workload --- in other words, every profiling run has specified the program | 879 | workload --- in other words, every profiling run has specified the program |
881 | to profile in the command-line e.g. 'perf record wget ...'. | 880 | to profile in the command-line e.g. ``perf record wget ...``. |
882 | 881 | ||
883 | It's also possible, and more interesting in many cases, to run a | 882 | It's also possible, and more interesting in many cases, to run a |
884 | system-wide profile or trace while running the workload in a separate | 883 | system-wide profile or trace while running the workload in a separate |
885 | shell. | 884 | shell. |
886 | 885 | ||
887 | To do system-wide profiling or tracing, you typically use the -a flag to | 886 | To do system-wide profiling or tracing, you typically use the ``-a`` flag to |
888 | 'perf record'. | 887 | ``perf record``. |
889 | 888 | ||
890 | To demonstrate this, open up one window and start the profile using the | 889 | To demonstrate this, open up one window and start the profile using the |
891 | -a flag (press Ctrl-C to stop tracing):: | 890 | ``-a`` flag (press ``Ctrl-C`` to stop tracing):: |
892 | 891 | ||
893 | root@crownbay:~# perf record -g -a | 892 | root@crownbay:~# perf record -g -a |
894 | ^C[ perf record: Woken up 6 times to write data ] | 893 | ^C[ perf record: Woken up 6 times to write data ] |
895 | [ perf record: Captured and wrote 1.400 MB perf.data (~61172 samples) ] | 894 | [ perf record: Captured and wrote 1.400 MB perf.data (~61172 samples) ] |
896 | 895 | ||
897 | In another window, run the wget test:: | 896 | In another window, run the ``wget`` test:: |
898 | 897 | ||
899 | root@crownbay:~# wget &YOCTO_DL_URL;/mirror/sources/linux-2.6.19.2.tar.bz2 | 898 | root@crownbay:~# wget &YOCTO_DL_URL;/mirror/sources/linux-2.6.19.2.tar.bz2 |
900 | Connecting to downloads.yoctoproject.org (140.211.169.59:80) | 899 | Connecting to downloads.yoctoproject.org (140.211.169.59:80) |
901 | linux-2.6.19.2.tar.b 100% \|*******************************\| 41727k 0:00:00 ETA | 900 | linux-2.6.19.2.tar.b 100% \|*******************************\| 41727k 0:00:00 ETA |
902 | 901 | ||
903 | Here we see entries not only for our wget load, but for | 902 | Here we see entries not only for our ``wget`` load, but for |
904 | other processes running on the system as well: | 903 | other processes running on the system as well: |
905 | 904 | ||
906 | .. image:: figures/perf-systemwide.png | 905 | .. image:: figures/perf-systemwide.png |
@@ -909,12 +908,12 @@ other processes running on the system as well: | |||
909 | 908 | ||
910 | In the snapshot above, we can see callchains that originate in libc, and | 909 | In the snapshot above, we can see callchains that originate in libc, and |
911 | a callchain from Xorg that demonstrates that we're using a proprietary X | 910 | a callchain from Xorg that demonstrates that we're using a proprietary X |
912 | driver in userspace (notice the presence of 'PVR' and some other | 911 | driver in userspace (notice the presence of ``PVR`` and some other |
913 | unresolvable symbols in the expanded Xorg callchain). | 912 | unresolvable symbols in the expanded Xorg callchain). |
914 | 913 | ||
915 | Note also that we have both kernel and userspace entries in the above | 914 | Note also that we have both kernel and userspace entries in the above |
916 | snapshot. We can also tell perf to focus on userspace but providing a | 915 | snapshot. We can also tell perf to focus on userspace but providing a |
917 | modifier, in this case 'u', to the 'cycles' hardware counter when we | 916 | modifier, in this case ``u``, to the ``cycles`` hardware counter when we |
918 | record a profile:: | 917 | record a profile:: |
919 | 918 | ||
920 | root@crownbay:~# perf record -g -a -e cycles:u | 919 | root@crownbay:~# perf record -g -a -e cycles:u |
@@ -925,25 +924,25 @@ record a profile:: | |||
925 | :align: center | 924 | :align: center |
926 | :width: 70% | 925 | :width: 70% |
927 | 926 | ||
928 | Notice in the screenshot above, we see only userspace entries ([.]) | 927 | Notice in the screenshot above, we see only userspace entries (``[.]``) |
929 | 928 | ||
930 | Finally, we can press 'enter' on a leaf node and select the 'Zoom into | 929 | Finally, we can press ``Enter`` on a leaf node and select the ``Zoom into |
931 | DSO' menu item to show only entries associated with a specific DSO. In | 930 | DSO`` menu item to show only entries associated with a specific DSO. In |
932 | the screenshot below, we've zoomed into the 'libc' DSO which shows all | 931 | the screenshot below, we've zoomed into the ``libc`` DSO which shows all |
933 | the entries associated with the libc-xxx.so DSO. | 932 | the entries associated with the ``libc-xxx.so`` DSO. |
934 | 933 | ||
935 | .. image:: figures/perf-systemwide-libc.png | 934 | .. image:: figures/perf-systemwide-libc.png |
936 | :align: center | 935 | :align: center |
937 | :width: 70% | 936 | :width: 70% |
938 | 937 | ||
939 | We can also use the system-wide -a switch to do system-wide tracing. | 938 | We can also use the system-wide ``-a`` switch to do system-wide tracing. |
940 | Here we'll trace a couple of scheduler events:: | 939 | Here we'll trace a couple of scheduler events:: |
941 | 940 | ||
942 | root@crownbay:~# perf record -a -e sched:sched_switch -e sched:sched_wakeup | 941 | root@crownbay:~# perf record -a -e sched:sched_switch -e sched:sched_wakeup |
943 | ^C[ perf record: Woken up 38 times to write data ] | 942 | ^C[ perf record: Woken up 38 times to write data ] |
944 | [ perf record: Captured and wrote 9.780 MB perf.data (~427299 samples) ] | 943 | [ perf record: Captured and wrote 9.780 MB perf.data (~427299 samples) ] |
945 | 944 | ||
946 | We can look at the raw output using 'perf script' with no arguments:: | 945 | We can look at the raw output using ``perf script`` with no arguments:: |
947 | 946 | ||
948 | root@crownbay:~# perf script | 947 | root@crownbay:~# perf script |
949 | 948 | ||
@@ -962,10 +961,10 @@ Filtering | |||
962 | ^^^^^^^^^ | 961 | ^^^^^^^^^ |
963 | 962 | ||
964 | Notice that there are a lot of events that don't really have anything to | 963 | Notice that there are a lot of events that don't really have anything to |
965 | do with what we're interested in, namely events that schedule 'perf' | 964 | do with what we're interested in, namely events that schedule ``perf`` |
966 | itself in and out or that wake perf up. We can get rid of those by using | 965 | itself in and out or that wake perf up. We can get rid of those by using |
967 | the '--filter' option --- for each event we specify using -e, we can add a | 966 | the ``--filter`` option --- for each event we specify using ``-e``, we can add a |
968 | --filter after that to filter out trace events that contain fields with | 967 | ``--filter`` after that to filter out trace events that contain fields with |
969 | specific values:: | 968 | specific values:: |
970 | 969 | ||
971 | root@crownbay:~# perf record -a -e sched:sched_switch --filter 'next_comm != perf && prev_comm != perf' -e sched:sched_wakeup --filter 'comm != perf' | 970 | root@crownbay:~# perf record -a -e sched:sched_switch --filter 'next_comm != perf && prev_comm != perf' -e sched:sched_wakeup --filter 'comm != perf' |
@@ -991,16 +990,16 @@ specific values:: | |||
991 | kworker/0:3 1209 [000] 7932.326214: sched_switch: prev_comm=kworker/0:3 prev_pid=1209 prev_prio=120 prev_state=S ==> next_comm=swapper/0 next_pid=0 next_prio=120 | 990 | kworker/0:3 1209 [000] 7932.326214: sched_switch: prev_comm=kworker/0:3 prev_pid=1209 prev_prio=120 prev_state=S ==> next_comm=swapper/0 next_pid=0 next_prio=120 |
992 | 991 | ||
993 | In this case, we've filtered out all events that have | 992 | In this case, we've filtered out all events that have |
994 | 'perf' in their 'comm' or 'comm_prev' or 'comm_next' fields. Notice that | 993 | ``perf`` in their ``comm``, ``comm_prev`` or ``comm_next`` fields. Notice that |
995 | there are still events recorded for perf, but notice that those events | 994 | there are still events recorded for perf, but notice that those events |
996 | don't have values of 'perf' for the filtered fields. To completely | 995 | don't have values of ``perf`` for the filtered fields. To completely |
997 | filter out anything from perf will require a bit more work, but for the | 996 | filter out anything from perf will require a bit more work, but for the |
998 | purpose of demonstrating how to use filters, it's close enough. | 997 | purpose of demonstrating how to use filters, it's close enough. |
999 | 998 | ||
1000 | .. admonition:: Tying it Together | 999 | .. admonition:: Tying it Together |
1001 | 1000 | ||
1002 | These are exactly the same set of event filters defined by the trace | 1001 | These are exactly the same set of event filters defined by the trace |
1003 | event subsystem. See the ftrace/tracecmd/kernelshark section for more | 1002 | event subsystem. See the ftrace / tracecmd / kernelshark section for more |
1004 | discussion about these event filters. | 1003 | discussion about these event filters. |
1005 | 1004 | ||
1006 | .. admonition:: Tying it Together | 1005 | .. admonition:: Tying it Together |
@@ -1028,9 +1027,9 @@ Using Dynamic Tracepoints | |||
1028 | ~~~~~~~~~~~~~~~~~~~~~~~~~ | 1027 | ~~~~~~~~~~~~~~~~~~~~~~~~~ |
1029 | 1028 | ||
1030 | perf isn't restricted to the fixed set of static tracepoints listed by | 1029 | perf isn't restricted to the fixed set of static tracepoints listed by |
1031 | 'perf list'. Users can also add their own 'dynamic' tracepoints anywhere | 1030 | ``perf list``. Users can also add their own "dynamic" tracepoints anywhere |
1032 | in the kernel. For instance, suppose we want to define our own | 1031 | in the kernel. For instance, suppose we want to define our own |
1033 | tracepoint on do_fork(). We can do that using the 'perf probe' perf | 1032 | tracepoint on ``do_fork()``. We can do that using the ``perf probe`` perf |
1034 | subcommand:: | 1033 | subcommand:: |
1035 | 1034 | ||
1036 | root@crownbay:~# perf probe do_fork | 1035 | root@crownbay:~# perf probe do_fork |
@@ -1042,8 +1041,8 @@ subcommand:: | |||
1042 | perf record -e probe:do_fork -aR sleep 1 | 1041 | perf record -e probe:do_fork -aR sleep 1 |
1043 | 1042 | ||
1044 | Adding a new tracepoint via | 1043 | Adding a new tracepoint via |
1045 | 'perf probe' results in an event with all the expected files and format | 1044 | ``perf probe`` results in an event with all the expected files and format |
1046 | in /sys/kernel/debug/tracing/events, just the same as for static | 1045 | in ``/sys/kernel/debug/tracing/events``, just the same as for static |
1047 | tracepoints (as discussed in more detail in the trace events subsystem | 1046 | tracepoints (as discussed in more detail in the trace events subsystem |
1048 | section:: | 1047 | section:: |
1049 | 1048 | ||
@@ -1076,7 +1075,7 @@ existence:: | |||
1076 | probe:do_fork (on do_fork) | 1075 | probe:do_fork (on do_fork) |
1077 | probe:schedule (on schedule) | 1076 | probe:schedule (on schedule) |
1078 | 1077 | ||
1079 | Let's record system-wide ('sleep 30' is a | 1078 | Let's record system-wide (``sleep 30`` is a |
1080 | trick for recording system-wide but basically do nothing and then wake | 1079 | trick for recording system-wide but basically do nothing and then wake |
1081 | up after 30 seconds):: | 1080 | up after 30 seconds):: |
1082 | 1081 | ||
@@ -1084,7 +1083,7 @@ up after 30 seconds):: | |||
1084 | [ perf record: Woken up 1 times to write data ] | 1083 | [ perf record: Woken up 1 times to write data ] |
1085 | [ perf record: Captured and wrote 0.087 MB perf.data (~3812 samples) ] | 1084 | [ perf record: Captured and wrote 0.087 MB perf.data (~3812 samples) ] |
1086 | 1085 | ||
1087 | Using 'perf script' we can see each do_fork event that fired:: | 1086 | Using ``perf script`` we can see each do_fork event that fired:: |
1088 | 1087 | ||
1089 | root@crownbay:~# perf script | 1088 | root@crownbay:~# perf script |
1090 | 1089 | ||
@@ -1125,7 +1124,7 @@ Using 'perf script' we can see each do_fork event that fired:: | |||
1125 | matchbox-deskto 1311 [001] 34237.114106: do_fork: (c1028460) | 1124 | matchbox-deskto 1311 [001] 34237.114106: do_fork: (c1028460) |
1126 | gaku 1312 [000] 34237.202388: do_fork: (c1028460) | 1125 | gaku 1312 [000] 34237.202388: do_fork: (c1028460) |
1127 | 1126 | ||
1128 | And using 'perf report' on the same file, we can see the | 1127 | And using ``perf report`` on the same file, we can see the |
1129 | callgraphs from starting a few programs during those 30 seconds: | 1128 | callgraphs from starting a few programs during those 30 seconds: |
1130 | 1129 | ||
1131 | .. image:: figures/perf-probe-do_fork-profile.png | 1130 | .. image:: figures/perf-probe-do_fork-profile.png |
@@ -1170,7 +1169,7 @@ section can be found here: | |||
1170 | - The top-level `perf(1) manpage <https://linux.die.net/man/1/perf>`__. | 1169 | - The top-level `perf(1) manpage <https://linux.die.net/man/1/perf>`__. |
1171 | 1170 | ||
1172 | Normally, you should be able to invoke the man pages via perf itself | 1171 | Normally, you should be able to invoke the man pages via perf itself |
1173 | e.g. 'perf help' or 'perf help record'. | 1172 | e.g. ``perf help`` or ``perf help record``. |
1174 | 1173 | ||
1175 | To have the perf manpages installed on your target, modify your | 1174 | To have the perf manpages installed on your target, modify your |
1176 | configuration as follows:: | 1175 | configuration as follows:: |
@@ -1179,7 +1178,7 @@ configuration as follows:: | |||
1179 | DISTRO_FEATURES:append = " api-documentation" | 1178 | DISTRO_FEATURES:append = " api-documentation" |
1180 | 1179 | ||
1181 | The man pages in text form, along with some other files, such as a set | 1180 | The man pages in text form, along with some other files, such as a set |
1182 | of examples, can also be found in the 'perf' directory of the kernel tree:: | 1181 | of examples, can also be found in the ``perf`` directory of the kernel tree:: |
1183 | 1182 | ||
1184 | tools/perf/Documentation | 1183 | tools/perf/Documentation |
1185 | 1184 | ||
@@ -1190,7 +1189,7 @@ Tutorial <https://perf.wiki.kernel.org/index.php/Tutorial>`__ | |||
1190 | ftrace | 1189 | ftrace |
1191 | ====== | 1190 | ====== |
1192 | 1191 | ||
1193 | 'ftrace' literally refers to the 'ftrace function tracer' but in reality | 1192 | "ftrace" literally refers to the "ftrace function tracer" but in reality |
1194 | this encompasses a number of related tracers along with the | 1193 | this encompasses a number of related tracers along with the |
1195 | infrastructure that they all make use of. | 1194 | infrastructure that they all make use of. |
1196 | 1195 | ||
@@ -1204,16 +1203,16 @@ ftrace, trace-cmd, and kernelshark run on the target system, and are | |||
1204 | ready to go out-of-the-box --- no additional setup is necessary. For the | 1203 | ready to go out-of-the-box --- no additional setup is necessary. For the |
1205 | rest of this section we assume you've ssh'ed to the host and will be | 1204 | rest of this section we assume you've ssh'ed to the host and will be |
1206 | running ftrace on the target. kernelshark is a GUI application and if | 1205 | running ftrace on the target. kernelshark is a GUI application and if |
1207 | you use the '-X' option to ssh you can have the kernelshark GUI run on | 1206 | you use the ``-X`` option to ssh you can have the kernelshark GUI run on |
1208 | the target but display remotely on the host if you want. | 1207 | the target but display remotely on the host if you want. |
1209 | 1208 | ||
1210 | Basic ftrace usage | 1209 | Basic ftrace usage |
1211 | ------------------ | 1210 | ------------------ |
1212 | 1211 | ||
1213 | 'ftrace' essentially refers to everything included in the /tracing | 1212 | "ftrace" essentially refers to everything included in the ``/tracing`` |
1214 | directory of the mounted debugfs filesystem (Yocto follows the standard | 1213 | directory of the mounted debugfs filesystem (Yocto follows the standard |
1215 | convention and mounts it at /sys/kernel/debug). Here's a listing of all | 1214 | convention and mounts it at ``/sys/kernel/debug``). Here's a listing of all |
1216 | the files found in /sys/kernel/debug/tracing on a Yocto system:: | 1215 | the files found in ``/sys/kernel/debug/tracing`` on a Yocto system:: |
1217 | 1216 | ||
1218 | root@sugarbay:/sys/kernel/debug/tracing# ls | 1217 | root@sugarbay:/sys/kernel/debug/tracing# ls |
1219 | README kprobe_events trace | 1218 | README kprobe_events trace |
@@ -1229,7 +1228,7 @@ the files found in /sys/kernel/debug/tracing on a Yocto system:: | |||
1229 | free_buffer set_graph_function | 1228 | free_buffer set_graph_function |
1230 | 1229 | ||
1231 | The files listed above are used for various purposes | 1230 | The files listed above are used for various purposes |
1232 | - some relate directly to the tracers themselves, others are used to set | 1231 | --- some relate directly to the tracers themselves, others are used to set |
1233 | tracing options, and yet others actually contain the tracing output when | 1232 | tracing options, and yet others actually contain the tracing output when |
1234 | a tracer is in effect. Some of the functions can be guessed from their | 1233 | a tracer is in effect. Some of the functions can be guessed from their |
1235 | names, others need explanation; in any case, we'll cover some of the | 1234 | names, others need explanation; in any case, we'll cover some of the |
@@ -1238,30 +1237,30 @@ the ftrace documentation. | |||
1238 | 1237 | ||
1239 | We'll start by looking at some of the available built-in tracers. | 1238 | We'll start by looking at some of the available built-in tracers. |
1240 | 1239 | ||
1241 | cat'ing the 'available_tracers' file lists the set of available tracers:: | 1240 | cat'ing the ``available_tracers`` file lists the set of available tracers:: |
1242 | 1241 | ||
1243 | root@sugarbay:/sys/kernel/debug/tracing# cat available_tracers | 1242 | root@sugarbay:/sys/kernel/debug/tracing# cat available_tracers |
1244 | blk function_graph function nop | 1243 | blk function_graph function nop |
1245 | 1244 | ||
1246 | The 'current_tracer' file contains the tracer currently in effect:: | 1245 | The ``current_tracer`` file contains the tracer currently in effect:: |
1247 | 1246 | ||
1248 | root@sugarbay:/sys/kernel/debug/tracing# cat current_tracer | 1247 | root@sugarbay:/sys/kernel/debug/tracing# cat current_tracer |
1249 | nop | 1248 | nop |
1250 | 1249 | ||
1251 | The above listing of current_tracer shows that the | 1250 | The above listing of current_tracer shows that the |
1252 | 'nop' tracer is in effect, which is just another way of saying that | 1251 | ``nop`` tracer is in effect, which is just another way of saying that |
1253 | there's actually no tracer currently in effect. | 1252 | there's actually no tracer currently in effect. |
1254 | 1253 | ||
1255 | echo'ing one of the available_tracers into current_tracer makes the | 1254 | echo'ing one of the available_tracers into ``current_tracer`` makes the |
1256 | specified tracer the current tracer:: | 1255 | specified tracer the current tracer:: |
1257 | 1256 | ||
1258 | root@sugarbay:/sys/kernel/debug/tracing# echo function > current_tracer | 1257 | root@sugarbay:/sys/kernel/debug/tracing# echo function > current_tracer |
1259 | root@sugarbay:/sys/kernel/debug/tracing# cat current_tracer | 1258 | root@sugarbay:/sys/kernel/debug/tracing# cat current_tracer |
1260 | function | 1259 | function |
1261 | 1260 | ||
1262 | The above sets the current tracer to be the 'function tracer'. This tracer | 1261 | The above sets the current tracer to be the ``function`` tracer. This tracer |
1263 | traces every function call in the kernel and makes it available as the | 1262 | traces every function call in the kernel and makes it available as the |
1264 | contents of the 'trace' file. Reading the 'trace' file lists the | 1263 | contents of the ``trace`` file. Reading the ``trace`` file lists the |
1265 | currently buffered function calls that have been traced by the function | 1264 | currently buffered function calls that have been traced by the function |
1266 | tracer:: | 1265 | tracer:: |
1267 | 1266 | ||
@@ -1318,11 +1317,11 @@ great way to learn about how the kernel code works in a dynamic sense. | |||
1318 | .. admonition:: Tying it Together | 1317 | .. admonition:: Tying it Together |
1319 | 1318 | ||
1320 | The ftrace function tracer is also available from within perf, as the | 1319 | The ftrace function tracer is also available from within perf, as the |
1321 | ftrace:function tracepoint. | 1320 | ``ftrace:function`` tracepoint. |
1322 | 1321 | ||
1323 | It is a little more difficult to follow the call chains than it needs to | 1322 | It is a little more difficult to follow the call chains than it needs to |
1324 | be --- luckily there's a variant of the function tracer that displays the | 1323 | be --- luckily there's a variant of the function tracer that displays the |
1325 | callchains explicitly, called the 'function_graph' tracer:: | 1324 | callchains explicitly, called the ``function_graph`` tracer:: |
1326 | 1325 | ||
1327 | root@sugarbay:/sys/kernel/debug/tracing# echo function_graph > current_tracer | 1326 | root@sugarbay:/sys/kernel/debug/tracing# echo function_graph > current_tracer |
1328 | root@sugarbay:/sys/kernel/debug/tracing# cat trace | less | 1327 | root@sugarbay:/sys/kernel/debug/tracing# cat trace | less |
@@ -1437,7 +1436,7 @@ callchains explicitly, called the 'function_graph' tracer:: | |||
1437 | 3) + 13.784 us | } | 1436 | 3) + 13.784 us | } |
1438 | 3) | sys_ioctl() { | 1437 | 3) | sys_ioctl() { |
1439 | 1438 | ||
1440 | As you can see, the function_graph display is much easier | 1439 | As you can see, the ``function_graph`` display is much easier |
1441 | to follow. Also note that in addition to the function calls and | 1440 | to follow. Also note that in addition to the function calls and |
1442 | associated braces, other events such as scheduler events are displayed | 1441 | associated braces, other events such as scheduler events are displayed |
1443 | in context. In fact, you can freely include any tracepoint available in | 1442 | in context. In fact, you can freely include any tracepoint available in |
@@ -1455,9 +1454,9 @@ The 'trace events' Subsystem | |||
1455 | ---------------------------- | 1454 | ---------------------------- |
1456 | 1455 | ||
1457 | One especially important directory contained within the | 1456 | One especially important directory contained within the |
1458 | /sys/kernel/debug/tracing directory is the 'events' subdirectory, which | 1457 | ``/sys/kernel/debug/tracing`` directory is the ``events`` subdirectory, which |
1459 | contains representations of every tracepoint in the system. Listing out | 1458 | contains representations of every tracepoint in the system. Listing out |
1460 | the contents of the 'events' subdirectory, we see mainly another set of | 1459 | the contents of the ``events`` subdirectory, we see mainly another set of |
1461 | subdirectories:: | 1460 | subdirectories:: |
1462 | 1461 | ||
1463 | root@sugarbay:/sys/kernel/debug/tracing# cd events | 1462 | root@sugarbay:/sys/kernel/debug/tracing# cd events |
@@ -1505,9 +1504,9 @@ subdirectories:: | |||
1505 | drwxr-xr-x 26 root root 0 Nov 14 23:19 writeback | 1504 | drwxr-xr-x 26 root root 0 Nov 14 23:19 writeback |
1506 | 1505 | ||
1507 | Each one of these subdirectories | 1506 | Each one of these subdirectories |
1508 | corresponds to a 'subsystem' and contains yet again more subdirectories, | 1507 | corresponds to a "subsystem" and contains yet again more subdirectories, |
1509 | each one of those finally corresponding to a tracepoint. For example, | 1508 | each one of those finally corresponding to a tracepoint. For example, |
1510 | here are the contents of the 'kmem' subsystem:: | 1509 | here are the contents of the ``kmem`` subsystem:: |
1511 | 1510 | ||
1512 | root@sugarbay:/sys/kernel/debug/tracing/events# cd kmem | 1511 | root@sugarbay:/sys/kernel/debug/tracing/events# cd kmem |
1513 | root@sugarbay:/sys/kernel/debug/tracing/events/kmem# ls -al | 1512 | root@sugarbay:/sys/kernel/debug/tracing/events/kmem# ls -al |
@@ -1529,7 +1528,7 @@ here are the contents of the 'kmem' subsystem:: | |||
1529 | drwxr-xr-x 2 root root 0 Nov 14 23:19 mm_page_pcpu_drain | 1528 | drwxr-xr-x 2 root root 0 Nov 14 23:19 mm_page_pcpu_drain |
1530 | 1529 | ||
1531 | Let's see what's inside the subdirectory for a | 1530 | Let's see what's inside the subdirectory for a |
1532 | specific tracepoint, in this case the one for kmalloc:: | 1531 | specific tracepoint, in this case the one for ``kmalloc``:: |
1533 | 1532 | ||
1534 | root@sugarbay:/sys/kernel/debug/tracing/events/kmem# cd kmalloc | 1533 | root@sugarbay:/sys/kernel/debug/tracing/events/kmem# cd kmalloc |
1535 | root@sugarbay:/sys/kernel/debug/tracing/events/kmem/kmalloc# ls -al | 1534 | root@sugarbay:/sys/kernel/debug/tracing/events/kmem/kmalloc# ls -al |
@@ -1540,12 +1539,12 @@ specific tracepoint, in this case the one for kmalloc:: | |||
1540 | -r--r--r-- 1 root root 0 Nov 14 23:19 format | 1539 | -r--r--r-- 1 root root 0 Nov 14 23:19 format |
1541 | -r--r--r-- 1 root root 0 Nov 14 23:19 id | 1540 | -r--r--r-- 1 root root 0 Nov 14 23:19 id |
1542 | 1541 | ||
1543 | The 'format' file for the | 1542 | The ``format`` file for the |
1544 | tracepoint describes the event in memory, which is used by the various | 1543 | tracepoint describes the event in memory, which is used by the various |
1545 | tracing tools that now make use of these tracepoint to parse the event | 1544 | tracing tools that now make use of these tracepoint to parse the event |
1546 | and make sense of it, along with a 'print fmt' field that allows tools | 1545 | and make sense of it, along with a ``print fmt`` field that allows tools |
1547 | like ftrace to display the event as text. Here's what the format of the | 1546 | like ftrace to display the event as text. Here's what the format of the |
1548 | kmalloc event looks like:: | 1547 | ``kmalloc`` event looks like:: |
1549 | 1548 | ||
1550 | root@sugarbay:/sys/kernel/debug/tracing/events/kmem/kmalloc# cat format | 1549 | root@sugarbay:/sys/kernel/debug/tracing/events/kmem/kmalloc# cat format |
1551 | name: kmalloc | 1550 | name: kmalloc |
@@ -1580,11 +1579,11 @@ kmalloc event looks like:: | |||
1580 | long)(( gfp_t)0x08u), "GFP_MOVABLE"}, {(unsigned long)(( gfp_t)0), "GFP_NOTRACK"}, {(unsigned long)(( gfp_t)0x400000u), "GFP_NO_KSWAPD"}, | 1579 | long)(( gfp_t)0x08u), "GFP_MOVABLE"}, {(unsigned long)(( gfp_t)0), "GFP_NOTRACK"}, {(unsigned long)(( gfp_t)0x400000u), "GFP_NO_KSWAPD"}, |
1581 | {(unsigned long)(( gfp_t)0x800000u), "GFP_OTHER_NODE"} ) : "GFP_NOWAIT" | 1580 | {(unsigned long)(( gfp_t)0x800000u), "GFP_OTHER_NODE"} ) : "GFP_NOWAIT" |
1582 | 1581 | ||
1583 | The 'enable' file | 1582 | The ``enable`` file |
1584 | in the tracepoint directory is what allows the user (or tools such as | 1583 | in the tracepoint directory is what allows the user (or tools such as |
1585 | trace-cmd) to actually turn the tracepoint on and off. When enabled, the | 1584 | ``trace-cmd``) to actually turn the tracepoint on and off. When enabled, the |
1586 | corresponding tracepoint will start appearing in the ftrace 'trace' file | 1585 | corresponding tracepoint will start appearing in the ftrace ``trace`` file |
1587 | described previously. For example, this turns on the kmalloc tracepoint:: | 1586 | described previously. For example, this turns on the ``kmalloc`` tracepoint:: |
1588 | 1587 | ||
1589 | root@sugarbay:/sys/kernel/debug/tracing/events/kmem/kmalloc# echo 1 > enable | 1588 | root@sugarbay:/sys/kernel/debug/tracing/events/kmem/kmalloc# echo 1 > enable |
1590 | 1589 | ||
@@ -1596,7 +1595,7 @@ events in the output buffer:: | |||
1596 | root@sugarbay:/sys/kernel/debug/tracing# echo nop > current_tracer | 1595 | root@sugarbay:/sys/kernel/debug/tracing# echo nop > current_tracer |
1597 | root@sugarbay:/sys/kernel/debug/tracing# echo 1 > tracing_on | 1596 | root@sugarbay:/sys/kernel/debug/tracing# echo 1 > tracing_on |
1598 | 1597 | ||
1599 | Now, if we look at the 'trace' file, we see nothing | 1598 | Now, if we look at the ``trace`` file, we see nothing |
1600 | but the kmalloc events we just turned on:: | 1599 | but the kmalloc events we just turned on:: |
1601 | 1600 | ||
1602 | root@sugarbay:/sys/kernel/debug/tracing# cat trace | less | 1601 | root@sugarbay:/sys/kernel/debug/tracing# cat trace | less |
@@ -1643,17 +1642,17 @@ but the kmalloc events we just turned on:: | |||
1643 | <idle>-0 [000] ..s3 18156.400660: kmalloc: call_site=ffffffff81619b36 ptr=ffff88006d554800 bytes_req=512 bytes_alloc=512 gfp_flags=GFP_ATOMIC | 1642 | <idle>-0 [000] ..s3 18156.400660: kmalloc: call_site=ffffffff81619b36 ptr=ffff88006d554800 bytes_req=512 bytes_alloc=512 gfp_flags=GFP_ATOMIC |
1644 | matchbox-termin-1361 [001] ...1 18156.552800: kmalloc: call_site=ffffffff81614050 ptr=ffff88006db34800 bytes_req=576 bytes_alloc=1024 gfp_flags=GFP_KERNEL|GFP_REPEAT | 1643 | matchbox-termin-1361 [001] ...1 18156.552800: kmalloc: call_site=ffffffff81614050 ptr=ffff88006db34800 bytes_req=576 bytes_alloc=1024 gfp_flags=GFP_KERNEL|GFP_REPEAT |
1645 | 1644 | ||
1646 | To again disable the kmalloc event, we need to send 0 to the enable file:: | 1645 | To again disable the ``kmalloc`` event, we need to send ``0`` to the ``enable`` file:: |
1647 | 1646 | ||
1648 | root@sugarbay:/sys/kernel/debug/tracing/events/kmem/kmalloc# echo 0 > enable | 1647 | root@sugarbay:/sys/kernel/debug/tracing/events/kmem/kmalloc# echo 0 > enable |
1649 | 1648 | ||
1650 | You can enable any number of events or complete subsystems (by | 1649 | You can enable any number of events or complete subsystems (by |
1651 | using the 'enable' file in the subsystem directory) and get an | 1650 | using the ``enable`` file in the subsystem directory) and get an |
1652 | arbitrarily fine-grained idea of what's going on in the system by | 1651 | arbitrarily fine-grained idea of what's going on in the system by |
1653 | enabling as many of the appropriate tracepoints as applicable. | 1652 | enabling as many of the appropriate tracepoints as applicable. |
1654 | 1653 | ||
1655 | A number of the tools described in this HOWTO do just that, including | 1654 | A number of the tools described in this HOWTO do just that, including |
1656 | trace-cmd and kernelshark in the next section. | 1655 | ``trace-cmd`` and kernelshark in the next section. |
1657 | 1656 | ||
1658 | .. admonition:: Tying it Together | 1657 | .. admonition:: Tying it Together |
1659 | 1658 | ||
@@ -1661,27 +1660,27 @@ trace-cmd and kernelshark in the next section. | |||
1661 | ftrace, but by many of the other tools covered in this document and | 1660 | ftrace, but by many of the other tools covered in this document and |
1662 | they form a central point of integration for the various tracers | 1661 | they form a central point of integration for the various tracers |
1663 | available in Linux. They form a central part of the instrumentation | 1662 | available in Linux. They form a central part of the instrumentation |
1664 | for the following tools: perf, lttng, ftrace, blktrace and SystemTap | 1663 | for the following tools: perf, LTTng, ftrace, blktrace and SystemTap |
1665 | 1664 | ||
1666 | .. admonition:: Tying it Together | 1665 | .. admonition:: Tying it Together |
1667 | 1666 | ||
1668 | Eventually all the special-purpose tracers currently available in | 1667 | Eventually all the special-purpose tracers currently available in |
1669 | /sys/kernel/debug/tracing will be removed and replaced with | 1668 | ``/sys/kernel/debug/tracing`` will be removed and replaced with |
1670 | equivalent tracers based on the 'trace events' subsystem. | 1669 | equivalent tracers based on the "trace events" subsystem. |
1671 | 1670 | ||
1672 | trace-cmd/kernelshark | 1671 | trace-cmd / kernelshark |
1673 | --------------------- | 1672 | ----------------------- |
1674 | 1673 | ||
1675 | trace-cmd is essentially an extensive command-line 'wrapper' interface | 1674 | trace-cmd is essentially an extensive command-line "wrapper" interface |
1676 | that hides the details of all the individual files in | 1675 | that hides the details of all the individual files in |
1677 | /sys/kernel/debug/tracing, allowing users to specify specific particular | 1676 | ``/sys/kernel/debug/tracing``, allowing users to specify specific particular |
1678 | events within the /sys/kernel/debug/tracing/events/ subdirectory and to | 1677 | events within the ``/sys/kernel/debug/tracing/events/`` subdirectory and to |
1679 | collect traces and avoid having to deal with those details directly. | 1678 | collect traces and avoid having to deal with those details directly. |
1680 | 1679 | ||
1681 | As yet another layer on top of that, kernelshark provides a GUI that | 1680 | As yet another layer on top of that, kernelshark provides a GUI that |
1682 | allows users to start and stop traces and specify sets of events using | 1681 | allows users to start and stop traces and specify sets of events using |
1683 | an intuitive interface, and view the output as both trace events and as | 1682 | an intuitive interface, and view the output as both trace events and as |
1684 | a per-CPU graphical display. It directly uses 'trace-cmd' as the | 1683 | a per-CPU graphical display. It directly uses trace-cmd as the |
1685 | plumbing that accomplishes all that underneath the covers (and actually | 1684 | plumbing that accomplishes all that underneath the covers (and actually |
1686 | displays the trace-cmd command it uses, as we'll see). | 1685 | displays the trace-cmd command it uses, as we'll see). |
1687 | 1686 | ||
@@ -1689,7 +1688,7 @@ To start a trace using kernelshark, first start kernelshark:: | |||
1689 | 1688 | ||
1690 | root@sugarbay:~# kernelshark | 1689 | root@sugarbay:~# kernelshark |
1691 | 1690 | ||
1692 | Then bring up the 'Capture' dialog by | 1691 | Then bring up the ``Capture`` dialog by |
1693 | choosing from the kernelshark menu:: | 1692 | choosing from the kernelshark menu:: |
1694 | 1693 | ||
1695 | Capture | Record | 1694 | Capture | Record |
@@ -1707,10 +1706,10 @@ gets them for kernelshark. | |||
1707 | 1706 | ||
1708 | In the above screenshot, we've decided to explore the graphics subsystem | 1707 | In the above screenshot, we've decided to explore the graphics subsystem |
1709 | a bit and so have chosen to trace all the tracepoints contained within | 1708 | a bit and so have chosen to trace all the tracepoints contained within |
1710 | the 'i915' and 'drm' subsystems. | 1709 | the ``i915`` and ``drm`` subsystems. |
1711 | 1710 | ||
1712 | After doing that, we can start and stop the trace using the 'Run' and | 1711 | After doing that, we can start and stop the trace using the ``Run`` and |
1713 | 'Stop' button on the lower right corner of the dialog (the same button | 1712 | ``Stop`` button on the lower right corner of the dialog (the same button |
1714 | will turn into the 'Stop' button after the trace has started): | 1713 | will turn into the 'Stop' button after the trace has started): |
1715 | 1714 | ||
1716 | .. image:: figures/kernelshark-output-display.png | 1715 | .. image:: figures/kernelshark-output-display.png |
@@ -1721,7 +1720,7 @@ Notice that the right-hand pane shows the exact trace-cmd command-line | |||
1721 | that's used to run the trace, along with the results of the trace-cmd | 1720 | that's used to run the trace, along with the results of the trace-cmd |
1722 | run. | 1721 | run. |
1723 | 1722 | ||
1724 | Once the 'Stop' button is pressed, the graphical view magically fills up | 1723 | Once the ``Stop`` button is pressed, the graphical view magically fills up |
1725 | with a colorful per-cpu display of the trace data, along with the | 1724 | with a colorful per-cpu display of the trace data, along with the |
1726 | detailed event listing below that: | 1725 | detailed event listing below that: |
1727 | 1726 | ||
@@ -1729,8 +1728,8 @@ detailed event listing below that: | |||
1729 | :align: center | 1728 | :align: center |
1730 | :width: 70% | 1729 | :width: 70% |
1731 | 1730 | ||
1732 | Here's another example, this time a display resulting from tracing 'all | 1731 | Here's another example, this time a display resulting from tracing ``all |
1733 | events': | 1732 | events``: |
1734 | 1733 | ||
1735 | .. image:: figures/kernelshark-all.png | 1734 | .. image:: figures/kernelshark-all.png |
1736 | :align: center | 1735 | :align: center |
@@ -1779,12 +1778,12 @@ systemtap | |||
1779 | SystemTap is a system-wide script-based tracing and profiling tool. | 1778 | SystemTap is a system-wide script-based tracing and profiling tool. |
1780 | 1779 | ||
1781 | SystemTap scripts are C-like programs that are executed in the kernel to | 1780 | SystemTap scripts are C-like programs that are executed in the kernel to |
1782 | gather/print/aggregate data extracted from the context they end up being | 1781 | gather / print / aggregate data extracted from the context they end up being |
1783 | invoked under. | 1782 | invoked under. |
1784 | 1783 | ||
1785 | For example, this probe from the `SystemTap | 1784 | For example, this probe from the `SystemTap |
1786 | tutorial <https://sourceware.org/systemtap/tutorial/>`__ simply prints a | 1785 | tutorial <https://sourceware.org/systemtap/tutorial/>`__ simply prints a |
1787 | line every time any process on the system open()s a file. For each line, | 1786 | line every time any process on the system runs ``open()`` on a file. For each line, |
1788 | it prints the executable name of the program that opened the file, along | 1787 | it prints the executable name of the program that opened the file, along |
1789 | with its PID, and the name of the file it opened (or tried to open), | 1788 | with its PID, and the name of the file it opened (or tried to open), |
1790 | which it extracts from the open syscall's argstr. | 1789 | which it extracts from the open syscall's argstr. |
@@ -1809,19 +1808,19 @@ file containing the above text is trace_open.stp:: | |||
1809 | # stap trace_open.stp | 1808 | # stap trace_open.stp |
1810 | 1809 | ||
1811 | What systemtap does under the covers to run this probe is 1) parse and | 1810 | What systemtap does under the covers to run this probe is 1) parse and |
1812 | convert the probe to an equivalent 'C' form, 2) compile the 'C' form | 1811 | convert the probe to an equivalent "C" form, 2) compile the "C" form |
1813 | into a kernel module, 3) insert the module into the kernel, which arms | 1812 | into a kernel module, 3) insert the module into the kernel, which arms |
1814 | it, and 4) collect the data generated by the probe and display it to the | 1813 | it, and 4) collect the data generated by the probe and display it to the |
1815 | user. | 1814 | user. |
1816 | 1815 | ||
1817 | In order to accomplish steps 1 and 2, the 'stap' program needs access to | 1816 | In order to accomplish steps 1 and 2, the ``stap`` program needs access to |
1818 | the kernel build system that produced the kernel that the probed system | 1817 | the kernel build system that produced the kernel that the probed system |
1819 | is running. In the case of a typical embedded system (the 'target'), the | 1818 | is running. In the case of a typical embedded system (the "target"), the |
1820 | kernel build system unfortunately isn't typically part of the image | 1819 | kernel build system unfortunately isn't typically part of the image |
1821 | running on the target. It is normally available on the 'host' system | 1820 | running on the target. It is normally available on the "host" system |
1822 | that produced the target image however; in such cases, steps 1 and 2 are | 1821 | that produced the target image however; in such cases, steps 1 and 2 are |
1823 | executed on the host system, and steps 3 and 4 are executed on the | 1822 | executed on the host system, and steps 3 and 4 are executed on the |
1824 | target system, using only the systemtap 'runtime'. | 1823 | target system, using only the systemtap "runtime". |
1825 | 1824 | ||
1826 | The systemtap support in Yocto assumes that only steps 3 and 4 are run | 1825 | The systemtap support in Yocto assumes that only steps 3 and 4 are run |
1827 | on the target; it is possible to do everything on the target, but this | 1826 | on the target; it is possible to do everything on the target, but this |
@@ -1838,12 +1837,12 @@ systemtap Setup | |||
1838 | --------------- | 1837 | --------------- |
1839 | 1838 | ||
1840 | Those are a lot of steps and a lot of details, but fortunately Yocto | 1839 | Those are a lot of steps and a lot of details, but fortunately Yocto |
1841 | includes a script called 'crosstap' that will take care of those | 1840 | includes a script called ``crosstap`` that will take care of those |
1842 | details, allowing you to simply execute a systemtap script on the remote | 1841 | details, allowing you to simply execute a systemtap script on the remote |
1843 | target, with arguments if necessary. | 1842 | target, with arguments if necessary. |
1844 | 1843 | ||
1845 | In order to do this from a remote host, however, you need to have access | 1844 | In order to do this from a remote host, however, you need to have access |
1846 | to the build for the image you booted. The 'crosstap' script provides | 1845 | to the build for the image you booted. The ``crosstap`` script provides |
1847 | details on how to do this if you run the script on the host without | 1846 | details on how to do this if you run the script on the host without |
1848 | having done a build:: | 1847 | having done a build:: |
1849 | 1848 | ||
@@ -1852,29 +1851,35 @@ having done a build:: | |||
1852 | Error: No target kernel build found. | 1851 | Error: No target kernel build found. |
1853 | Did you forget to create a local build of your image? | 1852 | Did you forget to create a local build of your image? |
1854 | 1853 | ||
1855 | 'crosstap' requires a local sdk build of the target system | 1854 | 'crosstap' requires a local sdk build of the target system |
1856 | (or a build that includes 'tools-profile') in order to build | 1855 | (or a build that includes 'tools-profile') in order to build |
1857 | kernel modules that can probe the target system. | 1856 | kernel modules that can probe the target system. |
1858 | 1857 | ||
1859 | Practically speaking, that means you need to do the following: | 1858 | Practically speaking, that means you need to do the following: |
1860 | - If you're running a pre-built image, download the release | 1859 | |
1861 | and/or BSP tarballs used to build the image. | 1860 | - If you're running a pre-built image, download the release |
1862 | - If you're working from git sources, just clone the metadata | 1861 | and/or BSP tarballs used to build the image. |
1863 | and BSP layers needed to build the image you'll be booting. | 1862 | |
1864 | - Make sure you're properly set up to build a new image (see | 1863 | - If you're working from git sources, just clone the metadata |
1865 | the BSP README and/or the widely available basic documentation | 1864 | and BSP layers needed to build the image you'll be booting. |
1866 | that discusses how to build images). | 1865 | |
1867 | - Build an -sdk version of the image e.g.: | 1866 | - Make sure you're properly set up to build a new image (see |
1868 | $ bitbake core-image-sato-sdk | 1867 | the BSP README and/or the widely available basic documentation |
1869 | OR | 1868 | that discusses how to build images). |
1870 | - Build a non-sdk image but include the profiling tools: | 1869 | |
1871 | [ edit local.conf and add 'tools-profile' to the end of | 1870 | - Build an -sdk version of the image e.g.:: |
1872 | the EXTRA_IMAGE_FEATURES variable ] | 1871 | |
1873 | $ bitbake core-image-sato | 1872 | $ bitbake core-image-sato-sdk |
1873 | |||
1874 | - Or build a non-sdk image but include the profiling tools | ||
1875 | (edit ``local.conf`` and add ``tools-profile`` to the end of | ||
1876 | :term:``EXTRA_IMAGE_FEATURES`` variable):: | ||
1877 | |||
1878 | $ bitbake core-image-sato | ||
1874 | 1879 | ||
1875 | Once you've build the image on the host system, you're ready to | 1880 | Once you've build the image on the host system, you're ready to |
1876 | boot it (or the equivalent pre-built image) and use 'crosstap' | 1881 | boot it (or the equivalent pre-built image) and use ``crosstap`` |
1877 | to probe it (you need to source the environment as usual first): | 1882 | to probe it (you need to source the environment as usual first):: |
1878 | 1883 | ||
1879 | $ source oe-init-build-env | 1884 | $ source oe-init-build-env |
1880 | $ cd ~/my/systemtap/scripts | 1885 | $ cd ~/my/systemtap/scripts |
@@ -1882,23 +1887,22 @@ having done a build:: | |||
1882 | 1887 | ||
1883 | .. note:: | 1888 | .. note:: |
1884 | 1889 | ||
1885 | SystemTap, which uses 'crosstap', assumes you can establish an ssh | 1890 | SystemTap, which uses ``crosstap``, assumes you can establish an ssh |
1886 | connection to the remote target. Please refer to the crosstap wiki | 1891 | connection to the remote target. Please refer to the crosstap wiki |
1887 | page for details on verifying ssh connections at | 1892 | page for details on verifying ssh connections. Also, the ability to ssh |
1888 | . Also, the ability to ssh into the target system is not enabled by | 1893 | into the target system is not enabled by default in ``*-minimal`` images. |
1889 | default in \*-minimal images. | ||
1890 | 1894 | ||
1891 | So essentially what you need to | 1895 | So essentially what you need to |
1892 | do is build an SDK image or image with 'tools-profile' as detailed in | 1896 | do is build an SDK image or image with ``tools-profile`` as detailed in |
1893 | the ":ref:`profile-manual/intro:General Setup`" section of this | 1897 | the ":ref:`profile-manual/intro:General Setup`" section of this |
1894 | manual, and boot the resulting target image. | 1898 | manual, and boot the resulting target image. |
1895 | 1899 | ||
1896 | .. note:: | 1900 | .. note:: |
1897 | 1901 | ||
1898 | If you have a :term:`Build Directory` containing multiple machines, you need | 1902 | If you have a :term:`Build Directory` containing multiple machines, you need |
1899 | to have the :term:`MACHINE` you're connecting to selected in local.conf, and | 1903 | to have the :term:`MACHINE` you're connecting to selected in ``local.conf``, and |
1900 | the kernel in that machine's :term:`Build Directory` must match the kernel on | 1904 | the kernel in that machine's :term:`Build Directory` must match the kernel on |
1901 | the booted system exactly, or you'll get the above 'crosstap' message | 1905 | the booted system exactly, or you'll get the above ``crosstap`` message |
1902 | when you try to invoke a script. | 1906 | when you try to invoke a script. |
1903 | 1907 | ||
1904 | Running a Script on a Target | 1908 | Running a Script on a Target |
@@ -1922,8 +1926,8 @@ the target:: | |||
1922 | 1926 | ||
1923 | You can also run generated QEMU images with a command like 'runqemu qemux86-64' | 1927 | You can also run generated QEMU images with a command like 'runqemu qemux86-64' |
1924 | 1928 | ||
1925 | Once you've done that, you can cd to whatever | 1929 | Once you've done that, you can ``cd`` to whatever |
1926 | directory contains your scripts and use 'crosstap' to run the script:: | 1930 | directory contains your scripts and use ``crosstap`` to run the script:: |
1927 | 1931 | ||
1928 | $ cd /path/to/my/systemap/script | 1932 | $ cd /path/to/my/systemap/script |
1929 | $ crosstap root@192.168.7.2 trace_open.stp | 1933 | $ crosstap root@192.168.7.2 trace_open.stp |
@@ -1937,9 +1941,8 @@ Try ssh'ing to the target and see what happens:: | |||
1937 | 1941 | ||
1938 | $ ssh root@192.168.7.2 | 1942 | $ ssh root@192.168.7.2 |
1939 | 1943 | ||
1940 | A lot of the time, connection | 1944 | Connection problems are often due specifying a wrong IP address or having a ``host key |
1941 | problems are due specifying a wrong IP address or having a 'host key | 1945 | verification error``. |
1942 | verification error'. | ||
1943 | 1946 | ||
1944 | If everything worked as planned, you should see something like this | 1947 | If everything worked as planned, you should see something like this |
1945 | (enter the password when prompted, or press enter if it's set up to use | 1948 | (enter the password when prompted, or press enter if it's set up to use |
@@ -1977,16 +1980,16 @@ outlined in the ":ref:`profile-manual/intro:General Setup`" section. | |||
1977 | 1980 | ||
1978 | Sysprof is a GUI-based application that runs on the target system. For | 1981 | Sysprof is a GUI-based application that runs on the target system. For |
1979 | the rest of this document we assume you've ssh'ed to the host and will | 1982 | the rest of this document we assume you've ssh'ed to the host and will |
1980 | be running Sysprof on the target (you can use the '-X' option to ssh and | 1983 | be running Sysprof on the target (you can use the ``-X`` option to ssh and |
1981 | have the Sysprof GUI run on the target but display remotely on the host | 1984 | have the Sysprof GUI run on the target but display remotely on the host |
1982 | if you want). | 1985 | if you want). |
1983 | 1986 | ||
1984 | Basic Sysprof Usage | 1987 | Basic Sysprof Usage |
1985 | ------------------- | 1988 | ------------------- |
1986 | 1989 | ||
1987 | To start profiling the system, you simply press the 'Start' button. To | 1990 | To start profiling the system, you simply press the ``Start`` button. To |
1988 | stop profiling and to start viewing the profile data in one easy step, | 1991 | stop profiling and to start viewing the profile data in one easy step, |
1989 | press the 'Profile' button. | 1992 | press the ``Profile`` button. |
1990 | 1993 | ||
1991 | Once you've pressed the profile button, the three panes will fill up | 1994 | Once you've pressed the profile button, the three panes will fill up |
1992 | with profiling data: | 1995 | with profiling data: |
@@ -2002,7 +2005,7 @@ perf's default callee-oriented callchain display. | |||
2002 | 2005 | ||
2003 | In the screenshot above, we're focusing on ``__copy_to_user_ll()`` and | 2006 | In the screenshot above, we're focusing on ``__copy_to_user_ll()`` and |
2004 | looking up the callchain we can see that one of the callers of | 2007 | looking up the callchain we can see that one of the callers of |
2005 | ``__copy_to_user_ll`` is sys_read() and the complete callpath between them. | 2008 | ``__copy_to_user_ll`` is ``sys_read()`` and the complete callpath between them. |
2006 | Notice that this is essentially a portion of the same information we saw | 2009 | Notice that this is essentially a portion of the same information we saw |
2007 | in the perf display shown in the perf section of this page. | 2010 | in the perf display shown in the perf section of this page. |
2008 | 2011 | ||
@@ -2011,7 +2014,7 @@ in the perf display shown in the perf section of this page. | |||
2011 | :width: 70% | 2014 | :width: 70% |
2012 | 2015 | ||
2013 | Similarly, the above is a snapshot of the Sysprof display of a | 2016 | Similarly, the above is a snapshot of the Sysprof display of a |
2014 | copy-from-user callchain. | 2017 | ``copy-from-user`` callchain. |
2015 | 2018 | ||
2016 | Finally, looking at the third Sysprof pane in the lower left, we can see | 2019 | Finally, looking at the third Sysprof pane in the lower left, we can see |
2017 | a list of all the callers of a particular function selected in the top | 2020 | a list of all the callers of a particular function selected in the top |
@@ -2027,10 +2030,10 @@ to the selected function, and so on. | |||
2027 | 2030 | ||
2028 | .. admonition:: Tying it Together | 2031 | .. admonition:: Tying it Together |
2029 | 2032 | ||
2030 | If you like sysprof's 'caller-oriented' display, you may be able to | 2033 | If you like sysprof's ``caller-oriented`` display, you may be able to |
2031 | approximate it in other tools as well. For example, 'perf report' has | 2034 | approximate it in other tools as well. For example, ``perf report`` has |
2032 | the -g (--call-graph) option that you can experiment with; one of the | 2035 | the ``-g`` (``--call-graph``) option that you can experiment with; one of the |
2033 | options is 'caller' for an inverted caller-based callgraph display. | 2036 | options is ``caller`` for an inverted caller-based callgraph display. |
2034 | 2037 | ||
2035 | Sysprof Documentation | 2038 | Sysprof Documentation |
2036 | --------------------- | 2039 | --------------------- |
@@ -2054,7 +2057,7 @@ Collecting and Viewing Traces | |||
2054 | ----------------------------- | 2057 | ----------------------------- |
2055 | 2058 | ||
2056 | Once you've applied the above commits and built and booted your image | 2059 | Once you've applied the above commits and built and booted your image |
2057 | (you need to build the core-image-sato-sdk image or use one of the other | 2060 | (you need to build the ``core-image-sato-sdk`` image or use one of the other |
2058 | methods described in the ":ref:`profile-manual/intro:General Setup`" section), you're ready to start | 2061 | methods described in the ":ref:`profile-manual/intro:General Setup`" section), you're ready to start |
2059 | tracing. | 2062 | tracing. |
2060 | 2063 | ||
@@ -2139,14 +2142,14 @@ You can now view the trace in text form on the target:: | |||
2139 | 2142 | ||
2140 | You can now safely destroy the trace | 2143 | You can now safely destroy the trace |
2141 | session (note that this doesn't delete the trace --- it's still there in | 2144 | session (note that this doesn't delete the trace --- it's still there in |
2142 | ~/lttng-traces):: | 2145 | ``~/lttng-traces``):: |
2143 | 2146 | ||
2144 | root@crownbay:~# lttng destroy | 2147 | root@crownbay:~# lttng destroy |
2145 | Session auto-20121015-232120 destroyed at /home/root | 2148 | Session auto-20121015-232120 destroyed at /home/root |
2146 | 2149 | ||
2147 | Note that the trace is saved in a directory of the same name as returned by | 2150 | Note that the trace is saved in a directory of the same name as returned by |
2148 | 'lttng create', under the ~/lttng-traces directory (note that you can change this by | 2151 | ``lttng create``, under the ``~/lttng-traces`` directory (note that you can change this by |
2149 | supplying your own name to 'lttng create'):: | 2152 | supplying your own name to ``lttng create``):: |
2150 | 2153 | ||
2151 | root@crownbay:~# ls -al ~/lttng-traces | 2154 | root@crownbay:~# ls -al ~/lttng-traces |
2152 | drwxrwx--- 3 root root 1024 Oct 15 23:21 . | 2155 | drwxrwx--- 3 root root 1024 Oct 15 23:21 . |
@@ -2157,12 +2160,12 @@ Collecting and viewing a userspace trace on the target (inside a shell) | |||
2157 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | 2160 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
2158 | 2161 | ||
2159 | For LTTng userspace tracing, you need to have a properly instrumented | 2162 | For LTTng userspace tracing, you need to have a properly instrumented |
2160 | userspace program. For this example, we'll use the 'hello' test program | 2163 | userspace program. For this example, we'll use the ``hello`` test program |
2161 | generated by the lttng-ust build. | 2164 | generated by the ``lttng-ust`` build. |
2162 | 2165 | ||
2163 | The 'hello' test program isn't installed on the root filesystem by the lttng-ust | 2166 | The ``hello`` test program isn't installed on the root filesystem by the ``lttng-ust`` |
2164 | build, so we need to copy it over manually. First cd into the build | 2167 | build, so we need to copy it over manually. First ``cd`` into the build |
2165 | directory that contains the hello executable:: | 2168 | directory that contains the ``hello`` executable:: |
2166 | 2169 | ||
2167 | $ cd build/tmp/work/core2_32-poky-linux/lttng-ust/2.0.5-r0/git/tests/hello/.libs | 2170 | $ cd build/tmp/work/core2_32-poky-linux/lttng-ust/2.0.5-r0/git/tests/hello/.libs |
2168 | 2171 | ||
@@ -2170,7 +2173,7 @@ Copy that over to the target machine:: | |||
2170 | 2173 | ||
2171 | $ scp hello root@192.168.1.20: | 2174 | $ scp hello root@192.168.1.20: |
2172 | 2175 | ||
2173 | You now have the instrumented lttng 'hello world' test program on the | 2176 | You now have the instrumented LTTng "hello world" test program on the |
2174 | target, ready to test. | 2177 | target, ready to test. |
2175 | 2178 | ||
2176 | First, from the host, ssh to the target:: | 2179 | First, from the host, ssh to the target:: |
@@ -2198,7 +2201,7 @@ Start the trace:: | |||
2198 | root@crownbay:~# lttng start | 2201 | root@crownbay:~# lttng start |
2199 | Tracing started for session auto-20190303-021943 | 2202 | Tracing started for session auto-20190303-021943 |
2200 | 2203 | ||
2201 | Run the instrumented hello world program:: | 2204 | Run the instrumented "hello world" program:: |
2202 | 2205 | ||
2203 | root@crownbay:~# ./hello | 2206 | root@crownbay:~# ./hello |
2204 | Hello, World! | 2207 | Hello, World! |
@@ -2222,7 +2225,7 @@ You can now view the trace in text form on the target:: | |||
2222 | . | 2225 | . |
2223 | 2226 | ||
2224 | You can now safely destroy the trace session (note that this doesn't delete the | 2227 | You can now safely destroy the trace session (note that this doesn't delete the |
2225 | trace --- it's still there in ~/lttng-traces):: | 2228 | trace --- it's still there in ``~/lttng-traces``):: |
2226 | 2229 | ||
2227 | root@crownbay:~# lttng destroy | 2230 | root@crownbay:~# lttng destroy |
2228 | Session auto-20190303-021943 destroyed at /home/root | 2231 | Session auto-20190303-021943 destroyed at /home/root |
@@ -2266,12 +2269,12 @@ will be running blkrace on the target. | |||
2266 | Basic blktrace Usage | 2269 | Basic blktrace Usage |
2267 | -------------------- | 2270 | -------------------- |
2268 | 2271 | ||
2269 | To record a trace, simply run the 'blktrace' command, giving it the name | 2272 | To record a trace, simply run the ``blktrace`` command, giving it the name |
2270 | of the block device you want to trace activity on:: | 2273 | of the block device you want to trace activity on:: |
2271 | 2274 | ||
2272 | root@crownbay:~# blktrace /dev/sdc | 2275 | root@crownbay:~# blktrace /dev/sdc |
2273 | 2276 | ||
2274 | In another shell, execute a workload you want to trace. :: | 2277 | In another shell, execute a workload you want to trace:: |
2275 | 2278 | ||
2276 | root@crownbay:/media/sdc# rm linux-2.6.19.2.tar.bz2; wget &YOCTO_DL_URL;/mirror/sources/linux-2.6.19.2.tar.bz2; sync | 2279 | root@crownbay:/media/sdc# rm linux-2.6.19.2.tar.bz2; wget &YOCTO_DL_URL;/mirror/sources/linux-2.6.19.2.tar.bz2; sync |
2277 | Connecting to downloads.yoctoproject.org (140.211.169.59:80) | 2280 | Connecting to downloads.yoctoproject.org (140.211.169.59:80) |
@@ -2280,7 +2283,7 @@ In another shell, execute a workload you want to trace. :: | |||
2280 | Press Ctrl-C in the blktrace shell to stop the trace. It | 2283 | Press Ctrl-C in the blktrace shell to stop the trace. It |
2281 | will display how many events were logged, along with the per-cpu file | 2284 | will display how many events were logged, along with the per-cpu file |
2282 | sizes (blktrace records traces in per-cpu kernel buffers and simply | 2285 | sizes (blktrace records traces in per-cpu kernel buffers and simply |
2283 | dumps them to userspace for blkparse to merge and sort later). :: | 2286 | dumps them to userspace for blkparse to merge and sort later):: |
2284 | 2287 | ||
2285 | ^C=== sdc === | 2288 | ^C=== sdc === |
2286 | CPU 0: 7082 events, 332 KiB data | 2289 | CPU 0: 7082 events, 332 KiB data |
@@ -2296,7 +2299,7 @@ with the device name as the first part of the filename:: | |||
2296 | -rw-r--r-- 1 root root 339938 Oct 27 22:40 sdc.blktrace.0 | 2299 | -rw-r--r-- 1 root root 339938 Oct 27 22:40 sdc.blktrace.0 |
2297 | -rw-r--r-- 1 root root 75753 Oct 27 22:40 sdc.blktrace.1 | 2300 | -rw-r--r-- 1 root root 75753 Oct 27 22:40 sdc.blktrace.1 |
2298 | 2301 | ||
2299 | To view the trace events, simply invoke 'blkparse' in the directory | 2302 | To view the trace events, simply invoke ``blkparse`` in the directory |
2300 | containing the trace files, giving it the device name that forms the | 2303 | containing the trace files, giving it the device name that forms the |
2301 | first part of the filenames:: | 2304 | first part of the filenames:: |
2302 | 2305 | ||
@@ -2395,7 +2398,7 @@ Live Mode | |||
2395 | ~~~~~~~~~ | 2398 | ~~~~~~~~~ |
2396 | 2399 | ||
2397 | blktrace and blkparse are designed from the ground up to be able to | 2400 | blktrace and blkparse are designed from the ground up to be able to |
2398 | operate together in a 'pipe mode' where the stdout of blktrace can be | 2401 | operate together in a "pipe mode" where the stdout of blktrace can be |
2399 | fed directly into the stdin of blkparse:: | 2402 | fed directly into the stdin of blkparse:: |
2400 | 2403 | ||
2401 | root@crownbay:~# blktrace /dev/sdc -o - | blkparse -i - | 2404 | root@crownbay:~# blktrace /dev/sdc -o - | blkparse -i - |
@@ -2441,13 +2444,13 @@ On the host system, you should see this:: | |||
2441 | 2444 | ||
2442 | server: connection from 192.168.1.43 | 2445 | server: connection from 192.168.1.43 |
2443 | 2446 | ||
2444 | In another shell, execute a workload you want to trace. :: | 2447 | In another shell, execute a workload you want to trace:: |
2445 | 2448 | ||
2446 | root@crownbay:/media/sdc# rm linux-2.6.19.2.tar.bz2; wget &YOCTO_DL_URL;/mirror/sources/linux-2.6.19.2.tar.bz2; sync | 2449 | root@crownbay:/media/sdc# rm linux-2.6.19.2.tar.bz2; wget &YOCTO_DL_URL;/mirror/sources/linux-2.6.19.2.tar.bz2; sync |
2447 | Connecting to downloads.yoctoproject.org (140.211.169.59:80) | 2450 | Connecting to downloads.yoctoproject.org (140.211.169.59:80) |
2448 | linux-2.6.19.2.tar.b 100% \|*******************************\| 41727k 0:00:00 ETA | 2451 | linux-2.6.19.2.tar.b 100% \|*******************************\| 41727k 0:00:00 ETA |
2449 | 2452 | ||
2450 | When it's done, do a Ctrl-C on the target system to stop the | 2453 | When it's done, do a ``Ctrl-C`` on the target system to stop the |
2451 | trace:: | 2454 | trace:: |
2452 | 2455 | ||
2453 | ^C=== sdc === | 2456 | ^C=== sdc === |
@@ -2472,7 +2475,7 @@ save the target output inside a hostname-timestamp directory:: | |||
2472 | drwxr-sr-x 4 root root 1024 Oct 26 18:24 .. | 2475 | drwxr-sr-x 4 root root 1024 Oct 26 18:24 .. |
2473 | drwxr-xr-x 2 root root 1024 Oct 28 02:40 192.168.1.43-2012-10-28-02:40:56 | 2476 | drwxr-xr-x 2 root root 1024 Oct 28 02:40 192.168.1.43-2012-10-28-02:40:56 |
2474 | 2477 | ||
2475 | cd into that directory to see the output files:: | 2478 | ``cd`` into that directory to see the output files:: |
2476 | 2479 | ||
2477 | $ ls -l | 2480 | $ ls -l |
2478 | -rw-r--r-- 1 root root 369193 Oct 28 02:44 sdc.blktrace.0 | 2481 | -rw-r--r-- 1 root root 369193 Oct 28 02:44 sdc.blktrace.0 |
@@ -2539,14 +2542,14 @@ It's also possible to trace block I/O using only | |||
2539 | can be useful for casual tracing if you don't want to bother dealing with the | 2542 | can be useful for casual tracing if you don't want to bother dealing with the |
2540 | userspace tools. | 2543 | userspace tools. |
2541 | 2544 | ||
2542 | To enable tracing for a given device, use /sys/block/xxx/trace/enable, | 2545 | To enable tracing for a given device, use ``/sys/block/xxx/trace/enable``, |
2543 | where xxx is the device name. This for example enables tracing for | 2546 | where ``xxx`` is the device name. This for example enables tracing for |
2544 | /dev/sdc:: | 2547 | ``/dev/sdc``:: |
2545 | 2548 | ||
2546 | root@crownbay:/sys/kernel/debug/tracing# echo 1 > /sys/block/sdc/trace/enable | 2549 | root@crownbay:/sys/kernel/debug/tracing# echo 1 > /sys/block/sdc/trace/enable |
2547 | 2550 | ||
2548 | Once you've selected the device(s) you want | 2551 | Once you've selected the device(s) you want |
2549 | to trace, selecting the 'blk' tracer will turn the blk tracer on:: | 2552 | to trace, selecting the ``blk`` tracer will turn the blk tracer on:: |
2550 | 2553 | ||
2551 | root@crownbay:/sys/kernel/debug/tracing# cat available_tracers | 2554 | root@crownbay:/sys/kernel/debug/tracing# cat available_tracers |
2552 | blk function_graph function nop | 2555 | blk function_graph function nop |
@@ -2557,7 +2560,7 @@ Execute the workload you're interested in:: | |||
2557 | 2560 | ||
2558 | root@crownbay:/sys/kernel/debug/tracing# cat /media/sdc/testfile.txt | 2561 | root@crownbay:/sys/kernel/debug/tracing# cat /media/sdc/testfile.txt |
2559 | 2562 | ||
2560 | And look at the output (note here that we're using 'trace_pipe' instead of | 2563 | And look at the output (note here that we're using ``trace_pipe`` instead of |
2561 | trace to capture this trace --- this allows us to wait around on the pipe | 2564 | trace to capture this trace --- this allows us to wait around on the pipe |
2562 | for data to appear):: | 2565 | for data to appear):: |
2563 | 2566 | ||
@@ -2594,8 +2597,8 @@ section can be found here: | |||
2594 | 2597 | ||
2595 | - https://linux.die.net/man/8/btrace | 2598 | - https://linux.die.net/man/8/btrace |
2596 | 2599 | ||
2597 | The above manpages, along with manpages for the other blktrace utilities | 2600 | The above manpages, along with manuals for the other blktrace utilities |
2598 | (btt, blkiomon, etc) can be found in the /doc directory of the blktrace | 2601 | (btt, blkiomon, etc) can be found in the ``/doc`` directory of the blktrace |
2599 | tools git repo:: | 2602 | tools git repo:: |
2600 | 2603 | ||
2601 | $ git clone git://git.kernel.dk/blktrace.git | 2604 | $ git clone git://git.kernel.dk/blktrace.git |