diff options
author | Michael Opdenacker <michael.opdenacker@bootlin.com> | 2024-03-13 13:47:11 +0100 |
---|---|---|
committer | Steve Sakoman <steve@sakoman.com> | 2024-04-05 07:23:59 -0700 |
commit | 8b7e9c67e3f75b9154a4b09f402ccb4a95ed5cdd (patch) | |
tree | e1541aff5147a8da0a7ae0a2a14df6d13410d098 | |
parent | 15034b32a5c81b963f1eb49b9bbaeb629b8a1f1e (diff) | |
download | poky-8b7e9c67e3f75b9154a4b09f402ccb4a95ed5cdd.tar.gz |
profile-manual: usage.rst: formatting fixes
Plus a few text styling improvements, some reported by "make stylecheck"
(From yocto-docs rev: ce0e83716197773d8eae0c2f0edc1cf290ebd60f)
Signed-off-by: Michael Opdenacker <michael.opdenacker@bootlin.com>
Signed-off-by: Steve Sakoman <steve@sakoman.com>
-rw-r--r-- | documentation/profile-manual/usage.rst | 473 |
1 files changed, 238 insertions, 235 deletions
diff --git a/documentation/profile-manual/usage.rst b/documentation/profile-manual/usage.rst index a190e743cd..912e4c9602 100644 --- a/documentation/profile-manual/usage.rst +++ b/documentation/profile-manual/usage.rst | |||
@@ -10,10 +10,10 @@ Basic Usage (with examples) for each of the Yocto Tracing Tools | |||
10 | This chapter presents basic usage examples for each of the tracing | 10 | This chapter presents basic usage examples for each of the tracing |
11 | tools. | 11 | tools. |
12 | 12 | ||
13 | perf | 13 | Perf |
14 | ==== | 14 | ==== |
15 | 15 | ||
16 | The 'perf' tool is the profiling and tracing tool that comes bundled | 16 | The perf tool is the profiling and tracing tool that comes bundled |
17 | with the Linux kernel. | 17 | with the Linux kernel. |
18 | 18 | ||
19 | Don't let the fact that it's part of the kernel fool you into thinking | 19 | Don't let the fact that it's part of the kernel fool you into thinking |
@@ -61,7 +61,7 @@ Basic Perf Usage | |||
61 | ---------------- | 61 | ---------------- |
62 | 62 | ||
63 | The perf tool is pretty much self-documenting. To remind yourself of the | 63 | The perf tool is pretty much self-documenting. To remind yourself of the |
64 | available commands, simply type 'perf', which will show you basic usage | 64 | available commands, simply type ``perf``, which will show you basic usage |
65 | along with the available perf subcommands:: | 65 | along with the available perf subcommands:: |
66 | 66 | ||
67 | root@crownbay:~# perf | 67 | root@crownbay:~# perf |
@@ -97,19 +97,19 @@ along with the available perf subcommands:: | |||
97 | Using perf to do Basic Profiling | 97 | Using perf to do Basic Profiling |
98 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | 98 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
99 | 99 | ||
100 | As a simple test case, we'll profile the 'wget' of a fairly large file, | 100 | As a simple test case, we'll profile the ``wget`` of a fairly large file, |
101 | which is a minimally interesting case because it has both file and | 101 | which is a minimally interesting case because it has both file and |
102 | network I/O aspects, and at least in the case of standard Yocto images, | 102 | network I/O aspects, and at least in the case of standard Yocto images, |
103 | it's implemented as part of BusyBox, so the methods we use to analyze it | 103 | it's implemented as part of BusyBox, so the methods we use to analyze it |
104 | can be used in a very similar way to the whole host of supported BusyBox | 104 | can be used in a very similar way to the whole host of supported BusyBox |
105 | applets in Yocto. :: | 105 | applets in Yocto:: |
106 | 106 | ||
107 | root@crownbay:~# rm linux-2.6.19.2.tar.bz2; \ | 107 | root@crownbay:~# rm linux-2.6.19.2.tar.bz2; \ |
108 | wget &YOCTO_DL_URL;/mirror/sources/linux-2.6.19.2.tar.bz2 | 108 | wget &YOCTO_DL_URL;/mirror/sources/linux-2.6.19.2.tar.bz2 |
109 | 109 | ||
110 | The quickest and easiest way to get some basic overall data about what's | 110 | The quickest and easiest way to get some basic overall data about what's |
111 | going on for a particular workload is to profile it using 'perf stat'. | 111 | going on for a particular workload is to profile it using ``perf stat``. |
112 | 'perf stat' basically profiles using a few default counters and displays | 112 | This command basically profiles using a few default counters and displays |
113 | the summed counts at the end of the run:: | 113 | the summed counts at the end of the run:: |
114 | 114 | ||
115 | root@crownbay:~# perf stat wget &YOCTO_DL_URL;/mirror/sources/linux-2.6.19.2.tar.bz2 | 115 | root@crownbay:~# perf stat wget &YOCTO_DL_URL;/mirror/sources/linux-2.6.19.2.tar.bz2 |
@@ -135,9 +135,9 @@ Many times such a simple-minded test doesn't yield much of | |||
135 | interest, but sometimes it does (see Real-world Yocto bug (slow | 135 | interest, but sometimes it does (see Real-world Yocto bug (slow |
136 | loop-mounted write speed)). | 136 | loop-mounted write speed)). |
137 | 137 | ||
138 | Also, note that 'perf stat' isn't restricted to a fixed set of counters | 138 | Also, note that ``perf stat`` isn't restricted to a fixed set of counters |
139 | - basically any event listed in the output of 'perf list' can be tallied | 139 | --- basically any event listed in the output of ``perf list`` can be tallied |
140 | by 'perf stat'. For example, suppose we wanted to see a summary of all | 140 | by ``perf stat``. For example, suppose we wanted to see a summary of all |
141 | the events related to kernel memory allocation/freeing along with cache | 141 | the events related to kernel memory allocation/freeing along with cache |
142 | hits and misses:: | 142 | hits and misses:: |
143 | 143 | ||
@@ -164,22 +164,22 @@ hits and misses:: | |||
164 | 164 | ||
165 | 44.831023415 seconds time elapsed | 165 | 44.831023415 seconds time elapsed |
166 | 166 | ||
167 | So 'perf stat' gives us a nice easy | 167 | So ``perf stat`` gives us a nice easy |
168 | way to get a quick overview of what might be happening for a set of | 168 | way to get a quick overview of what might be happening for a set of |
169 | events, but normally we'd need a little more detail in order to | 169 | events, but normally we'd need a little more detail in order to |
170 | understand what's going on in a way that we can act on in a useful way. | 170 | understand what's going on in a way that we can act on in a useful way. |
171 | 171 | ||
172 | To dive down into a next level of detail, we can use 'perf record'/'perf | 172 | To dive down into a next level of detail, we can use ``perf record`` / |
173 | report' which will collect profiling data and present it to use using an | 173 | ``perf report`` which will collect profiling data and present it to use using an |
174 | interactive text-based UI (or simply as text if we specify ``--stdio`` to | 174 | interactive text-based UI (or simply as text if we specify ``--stdio`` to |
175 | 'perf report'). | 175 | ``perf report``). |
176 | 176 | ||
177 | As our first attempt at profiling this workload, we'll simply run 'perf | 177 | As our first attempt at profiling this workload, we'll simply run ``perf |
178 | record', handing it the workload we want to profile (everything after | 178 | record``, handing it the workload we want to profile (everything after |
179 | 'perf record' and any perf options we hand it --- here none, will be | 179 | ``perf record`` and any perf options we hand it --- here none, will be |
180 | executed in a new shell). perf collects samples until the process exits | 180 | executed in a new shell). perf collects samples until the process exits |
181 | and records them in a file named 'perf.data' in the current working | 181 | and records them in a file named ``perf.data`` in the current working |
182 | directory. :: | 182 | directory:: |
183 | 183 | ||
184 | root@crownbay:~# perf record wget &YOCTO_DL_URL;/mirror/sources/linux-2.6.19.2.tar.bz2 | 184 | root@crownbay:~# perf record wget &YOCTO_DL_URL;/mirror/sources/linux-2.6.19.2.tar.bz2 |
185 | 185 | ||
@@ -189,7 +189,7 @@ directory. :: | |||
189 | [ perf record: Captured and wrote 0.176 MB perf.data (~7700 samples) ] | 189 | [ perf record: Captured and wrote 0.176 MB perf.data (~7700 samples) ] |
190 | 190 | ||
191 | To see the results in a | 191 | To see the results in a |
192 | 'text-based UI' (tui), simply run 'perf report', which will read the | 192 | "text-based UI" (tui), simply run ``perf report``, which will read the |
193 | perf.data file in the current working directory and display the results | 193 | perf.data file in the current working directory and display the results |
194 | in an interactive UI:: | 194 | in an interactive UI:: |
195 | 195 | ||
@@ -198,24 +198,24 @@ in an interactive UI:: | |||
198 | .. image:: figures/perf-wget-flat-stripped.png | 198 | .. image:: figures/perf-wget-flat-stripped.png |
199 | :align: center | 199 | :align: center |
200 | 200 | ||
201 | The above screenshot displays a 'flat' profile, one entry for each | 201 | The above screenshot displays a "flat" profile, one entry for each |
202 | 'bucket' corresponding to the functions that were profiled during the | 202 | "bucket" corresponding to the functions that were profiled during the |
203 | profiling run, ordered from the most popular to the least (perf has | 203 | profiling run, ordered from the most popular to the least (perf has |
204 | options to sort in various orders and keys as well as display entries | 204 | options to sort in various orders and keys as well as display entries |
205 | only above a certain threshold and so on --- see the perf documentation | 205 | only above a certain threshold and so on --- see the perf documentation |
206 | for details). Note that this includes both userspace functions (entries | 206 | for details). Note that this includes both userspace functions (entries |
207 | containing a [.]) and kernel functions accounted to the process (entries | 207 | containing a ``[.]``) and kernel functions accounted to the process (entries |
208 | containing a [k]). (perf has command-line modifiers that can be used to | 208 | containing a ``[k]``). perf has command-line modifiers that can be used to |
209 | restrict the profiling to kernel or userspace, among others). | 209 | restrict the profiling to kernel or userspace, among others. |
210 | 210 | ||
211 | Notice also that the above report shows an entry for 'busybox', which is | 211 | Notice also that the above report shows an entry for ``busybox``, which is |
212 | the executable that implements 'wget' in Yocto, but that instead of a | 212 | the executable that implements ``wget`` in Yocto, but that instead of a |
213 | useful function name in that entry, it displays a not-so-friendly hex | 213 | useful function name in that entry, it displays a not-so-friendly hex |
214 | value instead. The steps below will show how to fix that problem. | 214 | value instead. The steps below will show how to fix that problem. |
215 | 215 | ||
216 | Before we do that, however, let's try running a different profile, one | 216 | Before we do that, however, let's try running a different profile, one |
217 | which shows something a little more interesting. The only difference | 217 | which shows something a little more interesting. The only difference |
218 | between the new profile and the previous one is that we'll add the -g | 218 | between the new profile and the previous one is that we'll add the ``-g`` |
219 | option, which will record not just the address of a sampled function, | 219 | option, which will record not just the address of a sampled function, |
220 | but the entire callchain to the sampled function as well:: | 220 | but the entire callchain to the sampled function as well:: |
221 | 221 | ||
@@ -236,11 +236,11 @@ took the most time, but we can also see a summary of how those functions | |||
236 | were called and learn something about how the program interacts with the | 236 | were called and learn something about how the program interacts with the |
237 | kernel in the process. | 237 | kernel in the process. |
238 | 238 | ||
239 | Notice that each entry in the above screenshot now contains a '+' on the | 239 | Notice that each entry in the above screenshot now contains a ``+`` on the |
240 | left-hand side. This means that we can expand the entry and drill down | 240 | left-hand side. This means that we can expand the entry and drill down |
241 | into the callchains that feed into that entry. Pressing 'enter' on any | 241 | into the callchains that feed into that entry. Pressing ``Enter`` on any |
242 | one of them will expand the callchain (you can also press 'E' to expand | 242 | one of them will expand the callchain (you can also press ``E`` to expand |
243 | them all at the same time or 'C' to collapse them all). | 243 | them all at the same time or ``C`` to collapse them all). |
244 | 244 | ||
245 | In the screenshot above, we've toggled the ``__copy_to_user_ll()`` entry | 245 | In the screenshot above, we've toggled the ``__copy_to_user_ll()`` entry |
246 | and several subnodes all the way down. This lets us see which callchains | 246 | and several subnodes all the way down. This lets us see which callchains |
@@ -251,13 +251,13 @@ As a bit of background explanation for these callchains, think about | |||
251 | what happens at a high level when you run wget to get a file out on the | 251 | what happens at a high level when you run wget to get a file out on the |
252 | network. Basically what happens is that the data comes into the kernel | 252 | network. Basically what happens is that the data comes into the kernel |
253 | via the network connection (socket) and is passed to the userspace | 253 | via the network connection (socket) and is passed to the userspace |
254 | program 'wget' (which is actually a part of BusyBox, but that's not | 254 | program ``wget`` (which is actually a part of BusyBox, but that's not |
255 | important for now), which takes the buffers the kernel passes to it and | 255 | important for now), which takes the buffers the kernel passes to it and |
256 | writes it to a disk file to save it. | 256 | writes it to a disk file to save it. |
257 | 257 | ||
258 | The part of this process that we're looking at in the above call stacks | 258 | The part of this process that we're looking at in the above call stacks |
259 | is the part where the kernel passes the data it has read from the socket | 259 | is the part where the kernel passes the data it has read from the socket |
260 | down to wget i.e. a copy-to-user. | 260 | down to wget i.e. a ``copy-to-user``. |
261 | 261 | ||
262 | Notice also that here there's also a case where the hex value is | 262 | Notice also that here there's also a case where the hex value is |
263 | displayed in the callstack, here in the expanded ``sys_clock_gettime()`` | 263 | displayed in the callstack, here in the expanded ``sys_clock_gettime()`` |
@@ -267,8 +267,8 @@ busybox. | |||
267 | .. image:: figures/perf-wget-g-copy-from-user-expanded-stripped.png | 267 | .. image:: figures/perf-wget-g-copy-from-user-expanded-stripped.png |
268 | :align: center | 268 | :align: center |
269 | 269 | ||
270 | The above screenshot shows the other half of the journey for the data - | 270 | The above screenshot shows the other half of the journey for the data --- |
271 | from the wget program's userspace buffers to disk. To get the buffers to | 271 | from the ``wget`` program's userspace buffers to disk. To get the buffers to |
272 | disk, the wget program issues a ``write(2)``, which does a ``copy-from-user`` to | 272 | disk, the wget program issues a ``write(2)``, which does a ``copy-from-user`` to |
273 | the kernel, which then takes care via some circuitous path (probably | 273 | the kernel, which then takes care via some circuitous path (probably |
274 | also present somewhere in the profile data), to get it safely to disk. | 274 | also present somewhere in the profile data), to get it safely to disk. |
@@ -278,7 +278,7 @@ of how to extract useful information out of it, let's get back to the | |||
278 | task at hand and see if we can get some basic idea about where the time | 278 | task at hand and see if we can get some basic idea about where the time |
279 | is spent in the program we're profiling, wget. Remember that wget is | 279 | is spent in the program we're profiling, wget. Remember that wget is |
280 | actually implemented as an applet in BusyBox, so while the process name | 280 | actually implemented as an applet in BusyBox, so while the process name |
281 | is 'wget', the executable we're actually interested in is BusyBox. So | 281 | is ``wget``, the executable we're actually interested in is BusyBox. So |
282 | let's expand the first entry containing BusyBox: | 282 | let's expand the first entry containing BusyBox: |
283 | 283 | ||
284 | .. image:: figures/perf-wget-busybox-expanded-stripped.png | 284 | .. image:: figures/perf-wget-busybox-expanded-stripped.png |
@@ -307,16 +307,15 @@ To generate the debug info for the packages in the image, we can add | |||
307 | EXTRA_IMAGE_FEATURES = "debug-tweaks tools-profile dbg-pkgs" | 307 | EXTRA_IMAGE_FEATURES = "debug-tweaks tools-profile dbg-pkgs" |
308 | 308 | ||
309 | Additionally, in order to generate the type of debuginfo that perf | 309 | Additionally, in order to generate the type of debuginfo that perf |
310 | understands, we also need to set | 310 | understands, we also need to set :term:`PACKAGE_DEBUG_SPLIT_STYLE` |
311 | :term:`PACKAGE_DEBUG_SPLIT_STYLE` | ||
312 | in the ``local.conf`` file:: | 311 | in the ``local.conf`` file:: |
313 | 312 | ||
314 | PACKAGE_DEBUG_SPLIT_STYLE = 'debug-file-directory' | 313 | PACKAGE_DEBUG_SPLIT_STYLE = 'debug-file-directory' |
315 | 314 | ||
316 | Once we've done that, we can install the | 315 | Once we've done that, we can install the debuginfo for BusyBox. The |
317 | debuginfo for BusyBox. The debug packages once built can be found in | 316 | debug packages once built can be found in ``build/tmp/deploy/rpm/*`` |
318 | ``build/tmp/deploy/rpm/*`` on the host system. Find the busybox-dbg-...rpm | 317 | on the host system. Find the ``busybox-dbg-...rpm`` file and copy it |
319 | file and copy it to the target. For example:: | 318 | to the target. For example:: |
320 | 319 | ||
321 | [trz@empanada core2]$ scp /home/trz/yocto/crownbay-tracing-dbg/build/tmp/deploy/rpm/core2_32/busybox-dbg-1.20.2-r2.core2_32.rpm root@192.168.1.31: | 320 | [trz@empanada core2]$ scp /home/trz/yocto/crownbay-tracing-dbg/build/tmp/deploy/rpm/core2_32/busybox-dbg-1.20.2-r2.core2_32.rpm root@192.168.1.31: |
322 | busybox-dbg-1.20.2-r2.core2_32.rpm 100% 1826KB 1.8MB/s 00:01 | 321 | busybox-dbg-1.20.2-r2.core2_32.rpm 100% 1826KB 1.8MB/s 00:01 |
@@ -331,7 +330,7 @@ their functions symbolically: | |||
331 | .. image:: figures/perf-wget-busybox-debuginfo.png | 330 | .. image:: figures/perf-wget-busybox-debuginfo.png |
332 | :align: center | 331 | :align: center |
333 | 332 | ||
334 | If we expand one of the entries and press 'enter' on a leaf node, we're | 333 | If we expand one of the entries and press ``Enter`` on a leaf node, we're |
335 | presented with a menu of actions we can take to get more information | 334 | presented with a menu of actions we can take to get more information |
336 | related to that entry: | 335 | related to that entry: |
337 | 336 | ||
@@ -340,7 +339,7 @@ related to that entry: | |||
340 | 339 | ||
341 | One of these actions allows us to show a view that displays a | 340 | One of these actions allows us to show a view that displays a |
342 | busybox-centric view of the profiled functions (in this case we've also | 341 | busybox-centric view of the profiled functions (in this case we've also |
343 | expanded all the nodes using the 'E' key): | 342 | expanded all the nodes using the ``E`` key): |
344 | 343 | ||
345 | .. image:: figures/perf-wget-busybox-dso-zoom.png | 344 | .. image:: figures/perf-wget-busybox-dso-zoom.png |
346 | :align: center | 345 | :align: center |
@@ -348,8 +347,8 @@ expanded all the nodes using the 'E' key): | |||
348 | Finally, we can see that now that the BusyBox debuginfo is installed, | 347 | Finally, we can see that now that the BusyBox debuginfo is installed, |
349 | the previously unresolved symbol in the ``sys_clock_gettime()`` entry | 348 | the previously unresolved symbol in the ``sys_clock_gettime()`` entry |
350 | mentioned previously is now resolved, and shows that the | 349 | mentioned previously is now resolved, and shows that the |
351 | sys_clock_gettime system call that was the source of 6.75% of the | 350 | ``sys_clock_gettime`` system call that was the source of 6.75% of the |
352 | copy-to-user overhead was initiated by the ``handle_input()`` BusyBox | 351 | ``copy-to-user`` overhead was initiated by the ``handle_input()`` BusyBox |
353 | function: | 352 | function: |
354 | 353 | ||
355 | .. image:: figures/perf-wget-g-copy-to-user-expanded-debuginfo.png | 354 | .. image:: figures/perf-wget-g-copy-to-user-expanded-debuginfo.png |
@@ -357,14 +356,14 @@ function: | |||
357 | 356 | ||
358 | At the lowest level of detail, we can dive down to the assembly level | 357 | At the lowest level of detail, we can dive down to the assembly level |
359 | and see which instructions caused the most overhead in a function. | 358 | and see which instructions caused the most overhead in a function. |
360 | Pressing 'enter' on the 'udhcpc_main' function, we're again presented | 359 | Pressing ``Enter`` on the ``udhcpc_main`` function, we're again presented |
361 | with a menu: | 360 | with a menu: |
362 | 361 | ||
363 | .. image:: figures/perf-wget-busybox-annotate-menu.png | 362 | .. image:: figures/perf-wget-busybox-annotate-menu.png |
364 | :align: center | 363 | :align: center |
365 | 364 | ||
366 | Selecting 'Annotate udhcpc_main', we get a detailed listing of | 365 | Selecting ``Annotate udhcpc_main``, we get a detailed listing of |
367 | percentages by instruction for the udhcpc_main function. From the | 366 | percentages by instruction for the ``udhcpc_main`` function. From the |
368 | display, we can see that over 50% of the time spent in this function is | 367 | display, we can see that over 50% of the time spent in this function is |
369 | taken up by a couple tests and the move of a constant (1) to a register: | 368 | taken up by a couple tests and the move of a constant (1) to a register: |
370 | 369 | ||
@@ -372,7 +371,7 @@ taken up by a couple tests and the move of a constant (1) to a register: | |||
372 | :align: center | 371 | :align: center |
373 | 372 | ||
374 | As a segue into tracing, let's try another profile using a different | 373 | As a segue into tracing, let's try another profile using a different |
375 | counter, something other than the default 'cycles'. | 374 | counter, something other than the default ``cycles``. |
376 | 375 | ||
377 | The tracing and profiling infrastructure in Linux has become unified in | 376 | The tracing and profiling infrastructure in Linux has become unified in |
378 | a way that allows us to use the same tool with a completely different | 377 | a way that allows us to use the same tool with a completely different |
@@ -382,7 +381,7 @@ traditional tools can also make use of the expanded possibilities now | |||
382 | available to them, and in some cases have, as mentioned previously). | 381 | available to them, and in some cases have, as mentioned previously). |
383 | 382 | ||
384 | We can get a list of the available events that can be used to profile a | 383 | We can get a list of the available events that can be used to profile a |
385 | workload via 'perf list':: | 384 | workload via ``perf list``:: |
386 | 385 | ||
387 | root@crownbay:~# perf list | 386 | root@crownbay:~# perf list |
388 | 387 | ||
@@ -518,13 +517,13 @@ workload via 'perf list':: | |||
518 | .. admonition:: Tying it Together | 517 | .. admonition:: Tying it Together |
519 | 518 | ||
520 | These are exactly the same set of events defined by the trace event | 519 | These are exactly the same set of events defined by the trace event |
521 | subsystem and exposed by ftrace/tracecmd/kernelshark as files in | 520 | subsystem and exposed by ftrace / tracecmd / kernelshark as files in |
522 | /sys/kernel/debug/tracing/events, by SystemTap as | 521 | ``/sys/kernel/debug/tracing/events``, by SystemTap as |
523 | kernel.trace("tracepoint_name") and (partially) accessed by LTTng. | 522 | kernel.trace("tracepoint_name") and (partially) accessed by LTTng. |
524 | 523 | ||
525 | Only a subset of these would be of interest to us when looking at this | 524 | Only a subset of these would be of interest to us when looking at this |
526 | workload, so let's choose the most likely subsystems (identified by the | 525 | workload, so let's choose the most likely subsystems (identified by the |
527 | string before the colon in the Tracepoint events) and do a 'perf stat' | 526 | string before the colon in the Tracepoint events) and do a ``perf stat`` |
528 | run using only those wildcarded subsystems:: | 527 | run using only those wildcarded subsystems:: |
529 | 528 | ||
530 | root@crownbay:~# perf stat -e skb:* -e net:* -e napi:* -e sched:* -e workqueue:* -e irq:* -e syscalls:* wget &YOCTO_DL_URL;/mirror/sources/linux-2.6.19.2.tar.bz2 | 529 | root@crownbay:~# perf stat -e skb:* -e net:* -e napi:* -e sched:* -e workqueue:* -e irq:* -e syscalls:* wget &YOCTO_DL_URL;/mirror/sources/linux-2.6.19.2.tar.bz2 |
@@ -596,8 +595,8 @@ and tell perf to do a profile using it as the sampling event:: | |||
596 | 595 | ||
597 | The screenshot above shows the results of running a profile using | 596 | The screenshot above shows the results of running a profile using |
598 | sched:sched_switch tracepoint, which shows the relative costs of various | 597 | sched:sched_switch tracepoint, which shows the relative costs of various |
599 | paths to sched_wakeup (note that sched_wakeup is the name of the | 598 | paths to ``sched_wakeup`` (note that ``sched_wakeup`` is the name of the |
600 | tracepoint --- it's actually defined just inside ttwu_do_wakeup(), which | 599 | tracepoint --- it's actually defined just inside ``ttwu_do_wakeup()``, which |
601 | accounts for the function name actually displayed in the profile: | 600 | accounts for the function name actually displayed in the profile: |
602 | 601 | ||
603 | .. code-block:: c | 602 | .. code-block:: c |
@@ -620,10 +619,10 @@ receive paths that presumably end up waking up wget (busybox) when | |||
620 | network data is ready. | 619 | network data is ready. |
621 | 620 | ||
622 | Note that because tracepoints are normally used for tracing, the default | 621 | Note that because tracepoints are normally used for tracing, the default |
623 | sampling period for tracepoints is 1 i.e. for tracepoints perf will | 622 | sampling period for tracepoints is ``1`` i.e. for tracepoints perf will |
624 | sample on every event occurrence (this can be changed using the -c | 623 | sample on every event occurrence (this can be changed using the ``-c`` |
625 | option). This is in contrast to hardware counters such as for example | 624 | option). This is in contrast to hardware counters such as for example |
626 | the default 'cycles' hardware counter used for normal profiling, where | 625 | the default ``cycles`` hardware counter used for normal profiling, where |
627 | sampling periods are much higher (in the thousands) because profiling | 626 | sampling periods are much higher (in the thousands) because profiling |
628 | should have as low an overhead as possible and sampling on every cycle | 627 | should have as low an overhead as possible and sampling on every cycle |
629 | would be prohibitively expensive. | 628 | would be prohibitively expensive. |
@@ -634,10 +633,10 @@ Using perf to do Basic Tracing | |||
634 | Profiling is a great tool for solving many problems or for getting a | 633 | Profiling is a great tool for solving many problems or for getting a |
635 | high-level view of what's going on with a workload or across the system. | 634 | high-level view of what's going on with a workload or across the system. |
636 | It is however by definition an approximation, as suggested by the most | 635 | It is however by definition an approximation, as suggested by the most |
637 | prominent word associated with it, 'sampling'. On the one hand, it | 636 | prominent word associated with it, ``sampling``. On the one hand, it |
638 | allows a representative picture of what's going on in the system to be | 637 | allows a representative picture of what's going on in the system to be |
639 | cheaply taken, but on the other hand, that cheapness limits its utility | 638 | cheaply taken, but on the other hand, that cheapness limits its utility |
640 | when that data suggests a need to 'dive down' more deeply to discover | 639 | when that data suggests a need to "dive down" more deeply to discover |
641 | what's really going on. In such cases, the only way to see what's really | 640 | what's really going on. In such cases, the only way to see what's really |
642 | going on is to be able to look at (or summarize more intelligently) the | 641 | going on is to be able to look at (or summarize more intelligently) the |
643 | individual steps that go into the higher-level behavior exposed by the | 642 | individual steps that go into the higher-level behavior exposed by the |
@@ -650,7 +649,7 @@ applicable to our workload:: | |||
650 | -e syscalls:sys_enter_read -e syscalls:sys_exit_read -e syscalls:sys_enter_write -e syscalls:sys_exit_write | 649 | -e syscalls:sys_enter_read -e syscalls:sys_exit_read -e syscalls:sys_enter_write -e syscalls:sys_exit_write |
651 | wget &YOCTO_DL_URL;/mirror/sources/linux-2.6.19.2.tar.bz2 | 650 | wget &YOCTO_DL_URL;/mirror/sources/linux-2.6.19.2.tar.bz2 |
652 | 651 | ||
653 | We can look at the raw trace output using 'perf script' with no | 652 | We can look at the raw trace output using ``perf script`` with no |
654 | arguments:: | 653 | arguments:: |
655 | 654 | ||
656 | root@crownbay:~# perf script | 655 | root@crownbay:~# perf script |
@@ -681,7 +680,7 @@ arguments:: | |||
681 | This gives us a detailed timestamped sequence of events that occurred within the | 680 | This gives us a detailed timestamped sequence of events that occurred within the |
682 | workload with respect to those events. | 681 | workload with respect to those events. |
683 | 682 | ||
684 | In many ways, profiling can be viewed as a subset of tracing - | 683 | In many ways, profiling can be viewed as a subset of tracing --- |
685 | theoretically, if you have a set of trace events that's sufficient to | 684 | theoretically, if you have a set of trace events that's sufficient to |
686 | capture all the important aspects of a workload, you can derive any of | 685 | capture all the important aspects of a workload, you can derive any of |
687 | the results or views that a profiling run can. | 686 | the results or views that a profiling run can. |
@@ -706,7 +705,7 @@ problem at hand. For example, if we wanted to make use of a 'counter' | |||
706 | that maps to the value of the time difference between when a process was | 705 | that maps to the value of the time difference between when a process was |
707 | scheduled to run on a processor and the time it actually ran, we | 706 | scheduled to run on a processor and the time it actually ran, we |
708 | wouldn't expect such a counter to exist on its own, but we could derive | 707 | wouldn't expect such a counter to exist on its own, but we could derive |
709 | one called say 'wakeup_latency' and use it to extract a useful view of | 708 | one called say ``wakeup_latency`` and use it to extract a useful view of |
710 | that metric from trace data. Likewise, we really can't figure out from | 709 | that metric from trace data. Likewise, we really can't figure out from |
711 | standard profiling tools how much data every process on the system reads | 710 | standard profiling tools how much data every process on the system reads |
712 | and writes, along with how many of those reads and writes fail | 711 | and writes, along with how many of those reads and writes fail |
@@ -715,7 +714,7 @@ right tools easily extract and present that information, but we'd need | |||
715 | something other than pre-canned profiling tools to do that. | 714 | something other than pre-canned profiling tools to do that. |
716 | 715 | ||
717 | Luckily, there is a general-purpose way to handle such needs, called | 716 | Luckily, there is a general-purpose way to handle such needs, called |
718 | 'programming languages'. Making programming languages easily available | 717 | "programming languages". Making programming languages easily available |
719 | to apply to such problems given the specific format of data is called a | 718 | to apply to such problems given the specific format of data is called a |
720 | 'programming language binding' for that data and language. Perf supports | 719 | 'programming language binding' for that data and language. Perf supports |
721 | two programming language bindings, one for Python and one for Perl. | 720 | two programming language bindings, one for Python and one for Perl. |
@@ -733,15 +732,15 @@ two programming language bindings, one for Python and one for Perl. | |||
733 | created an elaborate compiler-based machinery to translate its | 732 | created an elaborate compiler-based machinery to translate its |
734 | language into kernel modules written in C. | 733 | language into kernel modules written in C. |
735 | 734 | ||
736 | Now that we have the trace data in perf.data, we can use 'perf script | 735 | Now that we have the trace data in ``perf.data``, we can use ``perf script |
737 | -g' to generate a skeleton script with handlers for the read/write | 736 | -g`` to generate a skeleton script with handlers for the read / write |
738 | entry/exit events we recorded:: | 737 | entry / exit events we recorded:: |
739 | 738 | ||
740 | root@crownbay:~# perf script -g python | 739 | root@crownbay:~# perf script -g python |
741 | generated Python script: perf-script.py | 740 | generated Python script: perf-script.py |
742 | 741 | ||
743 | The skeleton script simply creates a python function for each event type in the | 742 | The skeleton script simply creates a Python function for each event type in the |
744 | perf.data file. The body of each function simply prints the event name along | 743 | ``perf.data`` file. The body of each function simply prints the event name along |
745 | with its parameters. For example: | 744 | with its parameters. For example: |
746 | 745 | ||
747 | .. code-block:: python | 746 | .. code-block:: python |
@@ -755,7 +754,7 @@ with its parameters. For example: | |||
755 | print "skbaddr=%u, len=%u, name=%s\n" % (skbaddr, len, name), | 754 | print "skbaddr=%u, len=%u, name=%s\n" % (skbaddr, len, name), |
756 | 755 | ||
757 | We can run that script directly to print all of the events contained in the | 756 | We can run that script directly to print all of the events contained in the |
758 | perf.data file:: | 757 | ``perf.data`` file:: |
759 | 758 | ||
760 | root@crownbay:~# perf script -s perf-script.py | 759 | root@crownbay:~# perf script -s perf-script.py |
761 | 760 | ||
@@ -784,8 +783,8 @@ perf.data file:: | |||
784 | syscalls__sys_exit_read 1 11624.859944032 1262 wget nr=3, ret=1024 | 783 | syscalls__sys_exit_read 1 11624.859944032 1262 wget nr=3, ret=1024 |
785 | 784 | ||
786 | That in itself isn't very useful; after all, we can accomplish pretty much the | 785 | That in itself isn't very useful; after all, we can accomplish pretty much the |
787 | same thing by simply running 'perf script' without arguments in the same | 786 | same thing by simply running ``perf script`` without arguments in the same |
788 | directory as the perf.data file. | 787 | directory as the ``perf.data`` file. |
789 | 788 | ||
790 | We can however replace the print statements in the generated function | 789 | We can however replace the print statements in the generated function |
791 | bodies with whatever we want, and thereby make it infinitely more | 790 | bodies with whatever we want, and thereby make it infinitely more |
@@ -806,8 +805,8 @@ event. For example: | |||
806 | 805 | ||
807 | Each event handler function in the generated code | 806 | Each event handler function in the generated code |
808 | is modified to do this. For convenience, we define a common function | 807 | is modified to do this. For convenience, we define a common function |
809 | called inc_counts() that each handler calls; inc_counts() simply tallies | 808 | called ``inc_counts()`` that each handler calls; ``inc_counts()`` simply tallies |
810 | a count for each event using the 'counts' hash, which is a specialized | 809 | a count for each event using the ``counts`` hash, which is a specialized |
811 | hash function that does Perl-like autovivification, a capability that's | 810 | hash function that does Perl-like autovivification, a capability that's |
812 | extremely useful for kinds of multi-level aggregation commonly used in | 811 | extremely useful for kinds of multi-level aggregation commonly used in |
813 | processing traces (see perf's documentation on the Python language | 812 | processing traces (see perf's documentation on the Python language |
@@ -825,7 +824,7 @@ binding for details): | |||
825 | 824 | ||
826 | Finally, at the end of the trace processing run, we want to print the | 825 | Finally, at the end of the trace processing run, we want to print the |
827 | result of all the per-event tallies. For that, we use the special | 826 | result of all the per-event tallies. For that, we use the special |
828 | 'trace_end()' function: | 827 | ``trace_end()`` function: |
829 | 828 | ||
830 | .. code-block:: python | 829 | .. code-block:: python |
831 | 830 | ||
@@ -854,7 +853,7 @@ The end result is a summary of all the events recorded in the trace:: | |||
854 | syscalls__sys_exit_write 8990 | 853 | syscalls__sys_exit_write 8990 |
855 | 854 | ||
856 | Note that this is | 855 | Note that this is |
857 | pretty much exactly the same information we get from 'perf stat', which | 856 | pretty much exactly the same information we get from ``perf stat``, which |
858 | goes a little way to support the idea mentioned previously that given | 857 | goes a little way to support the idea mentioned previously that given |
859 | the right kind of trace data, higher-level profiling-type summaries can | 858 | the right kind of trace data, higher-level profiling-type summaries can |
860 | be derived from it. | 859 | be derived from it. |
@@ -867,29 +866,29 @@ System-Wide Tracing and Profiling | |||
867 | 866 | ||
868 | The examples so far have focused on tracing a particular program or | 867 | The examples so far have focused on tracing a particular program or |
869 | workload --- in other words, every profiling run has specified the program | 868 | workload --- in other words, every profiling run has specified the program |
870 | to profile in the command-line e.g. 'perf record wget ...'. | 869 | to profile in the command-line e.g. ``perf record wget ...``. |
871 | 870 | ||
872 | It's also possible, and more interesting in many cases, to run a | 871 | It's also possible, and more interesting in many cases, to run a |
873 | system-wide profile or trace while running the workload in a separate | 872 | system-wide profile or trace while running the workload in a separate |
874 | shell. | 873 | shell. |
875 | 874 | ||
876 | To do system-wide profiling or tracing, you typically use the -a flag to | 875 | To do system-wide profiling or tracing, you typically use the ``-a`` flag to |
877 | 'perf record'. | 876 | ``perf record``. |
878 | 877 | ||
879 | To demonstrate this, open up one window and start the profile using the | 878 | To demonstrate this, open up one window and start the profile using the |
880 | -a flag (press Ctrl-C to stop tracing):: | 879 | ``-a`` flag (press ``Ctrl-C`` to stop tracing):: |
881 | 880 | ||
882 | root@crownbay:~# perf record -g -a | 881 | root@crownbay:~# perf record -g -a |
883 | ^C[ perf record: Woken up 6 times to write data ] | 882 | ^C[ perf record: Woken up 6 times to write data ] |
884 | [ perf record: Captured and wrote 1.400 MB perf.data (~61172 samples) ] | 883 | [ perf record: Captured and wrote 1.400 MB perf.data (~61172 samples) ] |
885 | 884 | ||
886 | In another window, run the wget test:: | 885 | In another window, run the ``wget`` test:: |
887 | 886 | ||
888 | root@crownbay:~# wget &YOCTO_DL_URL;/mirror/sources/linux-2.6.19.2.tar.bz2 | 887 | root@crownbay:~# wget &YOCTO_DL_URL;/mirror/sources/linux-2.6.19.2.tar.bz2 |
889 | Connecting to downloads.yoctoproject.org (140.211.169.59:80) | 888 | Connecting to downloads.yoctoproject.org (140.211.169.59:80) |
890 | linux-2.6.19.2.tar.b 100% \|*******************************\| 41727k 0:00:00 ETA | 889 | linux-2.6.19.2.tar.b 100% \|*******************************\| 41727k 0:00:00 ETA |
891 | 890 | ||
892 | Here we see entries not only for our wget load, but for | 891 | Here we see entries not only for our ``wget`` load, but for |
893 | other processes running on the system as well: | 892 | other processes running on the system as well: |
894 | 893 | ||
895 | .. image:: figures/perf-systemwide.png | 894 | .. image:: figures/perf-systemwide.png |
@@ -897,12 +896,12 @@ other processes running on the system as well: | |||
897 | 896 | ||
898 | In the snapshot above, we can see callchains that originate in libc, and | 897 | In the snapshot above, we can see callchains that originate in libc, and |
899 | a callchain from Xorg that demonstrates that we're using a proprietary X | 898 | a callchain from Xorg that demonstrates that we're using a proprietary X |
900 | driver in userspace (notice the presence of 'PVR' and some other | 899 | driver in userspace (notice the presence of ``PVR`` and some other |
901 | unresolvable symbols in the expanded Xorg callchain). | 900 | unresolvable symbols in the expanded Xorg callchain). |
902 | 901 | ||
903 | Note also that we have both kernel and userspace entries in the above | 902 | Note also that we have both kernel and userspace entries in the above |
904 | snapshot. We can also tell perf to focus on userspace but providing a | 903 | snapshot. We can also tell perf to focus on userspace but providing a |
905 | modifier, in this case 'u', to the 'cycles' hardware counter when we | 904 | modifier, in this case ``u``, to the ``cycles`` hardware counter when we |
906 | record a profile:: | 905 | record a profile:: |
907 | 906 | ||
908 | root@crownbay:~# perf record -g -a -e cycles:u | 907 | root@crownbay:~# perf record -g -a -e cycles:u |
@@ -912,24 +911,24 @@ record a profile:: | |||
912 | .. image:: figures/perf-report-cycles-u.png | 911 | .. image:: figures/perf-report-cycles-u.png |
913 | :align: center | 912 | :align: center |
914 | 913 | ||
915 | Notice in the screenshot above, we see only userspace entries ([.]) | 914 | Notice in the screenshot above, we see only userspace entries (``[.]``) |
916 | 915 | ||
917 | Finally, we can press 'enter' on a leaf node and select the 'Zoom into | 916 | Finally, we can press ``Enter`` on a leaf node and select the ``Zoom into |
918 | DSO' menu item to show only entries associated with a specific DSO. In | 917 | DSO`` menu item to show only entries associated with a specific DSO. In |
919 | the screenshot below, we've zoomed into the 'libc' DSO which shows all | 918 | the screenshot below, we've zoomed into the ``libc`` DSO which shows all |
920 | the entries associated with the libc-xxx.so DSO. | 919 | the entries associated with the ``libc-xxx.so`` DSO. |
921 | 920 | ||
922 | .. image:: figures/perf-systemwide-libc.png | 921 | .. image:: figures/perf-systemwide-libc.png |
923 | :align: center | 922 | :align: center |
924 | 923 | ||
925 | We can also use the system-wide -a switch to do system-wide tracing. | 924 | We can also use the system-wide ``-a`` switch to do system-wide tracing. |
926 | Here we'll trace a couple of scheduler events:: | 925 | Here we'll trace a couple of scheduler events:: |
927 | 926 | ||
928 | root@crownbay:~# perf record -a -e sched:sched_switch -e sched:sched_wakeup | 927 | root@crownbay:~# perf record -a -e sched:sched_switch -e sched:sched_wakeup |
929 | ^C[ perf record: Woken up 38 times to write data ] | 928 | ^C[ perf record: Woken up 38 times to write data ] |
930 | [ perf record: Captured and wrote 9.780 MB perf.data (~427299 samples) ] | 929 | [ perf record: Captured and wrote 9.780 MB perf.data (~427299 samples) ] |
931 | 930 | ||
932 | We can look at the raw output using 'perf script' with no arguments:: | 931 | We can look at the raw output using ``perf script`` with no arguments:: |
933 | 932 | ||
934 | root@crownbay:~# perf script | 933 | root@crownbay:~# perf script |
935 | 934 | ||
@@ -948,10 +947,10 @@ Filtering | |||
948 | ^^^^^^^^^ | 947 | ^^^^^^^^^ |
949 | 948 | ||
950 | Notice that there are a lot of events that don't really have anything to | 949 | Notice that there are a lot of events that don't really have anything to |
951 | do with what we're interested in, namely events that schedule 'perf' | 950 | do with what we're interested in, namely events that schedule ``perf`` |
952 | itself in and out or that wake perf up. We can get rid of those by using | 951 | itself in and out or that wake perf up. We can get rid of those by using |
953 | the '--filter' option --- for each event we specify using -e, we can add a | 952 | the ``--filter`` option --- for each event we specify using ``-e``, we can add a |
954 | --filter after that to filter out trace events that contain fields with | 953 | ``--filter`` after that to filter out trace events that contain fields with |
955 | specific values:: | 954 | specific values:: |
956 | 955 | ||
957 | root@crownbay:~# perf record -a -e sched:sched_switch --filter 'next_comm != perf && prev_comm != perf' -e sched:sched_wakeup --filter 'comm != perf' | 956 | root@crownbay:~# perf record -a -e sched:sched_switch --filter 'next_comm != perf && prev_comm != perf' -e sched:sched_wakeup --filter 'comm != perf' |
@@ -977,16 +976,16 @@ specific values:: | |||
977 | kworker/0:3 1209 [000] 7932.326214: sched_switch: prev_comm=kworker/0:3 prev_pid=1209 prev_prio=120 prev_state=S ==> next_comm=swapper/0 next_pid=0 next_prio=120 | 976 | kworker/0:3 1209 [000] 7932.326214: sched_switch: prev_comm=kworker/0:3 prev_pid=1209 prev_prio=120 prev_state=S ==> next_comm=swapper/0 next_pid=0 next_prio=120 |
978 | 977 | ||
979 | In this case, we've filtered out all events that have | 978 | In this case, we've filtered out all events that have |
980 | 'perf' in their 'comm' or 'comm_prev' or 'comm_next' fields. Notice that | 979 | ``perf`` in their ``comm``, ``comm_prev`` or ``comm_next`` fields. Notice that |
981 | there are still events recorded for perf, but notice that those events | 980 | there are still events recorded for perf, but notice that those events |
982 | don't have values of 'perf' for the filtered fields. To completely | 981 | don't have values of ``perf`` for the filtered fields. To completely |
983 | filter out anything from perf will require a bit more work, but for the | 982 | filter out anything from perf will require a bit more work, but for the |
984 | purpose of demonstrating how to use filters, it's close enough. | 983 | purpose of demonstrating how to use filters, it's close enough. |
985 | 984 | ||
986 | .. admonition:: Tying it Together | 985 | .. admonition:: Tying it Together |
987 | 986 | ||
988 | These are exactly the same set of event filters defined by the trace | 987 | These are exactly the same set of event filters defined by the trace |
989 | event subsystem. See the ftrace/tracecmd/kernelshark section for more | 988 | event subsystem. See the ftrace / tracecmd / kernelshark section for more |
990 | discussion about these event filters. | 989 | discussion about these event filters. |
991 | 990 | ||
992 | .. admonition:: Tying it Together | 991 | .. admonition:: Tying it Together |
@@ -1014,9 +1013,9 @@ Using Dynamic Tracepoints | |||
1014 | ~~~~~~~~~~~~~~~~~~~~~~~~~ | 1013 | ~~~~~~~~~~~~~~~~~~~~~~~~~ |
1015 | 1014 | ||
1016 | perf isn't restricted to the fixed set of static tracepoints listed by | 1015 | perf isn't restricted to the fixed set of static tracepoints listed by |
1017 | 'perf list'. Users can also add their own 'dynamic' tracepoints anywhere | 1016 | ``perf list``. Users can also add their own "dynamic" tracepoints anywhere |
1018 | in the kernel. For instance, suppose we want to define our own | 1017 | in the kernel. For instance, suppose we want to define our own |
1019 | tracepoint on do_fork(). We can do that using the 'perf probe' perf | 1018 | tracepoint on ``do_fork()``. We can do that using the ``perf probe`` perf |
1020 | subcommand:: | 1019 | subcommand:: |
1021 | 1020 | ||
1022 | root@crownbay:~# perf probe do_fork | 1021 | root@crownbay:~# perf probe do_fork |
@@ -1028,8 +1027,8 @@ subcommand:: | |||
1028 | perf record -e probe:do_fork -aR sleep 1 | 1027 | perf record -e probe:do_fork -aR sleep 1 |
1029 | 1028 | ||
1030 | Adding a new tracepoint via | 1029 | Adding a new tracepoint via |
1031 | 'perf probe' results in an event with all the expected files and format | 1030 | ``perf probe`` results in an event with all the expected files and format |
1032 | in /sys/kernel/debug/tracing/events, just the same as for static | 1031 | in ``/sys/kernel/debug/tracing/events``, just the same as for static |
1033 | tracepoints (as discussed in more detail in the trace events subsystem | 1032 | tracepoints (as discussed in more detail in the trace events subsystem |
1034 | section:: | 1033 | section:: |
1035 | 1034 | ||
@@ -1062,7 +1061,7 @@ existence:: | |||
1062 | probe:do_fork (on do_fork) | 1061 | probe:do_fork (on do_fork) |
1063 | probe:schedule (on schedule) | 1062 | probe:schedule (on schedule) |
1064 | 1063 | ||
1065 | Let's record system-wide ('sleep 30' is a | 1064 | Let's record system-wide (``sleep 30`` is a |
1066 | trick for recording system-wide but basically do nothing and then wake | 1065 | trick for recording system-wide but basically do nothing and then wake |
1067 | up after 30 seconds):: | 1066 | up after 30 seconds):: |
1068 | 1067 | ||
@@ -1070,7 +1069,7 @@ up after 30 seconds):: | |||
1070 | [ perf record: Woken up 1 times to write data ] | 1069 | [ perf record: Woken up 1 times to write data ] |
1071 | [ perf record: Captured and wrote 0.087 MB perf.data (~3812 samples) ] | 1070 | [ perf record: Captured and wrote 0.087 MB perf.data (~3812 samples) ] |
1072 | 1071 | ||
1073 | Using 'perf script' we can see each do_fork event that fired:: | 1072 | Using ``perf script`` we can see each do_fork event that fired:: |
1074 | 1073 | ||
1075 | root@crownbay:~# perf script | 1074 | root@crownbay:~# perf script |
1076 | 1075 | ||
@@ -1111,7 +1110,7 @@ Using 'perf script' we can see each do_fork event that fired:: | |||
1111 | matchbox-deskto 1311 [001] 34237.114106: do_fork: (c1028460) | 1110 | matchbox-deskto 1311 [001] 34237.114106: do_fork: (c1028460) |
1112 | gaku 1312 [000] 34237.202388: do_fork: (c1028460) | 1111 | gaku 1312 [000] 34237.202388: do_fork: (c1028460) |
1113 | 1112 | ||
1114 | And using 'perf report' on the same file, we can see the | 1113 | And using ``perf report`` on the same file, we can see the |
1115 | callgraphs from starting a few programs during those 30 seconds: | 1114 | callgraphs from starting a few programs during those 30 seconds: |
1116 | 1115 | ||
1117 | .. image:: figures/perf-probe-do_fork-profile.png | 1116 | .. image:: figures/perf-probe-do_fork-profile.png |
@@ -1155,7 +1154,7 @@ section can be found here: | |||
1155 | - The top-level `perf(1) manpage <https://linux.die.net/man/1/perf>`__. | 1154 | - The top-level `perf(1) manpage <https://linux.die.net/man/1/perf>`__. |
1156 | 1155 | ||
1157 | Normally, you should be able to invoke the man pages via perf itself | 1156 | Normally, you should be able to invoke the man pages via perf itself |
1158 | e.g. 'perf help' or 'perf help record'. | 1157 | e.g. ``perf help`` or ``perf help record``. |
1159 | 1158 | ||
1160 | To have the perf manpages installed on your target, modify your | 1159 | To have the perf manpages installed on your target, modify your |
1161 | configuration as follows:: | 1160 | configuration as follows:: |
@@ -1164,7 +1163,7 @@ configuration as follows:: | |||
1164 | DISTRO_FEATURES:append = " api-documentation" | 1163 | DISTRO_FEATURES:append = " api-documentation" |
1165 | 1164 | ||
1166 | The man pages in text form, along with some other files, such as a set | 1165 | The man pages in text form, along with some other files, such as a set |
1167 | of examples, can also be found in the 'perf' directory of the kernel tree:: | 1166 | of examples, can also be found in the ``perf`` directory of the kernel tree:: |
1168 | 1167 | ||
1169 | tools/perf/Documentation | 1168 | tools/perf/Documentation |
1170 | 1169 | ||
@@ -1175,7 +1174,7 @@ Tutorial <https://perf.wiki.kernel.org/index.php/Tutorial>`__ | |||
1175 | ftrace | 1174 | ftrace |
1176 | ====== | 1175 | ====== |
1177 | 1176 | ||
1178 | 'ftrace' literally refers to the 'ftrace function tracer' but in reality | 1177 | "ftrace" literally refers to the "ftrace function tracer" but in reality |
1179 | this encompasses a number of related tracers along with the | 1178 | this encompasses a number of related tracers along with the |
1180 | infrastructure that they all make use of. | 1179 | infrastructure that they all make use of. |
1181 | 1180 | ||
@@ -1189,16 +1188,16 @@ ftrace, trace-cmd, and kernelshark run on the target system, and are | |||
1189 | ready to go out-of-the-box --- no additional setup is necessary. For the | 1188 | ready to go out-of-the-box --- no additional setup is necessary. For the |
1190 | rest of this section we assume you've ssh'ed to the host and will be | 1189 | rest of this section we assume you've ssh'ed to the host and will be |
1191 | running ftrace on the target. kernelshark is a GUI application and if | 1190 | running ftrace on the target. kernelshark is a GUI application and if |
1192 | you use the '-X' option to ssh you can have the kernelshark GUI run on | 1191 | you use the ``-X`` option to ssh you can have the kernelshark GUI run on |
1193 | the target but display remotely on the host if you want. | 1192 | the target but display remotely on the host if you want. |
1194 | 1193 | ||
1195 | Basic ftrace usage | 1194 | Basic ftrace usage |
1196 | ------------------ | 1195 | ------------------ |
1197 | 1196 | ||
1198 | 'ftrace' essentially refers to everything included in the /tracing | 1197 | "ftrace" essentially refers to everything included in the ``/tracing`` |
1199 | directory of the mounted debugfs filesystem (Yocto follows the standard | 1198 | directory of the mounted debugfs filesystem (Yocto follows the standard |
1200 | convention and mounts it at /sys/kernel/debug). Here's a listing of all | 1199 | convention and mounts it at ``/sys/kernel/debug``). Here's a listing of all |
1201 | the files found in /sys/kernel/debug/tracing on a Yocto system:: | 1200 | the files found in ``/sys/kernel/debug/tracing`` on a Yocto system:: |
1202 | 1201 | ||
1203 | root@sugarbay:/sys/kernel/debug/tracing# ls | 1202 | root@sugarbay:/sys/kernel/debug/tracing# ls |
1204 | README kprobe_events trace | 1203 | README kprobe_events trace |
@@ -1214,7 +1213,7 @@ the files found in /sys/kernel/debug/tracing on a Yocto system:: | |||
1214 | free_buffer set_graph_function | 1213 | free_buffer set_graph_function |
1215 | 1214 | ||
1216 | The files listed above are used for various purposes | 1215 | The files listed above are used for various purposes |
1217 | - some relate directly to the tracers themselves, others are used to set | 1216 | --- some relate directly to the tracers themselves, others are used to set |
1218 | tracing options, and yet others actually contain the tracing output when | 1217 | tracing options, and yet others actually contain the tracing output when |
1219 | a tracer is in effect. Some of the functions can be guessed from their | 1218 | a tracer is in effect. Some of the functions can be guessed from their |
1220 | names, others need explanation; in any case, we'll cover some of the | 1219 | names, others need explanation; in any case, we'll cover some of the |
@@ -1223,30 +1222,30 @@ the ftrace documentation. | |||
1223 | 1222 | ||
1224 | We'll start by looking at some of the available built-in tracers. | 1223 | We'll start by looking at some of the available built-in tracers. |
1225 | 1224 | ||
1226 | cat'ing the 'available_tracers' file lists the set of available tracers:: | 1225 | cat'ing the ``available_tracers`` file lists the set of available tracers:: |
1227 | 1226 | ||
1228 | root@sugarbay:/sys/kernel/debug/tracing# cat available_tracers | 1227 | root@sugarbay:/sys/kernel/debug/tracing# cat available_tracers |
1229 | blk function_graph function nop | 1228 | blk function_graph function nop |
1230 | 1229 | ||
1231 | The 'current_tracer' file contains the tracer currently in effect:: | 1230 | The ``current_tracer`` file contains the tracer currently in effect:: |
1232 | 1231 | ||
1233 | root@sugarbay:/sys/kernel/debug/tracing# cat current_tracer | 1232 | root@sugarbay:/sys/kernel/debug/tracing# cat current_tracer |
1234 | nop | 1233 | nop |
1235 | 1234 | ||
1236 | The above listing of current_tracer shows that the | 1235 | The above listing of current_tracer shows that the |
1237 | 'nop' tracer is in effect, which is just another way of saying that | 1236 | ``nop`` tracer is in effect, which is just another way of saying that |
1238 | there's actually no tracer currently in effect. | 1237 | there's actually no tracer currently in effect. |
1239 | 1238 | ||
1240 | echo'ing one of the available_tracers into current_tracer makes the | 1239 | echo'ing one of the available_tracers into ``current_tracer`` makes the |
1241 | specified tracer the current tracer:: | 1240 | specified tracer the current tracer:: |
1242 | 1241 | ||
1243 | root@sugarbay:/sys/kernel/debug/tracing# echo function > current_tracer | 1242 | root@sugarbay:/sys/kernel/debug/tracing# echo function > current_tracer |
1244 | root@sugarbay:/sys/kernel/debug/tracing# cat current_tracer | 1243 | root@sugarbay:/sys/kernel/debug/tracing# cat current_tracer |
1245 | function | 1244 | function |
1246 | 1245 | ||
1247 | The above sets the current tracer to be the 'function tracer'. This tracer | 1246 | The above sets the current tracer to be the ``function`` tracer. This tracer |
1248 | traces every function call in the kernel and makes it available as the | 1247 | traces every function call in the kernel and makes it available as the |
1249 | contents of the 'trace' file. Reading the 'trace' file lists the | 1248 | contents of the ``trace`` file. Reading the ``trace`` file lists the |
1250 | currently buffered function calls that have been traced by the function | 1249 | currently buffered function calls that have been traced by the function |
1251 | tracer:: | 1250 | tracer:: |
1252 | 1251 | ||
@@ -1303,11 +1302,11 @@ great way to learn about how the kernel code works in a dynamic sense. | |||
1303 | .. admonition:: Tying it Together | 1302 | .. admonition:: Tying it Together |
1304 | 1303 | ||
1305 | The ftrace function tracer is also available from within perf, as the | 1304 | The ftrace function tracer is also available from within perf, as the |
1306 | ftrace:function tracepoint. | 1305 | ``ftrace:function`` tracepoint. |
1307 | 1306 | ||
1308 | It is a little more difficult to follow the call chains than it needs to | 1307 | It is a little more difficult to follow the call chains than it needs to |
1309 | be --- luckily there's a variant of the function tracer that displays the | 1308 | be --- luckily there's a variant of the function tracer that displays the |
1310 | callchains explicitly, called the 'function_graph' tracer:: | 1309 | callchains explicitly, called the ``function_graph`` tracer:: |
1311 | 1310 | ||
1312 | root@sugarbay:/sys/kernel/debug/tracing# echo function_graph > current_tracer | 1311 | root@sugarbay:/sys/kernel/debug/tracing# echo function_graph > current_tracer |
1313 | root@sugarbay:/sys/kernel/debug/tracing# cat trace | less | 1312 | root@sugarbay:/sys/kernel/debug/tracing# cat trace | less |
@@ -1422,7 +1421,7 @@ callchains explicitly, called the 'function_graph' tracer:: | |||
1422 | 3) + 13.784 us | } | 1421 | 3) + 13.784 us | } |
1423 | 3) | sys_ioctl() { | 1422 | 3) | sys_ioctl() { |
1424 | 1423 | ||
1425 | As you can see, the function_graph display is much easier | 1424 | As you can see, the ``function_graph`` display is much easier |
1426 | to follow. Also note that in addition to the function calls and | 1425 | to follow. Also note that in addition to the function calls and |
1427 | associated braces, other events such as scheduler events are displayed | 1426 | associated braces, other events such as scheduler events are displayed |
1428 | in context. In fact, you can freely include any tracepoint available in | 1427 | in context. In fact, you can freely include any tracepoint available in |
@@ -1440,9 +1439,9 @@ The 'trace events' Subsystem | |||
1440 | ---------------------------- | 1439 | ---------------------------- |
1441 | 1440 | ||
1442 | One especially important directory contained within the | 1441 | One especially important directory contained within the |
1443 | /sys/kernel/debug/tracing directory is the 'events' subdirectory, which | 1442 | ``/sys/kernel/debug/tracing`` directory is the ``events`` subdirectory, which |
1444 | contains representations of every tracepoint in the system. Listing out | 1443 | contains representations of every tracepoint in the system. Listing out |
1445 | the contents of the 'events' subdirectory, we see mainly another set of | 1444 | the contents of the ``events`` subdirectory, we see mainly another set of |
1446 | subdirectories:: | 1445 | subdirectories:: |
1447 | 1446 | ||
1448 | root@sugarbay:/sys/kernel/debug/tracing# cd events | 1447 | root@sugarbay:/sys/kernel/debug/tracing# cd events |
@@ -1490,9 +1489,9 @@ subdirectories:: | |||
1490 | drwxr-xr-x 26 root root 0 Nov 14 23:19 writeback | 1489 | drwxr-xr-x 26 root root 0 Nov 14 23:19 writeback |
1491 | 1490 | ||
1492 | Each one of these subdirectories | 1491 | Each one of these subdirectories |
1493 | corresponds to a 'subsystem' and contains yet again more subdirectories, | 1492 | corresponds to a "subsystem" and contains yet again more subdirectories, |
1494 | each one of those finally corresponding to a tracepoint. For example, | 1493 | each one of those finally corresponding to a tracepoint. For example, |
1495 | here are the contents of the 'kmem' subsystem:: | 1494 | here are the contents of the ``kmem`` subsystem:: |
1496 | 1495 | ||
1497 | root@sugarbay:/sys/kernel/debug/tracing/events# cd kmem | 1496 | root@sugarbay:/sys/kernel/debug/tracing/events# cd kmem |
1498 | root@sugarbay:/sys/kernel/debug/tracing/events/kmem# ls -al | 1497 | root@sugarbay:/sys/kernel/debug/tracing/events/kmem# ls -al |
@@ -1514,7 +1513,7 @@ here are the contents of the 'kmem' subsystem:: | |||
1514 | drwxr-xr-x 2 root root 0 Nov 14 23:19 mm_page_pcpu_drain | 1513 | drwxr-xr-x 2 root root 0 Nov 14 23:19 mm_page_pcpu_drain |
1515 | 1514 | ||
1516 | Let's see what's inside the subdirectory for a | 1515 | Let's see what's inside the subdirectory for a |
1517 | specific tracepoint, in this case the one for kmalloc:: | 1516 | specific tracepoint, in this case the one for ``kmalloc``:: |
1518 | 1517 | ||
1519 | root@sugarbay:/sys/kernel/debug/tracing/events/kmem# cd kmalloc | 1518 | root@sugarbay:/sys/kernel/debug/tracing/events/kmem# cd kmalloc |
1520 | root@sugarbay:/sys/kernel/debug/tracing/events/kmem/kmalloc# ls -al | 1519 | root@sugarbay:/sys/kernel/debug/tracing/events/kmem/kmalloc# ls -al |
@@ -1525,12 +1524,12 @@ specific tracepoint, in this case the one for kmalloc:: | |||
1525 | -r--r--r-- 1 root root 0 Nov 14 23:19 format | 1524 | -r--r--r-- 1 root root 0 Nov 14 23:19 format |
1526 | -r--r--r-- 1 root root 0 Nov 14 23:19 id | 1525 | -r--r--r-- 1 root root 0 Nov 14 23:19 id |
1527 | 1526 | ||
1528 | The 'format' file for the | 1527 | The ``format`` file for the |
1529 | tracepoint describes the event in memory, which is used by the various | 1528 | tracepoint describes the event in memory, which is used by the various |
1530 | tracing tools that now make use of these tracepoint to parse the event | 1529 | tracing tools that now make use of these tracepoint to parse the event |
1531 | and make sense of it, along with a 'print fmt' field that allows tools | 1530 | and make sense of it, along with a ``print fmt`` field that allows tools |
1532 | like ftrace to display the event as text. Here's what the format of the | 1531 | like ftrace to display the event as text. Here's what the format of the |
1533 | kmalloc event looks like:: | 1532 | ``kmalloc`` event looks like:: |
1534 | 1533 | ||
1535 | root@sugarbay:/sys/kernel/debug/tracing/events/kmem/kmalloc# cat format | 1534 | root@sugarbay:/sys/kernel/debug/tracing/events/kmem/kmalloc# cat format |
1536 | name: kmalloc | 1535 | name: kmalloc |
@@ -1565,11 +1564,11 @@ kmalloc event looks like:: | |||
1565 | long)(( gfp_t)0x08u), "GFP_MOVABLE"}, {(unsigned long)(( gfp_t)0), "GFP_NOTRACK"}, {(unsigned long)(( gfp_t)0x400000u), "GFP_NO_KSWAPD"}, | 1564 | long)(( gfp_t)0x08u), "GFP_MOVABLE"}, {(unsigned long)(( gfp_t)0), "GFP_NOTRACK"}, {(unsigned long)(( gfp_t)0x400000u), "GFP_NO_KSWAPD"}, |
1566 | {(unsigned long)(( gfp_t)0x800000u), "GFP_OTHER_NODE"} ) : "GFP_NOWAIT" | 1565 | {(unsigned long)(( gfp_t)0x800000u), "GFP_OTHER_NODE"} ) : "GFP_NOWAIT" |
1567 | 1566 | ||
1568 | The 'enable' file | 1567 | The ``enable`` file |
1569 | in the tracepoint directory is what allows the user (or tools such as | 1568 | in the tracepoint directory is what allows the user (or tools such as |
1570 | trace-cmd) to actually turn the tracepoint on and off. When enabled, the | 1569 | ``trace-cmd``) to actually turn the tracepoint on and off. When enabled, the |
1571 | corresponding tracepoint will start appearing in the ftrace 'trace' file | 1570 | corresponding tracepoint will start appearing in the ftrace ``trace`` file |
1572 | described previously. For example, this turns on the kmalloc tracepoint:: | 1571 | described previously. For example, this turns on the ``kmalloc`` tracepoint:: |
1573 | 1572 | ||
1574 | root@sugarbay:/sys/kernel/debug/tracing/events/kmem/kmalloc# echo 1 > enable | 1573 | root@sugarbay:/sys/kernel/debug/tracing/events/kmem/kmalloc# echo 1 > enable |
1575 | 1574 | ||
@@ -1581,7 +1580,7 @@ events in the output buffer:: | |||
1581 | root@sugarbay:/sys/kernel/debug/tracing# echo nop > current_tracer | 1580 | root@sugarbay:/sys/kernel/debug/tracing# echo nop > current_tracer |
1582 | root@sugarbay:/sys/kernel/debug/tracing# echo 1 > tracing_on | 1581 | root@sugarbay:/sys/kernel/debug/tracing# echo 1 > tracing_on |
1583 | 1582 | ||
1584 | Now, if we look at the 'trace' file, we see nothing | 1583 | Now, if we look at the ``trace`` file, we see nothing |
1585 | but the kmalloc events we just turned on:: | 1584 | but the kmalloc events we just turned on:: |
1586 | 1585 | ||
1587 | root@sugarbay:/sys/kernel/debug/tracing# cat trace | less | 1586 | root@sugarbay:/sys/kernel/debug/tracing# cat trace | less |
@@ -1628,17 +1627,17 @@ but the kmalloc events we just turned on:: | |||
1628 | <idle>-0 [000] ..s3 18156.400660: kmalloc: call_site=ffffffff81619b36 ptr=ffff88006d554800 bytes_req=512 bytes_alloc=512 gfp_flags=GFP_ATOMIC | 1627 | <idle>-0 [000] ..s3 18156.400660: kmalloc: call_site=ffffffff81619b36 ptr=ffff88006d554800 bytes_req=512 bytes_alloc=512 gfp_flags=GFP_ATOMIC |
1629 | matchbox-termin-1361 [001] ...1 18156.552800: kmalloc: call_site=ffffffff81614050 ptr=ffff88006db34800 bytes_req=576 bytes_alloc=1024 gfp_flags=GFP_KERNEL|GFP_REPEAT | 1628 | matchbox-termin-1361 [001] ...1 18156.552800: kmalloc: call_site=ffffffff81614050 ptr=ffff88006db34800 bytes_req=576 bytes_alloc=1024 gfp_flags=GFP_KERNEL|GFP_REPEAT |
1630 | 1629 | ||
1631 | To again disable the kmalloc event, we need to send 0 to the enable file:: | 1630 | To again disable the ``kmalloc`` event, we need to send ``0`` to the ``enable`` file:: |
1632 | 1631 | ||
1633 | root@sugarbay:/sys/kernel/debug/tracing/events/kmem/kmalloc# echo 0 > enable | 1632 | root@sugarbay:/sys/kernel/debug/tracing/events/kmem/kmalloc# echo 0 > enable |
1634 | 1633 | ||
1635 | You can enable any number of events or complete subsystems (by | 1634 | You can enable any number of events or complete subsystems (by |
1636 | using the 'enable' file in the subsystem directory) and get an | 1635 | using the ``enable`` file in the subsystem directory) and get an |
1637 | arbitrarily fine-grained idea of what's going on in the system by | 1636 | arbitrarily fine-grained idea of what's going on in the system by |
1638 | enabling as many of the appropriate tracepoints as applicable. | 1637 | enabling as many of the appropriate tracepoints as applicable. |
1639 | 1638 | ||
1640 | A number of the tools described in this HOWTO do just that, including | 1639 | A number of the tools described in this HOWTO do just that, including |
1641 | trace-cmd and kernelshark in the next section. | 1640 | ``trace-cmd`` and kernelshark in the next section. |
1642 | 1641 | ||
1643 | .. admonition:: Tying it Together | 1642 | .. admonition:: Tying it Together |
1644 | 1643 | ||
@@ -1646,27 +1645,27 @@ trace-cmd and kernelshark in the next section. | |||
1646 | ftrace, but by many of the other tools covered in this document and | 1645 | ftrace, but by many of the other tools covered in this document and |
1647 | they form a central point of integration for the various tracers | 1646 | they form a central point of integration for the various tracers |
1648 | available in Linux. They form a central part of the instrumentation | 1647 | available in Linux. They form a central part of the instrumentation |
1649 | for the following tools: perf, lttng, ftrace, blktrace and SystemTap | 1648 | for the following tools: perf, LTTng, ftrace, blktrace and SystemTap |
1650 | 1649 | ||
1651 | .. admonition:: Tying it Together | 1650 | .. admonition:: Tying it Together |
1652 | 1651 | ||
1653 | Eventually all the special-purpose tracers currently available in | 1652 | Eventually all the special-purpose tracers currently available in |
1654 | /sys/kernel/debug/tracing will be removed and replaced with | 1653 | ``/sys/kernel/debug/tracing`` will be removed and replaced with |
1655 | equivalent tracers based on the 'trace events' subsystem. | 1654 | equivalent tracers based on the "trace events" subsystem. |
1656 | 1655 | ||
1657 | trace-cmd/kernelshark | 1656 | trace-cmd / kernelshark |
1658 | --------------------- | 1657 | ----------------------- |
1659 | 1658 | ||
1660 | trace-cmd is essentially an extensive command-line 'wrapper' interface | 1659 | trace-cmd is essentially an extensive command-line "wrapper" interface |
1661 | that hides the details of all the individual files in | 1660 | that hides the details of all the individual files in |
1662 | /sys/kernel/debug/tracing, allowing users to specify specific particular | 1661 | ``/sys/kernel/debug/tracing``, allowing users to specify specific particular |
1663 | events within the /sys/kernel/debug/tracing/events/ subdirectory and to | 1662 | events within the ``/sys/kernel/debug/tracing/events/`` subdirectory and to |
1664 | collect traces and avoid having to deal with those details directly. | 1663 | collect traces and avoid having to deal with those details directly. |
1665 | 1664 | ||
1666 | As yet another layer on top of that, kernelshark provides a GUI that | 1665 | As yet another layer on top of that, kernelshark provides a GUI that |
1667 | allows users to start and stop traces and specify sets of events using | 1666 | allows users to start and stop traces and specify sets of events using |
1668 | an intuitive interface, and view the output as both trace events and as | 1667 | an intuitive interface, and view the output as both trace events and as |
1669 | a per-CPU graphical display. It directly uses 'trace-cmd' as the | 1668 | a per-CPU graphical display. It directly uses trace-cmd as the |
1670 | plumbing that accomplishes all that underneath the covers (and actually | 1669 | plumbing that accomplishes all that underneath the covers (and actually |
1671 | displays the trace-cmd command it uses, as we'll see). | 1670 | displays the trace-cmd command it uses, as we'll see). |
1672 | 1671 | ||
@@ -1674,7 +1673,7 @@ To start a trace using kernelshark, first start kernelshark:: | |||
1674 | 1673 | ||
1675 | root@sugarbay:~# kernelshark | 1674 | root@sugarbay:~# kernelshark |
1676 | 1675 | ||
1677 | Then bring up the 'Capture' dialog by | 1676 | Then bring up the ``Capture`` dialog by |
1678 | choosing from the kernelshark menu:: | 1677 | choosing from the kernelshark menu:: |
1679 | 1678 | ||
1680 | Capture | Record | 1679 | Capture | Record |
@@ -1691,10 +1690,10 @@ gets them for kernelshark. | |||
1691 | 1690 | ||
1692 | In the above screenshot, we've decided to explore the graphics subsystem | 1691 | In the above screenshot, we've decided to explore the graphics subsystem |
1693 | a bit and so have chosen to trace all the tracepoints contained within | 1692 | a bit and so have chosen to trace all the tracepoints contained within |
1694 | the 'i915' and 'drm' subsystems. | 1693 | the ``i915`` and ``drm`` subsystems. |
1695 | 1694 | ||
1696 | After doing that, we can start and stop the trace using the 'Run' and | 1695 | After doing that, we can start and stop the trace using the ``Run`` and |
1697 | 'Stop' button on the lower right corner of the dialog (the same button | 1696 | ``Stop`` button on the lower right corner of the dialog (the same button |
1698 | will turn into the 'Stop' button after the trace has started): | 1697 | will turn into the 'Stop' button after the trace has started): |
1699 | 1698 | ||
1700 | .. image:: figures/kernelshark-output-display.png | 1699 | .. image:: figures/kernelshark-output-display.png |
@@ -1704,15 +1703,15 @@ Notice that the right-hand pane shows the exact trace-cmd command-line | |||
1704 | that's used to run the trace, along with the results of the trace-cmd | 1703 | that's used to run the trace, along with the results of the trace-cmd |
1705 | run. | 1704 | run. |
1706 | 1705 | ||
1707 | Once the 'Stop' button is pressed, the graphical view magically fills up | 1706 | Once the ``Stop`` button is pressed, the graphical view magically fills up |
1708 | with a colorful per-cpu display of the trace data, along with the | 1707 | with a colorful per-cpu display of the trace data, along with the |
1709 | detailed event listing below that: | 1708 | detailed event listing below that: |
1710 | 1709 | ||
1711 | .. image:: figures/kernelshark-i915-display.png | 1710 | .. image:: figures/kernelshark-i915-display.png |
1712 | :align: center | 1711 | :align: center |
1713 | 1712 | ||
1714 | Here's another example, this time a display resulting from tracing 'all | 1713 | Here's another example, this time a display resulting from tracing ``all |
1715 | events': | 1714 | events``: |
1716 | 1715 | ||
1717 | .. image:: figures/kernelshark-all.png | 1716 | .. image:: figures/kernelshark-all.png |
1718 | :align: center | 1717 | :align: center |
@@ -1760,12 +1759,12 @@ systemtap | |||
1760 | SystemTap is a system-wide script-based tracing and profiling tool. | 1759 | SystemTap is a system-wide script-based tracing and profiling tool. |
1761 | 1760 | ||
1762 | SystemTap scripts are C-like programs that are executed in the kernel to | 1761 | SystemTap scripts are C-like programs that are executed in the kernel to |
1763 | gather/print/aggregate data extracted from the context they end up being | 1762 | gather / print / aggregate data extracted from the context they end up being |
1764 | invoked under. | 1763 | invoked under. |
1765 | 1764 | ||
1766 | For example, this probe from the `SystemTap | 1765 | For example, this probe from the `SystemTap |
1767 | tutorial <https://sourceware.org/systemtap/tutorial/>`__ simply prints a | 1766 | tutorial <https://sourceware.org/systemtap/tutorial/>`__ simply prints a |
1768 | line every time any process on the system open()s a file. For each line, | 1767 | line every time any process on the system runs ``open()`` on a file. For each line, |
1769 | it prints the executable name of the program that opened the file, along | 1768 | it prints the executable name of the program that opened the file, along |
1770 | with its PID, and the name of the file it opened (or tried to open), | 1769 | with its PID, and the name of the file it opened (or tried to open), |
1771 | which it extracts from the open syscall's argstr. | 1770 | which it extracts from the open syscall's argstr. |
@@ -1790,19 +1789,19 @@ file containing the above text is trace_open.stp:: | |||
1790 | # stap trace_open.stp | 1789 | # stap trace_open.stp |
1791 | 1790 | ||
1792 | What systemtap does under the covers to run this probe is 1) parse and | 1791 | What systemtap does under the covers to run this probe is 1) parse and |
1793 | convert the probe to an equivalent 'C' form, 2) compile the 'C' form | 1792 | convert the probe to an equivalent "C" form, 2) compile the "C" form |
1794 | into a kernel module, 3) insert the module into the kernel, which arms | 1793 | into a kernel module, 3) insert the module into the kernel, which arms |
1795 | it, and 4) collect the data generated by the probe and display it to the | 1794 | it, and 4) collect the data generated by the probe and display it to the |
1796 | user. | 1795 | user. |
1797 | 1796 | ||
1798 | In order to accomplish steps 1 and 2, the 'stap' program needs access to | 1797 | In order to accomplish steps 1 and 2, the ``stap`` program needs access to |
1799 | the kernel build system that produced the kernel that the probed system | 1798 | the kernel build system that produced the kernel that the probed system |
1800 | is running. In the case of a typical embedded system (the 'target'), the | 1799 | is running. In the case of a typical embedded system (the "target"), the |
1801 | kernel build system unfortunately isn't typically part of the image | 1800 | kernel build system unfortunately isn't typically part of the image |
1802 | running on the target. It is normally available on the 'host' system | 1801 | running on the target. It is normally available on the "host" system |
1803 | that produced the target image however; in such cases, steps 1 and 2 are | 1802 | that produced the target image however; in such cases, steps 1 and 2 are |
1804 | executed on the host system, and steps 3 and 4 are executed on the | 1803 | executed on the host system, and steps 3 and 4 are executed on the |
1805 | target system, using only the systemtap 'runtime'. | 1804 | target system, using only the systemtap "runtime". |
1806 | 1805 | ||
1807 | The systemtap support in Yocto assumes that only steps 3 and 4 are run | 1806 | The systemtap support in Yocto assumes that only steps 3 and 4 are run |
1808 | on the target; it is possible to do everything on the target, but this | 1807 | on the target; it is possible to do everything on the target, but this |
@@ -1819,12 +1818,12 @@ systemtap Setup | |||
1819 | --------------- | 1818 | --------------- |
1820 | 1819 | ||
1821 | Those are a lot of steps and a lot of details, but fortunately Yocto | 1820 | Those are a lot of steps and a lot of details, but fortunately Yocto |
1822 | includes a script called 'crosstap' that will take care of those | 1821 | includes a script called ``crosstap`` that will take care of those |
1823 | details, allowing you to simply execute a systemtap script on the remote | 1822 | details, allowing you to simply execute a systemtap script on the remote |
1824 | target, with arguments if necessary. | 1823 | target, with arguments if necessary. |
1825 | 1824 | ||
1826 | In order to do this from a remote host, however, you need to have access | 1825 | In order to do this from a remote host, however, you need to have access |
1827 | to the build for the image you booted. The 'crosstap' script provides | 1826 | to the build for the image you booted. The ``crosstap`` script provides |
1828 | details on how to do this if you run the script on the host without | 1827 | details on how to do this if you run the script on the host without |
1829 | having done a build:: | 1828 | having done a build:: |
1830 | 1829 | ||
@@ -1833,29 +1832,35 @@ having done a build:: | |||
1833 | Error: No target kernel build found. | 1832 | Error: No target kernel build found. |
1834 | Did you forget to create a local build of your image? | 1833 | Did you forget to create a local build of your image? |
1835 | 1834 | ||
1836 | 'crosstap' requires a local sdk build of the target system | 1835 | 'crosstap' requires a local sdk build of the target system |
1837 | (or a build that includes 'tools-profile') in order to build | 1836 | (or a build that includes 'tools-profile') in order to build |
1838 | kernel modules that can probe the target system. | 1837 | kernel modules that can probe the target system. |
1839 | 1838 | ||
1840 | Practically speaking, that means you need to do the following: | 1839 | Practically speaking, that means you need to do the following: |
1841 | - If you're running a pre-built image, download the release | 1840 | |
1842 | and/or BSP tarballs used to build the image. | 1841 | - If you're running a pre-built image, download the release |
1843 | - If you're working from git sources, just clone the metadata | 1842 | and/or BSP tarballs used to build the image. |
1844 | and BSP layers needed to build the image you'll be booting. | 1843 | |
1845 | - Make sure you're properly set up to build a new image (see | 1844 | - If you're working from git sources, just clone the metadata |
1846 | the BSP README and/or the widely available basic documentation | 1845 | and BSP layers needed to build the image you'll be booting. |
1847 | that discusses how to build images). | 1846 | |
1848 | - Build an -sdk version of the image e.g.: | 1847 | - Make sure you're properly set up to build a new image (see |
1849 | $ bitbake core-image-sato-sdk | 1848 | the BSP README and/or the widely available basic documentation |
1850 | OR | 1849 | that discusses how to build images). |
1851 | - Build a non-sdk image but include the profiling tools: | 1850 | |
1852 | [ edit local.conf and add 'tools-profile' to the end of | 1851 | - Build an -sdk version of the image e.g.:: |
1853 | the EXTRA_IMAGE_FEATURES variable ] | 1852 | |
1854 | $ bitbake core-image-sato | 1853 | $ bitbake core-image-sato-sdk |
1854 | |||
1855 | - Or build a non-sdk image but include the profiling tools | ||
1856 | (edit ``local.conf`` and add ``tools-profile`` to the end of | ||
1857 | :term:``EXTRA_IMAGE_FEATURES`` variable):: | ||
1858 | |||
1859 | $ bitbake core-image-sato | ||
1855 | 1860 | ||
1856 | Once you've build the image on the host system, you're ready to | 1861 | Once you've build the image on the host system, you're ready to |
1857 | boot it (or the equivalent pre-built image) and use 'crosstap' | 1862 | boot it (or the equivalent pre-built image) and use ``crosstap`` |
1858 | to probe it (you need to source the environment as usual first): | 1863 | to probe it (you need to source the environment as usual first):: |
1859 | 1864 | ||
1860 | $ source oe-init-build-env | 1865 | $ source oe-init-build-env |
1861 | $ cd ~/my/systemtap/scripts | 1866 | $ cd ~/my/systemtap/scripts |
@@ -1863,23 +1868,22 @@ having done a build:: | |||
1863 | 1868 | ||
1864 | .. note:: | 1869 | .. note:: |
1865 | 1870 | ||
1866 | SystemTap, which uses 'crosstap', assumes you can establish an ssh | 1871 | SystemTap, which uses ``crosstap``, assumes you can establish an ssh |
1867 | connection to the remote target. Please refer to the crosstap wiki | 1872 | connection to the remote target. Please refer to the crosstap wiki |
1868 | page for details on verifying ssh connections at | 1873 | page for details on verifying ssh connections. Also, the ability to ssh |
1869 | . Also, the ability to ssh into the target system is not enabled by | 1874 | into the target system is not enabled by default in ``*-minimal`` images. |
1870 | default in \*-minimal images. | ||
1871 | 1875 | ||
1872 | So essentially what you need to | 1876 | So essentially what you need to |
1873 | do is build an SDK image or image with 'tools-profile' as detailed in | 1877 | do is build an SDK image or image with ``tools-profile`` as detailed in |
1874 | the ":ref:`profile-manual/intro:General Setup`" section of this | 1878 | the ":ref:`profile-manual/intro:General Setup`" section of this |
1875 | manual, and boot the resulting target image. | 1879 | manual, and boot the resulting target image. |
1876 | 1880 | ||
1877 | .. note:: | 1881 | .. note:: |
1878 | 1882 | ||
1879 | If you have a build directory containing multiple machines, you need | 1883 | If you have a :term:`Build Directory` containing multiple machines, you need |
1880 | to have the MACHINE you're connecting to selected in local.conf, and | 1884 | to have the :term:`MACHINE` you're connecting to selected in ``local.conf``, and |
1881 | the kernel in that machine's build directory must match the kernel on | 1885 | the kernel in that machine's :term:`Build Directory` must match the kernel on |
1882 | the booted system exactly, or you'll get the above 'crosstap' message | 1886 | the booted system exactly, or you'll get the above ``crosstap`` message |
1883 | when you try to invoke a script. | 1887 | when you try to invoke a script. |
1884 | 1888 | ||
1885 | Running a Script on a Target | 1889 | Running a Script on a Target |
@@ -1903,8 +1907,8 @@ the target:: | |||
1903 | 1907 | ||
1904 | You can also run generated QEMU images with a command like 'runqemu qemux86-64' | 1908 | You can also run generated QEMU images with a command like 'runqemu qemux86-64' |
1905 | 1909 | ||
1906 | Once you've done that, you can cd to whatever | 1910 | Once you've done that, you can ``cd`` to whatever |
1907 | directory contains your scripts and use 'crosstap' to run the script:: | 1911 | directory contains your scripts and use ``crosstap`` to run the script:: |
1908 | 1912 | ||
1909 | $ cd /path/to/my/systemap/script | 1913 | $ cd /path/to/my/systemap/script |
1910 | $ crosstap root@192.168.7.2 trace_open.stp | 1914 | $ crosstap root@192.168.7.2 trace_open.stp |
@@ -1918,9 +1922,8 @@ Try ssh'ing to the target and see what happens:: | |||
1918 | 1922 | ||
1919 | $ ssh root@192.168.7.2 | 1923 | $ ssh root@192.168.7.2 |
1920 | 1924 | ||
1921 | A lot of the time, connection | 1925 | Connection problems are often due specifying a wrong IP address or having a ``host key |
1922 | problems are due specifying a wrong IP address or having a 'host key | 1926 | verification error``. |
1923 | verification error'. | ||
1924 | 1927 | ||
1925 | If everything worked as planned, you should see something like this | 1928 | If everything worked as planned, you should see something like this |
1926 | (enter the password when prompted, or press enter if it's set up to use | 1929 | (enter the password when prompted, or press enter if it's set up to use |
@@ -1958,16 +1961,16 @@ outlined in the ":ref:`profile-manual/intro:General Setup`" section. | |||
1958 | 1961 | ||
1959 | Sysprof is a GUI-based application that runs on the target system. For | 1962 | Sysprof is a GUI-based application that runs on the target system. For |
1960 | the rest of this document we assume you've ssh'ed to the host and will | 1963 | the rest of this document we assume you've ssh'ed to the host and will |
1961 | be running Sysprof on the target (you can use the '-X' option to ssh and | 1964 | be running Sysprof on the target (you can use the ``-X`` option to ssh and |
1962 | have the Sysprof GUI run on the target but display remotely on the host | 1965 | have the Sysprof GUI run on the target but display remotely on the host |
1963 | if you want). | 1966 | if you want). |
1964 | 1967 | ||
1965 | Basic Sysprof Usage | 1968 | Basic Sysprof Usage |
1966 | ------------------- | 1969 | ------------------- |
1967 | 1970 | ||
1968 | To start profiling the system, you simply press the 'Start' button. To | 1971 | To start profiling the system, you simply press the ``Start`` button. To |
1969 | stop profiling and to start viewing the profile data in one easy step, | 1972 | stop profiling and to start viewing the profile data in one easy step, |
1970 | press the 'Profile' button. | 1973 | press the ``Profile`` button. |
1971 | 1974 | ||
1972 | Once you've pressed the profile button, the three panes will fill up | 1975 | Once you've pressed the profile button, the three panes will fill up |
1973 | with profiling data: | 1976 | with profiling data: |
@@ -1982,7 +1985,7 @@ perf's default callee-oriented callchain display. | |||
1982 | 1985 | ||
1983 | In the screenshot above, we're focusing on ``__copy_to_user_ll()`` and | 1986 | In the screenshot above, we're focusing on ``__copy_to_user_ll()`` and |
1984 | looking up the callchain we can see that one of the callers of | 1987 | looking up the callchain we can see that one of the callers of |
1985 | ``__copy_to_user_ll`` is sys_read() and the complete callpath between them. | 1988 | ``__copy_to_user_ll`` is ``sys_read()`` and the complete callpath between them. |
1986 | Notice that this is essentially a portion of the same information we saw | 1989 | Notice that this is essentially a portion of the same information we saw |
1987 | in the perf display shown in the perf section of this page. | 1990 | in the perf display shown in the perf section of this page. |
1988 | 1991 | ||
@@ -1990,7 +1993,7 @@ in the perf display shown in the perf section of this page. | |||
1990 | :align: center | 1993 | :align: center |
1991 | 1994 | ||
1992 | Similarly, the above is a snapshot of the Sysprof display of a | 1995 | Similarly, the above is a snapshot of the Sysprof display of a |
1993 | copy-from-user callchain. | 1996 | ``copy-from-user`` callchain. |
1994 | 1997 | ||
1995 | Finally, looking at the third Sysprof pane in the lower left, we can see | 1998 | Finally, looking at the third Sysprof pane in the lower left, we can see |
1996 | a list of all the callers of a particular function selected in the top | 1999 | a list of all the callers of a particular function selected in the top |
@@ -2005,10 +2008,10 @@ to the selected function, and so on. | |||
2005 | 2008 | ||
2006 | .. admonition:: Tying it Together | 2009 | .. admonition:: Tying it Together |
2007 | 2010 | ||
2008 | If you like sysprof's 'caller-oriented' display, you may be able to | 2011 | If you like sysprof's ``caller-oriented`` display, you may be able to |
2009 | approximate it in other tools as well. For example, 'perf report' has | 2012 | approximate it in other tools as well. For example, ``perf report`` has |
2010 | the -g (--call-graph) option that you can experiment with; one of the | 2013 | the ``-g`` (``--call-graph``) option that you can experiment with; one of the |
2011 | options is 'caller' for an inverted caller-based callgraph display. | 2014 | options is ``caller`` for an inverted caller-based callgraph display. |
2012 | 2015 | ||
2013 | Sysprof Documentation | 2016 | Sysprof Documentation |
2014 | --------------------- | 2017 | --------------------- |
@@ -2032,7 +2035,7 @@ Collecting and Viewing Traces | |||
2032 | ----------------------------- | 2035 | ----------------------------- |
2033 | 2036 | ||
2034 | Once you've applied the above commits and built and booted your image | 2037 | Once you've applied the above commits and built and booted your image |
2035 | (you need to build the core-image-sato-sdk image or use one of the other | 2038 | (you need to build the ``core-image-sato-sdk`` image or use one of the other |
2036 | methods described in the ":ref:`profile-manual/intro:General Setup`" section), you're ready to start | 2039 | methods described in the ":ref:`profile-manual/intro:General Setup`" section), you're ready to start |
2037 | tracing. | 2040 | tracing. |
2038 | 2041 | ||
@@ -2117,14 +2120,14 @@ You can now view the trace in text form on the target:: | |||
2117 | 2120 | ||
2118 | You can now safely destroy the trace | 2121 | You can now safely destroy the trace |
2119 | session (note that this doesn't delete the trace --- it's still there in | 2122 | session (note that this doesn't delete the trace --- it's still there in |
2120 | ~/lttng-traces):: | 2123 | ``~/lttng-traces``):: |
2121 | 2124 | ||
2122 | root@crownbay:~# lttng destroy | 2125 | root@crownbay:~# lttng destroy |
2123 | Session auto-20121015-232120 destroyed at /home/root | 2126 | Session auto-20121015-232120 destroyed at /home/root |
2124 | 2127 | ||
2125 | Note that the trace is saved in a directory of the same name as returned by | 2128 | Note that the trace is saved in a directory of the same name as returned by |
2126 | 'lttng create', under the ~/lttng-traces directory (note that you can change this by | 2129 | ``lttng create``, under the ``~/lttng-traces`` directory (note that you can change this by |
2127 | supplying your own name to 'lttng create'):: | 2130 | supplying your own name to ``lttng create``):: |
2128 | 2131 | ||
2129 | root@crownbay:~# ls -al ~/lttng-traces | 2132 | root@crownbay:~# ls -al ~/lttng-traces |
2130 | drwxrwx--- 3 root root 1024 Oct 15 23:21 . | 2133 | drwxrwx--- 3 root root 1024 Oct 15 23:21 . |
@@ -2135,12 +2138,12 @@ Collecting and viewing a userspace trace on the target (inside a shell) | |||
2135 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | 2138 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
2136 | 2139 | ||
2137 | For LTTng userspace tracing, you need to have a properly instrumented | 2140 | For LTTng userspace tracing, you need to have a properly instrumented |
2138 | userspace program. For this example, we'll use the 'hello' test program | 2141 | userspace program. For this example, we'll use the ``hello`` test program |
2139 | generated by the lttng-ust build. | 2142 | generated by the ``lttng-ust`` build. |
2140 | 2143 | ||
2141 | The 'hello' test program isn't installed on the root filesystem by the lttng-ust | 2144 | The ``hello`` test program isn't installed on the root filesystem by the ``lttng-ust`` |
2142 | build, so we need to copy it over manually. First cd into the build | 2145 | build, so we need to copy it over manually. First ``cd`` into the build |
2143 | directory that contains the hello executable:: | 2146 | directory that contains the ``hello`` executable:: |
2144 | 2147 | ||
2145 | $ cd build/tmp/work/core2_32-poky-linux/lttng-ust/2.0.5-r0/git/tests/hello/.libs | 2148 | $ cd build/tmp/work/core2_32-poky-linux/lttng-ust/2.0.5-r0/git/tests/hello/.libs |
2146 | 2149 | ||
@@ -2148,7 +2151,7 @@ Copy that over to the target machine:: | |||
2148 | 2151 | ||
2149 | $ scp hello root@192.168.1.20: | 2152 | $ scp hello root@192.168.1.20: |
2150 | 2153 | ||
2151 | You now have the instrumented lttng 'hello world' test program on the | 2154 | You now have the instrumented LTTng "hello world" test program on the |
2152 | target, ready to test. | 2155 | target, ready to test. |
2153 | 2156 | ||
2154 | First, from the host, ssh to the target:: | 2157 | First, from the host, ssh to the target:: |
@@ -2176,7 +2179,7 @@ Start the trace:: | |||
2176 | root@crownbay:~# lttng start | 2179 | root@crownbay:~# lttng start |
2177 | Tracing started for session auto-20190303-021943 | 2180 | Tracing started for session auto-20190303-021943 |
2178 | 2181 | ||
2179 | Run the instrumented hello world program:: | 2182 | Run the instrumented "hello world" program:: |
2180 | 2183 | ||
2181 | root@crownbay:~# ./hello | 2184 | root@crownbay:~# ./hello |
2182 | Hello, World! | 2185 | Hello, World! |
@@ -2200,7 +2203,7 @@ You can now view the trace in text form on the target:: | |||
2200 | . | 2203 | . |
2201 | 2204 | ||
2202 | You can now safely destroy the trace session (note that this doesn't delete the | 2205 | You can now safely destroy the trace session (note that this doesn't delete the |
2203 | trace --- it's still there in ~/lttng-traces):: | 2206 | trace --- it's still there in ``~/lttng-traces``):: |
2204 | 2207 | ||
2205 | root@crownbay:~# lttng destroy | 2208 | root@crownbay:~# lttng destroy |
2206 | Session auto-20190303-021943 destroyed at /home/root | 2209 | Session auto-20190303-021943 destroyed at /home/root |
@@ -2244,12 +2247,12 @@ will be running blkrace on the target. | |||
2244 | Basic blktrace Usage | 2247 | Basic blktrace Usage |
2245 | -------------------- | 2248 | -------------------- |
2246 | 2249 | ||
2247 | To record a trace, simply run the 'blktrace' command, giving it the name | 2250 | To record a trace, simply run the ``blktrace`` command, giving it the name |
2248 | of the block device you want to trace activity on:: | 2251 | of the block device you want to trace activity on:: |
2249 | 2252 | ||
2250 | root@crownbay:~# blktrace /dev/sdc | 2253 | root@crownbay:~# blktrace /dev/sdc |
2251 | 2254 | ||
2252 | In another shell, execute a workload you want to trace. :: | 2255 | In another shell, execute a workload you want to trace:: |
2253 | 2256 | ||
2254 | root@crownbay:/media/sdc# rm linux-2.6.19.2.tar.bz2; wget &YOCTO_DL_URL;/mirror/sources/linux-2.6.19.2.tar.bz2; sync | 2257 | root@crownbay:/media/sdc# rm linux-2.6.19.2.tar.bz2; wget &YOCTO_DL_URL;/mirror/sources/linux-2.6.19.2.tar.bz2; sync |
2255 | Connecting to downloads.yoctoproject.org (140.211.169.59:80) | 2258 | Connecting to downloads.yoctoproject.org (140.211.169.59:80) |
@@ -2258,7 +2261,7 @@ In another shell, execute a workload you want to trace. :: | |||
2258 | Press Ctrl-C in the blktrace shell to stop the trace. It | 2261 | Press Ctrl-C in the blktrace shell to stop the trace. It |
2259 | will display how many events were logged, along with the per-cpu file | 2262 | will display how many events were logged, along with the per-cpu file |
2260 | sizes (blktrace records traces in per-cpu kernel buffers and simply | 2263 | sizes (blktrace records traces in per-cpu kernel buffers and simply |
2261 | dumps them to userspace for blkparse to merge and sort later). :: | 2264 | dumps them to userspace for blkparse to merge and sort later):: |
2262 | 2265 | ||
2263 | ^C=== sdc === | 2266 | ^C=== sdc === |
2264 | CPU 0: 7082 events, 332 KiB data | 2267 | CPU 0: 7082 events, 332 KiB data |
@@ -2274,7 +2277,7 @@ with the device name as the first part of the filename:: | |||
2274 | -rw-r--r-- 1 root root 339938 Oct 27 22:40 sdc.blktrace.0 | 2277 | -rw-r--r-- 1 root root 339938 Oct 27 22:40 sdc.blktrace.0 |
2275 | -rw-r--r-- 1 root root 75753 Oct 27 22:40 sdc.blktrace.1 | 2278 | -rw-r--r-- 1 root root 75753 Oct 27 22:40 sdc.blktrace.1 |
2276 | 2279 | ||
2277 | To view the trace events, simply invoke 'blkparse' in the directory | 2280 | To view the trace events, simply invoke ``blkparse`` in the directory |
2278 | containing the trace files, giving it the device name that forms the | 2281 | containing the trace files, giving it the device name that forms the |
2279 | first part of the filenames:: | 2282 | first part of the filenames:: |
2280 | 2283 | ||
@@ -2373,7 +2376,7 @@ Live Mode | |||
2373 | ~~~~~~~~~ | 2376 | ~~~~~~~~~ |
2374 | 2377 | ||
2375 | blktrace and blkparse are designed from the ground up to be able to | 2378 | blktrace and blkparse are designed from the ground up to be able to |
2376 | operate together in a 'pipe mode' where the stdout of blktrace can be | 2379 | operate together in a "pipe mode" where the stdout of blktrace can be |
2377 | fed directly into the stdin of blkparse:: | 2380 | fed directly into the stdin of blkparse:: |
2378 | 2381 | ||
2379 | root@crownbay:~# blktrace /dev/sdc -o - | blkparse -i - | 2382 | root@crownbay:~# blktrace /dev/sdc -o - | blkparse -i - |
@@ -2419,13 +2422,13 @@ On the host system, you should see this:: | |||
2419 | 2422 | ||
2420 | server: connection from 192.168.1.43 | 2423 | server: connection from 192.168.1.43 |
2421 | 2424 | ||
2422 | In another shell, execute a workload you want to trace. :: | 2425 | In another shell, execute a workload you want to trace:: |
2423 | 2426 | ||
2424 | root@crownbay:/media/sdc# rm linux-2.6.19.2.tar.bz2; wget &YOCTO_DL_URL;/mirror/sources/linux-2.6.19.2.tar.bz2; sync | 2427 | root@crownbay:/media/sdc# rm linux-2.6.19.2.tar.bz2; wget &YOCTO_DL_URL;/mirror/sources/linux-2.6.19.2.tar.bz2; sync |
2425 | Connecting to downloads.yoctoproject.org (140.211.169.59:80) | 2428 | Connecting to downloads.yoctoproject.org (140.211.169.59:80) |
2426 | linux-2.6.19.2.tar.b 100% \|*******************************\| 41727k 0:00:00 ETA | 2429 | linux-2.6.19.2.tar.b 100% \|*******************************\| 41727k 0:00:00 ETA |
2427 | 2430 | ||
2428 | When it's done, do a Ctrl-C on the target system to stop the | 2431 | When it's done, do a ``Ctrl-C`` on the target system to stop the |
2429 | trace:: | 2432 | trace:: |
2430 | 2433 | ||
2431 | ^C=== sdc === | 2434 | ^C=== sdc === |
@@ -2450,7 +2453,7 @@ save the target output inside a hostname-timestamp directory:: | |||
2450 | drwxr-sr-x 4 root root 1024 Oct 26 18:24 .. | 2453 | drwxr-sr-x 4 root root 1024 Oct 26 18:24 .. |
2451 | drwxr-xr-x 2 root root 1024 Oct 28 02:40 192.168.1.43-2012-10-28-02:40:56 | 2454 | drwxr-xr-x 2 root root 1024 Oct 28 02:40 192.168.1.43-2012-10-28-02:40:56 |
2452 | 2455 | ||
2453 | cd into that directory to see the output files:: | 2456 | ``cd`` into that directory to see the output files:: |
2454 | 2457 | ||
2455 | $ ls -l | 2458 | $ ls -l |
2456 | -rw-r--r-- 1 root root 369193 Oct 28 02:44 sdc.blktrace.0 | 2459 | -rw-r--r-- 1 root root 369193 Oct 28 02:44 sdc.blktrace.0 |
@@ -2517,14 +2520,14 @@ It's also possible to trace block I/O using only | |||
2517 | can be useful for casual tracing if you don't want to bother dealing with the | 2520 | can be useful for casual tracing if you don't want to bother dealing with the |
2518 | userspace tools. | 2521 | userspace tools. |
2519 | 2522 | ||
2520 | To enable tracing for a given device, use /sys/block/xxx/trace/enable, | 2523 | To enable tracing for a given device, use ``/sys/block/xxx/trace/enable``, |
2521 | where xxx is the device name. This for example enables tracing for | 2524 | where ``xxx`` is the device name. This for example enables tracing for |
2522 | /dev/sdc:: | 2525 | ``/dev/sdc``:: |
2523 | 2526 | ||
2524 | root@crownbay:/sys/kernel/debug/tracing# echo 1 > /sys/block/sdc/trace/enable | 2527 | root@crownbay:/sys/kernel/debug/tracing# echo 1 > /sys/block/sdc/trace/enable |
2525 | 2528 | ||
2526 | Once you've selected the device(s) you want | 2529 | Once you've selected the device(s) you want |
2527 | to trace, selecting the 'blk' tracer will turn the blk tracer on:: | 2530 | to trace, selecting the ``blk`` tracer will turn the blk tracer on:: |
2528 | 2531 | ||
2529 | root@crownbay:/sys/kernel/debug/tracing# cat available_tracers | 2532 | root@crownbay:/sys/kernel/debug/tracing# cat available_tracers |
2530 | blk function_graph function nop | 2533 | blk function_graph function nop |
@@ -2535,7 +2538,7 @@ Execute the workload you're interested in:: | |||
2535 | 2538 | ||
2536 | root@crownbay:/sys/kernel/debug/tracing# cat /media/sdc/testfile.txt | 2539 | root@crownbay:/sys/kernel/debug/tracing# cat /media/sdc/testfile.txt |
2537 | 2540 | ||
2538 | And look at the output (note here that we're using 'trace_pipe' instead of | 2541 | And look at the output (note here that we're using ``trace_pipe`` instead of |
2539 | trace to capture this trace --- this allows us to wait around on the pipe | 2542 | trace to capture this trace --- this allows us to wait around on the pipe |
2540 | for data to appear):: | 2543 | for data to appear):: |
2541 | 2544 | ||
@@ -2572,8 +2575,8 @@ section can be found here: | |||
2572 | 2575 | ||
2573 | - https://linux.die.net/man/8/btrace | 2576 | - https://linux.die.net/man/8/btrace |
2574 | 2577 | ||
2575 | The above manpages, along with manpages for the other blktrace utilities | 2578 | The above manpages, along with manuals for the other blktrace utilities |
2576 | (btt, blkiomon, etc) can be found in the /doc directory of the blktrace | 2579 | (btt, blkiomon, etc) can be found in the ``/doc`` directory of the blktrace |
2577 | tools git repo:: | 2580 | tools git repo:: |
2578 | 2581 | ||
2579 | $ git clone git://git.kernel.dk/blktrace.git | 2582 | $ git clone git://git.kernel.dk/blktrace.git |