diff options
author | Michael Opdenacker <michael.opdenacker@bootlin.com> | 2024-03-14 13:28:00 +0100 |
---|---|---|
committer | Steve Sakoman <steve@sakoman.com> | 2024-03-28 07:08:31 -1000 |
commit | c0bc268a594f1742522ff36fc0d945796e9bb2de (patch) | |
tree | a0daf3719861f4fbe6e7cfab2c8bc73a7b7690d2 | |
parent | 664191d437ffc11fcbcb2b20c73140e5c40e394d (diff) | |
download | poky-c0bc268a594f1742522ff36fc0d945796e9bb2de.tar.gz |
profile-manual: usage.rst: further style improvements
According to errors reported by "make stylecheck"
(From yocto-docs rev: 3d6b7aa4b848403a5dcde0cdf68c38060f4ab0af)
Signed-off-by: Michael Opdenacker <michael.opdenacker@bootlin.com>
Signed-off-by: Steve Sakoman <steve@sakoman.com>
-rw-r--r-- | documentation/profile-manual/usage.rst | 335 | ||||
-rw-r--r-- | documentation/styles/config/vocabularies/OpenSource/accept.txt | 20 | ||||
-rw-r--r-- | documentation/styles/config/vocabularies/Yocto/accept.txt | 5 |
3 files changed, 187 insertions, 173 deletions
diff --git a/documentation/profile-manual/usage.rst b/documentation/profile-manual/usage.rst index 0d368792b5..2f82137538 100644 --- a/documentation/profile-manual/usage.rst +++ b/documentation/profile-manual/usage.rst | |||
@@ -10,7 +10,7 @@ Basic Usage (with examples) for each of the Yocto Tracing Tools | |||
10 | This chapter presents basic usage examples for each of the tracing | 10 | This chapter presents basic usage examples for each of the tracing |
11 | tools. | 11 | tools. |
12 | 12 | ||
13 | Perf | 13 | perf |
14 | ==== | 14 | ==== |
15 | 15 | ||
16 | The perf tool is the profiling and tracing tool that comes bundled | 16 | The perf tool is the profiling and tracing tool that comes bundled |
@@ -26,12 +26,12 @@ of what's going on. | |||
26 | 26 | ||
27 | In many ways, perf aims to be a superset of all the tracing and | 27 | In many ways, perf aims to be a superset of all the tracing and |
28 | profiling tools available in Linux today, including all the other tools | 28 | profiling tools available in Linux today, including all the other tools |
29 | covered in this HOWTO. The past couple of years have seen perf subsume a | 29 | covered in this How-to. The past couple of years have seen perf subsume a |
30 | lot of the functionality of those other tools and, at the same time, | 30 | lot of the functionality of those other tools and, at the same time, |
31 | those other tools have removed large portions of their previous | 31 | those other tools have removed large portions of their previous |
32 | functionality and replaced it with calls to the equivalent functionality | 32 | functionality and replaced it with calls to the equivalent functionality |
33 | now implemented by the perf subsystem. Extrapolation suggests that at | 33 | now implemented by the perf subsystem. Extrapolation suggests that at |
34 | some point those other tools will simply become completely redundant and | 34 | some point those other tools will become completely redundant and |
35 | go away; until then, we'll cover those other tools in these pages and in | 35 | go away; until then, we'll cover those other tools in these pages and in |
36 | many cases show how the same things can be accomplished in perf and the | 36 | many cases show how the same things can be accomplished in perf and the |
37 | other tools when it seems useful to do so. | 37 | other tools when it seems useful to do so. |
@@ -41,7 +41,7 @@ want to apply the tool; full documentation can be found either within | |||
41 | the tool itself or in the manual pages at | 41 | the tool itself or in the manual pages at |
42 | `perf(1) <https://linux.die.net/man/1/perf>`__. | 42 | `perf(1) <https://linux.die.net/man/1/perf>`__. |
43 | 43 | ||
44 | Perf Setup | 44 | perf Setup |
45 | ---------- | 45 | ---------- |
46 | 46 | ||
47 | For this section, we'll assume you've already performed the basic setup | 47 | For this section, we'll assume you've already performed the basic setup |
@@ -54,14 +54,14 @@ image built with the following in your ``local.conf`` file:: | |||
54 | 54 | ||
55 | perf runs on the target system for the most part. You can archive | 55 | perf runs on the target system for the most part. You can archive |
56 | profile data and copy it to the host for analysis, but for the rest of | 56 | profile data and copy it to the host for analysis, but for the rest of |
57 | this document we assume you've ssh'ed to the host and will be running | 57 | this document we assume you're connected to the host through SSH and will be |
58 | the perf commands on the target. | 58 | running the perf commands on the target. |
59 | 59 | ||
60 | Basic Perf Usage | 60 | Basic perf Usage |
61 | ---------------- | 61 | ---------------- |
62 | 62 | ||
63 | The perf tool is pretty much self-documenting. To remind yourself of the | 63 | The perf tool is pretty much self-documenting. To remind yourself of the |
64 | available commands, simply type ``perf``, which will show you basic usage | 64 | available commands, just type ``perf``, which will show you basic usage |
65 | along with the available perf subcommands:: | 65 | along with the available perf subcommands:: |
66 | 66 | ||
67 | root@crownbay:~# perf | 67 | root@crownbay:~# perf |
@@ -101,7 +101,7 @@ As a simple test case, we'll profile the ``wget`` of a fairly large file, | |||
101 | which is a minimally interesting case because it has both file and | 101 | which is a minimally interesting case because it has both file and |
102 | network I/O aspects, and at least in the case of standard Yocto images, | 102 | network I/O aspects, and at least in the case of standard Yocto images, |
103 | it's implemented as part of BusyBox, so the methods we use to analyze it | 103 | it's implemented as part of BusyBox, so the methods we use to analyze it |
104 | can be used in a very similar way to the whole host of supported BusyBox | 104 | can be used in a similar way to the whole host of supported BusyBox |
105 | applets in Yocto:: | 105 | applets in Yocto:: |
106 | 106 | ||
107 | root@crownbay:~# rm linux-2.6.19.2.tar.bz2; \ | 107 | root@crownbay:~# rm linux-2.6.19.2.tar.bz2; \ |
@@ -164,17 +164,17 @@ hits and misses:: | |||
164 | 164 | ||
165 | 44.831023415 seconds time elapsed | 165 | 44.831023415 seconds time elapsed |
166 | 166 | ||
167 | So ``perf stat`` gives us a nice easy | 167 | As you can see, ``perf stat`` gives us a nice easy |
168 | way to get a quick overview of what might be happening for a set of | 168 | way to get a quick overview of what might be happening for a set of |
169 | events, but normally we'd need a little more detail in order to | 169 | events, but normally we'd need a little more detail in order to |
170 | understand what's going on in a way that we can act on in a useful way. | 170 | understand what's going on in a way that we can act on in a useful way. |
171 | 171 | ||
172 | To dive down into a next level of detail, we can use ``perf record`` / | 172 | To dive down into a next level of detail, we can use ``perf record`` / |
173 | ``perf report`` which will collect profiling data and present it to use using an | 173 | ``perf report`` which will collect profiling data and present it to use using an |
174 | interactive text-based UI (or simply as text if we specify ``--stdio`` to | 174 | interactive text-based UI (or just as text if we specify ``--stdio`` to |
175 | ``perf report``). | 175 | ``perf report``). |
176 | 176 | ||
177 | As our first attempt at profiling this workload, we'll simply run ``perf | 177 | As our first attempt at profiling this workload, we'll just run ``perf |
178 | record``, handing it the workload we want to profile (everything after | 178 | record``, handing it the workload we want to profile (everything after |
179 | ``perf record`` and any perf options we hand it --- here none, will be | 179 | ``perf record`` and any perf options we hand it --- here none, will be |
180 | executed in a new shell). perf collects samples until the process exits | 180 | executed in a new shell). perf collects samples until the process exits |
@@ -189,7 +189,7 @@ directory:: | |||
189 | [ perf record: Captured and wrote 0.176 MB perf.data (~7700 samples) ] | 189 | [ perf record: Captured and wrote 0.176 MB perf.data (~7700 samples) ] |
190 | 190 | ||
191 | To see the results in a | 191 | To see the results in a |
192 | "text-based UI" (tui), simply run ``perf report``, which will read the | 192 | "text-based UI" (tui), just run ``perf report``, which will read the |
193 | perf.data file in the current working directory and display the results | 193 | perf.data file in the current working directory and display the results |
194 | in an interactive UI:: | 194 | in an interactive UI:: |
195 | 195 | ||
@@ -204,10 +204,10 @@ The above screenshot displays a "flat" profile, one entry for each | |||
204 | profiling run, ordered from the most popular to the least (perf has | 204 | profiling run, ordered from the most popular to the least (perf has |
205 | options to sort in various orders and keys as well as display entries | 205 | options to sort in various orders and keys as well as display entries |
206 | only above a certain threshold and so on --- see the perf documentation | 206 | only above a certain threshold and so on --- see the perf documentation |
207 | for details). Note that this includes both userspace functions (entries | 207 | for details). Note that this includes both user space functions (entries |
208 | containing a ``[.]``) and kernel functions accounted to the process (entries | 208 | containing a ``[.]``) and kernel functions accounted to the process (entries |
209 | containing a ``[k]``). perf has command-line modifiers that can be used to | 209 | containing a ``[k]``). perf has command-line modifiers that can be used to |
210 | restrict the profiling to kernel or userspace, among others. | 210 | restrict the profiling to kernel or user space, among others. |
211 | 211 | ||
212 | Notice also that the above report shows an entry for ``busybox``, which is | 212 | Notice also that the above report shows an entry for ``busybox``, which is |
213 | the executable that implements ``wget`` in Yocto, but that instead of a | 213 | the executable that implements ``wget`` in Yocto, but that instead of a |
@@ -218,7 +218,7 @@ Before we do that, however, let's try running a different profile, one | |||
218 | which shows something a little more interesting. The only difference | 218 | which shows something a little more interesting. The only difference |
219 | between the new profile and the previous one is that we'll add the ``-g`` | 219 | between the new profile and the previous one is that we'll add the ``-g`` |
220 | option, which will record not just the address of a sampled function, | 220 | option, which will record not just the address of a sampled function, |
221 | but the entire callchain to the sampled function as well:: | 221 | but the entire call chain to the sampled function as well:: |
222 | 222 | ||
223 | root@crownbay:~# perf record -g wget &YOCTO_DL_URL;/mirror/sources/linux-2.6.19.2.tar.bz2 | 223 | root@crownbay:~# perf record -g wget &YOCTO_DL_URL;/mirror/sources/linux-2.6.19.2.tar.bz2 |
224 | Connecting to downloads.yoctoproject.org (140.211.169.59:80) | 224 | Connecting to downloads.yoctoproject.org (140.211.169.59:80) |
@@ -233,26 +233,26 @@ but the entire callchain to the sampled function as well:: | |||
233 | :align: center | 233 | :align: center |
234 | :width: 70% | 234 | :width: 70% |
235 | 235 | ||
236 | Using the callgraph view, we can actually see not only which functions | 236 | Using the call graph view, we can actually see not only which functions |
237 | took the most time, but we can also see a summary of how those functions | 237 | took the most time, but we can also see a summary of how those functions |
238 | were called and learn something about how the program interacts with the | 238 | were called and learn something about how the program interacts with the |
239 | kernel in the process. | 239 | kernel in the process. |
240 | 240 | ||
241 | Notice that each entry in the above screenshot now contains a ``+`` on the | 241 | Notice that each entry in the above screenshot now contains a ``+`` on the |
242 | left-hand side. This means that we can expand the entry and drill down | 242 | left side. This means that we can expand the entry and drill down |
243 | into the callchains that feed into that entry. Pressing ``Enter`` on any | 243 | into the call chains that feed into that entry. Pressing ``Enter`` on any |
244 | one of them will expand the callchain (you can also press ``E`` to expand | 244 | one of them will expand the call chain (you can also press ``E`` to expand |
245 | them all at the same time or ``C`` to collapse them all). | 245 | them all at the same time or ``C`` to collapse them all). |
246 | 246 | ||
247 | In the screenshot above, we've toggled the ``__copy_to_user_ll()`` entry | 247 | In the screenshot above, we've toggled the ``__copy_to_user_ll()`` entry |
248 | and several subnodes all the way down. This lets us see which callchains | 248 | and several subnodes all the way down. This lets us see which call chains |
249 | contributed to the profiled ``__copy_to_user_ll()`` function which | 249 | contributed to the profiled ``__copy_to_user_ll()`` function which |
250 | contributed 1.77% to the total profile. | 250 | contributed 1.77% to the total profile. |
251 | 251 | ||
252 | As a bit of background explanation for these callchains, think about | 252 | As a bit of background explanation for these call chains, think about |
253 | what happens at a high level when you run wget to get a file out on the | 253 | what happens at a high level when you run ``wget`` to get a file out on the |
254 | network. Basically what happens is that the data comes into the kernel | 254 | network. Basically what happens is that the data comes into the kernel |
255 | via the network connection (socket) and is passed to the userspace | 255 | via the network connection (socket) and is passed to the user space |
256 | program ``wget`` (which is actually a part of BusyBox, but that's not | 256 | program ``wget`` (which is actually a part of BusyBox, but that's not |
257 | important for now), which takes the buffers the kernel passes to it and | 257 | important for now), which takes the buffers the kernel passes to it and |
258 | writes it to a disk file to save it. | 258 | writes it to a disk file to save it. |
@@ -262,16 +262,16 @@ is the part where the kernel passes the data it has read from the socket | |||
262 | down to wget i.e. a ``copy-to-user``. | 262 | down to wget i.e. a ``copy-to-user``. |
263 | 263 | ||
264 | Notice also that here there's also a case where the hex value is | 264 | Notice also that here there's also a case where the hex value is |
265 | displayed in the callstack, here in the expanded ``sys_clock_gettime()`` | 265 | displayed in the call stack, here in the expanded ``sys_clock_gettime()`` |
266 | function. Later we'll see it resolve to a userspace function call in | 266 | function. Later we'll see it resolve to a user space function call in |
267 | busybox. | 267 | BusyBox. |
268 | 268 | ||
269 | .. image:: figures/perf-wget-g-copy-from-user-expanded-stripped.png | 269 | .. image:: figures/perf-wget-g-copy-from-user-expanded-stripped.png |
270 | :align: center | 270 | :align: center |
271 | :width: 70% | 271 | :width: 70% |
272 | 272 | ||
273 | The above screenshot shows the other half of the journey for the data --- | 273 | The above screenshot shows the other half of the journey for the data --- |
274 | from the ``wget`` program's userspace buffers to disk. To get the buffers to | 274 | from the ``wget`` program's user space buffers to disk. To get the buffers to |
275 | disk, the wget program issues a ``write(2)``, which does a ``copy-from-user`` to | 275 | disk, the wget program issues a ``write(2)``, which does a ``copy-from-user`` to |
276 | the kernel, which then takes care via some circuitous path (probably | 276 | the kernel, which then takes care via some circuitous path (probably |
277 | also present somewhere in the profile data), to get it safely to disk. | 277 | also present somewhere in the profile data), to get it safely to disk. |
@@ -281,8 +281,8 @@ of how to extract useful information out of it, let's get back to the | |||
281 | task at hand and see if we can get some basic idea about where the time | 281 | task at hand and see if we can get some basic idea about where the time |
282 | is spent in the program we're profiling, wget. Remember that wget is | 282 | is spent in the program we're profiling, wget. Remember that wget is |
283 | actually implemented as an applet in BusyBox, so while the process name | 283 | actually implemented as an applet in BusyBox, so while the process name |
284 | is ``wget``, the executable we're actually interested in is BusyBox. So | 284 | is ``wget``, the executable we're actually interested in is ``busybox``. |
285 | let's expand the first entry containing BusyBox: | 285 | Therefore, let's expand the first entry containing BusyBox: |
286 | 286 | ||
287 | .. image:: figures/perf-wget-busybox-expanded-stripped.png | 287 | .. image:: figures/perf-wget-busybox-expanded-stripped.png |
288 | :align: center | 288 | :align: center |
@@ -293,7 +293,7 @@ hex value instead of a symbol as with most of the kernel entries. | |||
293 | Expanding the BusyBox entry doesn't make it any better. | 293 | Expanding the BusyBox entry doesn't make it any better. |
294 | 294 | ||
295 | The problem is that perf can't find the symbol information for the | 295 | The problem is that perf can't find the symbol information for the |
296 | busybox binary, which is actually stripped out by the Yocto build | 296 | ``busybox`` binary, which is actually stripped out by the Yocto build |
297 | system. | 297 | system. |
298 | 298 | ||
299 | One way around that is to put the following in your ``local.conf`` file | 299 | One way around that is to put the following in your ``local.conf`` file |
@@ -303,20 +303,20 @@ when you build the image:: | |||
303 | 303 | ||
304 | However, we already have an image with the binaries stripped, so | 304 | However, we already have an image with the binaries stripped, so |
305 | what can we do to get perf to resolve the symbols? Basically we need to | 305 | what can we do to get perf to resolve the symbols? Basically we need to |
306 | install the debuginfo for the BusyBox package. | 306 | install the debugging information for the BusyBox package. |
307 | 307 | ||
308 | To generate the debug info for the packages in the image, we can add | 308 | To generate the debug info for the packages in the image, we can add |
309 | ``dbg-pkgs`` to :term:`EXTRA_IMAGE_FEATURES` in ``local.conf``. For example:: | 309 | ``dbg-pkgs`` to :term:`EXTRA_IMAGE_FEATURES` in ``local.conf``. For example:: |
310 | 310 | ||
311 | EXTRA_IMAGE_FEATURES = "debug-tweaks tools-profile dbg-pkgs" | 311 | EXTRA_IMAGE_FEATURES = "debug-tweaks tools-profile dbg-pkgs" |
312 | 312 | ||
313 | Additionally, in order to generate the type of debuginfo that perf | 313 | Additionally, in order to generate the type of debugging information that perf |
314 | understands, we also need to set :term:`PACKAGE_DEBUG_SPLIT_STYLE` | 314 | understands, we also need to set :term:`PACKAGE_DEBUG_SPLIT_STYLE` |
315 | in the ``local.conf`` file:: | 315 | in the ``local.conf`` file:: |
316 | 316 | ||
317 | PACKAGE_DEBUG_SPLIT_STYLE = 'debug-file-directory' | 317 | PACKAGE_DEBUG_SPLIT_STYLE = 'debug-file-directory' |
318 | 318 | ||
319 | Once we've done that, we can install the debuginfo for BusyBox. The | 319 | Once we've done that, we can install the debugging information for BusyBox. The |
320 | debug packages once built can be found in ``build/tmp/deploy/rpm/*`` | 320 | debug packages once built can be found in ``build/tmp/deploy/rpm/*`` |
321 | on the host system. Find the ``busybox-dbg-...rpm`` file and copy it | 321 | on the host system. Find the ``busybox-dbg-...rpm`` file and copy it |
322 | to the target. For example:: | 322 | to the target. For example:: |
@@ -324,11 +324,11 @@ to the target. For example:: | |||
324 | [trz@empanada core2]$ scp /home/trz/yocto/crownbay-tracing-dbg/build/tmp/deploy/rpm/core2_32/busybox-dbg-1.20.2-r2.core2_32.rpm root@192.168.1.31: | 324 | [trz@empanada core2]$ scp /home/trz/yocto/crownbay-tracing-dbg/build/tmp/deploy/rpm/core2_32/busybox-dbg-1.20.2-r2.core2_32.rpm root@192.168.1.31: |
325 | busybox-dbg-1.20.2-r2.core2_32.rpm 100% 1826KB 1.8MB/s 00:01 | 325 | busybox-dbg-1.20.2-r2.core2_32.rpm 100% 1826KB 1.8MB/s 00:01 |
326 | 326 | ||
327 | Now install the debug rpm on the target:: | 327 | Now install the debug RPM on the target:: |
328 | 328 | ||
329 | root@crownbay:~# rpm -i busybox-dbg-1.20.2-r2.core2_32.rpm | 329 | root@crownbay:~# rpm -i busybox-dbg-1.20.2-r2.core2_32.rpm |
330 | 330 | ||
331 | Now that the debuginfo is installed, we see that the BusyBox entries now display | 331 | Now that the debugging information is installed, we see that the BusyBox entries now display |
332 | their functions symbolically: | 332 | their functions symbolically: |
333 | 333 | ||
334 | .. image:: figures/perf-wget-busybox-debuginfo.png | 334 | .. image:: figures/perf-wget-busybox-debuginfo.png |
@@ -351,7 +351,7 @@ expanded all the nodes using the ``E`` key): | |||
351 | :align: center | 351 | :align: center |
352 | :width: 70% | 352 | :width: 70% |
353 | 353 | ||
354 | Finally, we can see that now that the BusyBox debuginfo is installed, | 354 | Finally, we can see that now that the BusyBox debugging information is installed, |
355 | the previously unresolved symbol in the ``sys_clock_gettime()`` entry | 355 | the previously unresolved symbol in the ``sys_clock_gettime()`` entry |
356 | mentioned previously is now resolved, and shows that the | 356 | mentioned previously is now resolved, and shows that the |
357 | ``sys_clock_gettime`` system call that was the source of 6.75% of the | 357 | ``sys_clock_gettime`` system call that was the source of 6.75% of the |
@@ -386,8 +386,8 @@ counter, something other than the default ``cycles``. | |||
386 | The tracing and profiling infrastructure in Linux has become unified in | 386 | The tracing and profiling infrastructure in Linux has become unified in |
387 | a way that allows us to use the same tool with a completely different | 387 | a way that allows us to use the same tool with a completely different |
388 | set of counters, not just the standard hardware counters that | 388 | set of counters, not just the standard hardware counters that |
389 | traditional tools have had to restrict themselves to (of course the | 389 | traditional tools have had to restrict themselves to (the |
390 | traditional tools can also make use of the expanded possibilities now | 390 | traditional tools can now actually make use of the expanded possibilities now |
391 | available to them, and in some cases have, as mentioned previously). | 391 | available to them, and in some cases have, as mentioned previously). |
392 | 392 | ||
393 | We can get a list of the available events that can be used to profile a | 393 | We can get a list of the available events that can be used to profile a |
@@ -527,14 +527,14 @@ workload via ``perf list``:: | |||
527 | .. admonition:: Tying it Together | 527 | .. admonition:: Tying it Together |
528 | 528 | ||
529 | These are exactly the same set of events defined by the trace event | 529 | These are exactly the same set of events defined by the trace event |
530 | subsystem and exposed by ftrace / tracecmd / kernelshark as files in | 530 | subsystem and exposed by ftrace / trace-cmd / KernelShark as files in |
531 | ``/sys/kernel/debug/tracing/events``, by SystemTap as | 531 | ``/sys/kernel/debug/tracing/events``, by SystemTap as |
532 | kernel.trace("tracepoint_name") and (partially) accessed by LTTng. | 532 | kernel.trace("tracepoint_name") and (partially) accessed by LTTng. |
533 | 533 | ||
534 | Only a subset of these would be of interest to us when looking at this | 534 | Only a subset of these would be of interest to us when looking at this |
535 | workload, so let's choose the most likely subsystems (identified by the | 535 | workload, so let's choose the most likely subsystems (identified by the |
536 | string before the colon in the Tracepoint events) and do a ``perf stat`` | 536 | string before the colon in the ``Tracepoint`` events) and do a ``perf stat`` |
537 | run using only those wildcarded subsystems:: | 537 | run using only those subsystem wildcards:: |
538 | 538 | ||
539 | root@crownbay:~# perf stat -e skb:* -e net:* -e napi:* -e sched:* -e workqueue:* -e irq:* -e syscalls:* wget &YOCTO_DL_URL;/mirror/sources/linux-2.6.19.2.tar.bz2 | 539 | root@crownbay:~# perf stat -e skb:* -e net:* -e napi:* -e sched:* -e workqueue:* -e irq:* -e syscalls:* wget &YOCTO_DL_URL;/mirror/sources/linux-2.6.19.2.tar.bz2 |
540 | Performance counter stats for 'wget &YOCTO_DL_URL;/mirror/sources/linux-2.6.19.2.tar.bz2': | 540 | Performance counter stats for 'wget &YOCTO_DL_URL;/mirror/sources/linux-2.6.19.2.tar.bz2': |
@@ -625,8 +625,8 @@ accounts for the function name actually displayed in the profile: | |||
625 | } | 625 | } |
626 | 626 | ||
627 | A couple of the more interesting | 627 | A couple of the more interesting |
628 | callchains are expanded and displayed above, basically some network | 628 | call chains are expanded and displayed above, basically some network |
629 | receive paths that presumably end up waking up wget (busybox) when | 629 | receive paths that presumably end up waking up wget (BusyBox) when |
630 | network data is ready. | 630 | network data is ready. |
631 | 631 | ||
632 | Note that because tracepoints are normally used for tracing, the default | 632 | Note that because tracepoints are normally used for tracing, the default |
@@ -646,7 +646,7 @@ high-level view of what's going on with a workload or across the system. | |||
646 | It is however by definition an approximation, as suggested by the most | 646 | It is however by definition an approximation, as suggested by the most |
647 | prominent word associated with it, ``sampling``. On the one hand, it | 647 | prominent word associated with it, ``sampling``. On the one hand, it |
648 | allows a representative picture of what's going on in the system to be | 648 | allows a representative picture of what's going on in the system to be |
649 | cheaply taken, but on the other hand, that cheapness limits its utility | 649 | cheaply taken, but alternatively, that cheapness limits its utility |
650 | when that data suggests a need to "dive down" more deeply to discover | 650 | when that data suggests a need to "dive down" more deeply to discover |
651 | what's really going on. In such cases, the only way to see what's really | 651 | what's really going on. In such cases, the only way to see what's really |
652 | going on is to be able to look at (or summarize more intelligently) the | 652 | going on is to be able to look at (or summarize more intelligently) the |
@@ -711,7 +711,7 @@ an infinite variety of ways. | |||
711 | Another way to look at it is that there are only so many ways that the | 711 | Another way to look at it is that there are only so many ways that the |
712 | 'primitive' counters can be used on their own to generate interesting | 712 | 'primitive' counters can be used on their own to generate interesting |
713 | output; to get anything more complicated than simple counts requires | 713 | output; to get anything more complicated than simple counts requires |
714 | some amount of additional logic, which is typically very specific to the | 714 | some amount of additional logic, which is typically specific to the |
715 | problem at hand. For example, if we wanted to make use of a 'counter' | 715 | problem at hand. For example, if we wanted to make use of a 'counter' |
716 | that maps to the value of the time difference between when a process was | 716 | that maps to the value of the time difference between when a process was |
717 | scheduled to run on a processor and the time it actually ran, we | 717 | scheduled to run on a processor and the time it actually ran, we |
@@ -722,12 +722,12 @@ standard profiling tools how much data every process on the system reads | |||
722 | and writes, along with how many of those reads and writes fail | 722 | and writes, along with how many of those reads and writes fail |
723 | completely. If we have sufficient trace data, however, we could with the | 723 | completely. If we have sufficient trace data, however, we could with the |
724 | right tools easily extract and present that information, but we'd need | 724 | right tools easily extract and present that information, but we'd need |
725 | something other than pre-canned profiling tools to do that. | 725 | something other than ready-made profiling tools to do that. |
726 | 726 | ||
727 | Luckily, there is a general-purpose way to handle such needs, called | 727 | Luckily, there is a general-purpose way to handle such needs, called |
728 | "programming languages". Making programming languages easily available | 728 | "programming languages". Making programming languages easily available |
729 | to apply to such problems given the specific format of data is called a | 729 | to apply to such problems given the specific format of data is called a |
730 | 'programming language binding' for that data and language. Perf supports | 730 | 'programming language binding' for that data and language. perf supports |
731 | two programming language bindings, one for Python and one for Perl. | 731 | two programming language bindings, one for Python and one for Perl. |
732 | 732 | ||
733 | .. admonition:: Tying it Together | 733 | .. admonition:: Tying it Together |
@@ -737,7 +737,7 @@ two programming language bindings, one for Python and one for Perl. | |||
737 | DProbes dpcc compiler, an ANSI C compiler which targeted a low-level | 737 | DProbes dpcc compiler, an ANSI C compiler which targeted a low-level |
738 | assembly language running on an in-kernel interpreter on the target | 738 | assembly language running on an in-kernel interpreter on the target |
739 | system. This is exactly analogous to what Sun's DTrace did, except | 739 | system. This is exactly analogous to what Sun's DTrace did, except |
740 | that DTrace invented its own language for the purpose. Systemtap, | 740 | that DTrace invented its own language for the purpose. SystemTap, |
741 | heavily inspired by DTrace, also created its own one-off language, | 741 | heavily inspired by DTrace, also created its own one-off language, |
742 | but rather than running the product on an in-kernel interpreter, | 742 | but rather than running the product on an in-kernel interpreter, |
743 | created an elaborate compiler-based machinery to translate its | 743 | created an elaborate compiler-based machinery to translate its |
@@ -750,8 +750,8 @@ entry / exit events we recorded:: | |||
750 | root@crownbay:~# perf script -g python | 750 | root@crownbay:~# perf script -g python |
751 | generated Python script: perf-script.py | 751 | generated Python script: perf-script.py |
752 | 752 | ||
753 | The skeleton script simply creates a Python function for each event type in the | 753 | The skeleton script just creates a Python function for each event type in the |
754 | ``perf.data`` file. The body of each function simply prints the event name along | 754 | ``perf.data`` file. The body of each function just prints the event name along |
755 | with its parameters. For example: | 755 | with its parameters. For example: |
756 | 756 | ||
757 | .. code-block:: python | 757 | .. code-block:: python |
@@ -794,7 +794,7 @@ We can run that script directly to print all of the events contained in the | |||
794 | syscalls__sys_exit_read 1 11624.859944032 1262 wget nr=3, ret=1024 | 794 | syscalls__sys_exit_read 1 11624.859944032 1262 wget nr=3, ret=1024 |
795 | 795 | ||
796 | That in itself isn't very useful; after all, we can accomplish pretty much the | 796 | That in itself isn't very useful; after all, we can accomplish pretty much the |
797 | same thing by simply running ``perf script`` without arguments in the same | 797 | same thing by just running ``perf script`` without arguments in the same |
798 | directory as the ``perf.data`` file. | 798 | directory as the ``perf.data`` file. |
799 | 799 | ||
800 | We can however replace the print statements in the generated function | 800 | We can however replace the print statements in the generated function |
@@ -816,7 +816,7 @@ event. For example: | |||
816 | 816 | ||
817 | Each event handler function in the generated code | 817 | Each event handler function in the generated code |
818 | is modified to do this. For convenience, we define a common function | 818 | is modified to do this. For convenience, we define a common function |
819 | called ``inc_counts()`` that each handler calls; ``inc_counts()`` simply tallies | 819 | called ``inc_counts()`` that each handler calls; ``inc_counts()`` just tallies |
820 | a count for each event using the ``counts`` hash, which is a specialized | 820 | a count for each event using the ``counts`` hash, which is a specialized |
821 | hash function that does Perl-like autovivification, a capability that's | 821 | hash function that does Perl-like autovivification, a capability that's |
822 | extremely useful for kinds of multi-level aggregation commonly used in | 822 | extremely useful for kinds of multi-level aggregation commonly used in |
@@ -876,7 +876,7 @@ System-Wide Tracing and Profiling | |||
876 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | 876 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
877 | 877 | ||
878 | The examples so far have focused on tracing a particular program or | 878 | The examples so far have focused on tracing a particular program or |
879 | workload --- in other words, every profiling run has specified the program | 879 | workload --- that is, every profiling run has specified the program |
880 | to profile in the command-line e.g. ``perf record wget ...``. | 880 | to profile in the command-line e.g. ``perf record wget ...``. |
881 | 881 | ||
882 | It's also possible, and more interesting in many cases, to run a | 882 | It's also possible, and more interesting in many cases, to run a |
@@ -906,13 +906,13 @@ other processes running on the system as well: | |||
906 | :align: center | 906 | :align: center |
907 | :width: 70% | 907 | :width: 70% |
908 | 908 | ||
909 | In the snapshot above, we can see callchains that originate in libc, and | 909 | In the snapshot above, we can see call chains that originate in ``libc``, and |
910 | a callchain from Xorg that demonstrates that we're using a proprietary X | 910 | a call chain from ``Xorg`` that demonstrates that we're using a proprietary X |
911 | driver in userspace (notice the presence of ``PVR`` and some other | 911 | driver in user space (notice the presence of ``PVR`` and some other |
912 | unresolvable symbols in the expanded Xorg callchain). | 912 | unresolvable symbols in the expanded ``Xorg`` call chain). |
913 | 913 | ||
914 | Note also that we have both kernel and userspace entries in the above | 914 | Note also that we have both kernel and user space entries in the above |
915 | snapshot. We can also tell perf to focus on userspace but providing a | 915 | snapshot. We can also tell perf to focus on user space but providing a |
916 | modifier, in this case ``u``, to the ``cycles`` hardware counter when we | 916 | modifier, in this case ``u``, to the ``cycles`` hardware counter when we |
917 | record a profile:: | 917 | record a profile:: |
918 | 918 | ||
@@ -924,7 +924,7 @@ record a profile:: | |||
924 | :align: center | 924 | :align: center |
925 | :width: 70% | 925 | :width: 70% |
926 | 926 | ||
927 | Notice in the screenshot above, we see only userspace entries (``[.]``) | 927 | Notice in the screenshot above, we see only user space entries (``[.]``) |
928 | 928 | ||
929 | Finally, we can press ``Enter`` on a leaf node and select the ``Zoom into | 929 | Finally, we can press ``Enter`` on a leaf node and select the ``Zoom into |
930 | DSO`` menu item to show only entries associated with a specific DSO. In | 930 | DSO`` menu item to show only entries associated with a specific DSO. In |
@@ -960,7 +960,7 @@ We can look at the raw output using ``perf script`` with no arguments:: | |||
960 | Filtering | 960 | Filtering |
961 | ^^^^^^^^^ | 961 | ^^^^^^^^^ |
962 | 962 | ||
963 | Notice that there are a lot of events that don't really have anything to | 963 | Notice that there are many events that don't really have anything to |
964 | do with what we're interested in, namely events that schedule ``perf`` | 964 | do with what we're interested in, namely events that schedule ``perf`` |
965 | itself in and out or that wake perf up. We can get rid of those by using | 965 | itself in and out or that wake perf up. We can get rid of those by using |
966 | the ``--filter`` option --- for each event we specify using ``-e``, we can add a | 966 | the ``--filter`` option --- for each event we specify using ``-e``, we can add a |
@@ -999,7 +999,7 @@ purpose of demonstrating how to use filters, it's close enough. | |||
999 | .. admonition:: Tying it Together | 999 | .. admonition:: Tying it Together |
1000 | 1000 | ||
1001 | These are exactly the same set of event filters defined by the trace | 1001 | These are exactly the same set of event filters defined by the trace |
1002 | event subsystem. See the ftrace / tracecmd / kernelshark section for more | 1002 | event subsystem. See the ftrace / trace-cmd / KernelShark section for more |
1003 | discussion about these event filters. | 1003 | discussion about these event filters. |
1004 | 1004 | ||
1005 | .. admonition:: Tying it Together | 1005 | .. admonition:: Tying it Together |
@@ -1009,14 +1009,14 @@ purpose of demonstrating how to use filters, it's close enough. | |||
1009 | indispensable part of the perf design as it relates to tracing. | 1009 | indispensable part of the perf design as it relates to tracing. |
1010 | kernel-based event filters provide a mechanism to precisely throttle | 1010 | kernel-based event filters provide a mechanism to precisely throttle |
1011 | the event stream that appears in user space, where it makes sense to | 1011 | the event stream that appears in user space, where it makes sense to |
1012 | provide bindings to real programming languages for postprocessing the | 1012 | provide bindings to real programming languages for post-processing the |
1013 | event stream. This architecture allows for the intelligent and | 1013 | event stream. This architecture allows for the intelligent and |
1014 | flexible partitioning of processing between the kernel and user | 1014 | flexible partitioning of processing between the kernel and user |
1015 | space. Contrast this with other tools such as SystemTap, which does | 1015 | space. Contrast this with other tools such as SystemTap, which does |
1016 | all of its processing in the kernel and as such requires a special | 1016 | all of its processing in the kernel and as such requires a special |
1017 | project-defined language in order to accommodate that design, or | 1017 | project-defined language in order to accommodate that design, or |
1018 | LTTng, where everything is sent to userspace and as such requires a | 1018 | LTTng, where everything is sent to user space and as such requires a |
1019 | super-efficient kernel-to-userspace transport mechanism in order to | 1019 | super-efficient kernel-to-user space transport mechanism in order to |
1020 | function properly. While perf certainly can benefit from for instance | 1020 | function properly. While perf certainly can benefit from for instance |
1021 | advances in the design of the transport, it doesn't fundamentally | 1021 | advances in the design of the transport, it doesn't fundamentally |
1022 | depend on them. Basically, if you find that your perf tracing | 1022 | depend on them. Basically, if you find that your perf tracing |
@@ -1028,7 +1028,7 @@ Using Dynamic Tracepoints | |||
1028 | 1028 | ||
1029 | perf isn't restricted to the fixed set of static tracepoints listed by | 1029 | perf isn't restricted to the fixed set of static tracepoints listed by |
1030 | ``perf list``. Users can also add their own "dynamic" tracepoints anywhere | 1030 | ``perf list``. Users can also add their own "dynamic" tracepoints anywhere |
1031 | in the kernel. For instance, suppose we want to define our own | 1031 | in the kernel. For example, suppose we want to define our own |
1032 | tracepoint on ``do_fork()``. We can do that using the ``perf probe`` perf | 1032 | tracepoint on ``do_fork()``. We can do that using the ``perf probe`` perf |
1033 | subcommand:: | 1033 | subcommand:: |
1034 | 1034 | ||
@@ -1083,7 +1083,7 @@ up after 30 seconds):: | |||
1083 | [ perf record: Woken up 1 times to write data ] | 1083 | [ perf record: Woken up 1 times to write data ] |
1084 | [ perf record: Captured and wrote 0.087 MB perf.data (~3812 samples) ] | 1084 | [ perf record: Captured and wrote 0.087 MB perf.data (~3812 samples) ] |
1085 | 1085 | ||
1086 | Using ``perf script`` we can see each do_fork event that fired:: | 1086 | Using ``perf script`` we can see each ``do_fork`` event that fired:: |
1087 | 1087 | ||
1088 | root@crownbay:~# perf script | 1088 | root@crownbay:~# perf script |
1089 | 1089 | ||
@@ -1125,7 +1125,7 @@ Using ``perf script`` we can see each do_fork event that fired:: | |||
1125 | gaku 1312 [000] 34237.202388: do_fork: (c1028460) | 1125 | gaku 1312 [000] 34237.202388: do_fork: (c1028460) |
1126 | 1126 | ||
1127 | And using ``perf report`` on the same file, we can see the | 1127 | And using ``perf report`` on the same file, we can see the |
1128 | callgraphs from starting a few programs during those 30 seconds: | 1128 | call graphs from starting a few programs during those 30 seconds: |
1129 | 1129 | ||
1130 | .. image:: figures/perf-probe-do_fork-profile.png | 1130 | .. image:: figures/perf-probe-do_fork-profile.png |
1131 | :align: center | 1131 | :align: center |
@@ -1140,11 +1140,11 @@ callgraphs from starting a few programs during those 30 seconds: | |||
1140 | 1140 | ||
1141 | .. admonition:: Tying it Together | 1141 | .. admonition:: Tying it Together |
1142 | 1142 | ||
1143 | Dynamic tracepoints are implemented under the covers by kprobes and | 1143 | Dynamic tracepoints are implemented under the covers by Kprobes and |
1144 | uprobes. kprobes and uprobes are also used by and in fact are the | 1144 | Uprobes. Kprobes and Uprobes are also used by and in fact are the |
1145 | main focus of SystemTap. | 1145 | main focus of SystemTap. |
1146 | 1146 | ||
1147 | Perf Documentation | 1147 | perf Documentation |
1148 | ------------------ | 1148 | ------------------ |
1149 | 1149 | ||
1150 | Online versions of the manual pages for the commands discussed in this | 1150 | Online versions of the manual pages for the commands discussed in this |
@@ -1168,7 +1168,7 @@ section can be found here: | |||
1168 | 1168 | ||
1169 | - The top-level `perf(1) manual page <https://linux.die.net/man/1/perf>`__. | 1169 | - The top-level `perf(1) manual page <https://linux.die.net/man/1/perf>`__. |
1170 | 1170 | ||
1171 | Normally, you should be able to invoke the manual pages via perf itself | 1171 | Normally, you should be able to open the manual pages via perf itself |
1172 | e.g. ``perf help`` or ``perf help record``. | 1172 | e.g. ``perf help`` or ``perf help record``. |
1173 | 1173 | ||
1174 | To have the perf manual pages installed on your target, modify your | 1174 | To have the perf manual pages installed on your target, modify your |
@@ -1183,14 +1183,14 @@ of examples, can also be found in the ``perf`` directory of the kernel tree:: | |||
1183 | tools/perf/Documentation | 1183 | tools/perf/Documentation |
1184 | 1184 | ||
1185 | There's also a nice perf tutorial on the perf | 1185 | There's also a nice perf tutorial on the perf |
1186 | wiki that goes into more detail than we do here in certain areas: `Perf | 1186 | wiki that goes into more detail than we do here in certain areas: `perf |
1187 | Tutorial <https://perf.wiki.kernel.org/index.php/Tutorial>`__ | 1187 | Tutorial <https://perf.wiki.kernel.org/index.php/Tutorial>`__ |
1188 | 1188 | ||
1189 | ftrace | 1189 | ftrace |
1190 | ====== | 1190 | ====== |
1191 | 1191 | ||
1192 | "ftrace" literally refers to the "ftrace function tracer" but in reality | 1192 | "ftrace" literally refers to the "ftrace function tracer" but in reality |
1193 | this encompasses a number of related tracers along with the | 1193 | this encompasses several related tracers along with the |
1194 | infrastructure that they all make use of. | 1194 | infrastructure that they all make use of. |
1195 | 1195 | ||
1196 | ftrace Setup | 1196 | ftrace Setup |
@@ -1199,11 +1199,11 @@ ftrace Setup | |||
1199 | For this section, we'll assume you've already performed the basic setup | 1199 | For this section, we'll assume you've already performed the basic setup |
1200 | outlined in the ":ref:`profile-manual/intro:General Setup`" section. | 1200 | outlined in the ":ref:`profile-manual/intro:General Setup`" section. |
1201 | 1201 | ||
1202 | ftrace, trace-cmd, and kernelshark run on the target system, and are | 1202 | ftrace, trace-cmd, and KernelShark run on the target system, and are |
1203 | ready to go out-of-the-box --- no additional setup is necessary. For the | 1203 | ready to go out-of-the-box --- no additional setup is necessary. For the |
1204 | rest of this section we assume you've ssh'ed to the host and will be | 1204 | rest of this section we assume you're connected to the host through SSH and |
1205 | running ftrace on the target. kernelshark is a GUI application and if | 1205 | will be running ftrace on the target. KernelShark is a GUI application and if |
1206 | you use the ``-X`` option to ssh you can have the kernelshark GUI run on | 1206 | you use the ``-X`` option to ``ssh`` you can have the KernelShark GUI run on |
1207 | the target but display remotely on the host if you want. | 1207 | the target but display remotely on the host if you want. |
1208 | 1208 | ||
1209 | Basic ftrace usage | 1209 | Basic ftrace usage |
@@ -1211,8 +1211,8 @@ Basic ftrace usage | |||
1211 | 1211 | ||
1212 | "ftrace" essentially refers to everything included in the ``/tracing`` | 1212 | "ftrace" essentially refers to everything included in the ``/tracing`` |
1213 | directory of the mounted debugfs filesystem (Yocto follows the standard | 1213 | directory of the mounted debugfs filesystem (Yocto follows the standard |
1214 | convention and mounts it at ``/sys/kernel/debug``). Here's a listing of all | 1214 | convention and mounts it at ``/sys/kernel/debug``). All the files found in |
1215 | the files found in ``/sys/kernel/debug/tracing`` on a Yocto system:: | 1215 | ``/sys/kernel/debug/tracing`` on a Yocto system are:: |
1216 | 1216 | ||
1217 | root@sugarbay:/sys/kernel/debug/tracing# ls | 1217 | root@sugarbay:/sys/kernel/debug/tracing# ls |
1218 | README kprobe_events trace | 1218 | README kprobe_events trace |
@@ -1237,7 +1237,7 @@ the ftrace documentation. | |||
1237 | 1237 | ||
1238 | We'll start by looking at some of the available built-in tracers. | 1238 | We'll start by looking at some of the available built-in tracers. |
1239 | 1239 | ||
1240 | cat'ing the ``available_tracers`` file lists the set of available tracers:: | 1240 | The ``available_tracers`` file lists the set of available tracers:: |
1241 | 1241 | ||
1242 | root@sugarbay:/sys/kernel/debug/tracing# cat available_tracers | 1242 | root@sugarbay:/sys/kernel/debug/tracing# cat available_tracers |
1243 | blk function_graph function nop | 1243 | blk function_graph function nop |
@@ -1247,11 +1247,11 @@ The ``current_tracer`` file contains the tracer currently in effect:: | |||
1247 | root@sugarbay:/sys/kernel/debug/tracing# cat current_tracer | 1247 | root@sugarbay:/sys/kernel/debug/tracing# cat current_tracer |
1248 | nop | 1248 | nop |
1249 | 1249 | ||
1250 | The above listing of current_tracer shows that the | 1250 | The above listing of ``current_tracer`` shows that the |
1251 | ``nop`` tracer is in effect, which is just another way of saying that | 1251 | ``nop`` tracer is in effect, which is just another way of saying that |
1252 | there's actually no tracer currently in effect. | 1252 | there's actually no tracer currently in effect. |
1253 | 1253 | ||
1254 | echo'ing one of the available_tracers into ``current_tracer`` makes the | 1254 | Writing one of the available tracers into ``current_tracer`` makes the |
1255 | specified tracer the current tracer:: | 1255 | specified tracer the current tracer:: |
1256 | 1256 | ||
1257 | root@sugarbay:/sys/kernel/debug/tracing# echo function > current_tracer | 1257 | root@sugarbay:/sys/kernel/debug/tracing# echo function > current_tracer |
@@ -1307,7 +1307,7 @@ tracer:: | |||
1307 | . | 1307 | . |
1308 | 1308 | ||
1309 | Each line in the trace above shows what was happening in the kernel on a given | 1309 | Each line in the trace above shows what was happening in the kernel on a given |
1310 | cpu, to the level of detail of function calls. Each entry shows the function | 1310 | CPU, to the level of detail of function calls. Each entry shows the function |
1311 | called, followed by its caller (after the arrow). | 1311 | called, followed by its caller (after the arrow). |
1312 | 1312 | ||
1313 | The function tracer gives you an extremely detailed idea of what the | 1313 | The function tracer gives you an extremely detailed idea of what the |
@@ -1321,7 +1321,7 @@ great way to learn about how the kernel code works in a dynamic sense. | |||
1321 | 1321 | ||
1322 | It is a little more difficult to follow the call chains than it needs to | 1322 | It is a little more difficult to follow the call chains than it needs to |
1323 | be --- luckily there's a variant of the function tracer that displays the | 1323 | be --- luckily there's a variant of the function tracer that displays the |
1324 | callchains explicitly, called the ``function_graph`` tracer:: | 1324 | call chains explicitly, called the ``function_graph`` tracer:: |
1325 | 1325 | ||
1326 | root@sugarbay:/sys/kernel/debug/tracing# echo function_graph > current_tracer | 1326 | root@sugarbay:/sys/kernel/debug/tracing# echo function_graph > current_tracer |
1327 | root@sugarbay:/sys/kernel/debug/tracing# cat trace | less | 1327 | root@sugarbay:/sys/kernel/debug/tracing# cat trace | less |
@@ -1440,7 +1440,7 @@ As you can see, the ``function_graph`` display is much easier | |||
1440 | to follow. Also note that in addition to the function calls and | 1440 | to follow. Also note that in addition to the function calls and |
1441 | associated braces, other events such as scheduler events are displayed | 1441 | associated braces, other events such as scheduler events are displayed |
1442 | in context. In fact, you can freely include any tracepoint available in | 1442 | in context. In fact, you can freely include any tracepoint available in |
1443 | the trace events subsystem described in the next section by simply | 1443 | the trace events subsystem described in the next section by just |
1444 | enabling those events, and they'll appear in context in the function | 1444 | enabling those events, and they'll appear in context in the function |
1445 | graph display. Quite a powerful tool for understanding kernel dynamics. | 1445 | graph display. Quite a powerful tool for understanding kernel dynamics. |
1446 | 1446 | ||
@@ -1543,7 +1543,7 @@ The ``format`` file for the | |||
1543 | tracepoint describes the event in memory, which is used by the various | 1543 | tracepoint describes the event in memory, which is used by the various |
1544 | tracing tools that now make use of these tracepoint to parse the event | 1544 | tracing tools that now make use of these tracepoint to parse the event |
1545 | and make sense of it, along with a ``print fmt`` field that allows tools | 1545 | and make sense of it, along with a ``print fmt`` field that allows tools |
1546 | like ftrace to display the event as text. Here's what the format of the | 1546 | like ftrace to display the event as text. The format of the |
1547 | ``kmalloc`` event looks like:: | 1547 | ``kmalloc`` event looks like:: |
1548 | 1548 | ||
1549 | root@sugarbay:/sys/kernel/debug/tracing/events/kmem/kmalloc# cat format | 1549 | root@sugarbay:/sys/kernel/debug/tracing/events/kmem/kmalloc# cat format |
@@ -1596,7 +1596,7 @@ events in the output buffer:: | |||
1596 | root@sugarbay:/sys/kernel/debug/tracing# echo 1 > tracing_on | 1596 | root@sugarbay:/sys/kernel/debug/tracing# echo 1 > tracing_on |
1597 | 1597 | ||
1598 | Now, if we look at the ``trace`` file, we see nothing | 1598 | Now, if we look at the ``trace`` file, we see nothing |
1599 | but the kmalloc events we just turned on:: | 1599 | but the ``kmalloc`` events we just turned on:: |
1600 | 1600 | ||
1601 | root@sugarbay:/sys/kernel/debug/tracing# cat trace | less | 1601 | root@sugarbay:/sys/kernel/debug/tracing# cat trace | less |
1602 | # tracer: nop | 1602 | # tracer: nop |
@@ -1651,8 +1651,8 @@ using the ``enable`` file in the subsystem directory) and get an | |||
1651 | arbitrarily fine-grained idea of what's going on in the system by | 1651 | arbitrarily fine-grained idea of what's going on in the system by |
1652 | enabling as many of the appropriate tracepoints as applicable. | 1652 | enabling as many of the appropriate tracepoints as applicable. |
1653 | 1653 | ||
1654 | A number of the tools described in this HOWTO do just that, including | 1654 | Several tools described in this How-to do just that, including |
1655 | ``trace-cmd`` and kernelshark in the next section. | 1655 | ``trace-cmd`` and KernelShark in the next section. |
1656 | 1656 | ||
1657 | .. admonition:: Tying it Together | 1657 | .. admonition:: Tying it Together |
1658 | 1658 | ||
@@ -1668,7 +1668,7 @@ A number of the tools described in this HOWTO do just that, including | |||
1668 | ``/sys/kernel/debug/tracing`` will be removed and replaced with | 1668 | ``/sys/kernel/debug/tracing`` will be removed and replaced with |
1669 | equivalent tracers based on the "trace events" subsystem. | 1669 | equivalent tracers based on the "trace events" subsystem. |
1670 | 1670 | ||
1671 | trace-cmd / kernelshark | 1671 | trace-cmd / KernelShark |
1672 | ----------------------- | 1672 | ----------------------- |
1673 | 1673 | ||
1674 | trace-cmd is essentially an extensive command-line "wrapper" interface | 1674 | trace-cmd is essentially an extensive command-line "wrapper" interface |
@@ -1677,24 +1677,23 @@ that hides the details of all the individual files in | |||
1677 | events within the ``/sys/kernel/debug/tracing/events/`` subdirectory and to | 1677 | events within the ``/sys/kernel/debug/tracing/events/`` subdirectory and to |
1678 | collect traces and avoid having to deal with those details directly. | 1678 | collect traces and avoid having to deal with those details directly. |
1679 | 1679 | ||
1680 | As yet another layer on top of that, kernelshark provides a GUI that | 1680 | As yet another layer on top of that, KernelShark provides a GUI that |
1681 | allows users to start and stop traces and specify sets of events using | 1681 | allows users to start and stop traces and specify sets of events using |
1682 | an intuitive interface, and view the output as both trace events and as | 1682 | an intuitive interface, and view the output as both trace events and as |
1683 | a per-CPU graphical display. It directly uses trace-cmd as the | 1683 | a per-CPU graphical display. It directly uses trace-cmd as the |
1684 | plumbing that accomplishes all that underneath the covers (and actually | 1684 | plumbing that accomplishes all that underneath the covers (and actually |
1685 | displays the trace-cmd command it uses, as we'll see). | 1685 | displays the trace-cmd command it uses, as we'll see). |
1686 | 1686 | ||
1687 | To start a trace using kernelshark, first start kernelshark:: | 1687 | To start a trace using KernelShark, first start this tool:: |
1688 | 1688 | ||
1689 | root@sugarbay:~# kernelshark | 1689 | root@sugarbay:~# kernelshark |
1690 | 1690 | ||
1691 | Then bring up the ``Capture`` dialog by | 1691 | Then open up the ``Capture`` dialog by choosing from the KernelShark menu:: |
1692 | choosing from the kernelshark menu:: | ||
1693 | 1692 | ||
1694 | Capture | Record | 1693 | Capture | Record |
1695 | 1694 | ||
1696 | That will display the following dialog, which allows you to choose one or more | 1695 | That will display the following dialog, which allows you to choose one or more |
1697 | events (or even one or more complete subsystems) to trace: | 1696 | events (or even entire subsystems) to trace: |
1698 | 1697 | ||
1699 | .. image:: figures/kernelshark-choose-events.png | 1698 | .. image:: figures/kernelshark-choose-events.png |
1700 | :align: center | 1699 | :align: center |
@@ -1702,7 +1701,7 @@ events (or even one or more complete subsystems) to trace: | |||
1702 | 1701 | ||
1703 | Note that these are exactly the same sets of events described in the | 1702 | Note that these are exactly the same sets of events described in the |
1704 | previous trace events subsystem section, and in fact is where trace-cmd | 1703 | previous trace events subsystem section, and in fact is where trace-cmd |
1705 | gets them for kernelshark. | 1704 | gets them for KernelShark. |
1706 | 1705 | ||
1707 | In the above screenshot, we've decided to explore the graphics subsystem | 1706 | In the above screenshot, we've decided to explore the graphics subsystem |
1708 | a bit and so have chosen to trace all the tracepoints contained within | 1707 | a bit and so have chosen to trace all the tracepoints contained within |
@@ -1716,12 +1715,12 @@ will turn into the 'Stop' button after the trace has started): | |||
1716 | :align: center | 1715 | :align: center |
1717 | :width: 70% | 1716 | :width: 70% |
1718 | 1717 | ||
1719 | Notice that the right-hand pane shows the exact trace-cmd command-line | 1718 | Notice that the right pane shows the exact trace-cmd command-line |
1720 | that's used to run the trace, along with the results of the trace-cmd | 1719 | that's used to run the trace, along with the results of the trace-cmd |
1721 | run. | 1720 | run. |
1722 | 1721 | ||
1723 | Once the ``Stop`` button is pressed, the graphical view magically fills up | 1722 | Once the ``Stop`` button is pressed, the graphical view magically fills up |
1724 | with a colorful per-cpu display of the trace data, along with the | 1723 | with a colorful per-CPU display of the trace data, along with the |
1725 | detailed event listing below that: | 1724 | detailed event listing below that: |
1726 | 1725 | ||
1727 | .. image:: figures/kernelshark-i915-display.png | 1726 | .. image:: figures/kernelshark-i915-display.png |
@@ -1736,7 +1735,7 @@ events``: | |||
1736 | :width: 70% | 1735 | :width: 70% |
1737 | 1736 | ||
1738 | The tool is pretty self-explanatory, but for more detailed information | 1737 | The tool is pretty self-explanatory, but for more detailed information |
1739 | on navigating through the data, see the `kernelshark | 1738 | on navigating through the data, see the `KernelShark |
1740 | website <https://kernelshark.org/Documentation.html>`__. | 1739 | website <https://kernelshark.org/Documentation.html>`__. |
1741 | 1740 | ||
1742 | ftrace Documentation | 1741 | ftrace Documentation |
@@ -1752,41 +1751,41 @@ Documentation directory:: | |||
1752 | 1751 | ||
1753 | Documentation/trace/events.txt | 1752 | Documentation/trace/events.txt |
1754 | 1753 | ||
1755 | There is a nice series of articles on using ftrace and trace-cmd at LWN: | 1754 | A nice series of articles on using ftrace and trace-cmd are available at LWN: |
1756 | 1755 | ||
1757 | - `Debugging the kernel using Ftrace - part | 1756 | - `Debugging the kernel using ftrace - part |
1758 | 1 <https://lwn.net/Articles/365835/>`__ | 1757 | 1 <https://lwn.net/Articles/365835/>`__ |
1759 | 1758 | ||
1760 | - `Debugging the kernel using Ftrace - part | 1759 | - `Debugging the kernel using ftrace - part |
1761 | 2 <https://lwn.net/Articles/366796/>`__ | 1760 | 2 <https://lwn.net/Articles/366796/>`__ |
1762 | 1761 | ||
1763 | - `Secrets of the Ftrace function | 1762 | - `Secrets of the ftrace function |
1764 | tracer <https://lwn.net/Articles/370423/>`__ | 1763 | tracer <https://lwn.net/Articles/370423/>`__ |
1765 | 1764 | ||
1766 | - `trace-cmd: A front-end for | 1765 | - `trace-cmd: A front-end for |
1767 | Ftrace <https://lwn.net/Articles/410200/>`__ | 1766 | ftrace <https://lwn.net/Articles/410200/>`__ |
1768 | 1767 | ||
1769 | See also `KernelShark's documentation <https://kernelshark.org/Documentation.html>`__ | 1768 | See also `KernelShark's documentation <https://kernelshark.org/Documentation.html>`__ |
1770 | for further usage details. | 1769 | for further usage details. |
1771 | 1770 | ||
1772 | An amusing yet useful README (a tracing mini-HOWTO) can be found in | 1771 | An amusing yet useful README (a tracing mini-How-to) can be found in |
1773 | ``/sys/kernel/debug/tracing/README``. | 1772 | ``/sys/kernel/debug/tracing/README``. |
1774 | 1773 | ||
1775 | systemtap | 1774 | SystemTap |
1776 | ========= | 1775 | ========= |
1777 | 1776 | ||
1778 | SystemTap is a system-wide script-based tracing and profiling tool. | 1777 | SystemTap is a system-wide script-based tracing and profiling tool. |
1779 | 1778 | ||
1780 | SystemTap scripts are C-like programs that are executed in the kernel to | 1779 | SystemTap scripts are C-like programs that are executed in the kernel to |
1781 | gather / print / aggregate data extracted from the context they end up being | 1780 | gather / print / aggregate data extracted from the context they end up being |
1782 | invoked under. | 1781 | called under. |
1783 | 1782 | ||
1784 | For example, this probe from the `SystemTap | 1783 | For example, this probe from the `SystemTap |
1785 | tutorial <https://sourceware.org/systemtap/tutorial/>`__ simply prints a | 1784 | tutorial <https://sourceware.org/systemtap/tutorial/>`__ just prints a |
1786 | line every time any process on the system runs ``open()`` on a file. For each line, | 1785 | line every time any process on the system runs ``open()`` on a file. For each line, |
1787 | it prints the executable name of the program that opened the file, along | 1786 | it prints the executable name of the program that opened the file, along |
1788 | with its PID, and the name of the file it opened (or tried to open), | 1787 | with its PID, and the name of the file it opened (or tried to open), which it |
1789 | which it extracts from the open syscall's argstr. | 1788 | extracts from the argument string (``argstr``) of the ``open`` system call. |
1790 | 1789 | ||
1791 | .. code-block:: none | 1790 | .. code-block:: none |
1792 | 1791 | ||
@@ -1801,13 +1800,13 @@ which it extracts from the open syscall's argstr. | |||
1801 | } | 1800 | } |
1802 | 1801 | ||
1803 | Normally, to execute this | 1802 | Normally, to execute this |
1804 | probe, you'd simply install systemtap on the system you want to probe, | 1803 | probe, you'd just install SystemTap on the system you want to probe, |
1805 | and directly run the probe on that system e.g. assuming the name of the | 1804 | and directly run the probe on that system e.g. assuming the name of the |
1806 | file containing the above text is trace_open.stp:: | 1805 | file containing the above text is ``trace_open.stp``:: |
1807 | 1806 | ||
1808 | # stap trace_open.stp | 1807 | # stap trace_open.stp |
1809 | 1808 | ||
1810 | What systemtap does under the covers to run this probe is 1) parse and | 1809 | What SystemTap does under the covers to run this probe is 1) parse and |
1811 | convert the probe to an equivalent "C" form, 2) compile the "C" form | 1810 | convert the probe to an equivalent "C" form, 2) compile the "C" form |
1812 | into a kernel module, 3) insert the module into the kernel, which arms | 1811 | into a kernel module, 3) insert the module into the kernel, which arms |
1813 | it, and 4) collect the data generated by the probe and display it to the | 1812 | it, and 4) collect the data generated by the probe and display it to the |
@@ -1820,25 +1819,25 @@ kernel build system unfortunately isn't typically part of the image | |||
1820 | running on the target. It is normally available on the "host" system | 1819 | running on the target. It is normally available on the "host" system |
1821 | that produced the target image however; in such cases, steps 1 and 2 are | 1820 | that produced the target image however; in such cases, steps 1 and 2 are |
1822 | executed on the host system, and steps 3 and 4 are executed on the | 1821 | executed on the host system, and steps 3 and 4 are executed on the |
1823 | target system, using only the systemtap "runtime". | 1822 | target system, using only the SystemTap "runtime". |
1824 | 1823 | ||
1825 | The systemtap support in Yocto assumes that only steps 3 and 4 are run | 1824 | The SystemTap support in Yocto assumes that only steps 3 and 4 are run |
1826 | on the target; it is possible to do everything on the target, but this | 1825 | on the target; it is possible to do everything on the target, but this |
1827 | section assumes only the typical embedded use-case. | 1826 | section assumes only the typical embedded use-case. |
1828 | 1827 | ||
1829 | So basically what you need to do in order to run a systemtap script on | 1828 | Therefore, what you need to do in order to run a SystemTap script on |
1830 | the target is to 1) on the host system, compile the probe into a kernel | 1829 | the target is to 1) on the host system, compile the probe into a kernel |
1831 | module that makes sense to the target, 2) copy the module onto the | 1830 | module that makes sense to the target, 2) copy the module onto the |
1832 | target system and 3) insert the module into the target kernel, which | 1831 | target system and 3) insert the module into the target kernel, which |
1833 | arms it, and 4) collect the data generated by the probe and display it | 1832 | arms it, and 4) collect the data generated by the probe and display it |
1834 | to the user. | 1833 | to the user. |
1835 | 1834 | ||
1836 | systemtap Setup | 1835 | SystemTap Setup |
1837 | --------------- | 1836 | --------------- |
1838 | 1837 | ||
1839 | Those are a lot of steps and a lot of details, but fortunately Yocto | 1838 | Those are many steps and details, but fortunately Yocto |
1840 | includes a script called ``crosstap`` that will take care of those | 1839 | includes a script called ``crosstap`` that will take care of those |
1841 | details, allowing you to simply execute a systemtap script on the remote | 1840 | details, allowing you to just execute a SystemTap script on the remote |
1842 | target, with arguments if necessary. | 1841 | target, with arguments if necessary. |
1843 | 1842 | ||
1844 | In order to do this from a remote host, however, you need to have access | 1843 | In order to do this from a remote host, however, you need to have access |
@@ -1851,7 +1850,7 @@ having done a build:: | |||
1851 | Error: No target kernel build found. | 1850 | Error: No target kernel build found. |
1852 | Did you forget to create a local build of your image? | 1851 | Did you forget to create a local build of your image? |
1853 | 1852 | ||
1854 | 'crosstap' requires a local sdk build of the target system | 1853 | 'crosstap' requires a local SDK build of the target system |
1855 | (or a build that includes 'tools-profile') in order to build | 1854 | (or a build that includes 'tools-profile') in order to build |
1856 | kernel modules that can probe the target system. | 1855 | kernel modules that can probe the target system. |
1857 | 1856 | ||
@@ -1867,11 +1866,11 @@ Practically speaking, that means you need to do the following: | |||
1867 | the BSP README and/or the widely available basic documentation | 1866 | the BSP README and/or the widely available basic documentation |
1868 | that discusses how to build images). | 1867 | that discusses how to build images). |
1869 | 1868 | ||
1870 | - Build an -sdk version of the image e.g.:: | 1869 | - Build an ``-sdk`` version of the image e.g.:: |
1871 | 1870 | ||
1872 | $ bitbake core-image-sato-sdk | 1871 | $ bitbake core-image-sato-sdk |
1873 | 1872 | ||
1874 | - Or build a non-sdk image but include the profiling tools | 1873 | - Or build a non-SDK image but include the profiling tools |
1875 | (edit ``local.conf`` and add ``tools-profile`` to the end of | 1874 | (edit ``local.conf`` and add ``tools-profile`` to the end of |
1876 | :term:``EXTRA_IMAGE_FEATURES`` variable):: | 1875 | :term:``EXTRA_IMAGE_FEATURES`` variable):: |
1877 | 1876 | ||
@@ -1887,15 +1886,14 @@ Practically speaking, that means you need to do the following: | |||
1887 | 1886 | ||
1888 | .. note:: | 1887 | .. note:: |
1889 | 1888 | ||
1890 | SystemTap, which uses ``crosstap``, assumes you can establish an ssh | 1889 | SystemTap, which uses ``crosstap``, assumes you can establish an SSH |
1891 | connection to the remote target. Please refer to the crosstap wiki | 1890 | connection to the remote target. Please refer to the crosstap wiki |
1892 | page for details on verifying ssh connections. Also, the ability to ssh | 1891 | page for details on verifying SSH connections. Also, the ability to SSH |
1893 | into the target system is not enabled by default in ``*-minimal`` images. | 1892 | into the target system is not enabled by default in ``*-minimal`` images. |
1894 | 1893 | ||
1895 | So essentially what you need to | 1894 | Therefore, what you need to do is build an SDK image or image with |
1896 | do is build an SDK image or image with ``tools-profile`` as detailed in | 1895 | ``tools-profile`` as detailed in the ":ref:`profile-manual/intro:General Setup`" |
1897 | the ":ref:`profile-manual/intro:General Setup`" section of this | 1896 | section of this manual, and boot the resulting target image. |
1898 | manual, and boot the resulting target image. | ||
1899 | 1897 | ||
1900 | .. note:: | 1898 | .. note:: |
1901 | 1899 | ||
@@ -1903,12 +1901,12 @@ manual, and boot the resulting target image. | |||
1903 | to have the :term:`MACHINE` you're connecting to selected in ``local.conf``, and | 1901 | to have the :term:`MACHINE` you're connecting to selected in ``local.conf``, and |
1904 | the kernel in that machine's :term:`Build Directory` must match the kernel on | 1902 | the kernel in that machine's :term:`Build Directory` must match the kernel on |
1905 | the booted system exactly, or you'll get the above ``crosstap`` message | 1903 | the booted system exactly, or you'll get the above ``crosstap`` message |
1906 | when you try to invoke a script. | 1904 | when you try to call a script. |
1907 | 1905 | ||
1908 | Running a Script on a Target | 1906 | Running a Script on a Target |
1909 | ---------------------------- | 1907 | ---------------------------- |
1910 | 1908 | ||
1911 | Once you've done that, you should be able to run a systemtap script on | 1909 | Once you've done that, you should be able to run a SystemTap script on |
1912 | the target:: | 1910 | the target:: |
1913 | 1911 | ||
1914 | $ cd /path/to/yocto | 1912 | $ cd /path/to/yocto |
@@ -1937,7 +1935,7 @@ If you get an error connecting to the target e.g.:: | |||
1937 | $ crosstap root@192.168.7.2 trace_open.stp | 1935 | $ crosstap root@192.168.7.2 trace_open.stp |
1938 | error establishing ssh connection on remote 'root@192.168.7.2' | 1936 | error establishing ssh connection on remote 'root@192.168.7.2' |
1939 | 1937 | ||
1940 | Try ssh'ing to the target and see what happens:: | 1938 | Try connecting to the target through SSH and see what happens:: |
1941 | 1939 | ||
1942 | $ ssh root@192.168.7.2 | 1940 | $ ssh root@192.168.7.2 |
1943 | 1941 | ||
@@ -1955,7 +1953,7 @@ no password): | |||
1955 | matchbox-termin(1036) open ("/tmp/vte3FS2LW", O_RDWR|O_CREAT|O_EXCL|O_LARGEFILE, 0600) | 1953 | matchbox-termin(1036) open ("/tmp/vte3FS2LW", O_RDWR|O_CREAT|O_EXCL|O_LARGEFILE, 0600) |
1956 | matchbox-termin(1036) open ("/tmp/vteJMC7LW", O_RDWR|O_CREAT|O_EXCL|O_LARGEFILE, 0600) | 1954 | matchbox-termin(1036) open ("/tmp/vteJMC7LW", O_RDWR|O_CREAT|O_EXCL|O_LARGEFILE, 0600) |
1957 | 1955 | ||
1958 | systemtap Documentation | 1956 | SystemTap Documentation |
1959 | ----------------------- | 1957 | ----------------------- |
1960 | 1958 | ||
1961 | The SystemTap language reference can be found here: `SystemTap Language | 1959 | The SystemTap language reference can be found here: `SystemTap Language |
@@ -1968,7 +1966,7 @@ page <https://sourceware.org/systemtap/documentation.html>`__ | |||
1968 | Sysprof | 1966 | Sysprof |
1969 | ======= | 1967 | ======= |
1970 | 1968 | ||
1971 | Sysprof is a very easy to use system-wide profiler that consists of a | 1969 | Sysprof is an easy to use system-wide profiler that consists of a |
1972 | single window with three panes and a few buttons which allow you to | 1970 | single window with three panes and a few buttons which allow you to |
1973 | start, stop, and view the profile from one place. | 1971 | start, stop, and view the profile from one place. |
1974 | 1972 | ||
@@ -1978,16 +1976,16 @@ Sysprof Setup | |||
1978 | For this section, we'll assume you've already performed the basic setup | 1976 | For this section, we'll assume you've already performed the basic setup |
1979 | outlined in the ":ref:`profile-manual/intro:General Setup`" section. | 1977 | outlined in the ":ref:`profile-manual/intro:General Setup`" section. |
1980 | 1978 | ||
1981 | Sysprof is a GUI-based application that runs on the target system. For | 1979 | Sysprof is a GUI-based application that runs on the target system. For the rest |
1982 | the rest of this document we assume you've ssh'ed to the host and will | 1980 | of this document we assume you're connected to the host through SSH and will be |
1983 | be running Sysprof on the target (you can use the ``-X`` option to ssh and | 1981 | running Sysprof on the target (you can use the ``-X`` option to ``ssh`` and |
1984 | have the Sysprof GUI run on the target but display remotely on the host | 1982 | have the Sysprof GUI run on the target but display remotely on the host |
1985 | if you want). | 1983 | if you want). |
1986 | 1984 | ||
1987 | Basic Sysprof Usage | 1985 | Basic Sysprof Usage |
1988 | ------------------- | 1986 | ------------------- |
1989 | 1987 | ||
1990 | To start profiling the system, you simply press the ``Start`` button. To | 1988 | To start profiling the system, you just press the ``Start`` button. To |
1991 | stop profiling and to start viewing the profile data in one easy step, | 1989 | stop profiling and to start viewing the profile data in one easy step, |
1992 | press the ``Profile`` button. | 1990 | press the ``Profile`` button. |
1993 | 1991 | ||
@@ -2001,11 +1999,11 @@ with profiling data: | |||
2001 | The left pane shows a list of functions and processes. Selecting one of | 1999 | The left pane shows a list of functions and processes. Selecting one of |
2002 | those expands that function in the right pane, showing all its callees. | 2000 | those expands that function in the right pane, showing all its callees. |
2003 | Note that this caller-oriented display is essentially the inverse of | 2001 | Note that this caller-oriented display is essentially the inverse of |
2004 | perf's default callee-oriented callchain display. | 2002 | perf's default callee-oriented call chain display. |
2005 | 2003 | ||
2006 | In the screenshot above, we're focusing on ``__copy_to_user_ll()`` and | 2004 | In the screenshot above, we're focusing on ``__copy_to_user_ll()`` and |
2007 | looking up the callchain we can see that one of the callers of | 2005 | looking up the call chain we can see that one of the callers of |
2008 | ``__copy_to_user_ll`` is ``sys_read()`` and the complete callpath between them. | 2006 | ``__copy_to_user_ll`` is ``sys_read()`` and the complete call path between them. |
2009 | Notice that this is essentially a portion of the same information we saw | 2007 | Notice that this is essentially a portion of the same information we saw |
2010 | in the perf display shown in the perf section of this page. | 2008 | in the perf display shown in the perf section of this page. |
2011 | 2009 | ||
@@ -2014,7 +2012,7 @@ in the perf display shown in the perf section of this page. | |||
2014 | :width: 70% | 2012 | :width: 70% |
2015 | 2013 | ||
2016 | Similarly, the above is a snapshot of the Sysprof display of a | 2014 | Similarly, the above is a snapshot of the Sysprof display of a |
2017 | ``copy-from-user`` callchain. | 2015 | ``copy-from-user`` call chain. |
2018 | 2016 | ||
2019 | Finally, looking at the third Sysprof pane in the lower left, we can see | 2017 | Finally, looking at the third Sysprof pane in the lower left, we can see |
2020 | a list of all the callers of a particular function selected in the top | 2018 | a list of all the callers of a particular function selected in the top |
@@ -2030,18 +2028,17 @@ to the selected function, and so on. | |||
2030 | 2028 | ||
2031 | .. admonition:: Tying it Together | 2029 | .. admonition:: Tying it Together |
2032 | 2030 | ||
2033 | If you like sysprof's ``caller-oriented`` display, you may be able to | 2031 | If you like Sysprof's ``caller-oriented`` display, you may be able to |
2034 | approximate it in other tools as well. For example, ``perf report`` has | 2032 | approximate it in other tools as well. For example, ``perf report`` has |
2035 | the ``-g`` (``--call-graph``) option that you can experiment with; one of the | 2033 | the ``-g`` (``--call-graph``) option that you can experiment with; one of the |
2036 | options is ``caller`` for an inverted caller-based callgraph display. | 2034 | options is ``caller`` for an inverted caller-based call graph display. |
2037 | 2035 | ||
2038 | Sysprof Documentation | 2036 | Sysprof Documentation |
2039 | --------------------- | 2037 | --------------------- |
2040 | 2038 | ||
2041 | There doesn't seem to be any documentation for Sysprof, but maybe that's | 2039 | There doesn't seem to be any documentation for Sysprof, but maybe that's |
2042 | because it's pretty self-explanatory. The Sysprof website, however, is | 2040 | because it's pretty self-explanatory. The Sysprof website, however, is here: |
2043 | here: `Sysprof, System-wide Performance Profiler for | 2041 | `Sysprof, System-wide Performance Profiler for Linux <http://sysprof.com/>`__ |
2044 | Linux <http://sysprof.com/>`__ | ||
2045 | 2042 | ||
2046 | LTTng (Linux Trace Toolkit, next generation) | 2043 | LTTng (Linux Trace Toolkit, next generation) |
2047 | ============================================ | 2044 | ============================================ |
@@ -2051,7 +2048,7 @@ LTTng Setup | |||
2051 | 2048 | ||
2052 | For this section, we'll assume you've already performed the basic setup | 2049 | For this section, we'll assume you've already performed the basic setup |
2053 | outlined in the ":ref:`profile-manual/intro:General Setup`" section. | 2050 | outlined in the ":ref:`profile-manual/intro:General Setup`" section. |
2054 | LTTng is run on the target system by ssh'ing to it. | 2051 | LTTng is run on the target system by connecting to it through SSH. |
2055 | 2052 | ||
2056 | Collecting and Viewing Traces | 2053 | Collecting and Viewing Traces |
2057 | ----------------------------- | 2054 | ----------------------------- |
@@ -2064,7 +2061,7 @@ tracing. | |||
2064 | Collecting and viewing a trace on the target (inside a shell) | 2061 | Collecting and viewing a trace on the target (inside a shell) |
2065 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | 2062 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
2066 | 2063 | ||
2067 | First, from the host, ssh to the target:: | 2064 | First, from the host, connect to the target through SSH:: |
2068 | 2065 | ||
2069 | $ ssh -l root 192.168.1.47 | 2066 | $ ssh -l root 192.168.1.47 |
2070 | The authenticity of host '192.168.1.47 (192.168.1.47)' can't be established. | 2067 | The authenticity of host '192.168.1.47 (192.168.1.47)' can't be established. |
@@ -2156,11 +2153,11 @@ supplying your own name to ``lttng create``):: | |||
2156 | drwxr-xr-x 5 root root 1024 Oct 15 23:57 .. | 2153 | drwxr-xr-x 5 root root 1024 Oct 15 23:57 .. |
2157 | drwxrwx--- 3 root root 1024 Oct 15 23:21 auto-20121015-232120 | 2154 | drwxrwx--- 3 root root 1024 Oct 15 23:21 auto-20121015-232120 |
2158 | 2155 | ||
2159 | Collecting and viewing a userspace trace on the target (inside a shell) | 2156 | Collecting and viewing a user space trace on the target (inside a shell) |
2160 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | 2157 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
2161 | 2158 | ||
2162 | For LTTng userspace tracing, you need to have a properly instrumented | 2159 | For LTTng user space tracing, you need to have a properly instrumented |
2163 | userspace program. For this example, we'll use the ``hello`` test program | 2160 | user space program. For this example, we'll use the ``hello`` test program |
2164 | generated by the ``lttng-ust`` build. | 2161 | generated by the ``lttng-ust`` build. |
2165 | 2162 | ||
2166 | The ``hello`` test program isn't installed on the root filesystem by the ``lttng-ust`` | 2163 | The ``hello`` test program isn't installed on the root filesystem by the ``lttng-ust`` |
@@ -2176,7 +2173,7 @@ Copy that over to the target machine:: | |||
2176 | You now have the instrumented LTTng "hello world" test program on the | 2173 | You now have the instrumented LTTng "hello world" test program on the |
2177 | target, ready to test. | 2174 | target, ready to test. |
2178 | 2175 | ||
2179 | First, from the host, ssh to the target:: | 2176 | First, from the host, connect to the target through SSH:: |
2180 | 2177 | ||
2181 | $ ssh -l root 192.168.1.47 | 2178 | $ ssh -l root 192.168.1.47 |
2182 | The authenticity of host '192.168.1.47 (192.168.1.47)' can't be established. | 2179 | The authenticity of host '192.168.1.47 (192.168.1.47)' can't be established. |
@@ -2191,7 +2188,7 @@ Once on the target, use these steps to create a trace:: | |||
2191 | Session auto-20190303-021943 created. | 2188 | Session auto-20190303-021943 created. |
2192 | Traces will be written in /home/root/lttng-traces/auto-20190303-021943 | 2189 | Traces will be written in /home/root/lttng-traces/auto-20190303-021943 |
2193 | 2190 | ||
2194 | Enable the events you want to trace (in this case all userspace events):: | 2191 | Enable the events you want to trace (in this case all user space events):: |
2195 | 2192 | ||
2196 | root@crownbay:~# lttng enable-event --userspace --all | 2193 | root@crownbay:~# lttng enable-event --userspace --all |
2197 | All UST events are enabled in channel channel0 | 2194 | All UST events are enabled in channel channel0 |
@@ -2263,13 +2260,13 @@ the entire blktrace and blkparse pipeline on the target, or you can run | |||
2263 | blktrace in 'listen' mode on the target and have blktrace and blkparse | 2260 | blktrace in 'listen' mode on the target and have blktrace and blkparse |
2264 | collect and analyze the data on the host (see the | 2261 | collect and analyze the data on the host (see the |
2265 | ":ref:`profile-manual/usage:Using blktrace Remotely`" section | 2262 | ":ref:`profile-manual/usage:Using blktrace Remotely`" section |
2266 | below). For the rest of this section we assume you've ssh'ed to the host and | 2263 | below). For the rest of this section we assume you've to the host through SSH |
2267 | will be running blkrace on the target. | 2264 | and will be running blktrace on the target. |
2268 | 2265 | ||
2269 | Basic blktrace Usage | 2266 | Basic blktrace Usage |
2270 | -------------------- | 2267 | -------------------- |
2271 | 2268 | ||
2272 | To record a trace, simply run the ``blktrace`` command, giving it the name | 2269 | To record a trace, just run the ``blktrace`` command, giving it the name |
2273 | of the block device you want to trace activity on:: | 2270 | of the block device you want to trace activity on:: |
2274 | 2271 | ||
2275 | root@crownbay:~# blktrace /dev/sdc | 2272 | root@crownbay:~# blktrace /dev/sdc |
@@ -2280,10 +2277,10 @@ In another shell, execute a workload you want to trace:: | |||
2280 | Connecting to downloads.yoctoproject.org (140.211.169.59:80) | 2277 | Connecting to downloads.yoctoproject.org (140.211.169.59:80) |
2281 | linux-2.6.19.2.tar.b 100% \|*******************************\| 41727k 0:00:00 ETA | 2278 | linux-2.6.19.2.tar.b 100% \|*******************************\| 41727k 0:00:00 ETA |
2282 | 2279 | ||
2283 | Press Ctrl-C in the blktrace shell to stop the trace. It | 2280 | Press ``Ctrl-C`` in the blktrace shell to stop the trace. It |
2284 | will display how many events were logged, along with the per-cpu file | 2281 | will display how many events were logged, along with the per-cpu file |
2285 | sizes (blktrace records traces in per-cpu kernel buffers and simply | 2282 | sizes (blktrace records traces in per-cpu kernel buffers and just |
2286 | dumps them to userspace for blkparse to merge and sort later):: | 2283 | dumps them to user space for blkparse to merge and sort later):: |
2287 | 2284 | ||
2288 | ^C=== sdc === | 2285 | ^C=== sdc === |
2289 | CPU 0: 7082 events, 332 KiB data | 2286 | CPU 0: 7082 events, 332 KiB data |
@@ -2299,7 +2296,7 @@ with the device name as the first part of the filename:: | |||
2299 | -rw-r--r-- 1 root root 339938 Oct 27 22:40 sdc.blktrace.0 | 2296 | -rw-r--r-- 1 root root 339938 Oct 27 22:40 sdc.blktrace.0 |
2300 | -rw-r--r-- 1 root root 75753 Oct 27 22:40 sdc.blktrace.1 | 2297 | -rw-r--r-- 1 root root 75753 Oct 27 22:40 sdc.blktrace.1 |
2301 | 2298 | ||
2302 | To view the trace events, simply invoke ``blkparse`` in the directory | 2299 | To view the trace events, just call ``blkparse`` in the directory |
2303 | containing the trace files, giving it the device name that forms the | 2300 | containing the trace files, giving it the device name that forms the |
2304 | first part of the filenames:: | 2301 | first part of the filenames:: |
2305 | 2302 | ||
@@ -2398,8 +2395,8 @@ Live Mode | |||
2398 | ~~~~~~~~~ | 2395 | ~~~~~~~~~ |
2399 | 2396 | ||
2400 | blktrace and blkparse are designed from the ground up to be able to | 2397 | blktrace and blkparse are designed from the ground up to be able to |
2401 | operate together in a "pipe mode" where the stdout of blktrace can be | 2398 | operate together in a "pipe mode" where the standard output of blktrace can be |
2402 | fed directly into the stdin of blkparse:: | 2399 | fed directly into the standard input of blkparse:: |
2403 | 2400 | ||
2404 | root@crownbay:~# blktrace /dev/sdc -o - | blkparse -i - | 2401 | root@crownbay:~# blktrace /dev/sdc -o - | blkparse -i - |
2405 | 2402 | ||
@@ -2468,7 +2465,7 @@ just ended:: | |||
2468 | Total: 11800 events (dropped 0), 554 KiB data | 2465 | Total: 11800 events (dropped 0), 554 KiB data |
2469 | 2466 | ||
2470 | The blktrace instance on the host will | 2467 | The blktrace instance on the host will |
2471 | save the target output inside a hostname-timestamp directory:: | 2468 | save the target output inside a ``<hostname>-<timestamp>`` directory:: |
2472 | 2469 | ||
2473 | $ ls -al | 2470 | $ ls -al |
2474 | drwxr-xr-x 10 root root 1024 Oct 28 02:40 . | 2471 | drwxr-xr-x 10 root root 1024 Oct 28 02:40 . |
@@ -2540,7 +2537,7 @@ Tracing Block I/O via 'ftrace' | |||
2540 | It's also possible to trace block I/O using only | 2537 | It's also possible to trace block I/O using only |
2541 | :ref:`profile-manual/usage:The 'trace events' Subsystem`, which | 2538 | :ref:`profile-manual/usage:The 'trace events' Subsystem`, which |
2542 | can be useful for casual tracing if you don't want to bother dealing with the | 2539 | can be useful for casual tracing if you don't want to bother dealing with the |
2543 | userspace tools. | 2540 | user space tools. |
2544 | 2541 | ||
2545 | To enable tracing for a given device, use ``/sys/block/xxx/trace/enable``, | 2542 | To enable tracing for a given device, use ``/sys/block/xxx/trace/enable``, |
2546 | where ``xxx`` is the device name. This for example enables tracing for | 2543 | where ``xxx`` is the device name. This for example enables tracing for |
@@ -2598,7 +2595,7 @@ section can be found here: | |||
2598 | - https://linux.die.net/man/8/btrace | 2595 | - https://linux.die.net/man/8/btrace |
2599 | 2596 | ||
2600 | The above manual pages, along with manuals for the other blktrace utilities | 2597 | The above manual pages, along with manuals for the other blktrace utilities |
2601 | (btt, blkiomon, etc) can be found in the ``/doc`` directory of the blktrace | 2598 | (``btt``, ``blkiomon``, etc) can be found in the ``/doc`` directory of the blktrace |
2602 | tools git repo:: | 2599 | tools git repository:: |
2603 | 2600 | ||
2604 | $ git clone git://git.kernel.dk/blktrace.git | 2601 | $ git clone git://git.kernel.dk/blktrace.git |
diff --git a/documentation/styles/config/vocabularies/OpenSource/accept.txt b/documentation/styles/config/vocabularies/OpenSource/accept.txt index 98e76ae1f5..e378fbf79b 100644 --- a/documentation/styles/config/vocabularies/OpenSource/accept.txt +++ b/documentation/styles/config/vocabularies/OpenSource/accept.txt | |||
@@ -1,4 +1,20 @@ | |||
1 | autovivification | ||
2 | blkparse | ||
3 | blktrace | ||
4 | callee | ||
5 | debugfs | ||
1 | ftrace | 6 | ftrace |
2 | toolchain | 7 | KernelShark |
3 | systemd | 8 | Kprobe |
4 | LTTng | 9 | LTTng |
10 | perf | ||
11 | profiler | ||
12 | subcommand | ||
13 | subnode | ||
14 | superset | ||
15 | Sysprof | ||
16 | systemd | ||
17 | toolchain | ||
18 | tracepoint | ||
19 | Uprobe | ||
20 | wget | ||
diff --git a/documentation/styles/config/vocabularies/Yocto/accept.txt b/documentation/styles/config/vocabularies/Yocto/accept.txt index b725414014..ca622ba412 100644 --- a/documentation/styles/config/vocabularies/Yocto/accept.txt +++ b/documentation/styles/config/vocabularies/Yocto/accept.txt | |||
@@ -1,4 +1,5 @@ | |||
1 | Yocto | ||
2 | BSP | ||
3 | BitBake | 1 | BitBake |
2 | BSP | ||
3 | crosstap | ||
4 | OpenEmbedded | 4 | OpenEmbedded |
5 | Yocto | ||