diff options
author | Michael Opdenacker <michael.opdenacker@bootlin.com> | 2024-03-14 13:28:00 +0100 |
---|---|---|
committer | Steve Sakoman <steve@sakoman.com> | 2024-04-05 07:24:00 -0700 |
commit | 2b4a64396e423160ac65ae4c017b42e37e94e49d (patch) | |
tree | bef1d7557c8333cb1fa176412ece097f7c8c0fc5 | |
parent | 28cd826b579931b69bb2525a9a93fe478911a331 (diff) | |
download | poky-2b4a64396e423160ac65ae4c017b42e37e94e49d.tar.gz |
profile-manual: usage.rst: further style improvements
According to errors reported by "make stylecheck"
(From yocto-docs rev: b3aaf4523190f7528d49c29a9aea234bb1647eae)
Signed-off-by: Michael Opdenacker <michael.opdenacker@bootlin.com>
Signed-off-by: Steve Sakoman <steve@sakoman.com>
-rw-r--r-- | documentation/profile-manual/usage.rst | 335 | ||||
-rw-r--r-- | documentation/styles/config/vocabularies/OpenSource/accept.txt | 20 | ||||
-rw-r--r-- | documentation/styles/config/vocabularies/Yocto/accept.txt | 5 |
3 files changed, 187 insertions, 173 deletions
diff --git a/documentation/profile-manual/usage.rst b/documentation/profile-manual/usage.rst index 28dcb969a0..542bd918b9 100644 --- a/documentation/profile-manual/usage.rst +++ b/documentation/profile-manual/usage.rst | |||
@@ -10,7 +10,7 @@ Basic Usage (with examples) for each of the Yocto Tracing Tools | |||
10 | This chapter presents basic usage examples for each of the tracing | 10 | This chapter presents basic usage examples for each of the tracing |
11 | tools. | 11 | tools. |
12 | 12 | ||
13 | Perf | 13 | perf |
14 | ==== | 14 | ==== |
15 | 15 | ||
16 | The perf tool is the profiling and tracing tool that comes bundled | 16 | The perf tool is the profiling and tracing tool that comes bundled |
@@ -26,12 +26,12 @@ of what's going on. | |||
26 | 26 | ||
27 | In many ways, perf aims to be a superset of all the tracing and | 27 | In many ways, perf aims to be a superset of all the tracing and |
28 | profiling tools available in Linux today, including all the other tools | 28 | profiling tools available in Linux today, including all the other tools |
29 | covered in this HOWTO. The past couple of years have seen perf subsume a | 29 | covered in this How-to. The past couple of years have seen perf subsume a |
30 | lot of the functionality of those other tools and, at the same time, | 30 | lot of the functionality of those other tools and, at the same time, |
31 | those other tools have removed large portions of their previous | 31 | those other tools have removed large portions of their previous |
32 | functionality and replaced it with calls to the equivalent functionality | 32 | functionality and replaced it with calls to the equivalent functionality |
33 | now implemented by the perf subsystem. Extrapolation suggests that at | 33 | now implemented by the perf subsystem. Extrapolation suggests that at |
34 | some point those other tools will simply become completely redundant and | 34 | some point those other tools will become completely redundant and |
35 | go away; until then, we'll cover those other tools in these pages and in | 35 | go away; until then, we'll cover those other tools in these pages and in |
36 | many cases show how the same things can be accomplished in perf and the | 36 | many cases show how the same things can be accomplished in perf and the |
37 | other tools when it seems useful to do so. | 37 | other tools when it seems useful to do so. |
@@ -41,7 +41,7 @@ want to apply the tool; full documentation can be found either within | |||
41 | the tool itself or in the manual pages at | 41 | the tool itself or in the manual pages at |
42 | `perf(1) <https://linux.die.net/man/1/perf>`__. | 42 | `perf(1) <https://linux.die.net/man/1/perf>`__. |
43 | 43 | ||
44 | Perf Setup | 44 | perf Setup |
45 | ---------- | 45 | ---------- |
46 | 46 | ||
47 | For this section, we'll assume you've already performed the basic setup | 47 | For this section, we'll assume you've already performed the basic setup |
@@ -54,14 +54,14 @@ image built with the following in your ``local.conf`` file:: | |||
54 | 54 | ||
55 | perf runs on the target system for the most part. You can archive | 55 | perf runs on the target system for the most part. You can archive |
56 | profile data and copy it to the host for analysis, but for the rest of | 56 | profile data and copy it to the host for analysis, but for the rest of |
57 | this document we assume you've ssh'ed to the host and will be running | 57 | this document we assume you're connected to the host through SSH and will be |
58 | the perf commands on the target. | 58 | running the perf commands on the target. |
59 | 59 | ||
60 | Basic Perf Usage | 60 | Basic perf Usage |
61 | ---------------- | 61 | ---------------- |
62 | 62 | ||
63 | The perf tool is pretty much self-documenting. To remind yourself of the | 63 | The perf tool is pretty much self-documenting. To remind yourself of the |
64 | available commands, simply type ``perf``, which will show you basic usage | 64 | available commands, just type ``perf``, which will show you basic usage |
65 | along with the available perf subcommands:: | 65 | along with the available perf subcommands:: |
66 | 66 | ||
67 | root@crownbay:~# perf | 67 | root@crownbay:~# perf |
@@ -101,7 +101,7 @@ As a simple test case, we'll profile the ``wget`` of a fairly large file, | |||
101 | which is a minimally interesting case because it has both file and | 101 | which is a minimally interesting case because it has both file and |
102 | network I/O aspects, and at least in the case of standard Yocto images, | 102 | network I/O aspects, and at least in the case of standard Yocto images, |
103 | it's implemented as part of BusyBox, so the methods we use to analyze it | 103 | it's implemented as part of BusyBox, so the methods we use to analyze it |
104 | can be used in a very similar way to the whole host of supported BusyBox | 104 | can be used in a similar way to the whole host of supported BusyBox |
105 | applets in Yocto:: | 105 | applets in Yocto:: |
106 | 106 | ||
107 | root@crownbay:~# rm linux-2.6.19.2.tar.bz2; \ | 107 | root@crownbay:~# rm linux-2.6.19.2.tar.bz2; \ |
@@ -164,17 +164,17 @@ hits and misses:: | |||
164 | 164 | ||
165 | 44.831023415 seconds time elapsed | 165 | 44.831023415 seconds time elapsed |
166 | 166 | ||
167 | So ``perf stat`` gives us a nice easy | 167 | As you can see, ``perf stat`` gives us a nice easy |
168 | way to get a quick overview of what might be happening for a set of | 168 | way to get a quick overview of what might be happening for a set of |
169 | events, but normally we'd need a little more detail in order to | 169 | events, but normally we'd need a little more detail in order to |
170 | understand what's going on in a way that we can act on in a useful way. | 170 | understand what's going on in a way that we can act on in a useful way. |
171 | 171 | ||
172 | To dive down into a next level of detail, we can use ``perf record`` / | 172 | To dive down into a next level of detail, we can use ``perf record`` / |
173 | ``perf report`` which will collect profiling data and present it to use using an | 173 | ``perf report`` which will collect profiling data and present it to use using an |
174 | interactive text-based UI (or simply as text if we specify ``--stdio`` to | 174 | interactive text-based UI (or just as text if we specify ``--stdio`` to |
175 | ``perf report``). | 175 | ``perf report``). |
176 | 176 | ||
177 | As our first attempt at profiling this workload, we'll simply run ``perf | 177 | As our first attempt at profiling this workload, we'll just run ``perf |
178 | record``, handing it the workload we want to profile (everything after | 178 | record``, handing it the workload we want to profile (everything after |
179 | ``perf record`` and any perf options we hand it --- here none, will be | 179 | ``perf record`` and any perf options we hand it --- here none, will be |
180 | executed in a new shell). perf collects samples until the process exits | 180 | executed in a new shell). perf collects samples until the process exits |
@@ -189,7 +189,7 @@ directory:: | |||
189 | [ perf record: Captured and wrote 0.176 MB perf.data (~7700 samples) ] | 189 | [ perf record: Captured and wrote 0.176 MB perf.data (~7700 samples) ] |
190 | 190 | ||
191 | To see the results in a | 191 | To see the results in a |
192 | "text-based UI" (tui), simply run ``perf report``, which will read the | 192 | "text-based UI" (tui), just run ``perf report``, which will read the |
193 | perf.data file in the current working directory and display the results | 193 | perf.data file in the current working directory and display the results |
194 | in an interactive UI:: | 194 | in an interactive UI:: |
195 | 195 | ||
@@ -203,10 +203,10 @@ The above screenshot displays a "flat" profile, one entry for each | |||
203 | profiling run, ordered from the most popular to the least (perf has | 203 | profiling run, ordered from the most popular to the least (perf has |
204 | options to sort in various orders and keys as well as display entries | 204 | options to sort in various orders and keys as well as display entries |
205 | only above a certain threshold and so on --- see the perf documentation | 205 | only above a certain threshold and so on --- see the perf documentation |
206 | for details). Note that this includes both userspace functions (entries | 206 | for details). Note that this includes both user space functions (entries |
207 | containing a ``[.]``) and kernel functions accounted to the process (entries | 207 | containing a ``[.]``) and kernel functions accounted to the process (entries |
208 | containing a ``[k]``). perf has command-line modifiers that can be used to | 208 | containing a ``[k]``). perf has command-line modifiers that can be used to |
209 | restrict the profiling to kernel or userspace, among others. | 209 | restrict the profiling to kernel or user space, among others. |
210 | 210 | ||
211 | Notice also that the above report shows an entry for ``busybox``, which is | 211 | Notice also that the above report shows an entry for ``busybox``, which is |
212 | the executable that implements ``wget`` in Yocto, but that instead of a | 212 | the executable that implements ``wget`` in Yocto, but that instead of a |
@@ -217,7 +217,7 @@ Before we do that, however, let's try running a different profile, one | |||
217 | which shows something a little more interesting. The only difference | 217 | which shows something a little more interesting. The only difference |
218 | between the new profile and the previous one is that we'll add the ``-g`` | 218 | between the new profile and the previous one is that we'll add the ``-g`` |
219 | option, which will record not just the address of a sampled function, | 219 | option, which will record not just the address of a sampled function, |
220 | but the entire callchain to the sampled function as well:: | 220 | but the entire call chain to the sampled function as well:: |
221 | 221 | ||
222 | root@crownbay:~# perf record -g wget &YOCTO_DL_URL;/mirror/sources/linux-2.6.19.2.tar.bz2 | 222 | root@crownbay:~# perf record -g wget &YOCTO_DL_URL;/mirror/sources/linux-2.6.19.2.tar.bz2 |
223 | Connecting to downloads.yoctoproject.org (140.211.169.59:80) | 223 | Connecting to downloads.yoctoproject.org (140.211.169.59:80) |
@@ -231,26 +231,26 @@ but the entire callchain to the sampled function as well:: | |||
231 | .. image:: figures/perf-wget-g-copy-to-user-expanded-stripped.png | 231 | .. image:: figures/perf-wget-g-copy-to-user-expanded-stripped.png |
232 | :align: center | 232 | :align: center |
233 | 233 | ||
234 | Using the callgraph view, we can actually see not only which functions | 234 | Using the call graph view, we can actually see not only which functions |
235 | took the most time, but we can also see a summary of how those functions | 235 | took the most time, but we can also see a summary of how those functions |
236 | were called and learn something about how the program interacts with the | 236 | were called and learn something about how the program interacts with the |
237 | kernel in the process. | 237 | kernel in the process. |
238 | 238 | ||
239 | Notice that each entry in the above screenshot now contains a ``+`` on the | 239 | Notice that each entry in the above screenshot now contains a ``+`` on the |
240 | left-hand side. This means that we can expand the entry and drill down | 240 | left side. This means that we can expand the entry and drill down |
241 | into the callchains that feed into that entry. Pressing ``Enter`` on any | 241 | into the call chains that feed into that entry. Pressing ``Enter`` on any |
242 | one of them will expand the callchain (you can also press ``E`` to expand | 242 | one of them will expand the call chain (you can also press ``E`` to expand |
243 | them all at the same time or ``C`` to collapse them all). | 243 | them all at the same time or ``C`` to collapse them all). |
244 | 244 | ||
245 | In the screenshot above, we've toggled the ``__copy_to_user_ll()`` entry | 245 | In the screenshot above, we've toggled the ``__copy_to_user_ll()`` entry |
246 | and several subnodes all the way down. This lets us see which callchains | 246 | and several subnodes all the way down. This lets us see which call chains |
247 | contributed to the profiled ``__copy_to_user_ll()`` function which | 247 | contributed to the profiled ``__copy_to_user_ll()`` function which |
248 | contributed 1.77% to the total profile. | 248 | contributed 1.77% to the total profile. |
249 | 249 | ||
250 | As a bit of background explanation for these callchains, think about | 250 | As a bit of background explanation for these call chains, think about |
251 | what happens at a high level when you run wget to get a file out on the | 251 | what happens at a high level when you run ``wget`` to get a file out on the |
252 | network. Basically what happens is that the data comes into the kernel | 252 | network. Basically what happens is that the data comes into the kernel |
253 | via the network connection (socket) and is passed to the userspace | 253 | via the network connection (socket) and is passed to the user space |
254 | program ``wget`` (which is actually a part of BusyBox, but that's not | 254 | program ``wget`` (which is actually a part of BusyBox, but that's not |
255 | important for now), which takes the buffers the kernel passes to it and | 255 | important for now), which takes the buffers the kernel passes to it and |
256 | writes it to a disk file to save it. | 256 | writes it to a disk file to save it. |
@@ -260,15 +260,15 @@ is the part where the kernel passes the data it has read from the socket | |||
260 | down to wget i.e. a ``copy-to-user``. | 260 | down to wget i.e. a ``copy-to-user``. |
261 | 261 | ||
262 | Notice also that here there's also a case where the hex value is | 262 | Notice also that here there's also a case where the hex value is |
263 | displayed in the callstack, here in the expanded ``sys_clock_gettime()`` | 263 | displayed in the call stack, here in the expanded ``sys_clock_gettime()`` |
264 | function. Later we'll see it resolve to a userspace function call in | 264 | function. Later we'll see it resolve to a user space function call in |
265 | busybox. | 265 | BusyBox. |
266 | 266 | ||
267 | .. image:: figures/perf-wget-g-copy-from-user-expanded-stripped.png | 267 | .. image:: figures/perf-wget-g-copy-from-user-expanded-stripped.png |
268 | :align: center | 268 | :align: center |
269 | 269 | ||
270 | The above screenshot shows the other half of the journey for the data --- | 270 | The above screenshot shows the other half of the journey for the data --- |
271 | from the ``wget`` program's userspace buffers to disk. To get the buffers to | 271 | from the ``wget`` program's user space buffers to disk. To get the buffers to |
272 | disk, the wget program issues a ``write(2)``, which does a ``copy-from-user`` to | 272 | disk, the wget program issues a ``write(2)``, which does a ``copy-from-user`` to |
273 | the kernel, which then takes care via some circuitous path (probably | 273 | the kernel, which then takes care via some circuitous path (probably |
274 | also present somewhere in the profile data), to get it safely to disk. | 274 | also present somewhere in the profile data), to get it safely to disk. |
@@ -278,8 +278,8 @@ of how to extract useful information out of it, let's get back to the | |||
278 | task at hand and see if we can get some basic idea about where the time | 278 | task at hand and see if we can get some basic idea about where the time |
279 | is spent in the program we're profiling, wget. Remember that wget is | 279 | is spent in the program we're profiling, wget. Remember that wget is |
280 | actually implemented as an applet in BusyBox, so while the process name | 280 | actually implemented as an applet in BusyBox, so while the process name |
281 | is ``wget``, the executable we're actually interested in is BusyBox. So | 281 | is ``wget``, the executable we're actually interested in is ``busybox``. |
282 | let's expand the first entry containing BusyBox: | 282 | Therefore, let's expand the first entry containing BusyBox: |
283 | 283 | ||
284 | .. image:: figures/perf-wget-busybox-expanded-stripped.png | 284 | .. image:: figures/perf-wget-busybox-expanded-stripped.png |
285 | :align: center | 285 | :align: center |
@@ -289,7 +289,7 @@ hex value instead of a symbol as with most of the kernel entries. | |||
289 | Expanding the BusyBox entry doesn't make it any better. | 289 | Expanding the BusyBox entry doesn't make it any better. |
290 | 290 | ||
291 | The problem is that perf can't find the symbol information for the | 291 | The problem is that perf can't find the symbol information for the |
292 | busybox binary, which is actually stripped out by the Yocto build | 292 | ``busybox`` binary, which is actually stripped out by the Yocto build |
293 | system. | 293 | system. |
294 | 294 | ||
295 | One way around that is to put the following in your ``local.conf`` file | 295 | One way around that is to put the following in your ``local.conf`` file |
@@ -299,20 +299,20 @@ when you build the image:: | |||
299 | 299 | ||
300 | However, we already have an image with the binaries stripped, so | 300 | However, we already have an image with the binaries stripped, so |
301 | what can we do to get perf to resolve the symbols? Basically we need to | 301 | what can we do to get perf to resolve the symbols? Basically we need to |
302 | install the debuginfo for the BusyBox package. | 302 | install the debugging information for the BusyBox package. |
303 | 303 | ||
304 | To generate the debug info for the packages in the image, we can add | 304 | To generate the debug info for the packages in the image, we can add |
305 | ``dbg-pkgs`` to :term:`EXTRA_IMAGE_FEATURES` in ``local.conf``. For example:: | 305 | ``dbg-pkgs`` to :term:`EXTRA_IMAGE_FEATURES` in ``local.conf``. For example:: |
306 | 306 | ||
307 | EXTRA_IMAGE_FEATURES = "debug-tweaks tools-profile dbg-pkgs" | 307 | EXTRA_IMAGE_FEATURES = "debug-tweaks tools-profile dbg-pkgs" |
308 | 308 | ||
309 | Additionally, in order to generate the type of debuginfo that perf | 309 | Additionally, in order to generate the type of debugging information that perf |
310 | understands, we also need to set :term:`PACKAGE_DEBUG_SPLIT_STYLE` | 310 | understands, we also need to set :term:`PACKAGE_DEBUG_SPLIT_STYLE` |
311 | in the ``local.conf`` file:: | 311 | in the ``local.conf`` file:: |
312 | 312 | ||
313 | PACKAGE_DEBUG_SPLIT_STYLE = 'debug-file-directory' | 313 | PACKAGE_DEBUG_SPLIT_STYLE = 'debug-file-directory' |
314 | 314 | ||
315 | Once we've done that, we can install the debuginfo for BusyBox. The | 315 | Once we've done that, we can install the debugging information for BusyBox. The |
316 | debug packages once built can be found in ``build/tmp/deploy/rpm/*`` | 316 | debug packages once built can be found in ``build/tmp/deploy/rpm/*`` |
317 | on the host system. Find the ``busybox-dbg-...rpm`` file and copy it | 317 | on the host system. Find the ``busybox-dbg-...rpm`` file and copy it |
318 | to the target. For example:: | 318 | to the target. For example:: |
@@ -320,11 +320,11 @@ to the target. For example:: | |||
320 | [trz@empanada core2]$ scp /home/trz/yocto/crownbay-tracing-dbg/build/tmp/deploy/rpm/core2_32/busybox-dbg-1.20.2-r2.core2_32.rpm root@192.168.1.31: | 320 | [trz@empanada core2]$ scp /home/trz/yocto/crownbay-tracing-dbg/build/tmp/deploy/rpm/core2_32/busybox-dbg-1.20.2-r2.core2_32.rpm root@192.168.1.31: |
321 | busybox-dbg-1.20.2-r2.core2_32.rpm 100% 1826KB 1.8MB/s 00:01 | 321 | busybox-dbg-1.20.2-r2.core2_32.rpm 100% 1826KB 1.8MB/s 00:01 |
322 | 322 | ||
323 | Now install the debug rpm on the target:: | 323 | Now install the debug RPM on the target:: |
324 | 324 | ||
325 | root@crownbay:~# rpm -i busybox-dbg-1.20.2-r2.core2_32.rpm | 325 | root@crownbay:~# rpm -i busybox-dbg-1.20.2-r2.core2_32.rpm |
326 | 326 | ||
327 | Now that the debuginfo is installed, we see that the BusyBox entries now display | 327 | Now that the debugging information is installed, we see that the BusyBox entries now display |
328 | their functions symbolically: | 328 | their functions symbolically: |
329 | 329 | ||
330 | .. image:: figures/perf-wget-busybox-debuginfo.png | 330 | .. image:: figures/perf-wget-busybox-debuginfo.png |
@@ -344,7 +344,7 @@ expanded all the nodes using the ``E`` key): | |||
344 | .. image:: figures/perf-wget-busybox-dso-zoom.png | 344 | .. image:: figures/perf-wget-busybox-dso-zoom.png |
345 | :align: center | 345 | :align: center |
346 | 346 | ||
347 | Finally, we can see that now that the BusyBox debuginfo is installed, | 347 | Finally, we can see that now that the BusyBox debugging information is installed, |
348 | the previously unresolved symbol in the ``sys_clock_gettime()`` entry | 348 | the previously unresolved symbol in the ``sys_clock_gettime()`` entry |
349 | mentioned previously is now resolved, and shows that the | 349 | mentioned previously is now resolved, and shows that the |
350 | ``sys_clock_gettime`` system call that was the source of 6.75% of the | 350 | ``sys_clock_gettime`` system call that was the source of 6.75% of the |
@@ -376,8 +376,8 @@ counter, something other than the default ``cycles``. | |||
376 | The tracing and profiling infrastructure in Linux has become unified in | 376 | The tracing and profiling infrastructure in Linux has become unified in |
377 | a way that allows us to use the same tool with a completely different | 377 | a way that allows us to use the same tool with a completely different |
378 | set of counters, not just the standard hardware counters that | 378 | set of counters, not just the standard hardware counters that |
379 | traditional tools have had to restrict themselves to (of course the | 379 | traditional tools have had to restrict themselves to (the |
380 | traditional tools can also make use of the expanded possibilities now | 380 | traditional tools can now actually make use of the expanded possibilities now |
381 | available to them, and in some cases have, as mentioned previously). | 381 | available to them, and in some cases have, as mentioned previously). |
382 | 382 | ||
383 | We can get a list of the available events that can be used to profile a | 383 | We can get a list of the available events that can be used to profile a |
@@ -517,14 +517,14 @@ workload via ``perf list``:: | |||
517 | .. admonition:: Tying it Together | 517 | .. admonition:: Tying it Together |
518 | 518 | ||
519 | These are exactly the same set of events defined by the trace event | 519 | These are exactly the same set of events defined by the trace event |
520 | subsystem and exposed by ftrace / tracecmd / kernelshark as files in | 520 | subsystem and exposed by ftrace / trace-cmd / KernelShark as files in |
521 | ``/sys/kernel/debug/tracing/events``, by SystemTap as | 521 | ``/sys/kernel/debug/tracing/events``, by SystemTap as |
522 | kernel.trace("tracepoint_name") and (partially) accessed by LTTng. | 522 | kernel.trace("tracepoint_name") and (partially) accessed by LTTng. |
523 | 523 | ||
524 | Only a subset of these would be of interest to us when looking at this | 524 | Only a subset of these would be of interest to us when looking at this |
525 | workload, so let's choose the most likely subsystems (identified by the | 525 | workload, so let's choose the most likely subsystems (identified by the |
526 | string before the colon in the Tracepoint events) and do a ``perf stat`` | 526 | string before the colon in the ``Tracepoint`` events) and do a ``perf stat`` |
527 | run using only those wildcarded subsystems:: | 527 | run using only those subsystem wildcards:: |
528 | 528 | ||
529 | root@crownbay:~# perf stat -e skb:* -e net:* -e napi:* -e sched:* -e workqueue:* -e irq:* -e syscalls:* wget &YOCTO_DL_URL;/mirror/sources/linux-2.6.19.2.tar.bz2 | 529 | root@crownbay:~# perf stat -e skb:* -e net:* -e napi:* -e sched:* -e workqueue:* -e irq:* -e syscalls:* wget &YOCTO_DL_URL;/mirror/sources/linux-2.6.19.2.tar.bz2 |
530 | Performance counter stats for 'wget &YOCTO_DL_URL;/mirror/sources/linux-2.6.19.2.tar.bz2': | 530 | Performance counter stats for 'wget &YOCTO_DL_URL;/mirror/sources/linux-2.6.19.2.tar.bz2': |
@@ -614,8 +614,8 @@ accounts for the function name actually displayed in the profile: | |||
614 | } | 614 | } |
615 | 615 | ||
616 | A couple of the more interesting | 616 | A couple of the more interesting |
617 | callchains are expanded and displayed above, basically some network | 617 | call chains are expanded and displayed above, basically some network |
618 | receive paths that presumably end up waking up wget (busybox) when | 618 | receive paths that presumably end up waking up wget (BusyBox) when |
619 | network data is ready. | 619 | network data is ready. |
620 | 620 | ||
621 | Note that because tracepoints are normally used for tracing, the default | 621 | Note that because tracepoints are normally used for tracing, the default |
@@ -635,7 +635,7 @@ high-level view of what's going on with a workload or across the system. | |||
635 | It is however by definition an approximation, as suggested by the most | 635 | It is however by definition an approximation, as suggested by the most |
636 | prominent word associated with it, ``sampling``. On the one hand, it | 636 | prominent word associated with it, ``sampling``. On the one hand, it |
637 | allows a representative picture of what's going on in the system to be | 637 | allows a representative picture of what's going on in the system to be |
638 | cheaply taken, but on the other hand, that cheapness limits its utility | 638 | cheaply taken, but alternatively, that cheapness limits its utility |
639 | when that data suggests a need to "dive down" more deeply to discover | 639 | when that data suggests a need to "dive down" more deeply to discover |
640 | what's really going on. In such cases, the only way to see what's really | 640 | what's really going on. In such cases, the only way to see what's really |
641 | going on is to be able to look at (or summarize more intelligently) the | 641 | going on is to be able to look at (or summarize more intelligently) the |
@@ -700,7 +700,7 @@ an infinite variety of ways. | |||
700 | Another way to look at it is that there are only so many ways that the | 700 | Another way to look at it is that there are only so many ways that the |
701 | 'primitive' counters can be used on their own to generate interesting | 701 | 'primitive' counters can be used on their own to generate interesting |
702 | output; to get anything more complicated than simple counts requires | 702 | output; to get anything more complicated than simple counts requires |
703 | some amount of additional logic, which is typically very specific to the | 703 | some amount of additional logic, which is typically specific to the |
704 | problem at hand. For example, if we wanted to make use of a 'counter' | 704 | problem at hand. For example, if we wanted to make use of a 'counter' |
705 | that maps to the value of the time difference between when a process was | 705 | that maps to the value of the time difference between when a process was |
706 | scheduled to run on a processor and the time it actually ran, we | 706 | scheduled to run on a processor and the time it actually ran, we |
@@ -711,12 +711,12 @@ standard profiling tools how much data every process on the system reads | |||
711 | and writes, along with how many of those reads and writes fail | 711 | and writes, along with how many of those reads and writes fail |
712 | completely. If we have sufficient trace data, however, we could with the | 712 | completely. If we have sufficient trace data, however, we could with the |
713 | right tools easily extract and present that information, but we'd need | 713 | right tools easily extract and present that information, but we'd need |
714 | something other than pre-canned profiling tools to do that. | 714 | something other than ready-made profiling tools to do that. |
715 | 715 | ||
716 | Luckily, there is a general-purpose way to handle such needs, called | 716 | Luckily, there is a general-purpose way to handle such needs, called |
717 | "programming languages". Making programming languages easily available | 717 | "programming languages". Making programming languages easily available |
718 | to apply to such problems given the specific format of data is called a | 718 | to apply to such problems given the specific format of data is called a |
719 | 'programming language binding' for that data and language. Perf supports | 719 | 'programming language binding' for that data and language. perf supports |
720 | two programming language bindings, one for Python and one for Perl. | 720 | two programming language bindings, one for Python and one for Perl. |
721 | 721 | ||
722 | .. admonition:: Tying it Together | 722 | .. admonition:: Tying it Together |
@@ -726,7 +726,7 @@ two programming language bindings, one for Python and one for Perl. | |||
726 | DProbes dpcc compiler, an ANSI C compiler which targeted a low-level | 726 | DProbes dpcc compiler, an ANSI C compiler which targeted a low-level |
727 | assembly language running on an in-kernel interpreter on the target | 727 | assembly language running on an in-kernel interpreter on the target |
728 | system. This is exactly analogous to what Sun's DTrace did, except | 728 | system. This is exactly analogous to what Sun's DTrace did, except |
729 | that DTrace invented its own language for the purpose. Systemtap, | 729 | that DTrace invented its own language for the purpose. SystemTap, |
730 | heavily inspired by DTrace, also created its own one-off language, | 730 | heavily inspired by DTrace, also created its own one-off language, |
731 | but rather than running the product on an in-kernel interpreter, | 731 | but rather than running the product on an in-kernel interpreter, |
732 | created an elaborate compiler-based machinery to translate its | 732 | created an elaborate compiler-based machinery to translate its |
@@ -739,8 +739,8 @@ entry / exit events we recorded:: | |||
739 | root@crownbay:~# perf script -g python | 739 | root@crownbay:~# perf script -g python |
740 | generated Python script: perf-script.py | 740 | generated Python script: perf-script.py |
741 | 741 | ||
742 | The skeleton script simply creates a Python function for each event type in the | 742 | The skeleton script just creates a Python function for each event type in the |
743 | ``perf.data`` file. The body of each function simply prints the event name along | 743 | ``perf.data`` file. The body of each function just prints the event name along |
744 | with its parameters. For example: | 744 | with its parameters. For example: |
745 | 745 | ||
746 | .. code-block:: python | 746 | .. code-block:: python |
@@ -783,7 +783,7 @@ We can run that script directly to print all of the events contained in the | |||
783 | syscalls__sys_exit_read 1 11624.859944032 1262 wget nr=3, ret=1024 | 783 | syscalls__sys_exit_read 1 11624.859944032 1262 wget nr=3, ret=1024 |
784 | 784 | ||
785 | That in itself isn't very useful; after all, we can accomplish pretty much the | 785 | That in itself isn't very useful; after all, we can accomplish pretty much the |
786 | same thing by simply running ``perf script`` without arguments in the same | 786 | same thing by just running ``perf script`` without arguments in the same |
787 | directory as the ``perf.data`` file. | 787 | directory as the ``perf.data`` file. |
788 | 788 | ||
789 | We can however replace the print statements in the generated function | 789 | We can however replace the print statements in the generated function |
@@ -805,7 +805,7 @@ event. For example: | |||
805 | 805 | ||
806 | Each event handler function in the generated code | 806 | Each event handler function in the generated code |
807 | is modified to do this. For convenience, we define a common function | 807 | is modified to do this. For convenience, we define a common function |
808 | called ``inc_counts()`` that each handler calls; ``inc_counts()`` simply tallies | 808 | called ``inc_counts()`` that each handler calls; ``inc_counts()`` just tallies |
809 | a count for each event using the ``counts`` hash, which is a specialized | 809 | a count for each event using the ``counts`` hash, which is a specialized |
810 | hash function that does Perl-like autovivification, a capability that's | 810 | hash function that does Perl-like autovivification, a capability that's |
811 | extremely useful for kinds of multi-level aggregation commonly used in | 811 | extremely useful for kinds of multi-level aggregation commonly used in |
@@ -865,7 +865,7 @@ System-Wide Tracing and Profiling | |||
865 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | 865 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
866 | 866 | ||
867 | The examples so far have focused on tracing a particular program or | 867 | The examples so far have focused on tracing a particular program or |
868 | workload --- in other words, every profiling run has specified the program | 868 | workload --- that is, every profiling run has specified the program |
869 | to profile in the command-line e.g. ``perf record wget ...``. | 869 | to profile in the command-line e.g. ``perf record wget ...``. |
870 | 870 | ||
871 | It's also possible, and more interesting in many cases, to run a | 871 | It's also possible, and more interesting in many cases, to run a |
@@ -894,13 +894,13 @@ other processes running on the system as well: | |||
894 | .. image:: figures/perf-systemwide.png | 894 | .. image:: figures/perf-systemwide.png |
895 | :align: center | 895 | :align: center |
896 | 896 | ||
897 | In the snapshot above, we can see callchains that originate in libc, and | 897 | In the snapshot above, we can see call chains that originate in ``libc``, and |
898 | a callchain from Xorg that demonstrates that we're using a proprietary X | 898 | a call chain from ``Xorg`` that demonstrates that we're using a proprietary X |
899 | driver in userspace (notice the presence of ``PVR`` and some other | 899 | driver in user space (notice the presence of ``PVR`` and some other |
900 | unresolvable symbols in the expanded Xorg callchain). | 900 | unresolvable symbols in the expanded ``Xorg`` call chain). |
901 | 901 | ||
902 | Note also that we have both kernel and userspace entries in the above | 902 | Note also that we have both kernel and user space entries in the above |
903 | snapshot. We can also tell perf to focus on userspace but providing a | 903 | snapshot. We can also tell perf to focus on user space but providing a |
904 | modifier, in this case ``u``, to the ``cycles`` hardware counter when we | 904 | modifier, in this case ``u``, to the ``cycles`` hardware counter when we |
905 | record a profile:: | 905 | record a profile:: |
906 | 906 | ||
@@ -911,7 +911,7 @@ record a profile:: | |||
911 | .. image:: figures/perf-report-cycles-u.png | 911 | .. image:: figures/perf-report-cycles-u.png |
912 | :align: center | 912 | :align: center |
913 | 913 | ||
914 | Notice in the screenshot above, we see only userspace entries (``[.]``) | 914 | Notice in the screenshot above, we see only user space entries (``[.]``) |
915 | 915 | ||
916 | Finally, we can press ``Enter`` on a leaf node and select the ``Zoom into | 916 | Finally, we can press ``Enter`` on a leaf node and select the ``Zoom into |
917 | DSO`` menu item to show only entries associated with a specific DSO. In | 917 | DSO`` menu item to show only entries associated with a specific DSO. In |
@@ -946,7 +946,7 @@ We can look at the raw output using ``perf script`` with no arguments:: | |||
946 | Filtering | 946 | Filtering |
947 | ^^^^^^^^^ | 947 | ^^^^^^^^^ |
948 | 948 | ||
949 | Notice that there are a lot of events that don't really have anything to | 949 | Notice that there are many events that don't really have anything to |
950 | do with what we're interested in, namely events that schedule ``perf`` | 950 | do with what we're interested in, namely events that schedule ``perf`` |
951 | itself in and out or that wake perf up. We can get rid of those by using | 951 | itself in and out or that wake perf up. We can get rid of those by using |
952 | the ``--filter`` option --- for each event we specify using ``-e``, we can add a | 952 | the ``--filter`` option --- for each event we specify using ``-e``, we can add a |
@@ -985,7 +985,7 @@ purpose of demonstrating how to use filters, it's close enough. | |||
985 | .. admonition:: Tying it Together | 985 | .. admonition:: Tying it Together |
986 | 986 | ||
987 | These are exactly the same set of event filters defined by the trace | 987 | These are exactly the same set of event filters defined by the trace |
988 | event subsystem. See the ftrace / tracecmd / kernelshark section for more | 988 | event subsystem. See the ftrace / trace-cmd / KernelShark section for more |
989 | discussion about these event filters. | 989 | discussion about these event filters. |
990 | 990 | ||
991 | .. admonition:: Tying it Together | 991 | .. admonition:: Tying it Together |
@@ -995,14 +995,14 @@ purpose of demonstrating how to use filters, it's close enough. | |||
995 | indispensable part of the perf design as it relates to tracing. | 995 | indispensable part of the perf design as it relates to tracing. |
996 | kernel-based event filters provide a mechanism to precisely throttle | 996 | kernel-based event filters provide a mechanism to precisely throttle |
997 | the event stream that appears in user space, where it makes sense to | 997 | the event stream that appears in user space, where it makes sense to |
998 | provide bindings to real programming languages for postprocessing the | 998 | provide bindings to real programming languages for post-processing the |
999 | event stream. This architecture allows for the intelligent and | 999 | event stream. This architecture allows for the intelligent and |
1000 | flexible partitioning of processing between the kernel and user | 1000 | flexible partitioning of processing between the kernel and user |
1001 | space. Contrast this with other tools such as SystemTap, which does | 1001 | space. Contrast this with other tools such as SystemTap, which does |
1002 | all of its processing in the kernel and as such requires a special | 1002 | all of its processing in the kernel and as such requires a special |
1003 | project-defined language in order to accommodate that design, or | 1003 | project-defined language in order to accommodate that design, or |
1004 | LTTng, where everything is sent to userspace and as such requires a | 1004 | LTTng, where everything is sent to user space and as such requires a |
1005 | super-efficient kernel-to-userspace transport mechanism in order to | 1005 | super-efficient kernel-to-user space transport mechanism in order to |
1006 | function properly. While perf certainly can benefit from for instance | 1006 | function properly. While perf certainly can benefit from for instance |
1007 | advances in the design of the transport, it doesn't fundamentally | 1007 | advances in the design of the transport, it doesn't fundamentally |
1008 | depend on them. Basically, if you find that your perf tracing | 1008 | depend on them. Basically, if you find that your perf tracing |
@@ -1014,7 +1014,7 @@ Using Dynamic Tracepoints | |||
1014 | 1014 | ||
1015 | perf isn't restricted to the fixed set of static tracepoints listed by | 1015 | perf isn't restricted to the fixed set of static tracepoints listed by |
1016 | ``perf list``. Users can also add their own "dynamic" tracepoints anywhere | 1016 | ``perf list``. Users can also add their own "dynamic" tracepoints anywhere |
1017 | in the kernel. For instance, suppose we want to define our own | 1017 | in the kernel. For example, suppose we want to define our own |
1018 | tracepoint on ``do_fork()``. We can do that using the ``perf probe`` perf | 1018 | tracepoint on ``do_fork()``. We can do that using the ``perf probe`` perf |
1019 | subcommand:: | 1019 | subcommand:: |
1020 | 1020 | ||
@@ -1069,7 +1069,7 @@ up after 30 seconds):: | |||
1069 | [ perf record: Woken up 1 times to write data ] | 1069 | [ perf record: Woken up 1 times to write data ] |
1070 | [ perf record: Captured and wrote 0.087 MB perf.data (~3812 samples) ] | 1070 | [ perf record: Captured and wrote 0.087 MB perf.data (~3812 samples) ] |
1071 | 1071 | ||
1072 | Using ``perf script`` we can see each do_fork event that fired:: | 1072 | Using ``perf script`` we can see each ``do_fork`` event that fired:: |
1073 | 1073 | ||
1074 | root@crownbay:~# perf script | 1074 | root@crownbay:~# perf script |
1075 | 1075 | ||
@@ -1111,7 +1111,7 @@ Using ``perf script`` we can see each do_fork event that fired:: | |||
1111 | gaku 1312 [000] 34237.202388: do_fork: (c1028460) | 1111 | gaku 1312 [000] 34237.202388: do_fork: (c1028460) |
1112 | 1112 | ||
1113 | And using ``perf report`` on the same file, we can see the | 1113 | And using ``perf report`` on the same file, we can see the |
1114 | callgraphs from starting a few programs during those 30 seconds: | 1114 | call graphs from starting a few programs during those 30 seconds: |
1115 | 1115 | ||
1116 | .. image:: figures/perf-probe-do_fork-profile.png | 1116 | .. image:: figures/perf-probe-do_fork-profile.png |
1117 | :align: center | 1117 | :align: center |
@@ -1125,11 +1125,11 @@ callgraphs from starting a few programs during those 30 seconds: | |||
1125 | 1125 | ||
1126 | .. admonition:: Tying it Together | 1126 | .. admonition:: Tying it Together |
1127 | 1127 | ||
1128 | Dynamic tracepoints are implemented under the covers by kprobes and | 1128 | Dynamic tracepoints are implemented under the covers by Kprobes and |
1129 | uprobes. kprobes and uprobes are also used by and in fact are the | 1129 | Uprobes. Kprobes and Uprobes are also used by and in fact are the |
1130 | main focus of SystemTap. | 1130 | main focus of SystemTap. |
1131 | 1131 | ||
1132 | Perf Documentation | 1132 | perf Documentation |
1133 | ------------------ | 1133 | ------------------ |
1134 | 1134 | ||
1135 | Online versions of the manual pages for the commands discussed in this | 1135 | Online versions of the manual pages for the commands discussed in this |
@@ -1153,7 +1153,7 @@ section can be found here: | |||
1153 | 1153 | ||
1154 | - The top-level `perf(1) manual page <https://linux.die.net/man/1/perf>`__. | 1154 | - The top-level `perf(1) manual page <https://linux.die.net/man/1/perf>`__. |
1155 | 1155 | ||
1156 | Normally, you should be able to invoke the manual pages via perf itself | 1156 | Normally, you should be able to open the manual pages via perf itself |
1157 | e.g. ``perf help`` or ``perf help record``. | 1157 | e.g. ``perf help`` or ``perf help record``. |
1158 | 1158 | ||
1159 | To have the perf manual pages installed on your target, modify your | 1159 | To have the perf manual pages installed on your target, modify your |
@@ -1168,14 +1168,14 @@ of examples, can also be found in the ``perf`` directory of the kernel tree:: | |||
1168 | tools/perf/Documentation | 1168 | tools/perf/Documentation |
1169 | 1169 | ||
1170 | There's also a nice perf tutorial on the perf | 1170 | There's also a nice perf tutorial on the perf |
1171 | wiki that goes into more detail than we do here in certain areas: `Perf | 1171 | wiki that goes into more detail than we do here in certain areas: `perf |
1172 | Tutorial <https://perf.wiki.kernel.org/index.php/Tutorial>`__ | 1172 | Tutorial <https://perf.wiki.kernel.org/index.php/Tutorial>`__ |
1173 | 1173 | ||
1174 | ftrace | 1174 | ftrace |
1175 | ====== | 1175 | ====== |
1176 | 1176 | ||
1177 | "ftrace" literally refers to the "ftrace function tracer" but in reality | 1177 | "ftrace" literally refers to the "ftrace function tracer" but in reality |
1178 | this encompasses a number of related tracers along with the | 1178 | this encompasses several related tracers along with the |
1179 | infrastructure that they all make use of. | 1179 | infrastructure that they all make use of. |
1180 | 1180 | ||
1181 | ftrace Setup | 1181 | ftrace Setup |
@@ -1184,11 +1184,11 @@ ftrace Setup | |||
1184 | For this section, we'll assume you've already performed the basic setup | 1184 | For this section, we'll assume you've already performed the basic setup |
1185 | outlined in the ":ref:`profile-manual/intro:General Setup`" section. | 1185 | outlined in the ":ref:`profile-manual/intro:General Setup`" section. |
1186 | 1186 | ||
1187 | ftrace, trace-cmd, and kernelshark run on the target system, and are | 1187 | ftrace, trace-cmd, and KernelShark run on the target system, and are |
1188 | ready to go out-of-the-box --- no additional setup is necessary. For the | 1188 | ready to go out-of-the-box --- no additional setup is necessary. For the |
1189 | rest of this section we assume you've ssh'ed to the host and will be | 1189 | rest of this section we assume you're connected to the host through SSH and |
1190 | running ftrace on the target. kernelshark is a GUI application and if | 1190 | will be running ftrace on the target. KernelShark is a GUI application and if |
1191 | you use the ``-X`` option to ssh you can have the kernelshark GUI run on | 1191 | you use the ``-X`` option to ``ssh`` you can have the KernelShark GUI run on |
1192 | the target but display remotely on the host if you want. | 1192 | the target but display remotely on the host if you want. |
1193 | 1193 | ||
1194 | Basic ftrace usage | 1194 | Basic ftrace usage |
@@ -1196,8 +1196,8 @@ Basic ftrace usage | |||
1196 | 1196 | ||
1197 | "ftrace" essentially refers to everything included in the ``/tracing`` | 1197 | "ftrace" essentially refers to everything included in the ``/tracing`` |
1198 | directory of the mounted debugfs filesystem (Yocto follows the standard | 1198 | directory of the mounted debugfs filesystem (Yocto follows the standard |
1199 | convention and mounts it at ``/sys/kernel/debug``). Here's a listing of all | 1199 | convention and mounts it at ``/sys/kernel/debug``). All the files found in |
1200 | the files found in ``/sys/kernel/debug/tracing`` on a Yocto system:: | 1200 | ``/sys/kernel/debug/tracing`` on a Yocto system are:: |
1201 | 1201 | ||
1202 | root@sugarbay:/sys/kernel/debug/tracing# ls | 1202 | root@sugarbay:/sys/kernel/debug/tracing# ls |
1203 | README kprobe_events trace | 1203 | README kprobe_events trace |
@@ -1222,7 +1222,7 @@ the ftrace documentation. | |||
1222 | 1222 | ||
1223 | We'll start by looking at some of the available built-in tracers. | 1223 | We'll start by looking at some of the available built-in tracers. |
1224 | 1224 | ||
1225 | cat'ing the ``available_tracers`` file lists the set of available tracers:: | 1225 | The ``available_tracers`` file lists the set of available tracers:: |
1226 | 1226 | ||
1227 | root@sugarbay:/sys/kernel/debug/tracing# cat available_tracers | 1227 | root@sugarbay:/sys/kernel/debug/tracing# cat available_tracers |
1228 | blk function_graph function nop | 1228 | blk function_graph function nop |
@@ -1232,11 +1232,11 @@ The ``current_tracer`` file contains the tracer currently in effect:: | |||
1232 | root@sugarbay:/sys/kernel/debug/tracing# cat current_tracer | 1232 | root@sugarbay:/sys/kernel/debug/tracing# cat current_tracer |
1233 | nop | 1233 | nop |
1234 | 1234 | ||
1235 | The above listing of current_tracer shows that the | 1235 | The above listing of ``current_tracer`` shows that the |
1236 | ``nop`` tracer is in effect, which is just another way of saying that | 1236 | ``nop`` tracer is in effect, which is just another way of saying that |
1237 | there's actually no tracer currently in effect. | 1237 | there's actually no tracer currently in effect. |
1238 | 1238 | ||
1239 | echo'ing one of the available_tracers into ``current_tracer`` makes the | 1239 | Writing one of the available tracers into ``current_tracer`` makes the |
1240 | specified tracer the current tracer:: | 1240 | specified tracer the current tracer:: |
1241 | 1241 | ||
1242 | root@sugarbay:/sys/kernel/debug/tracing# echo function > current_tracer | 1242 | root@sugarbay:/sys/kernel/debug/tracing# echo function > current_tracer |
@@ -1292,7 +1292,7 @@ tracer:: | |||
1292 | . | 1292 | . |
1293 | 1293 | ||
1294 | Each line in the trace above shows what was happening in the kernel on a given | 1294 | Each line in the trace above shows what was happening in the kernel on a given |
1295 | cpu, to the level of detail of function calls. Each entry shows the function | 1295 | CPU, to the level of detail of function calls. Each entry shows the function |
1296 | called, followed by its caller (after the arrow). | 1296 | called, followed by its caller (after the arrow). |
1297 | 1297 | ||
1298 | The function tracer gives you an extremely detailed idea of what the | 1298 | The function tracer gives you an extremely detailed idea of what the |
@@ -1306,7 +1306,7 @@ great way to learn about how the kernel code works in a dynamic sense. | |||
1306 | 1306 | ||
1307 | It is a little more difficult to follow the call chains than it needs to | 1307 | It is a little more difficult to follow the call chains than it needs to |
1308 | be --- luckily there's a variant of the function tracer that displays the | 1308 | be --- luckily there's a variant of the function tracer that displays the |
1309 | callchains explicitly, called the ``function_graph`` tracer:: | 1309 | call chains explicitly, called the ``function_graph`` tracer:: |
1310 | 1310 | ||
1311 | root@sugarbay:/sys/kernel/debug/tracing# echo function_graph > current_tracer | 1311 | root@sugarbay:/sys/kernel/debug/tracing# echo function_graph > current_tracer |
1312 | root@sugarbay:/sys/kernel/debug/tracing# cat trace | less | 1312 | root@sugarbay:/sys/kernel/debug/tracing# cat trace | less |
@@ -1425,7 +1425,7 @@ As you can see, the ``function_graph`` display is much easier | |||
1425 | to follow. Also note that in addition to the function calls and | 1425 | to follow. Also note that in addition to the function calls and |
1426 | associated braces, other events such as scheduler events are displayed | 1426 | associated braces, other events such as scheduler events are displayed |
1427 | in context. In fact, you can freely include any tracepoint available in | 1427 | in context. In fact, you can freely include any tracepoint available in |
1428 | the trace events subsystem described in the next section by simply | 1428 | the trace events subsystem described in the next section by just |
1429 | enabling those events, and they'll appear in context in the function | 1429 | enabling those events, and they'll appear in context in the function |
1430 | graph display. Quite a powerful tool for understanding kernel dynamics. | 1430 | graph display. Quite a powerful tool for understanding kernel dynamics. |
1431 | 1431 | ||
@@ -1528,7 +1528,7 @@ The ``format`` file for the | |||
1528 | tracepoint describes the event in memory, which is used by the various | 1528 | tracepoint describes the event in memory, which is used by the various |
1529 | tracing tools that now make use of these tracepoint to parse the event | 1529 | tracing tools that now make use of these tracepoint to parse the event |
1530 | and make sense of it, along with a ``print fmt`` field that allows tools | 1530 | and make sense of it, along with a ``print fmt`` field that allows tools |
1531 | like ftrace to display the event as text. Here's what the format of the | 1531 | like ftrace to display the event as text. The format of the |
1532 | ``kmalloc`` event looks like:: | 1532 | ``kmalloc`` event looks like:: |
1533 | 1533 | ||
1534 | root@sugarbay:/sys/kernel/debug/tracing/events/kmem/kmalloc# cat format | 1534 | root@sugarbay:/sys/kernel/debug/tracing/events/kmem/kmalloc# cat format |
@@ -1581,7 +1581,7 @@ events in the output buffer:: | |||
1581 | root@sugarbay:/sys/kernel/debug/tracing# echo 1 > tracing_on | 1581 | root@sugarbay:/sys/kernel/debug/tracing# echo 1 > tracing_on |
1582 | 1582 | ||
1583 | Now, if we look at the ``trace`` file, we see nothing | 1583 | Now, if we look at the ``trace`` file, we see nothing |
1584 | but the kmalloc events we just turned on:: | 1584 | but the ``kmalloc`` events we just turned on:: |
1585 | 1585 | ||
1586 | root@sugarbay:/sys/kernel/debug/tracing# cat trace | less | 1586 | root@sugarbay:/sys/kernel/debug/tracing# cat trace | less |
1587 | # tracer: nop | 1587 | # tracer: nop |
@@ -1636,8 +1636,8 @@ using the ``enable`` file in the subsystem directory) and get an | |||
1636 | arbitrarily fine-grained idea of what's going on in the system by | 1636 | arbitrarily fine-grained idea of what's going on in the system by |
1637 | enabling as many of the appropriate tracepoints as applicable. | 1637 | enabling as many of the appropriate tracepoints as applicable. |
1638 | 1638 | ||
1639 | A number of the tools described in this HOWTO do just that, including | 1639 | Several tools described in this How-to do just that, including |
1640 | ``trace-cmd`` and kernelshark in the next section. | 1640 | ``trace-cmd`` and KernelShark in the next section. |
1641 | 1641 | ||
1642 | .. admonition:: Tying it Together | 1642 | .. admonition:: Tying it Together |
1643 | 1643 | ||
@@ -1653,7 +1653,7 @@ A number of the tools described in this HOWTO do just that, including | |||
1653 | ``/sys/kernel/debug/tracing`` will be removed and replaced with | 1653 | ``/sys/kernel/debug/tracing`` will be removed and replaced with |
1654 | equivalent tracers based on the "trace events" subsystem. | 1654 | equivalent tracers based on the "trace events" subsystem. |
1655 | 1655 | ||
1656 | trace-cmd / kernelshark | 1656 | trace-cmd / KernelShark |
1657 | ----------------------- | 1657 | ----------------------- |
1658 | 1658 | ||
1659 | trace-cmd is essentially an extensive command-line "wrapper" interface | 1659 | trace-cmd is essentially an extensive command-line "wrapper" interface |
@@ -1662,31 +1662,30 @@ that hides the details of all the individual files in | |||
1662 | events within the ``/sys/kernel/debug/tracing/events/`` subdirectory and to | 1662 | events within the ``/sys/kernel/debug/tracing/events/`` subdirectory and to |
1663 | collect traces and avoid having to deal with those details directly. | 1663 | collect traces and avoid having to deal with those details directly. |
1664 | 1664 | ||
1665 | As yet another layer on top of that, kernelshark provides a GUI that | 1665 | As yet another layer on top of that, KernelShark provides a GUI that |
1666 | allows users to start and stop traces and specify sets of events using | 1666 | allows users to start and stop traces and specify sets of events using |
1667 | an intuitive interface, and view the output as both trace events and as | 1667 | an intuitive interface, and view the output as both trace events and as |
1668 | a per-CPU graphical display. It directly uses trace-cmd as the | 1668 | a per-CPU graphical display. It directly uses trace-cmd as the |
1669 | plumbing that accomplishes all that underneath the covers (and actually | 1669 | plumbing that accomplishes all that underneath the covers (and actually |
1670 | displays the trace-cmd command it uses, as we'll see). | 1670 | displays the trace-cmd command it uses, as we'll see). |
1671 | 1671 | ||
1672 | To start a trace using kernelshark, first start kernelshark:: | 1672 | To start a trace using KernelShark, first start this tool:: |
1673 | 1673 | ||
1674 | root@sugarbay:~# kernelshark | 1674 | root@sugarbay:~# kernelshark |
1675 | 1675 | ||
1676 | Then bring up the ``Capture`` dialog by | 1676 | Then open up the ``Capture`` dialog by choosing from the KernelShark menu:: |
1677 | choosing from the kernelshark menu:: | ||
1678 | 1677 | ||
1679 | Capture | Record | 1678 | Capture | Record |
1680 | 1679 | ||
1681 | That will display the following dialog, which allows you to choose one or more | 1680 | That will display the following dialog, which allows you to choose one or more |
1682 | events (or even one or more complete subsystems) to trace: | 1681 | events (or even entire subsystems) to trace: |
1683 | 1682 | ||
1684 | .. image:: figures/kernelshark-choose-events.png | 1683 | .. image:: figures/kernelshark-choose-events.png |
1685 | :align: center | 1684 | :align: center |
1686 | 1685 | ||
1687 | Note that these are exactly the same sets of events described in the | 1686 | Note that these are exactly the same sets of events described in the |
1688 | previous trace events subsystem section, and in fact is where trace-cmd | 1687 | previous trace events subsystem section, and in fact is where trace-cmd |
1689 | gets them for kernelshark. | 1688 | gets them for KernelShark. |
1690 | 1689 | ||
1691 | In the above screenshot, we've decided to explore the graphics subsystem | 1690 | In the above screenshot, we've decided to explore the graphics subsystem |
1692 | a bit and so have chosen to trace all the tracepoints contained within | 1691 | a bit and so have chosen to trace all the tracepoints contained within |
@@ -1699,12 +1698,12 @@ will turn into the 'Stop' button after the trace has started): | |||
1699 | .. image:: figures/kernelshark-output-display.png | 1698 | .. image:: figures/kernelshark-output-display.png |
1700 | :align: center | 1699 | :align: center |
1701 | 1700 | ||
1702 | Notice that the right-hand pane shows the exact trace-cmd command-line | 1701 | Notice that the right pane shows the exact trace-cmd command-line |
1703 | that's used to run the trace, along with the results of the trace-cmd | 1702 | that's used to run the trace, along with the results of the trace-cmd |
1704 | run. | 1703 | run. |
1705 | 1704 | ||
1706 | Once the ``Stop`` button is pressed, the graphical view magically fills up | 1705 | Once the ``Stop`` button is pressed, the graphical view magically fills up |
1707 | with a colorful per-cpu display of the trace data, along with the | 1706 | with a colorful per-CPU display of the trace data, along with the |
1708 | detailed event listing below that: | 1707 | detailed event listing below that: |
1709 | 1708 | ||
1710 | .. image:: figures/kernelshark-i915-display.png | 1709 | .. image:: figures/kernelshark-i915-display.png |
@@ -1717,7 +1716,7 @@ events``: | |||
1717 | :align: center | 1716 | :align: center |
1718 | 1717 | ||
1719 | The tool is pretty self-explanatory, but for more detailed information | 1718 | The tool is pretty self-explanatory, but for more detailed information |
1720 | on navigating through the data, see the `kernelshark | 1719 | on navigating through the data, see the `KernelShark |
1721 | website <https://kernelshark.org/Documentation.html>`__. | 1720 | website <https://kernelshark.org/Documentation.html>`__. |
1722 | 1721 | ||
1723 | ftrace Documentation | 1722 | ftrace Documentation |
@@ -1733,41 +1732,41 @@ Documentation directory:: | |||
1733 | 1732 | ||
1734 | Documentation/trace/events.txt | 1733 | Documentation/trace/events.txt |
1735 | 1734 | ||
1736 | There is a nice series of articles on using ftrace and trace-cmd at LWN: | 1735 | A nice series of articles on using ftrace and trace-cmd are available at LWN: |
1737 | 1736 | ||
1738 | - `Debugging the kernel using Ftrace - part | 1737 | - `Debugging the kernel using ftrace - part |
1739 | 1 <https://lwn.net/Articles/365835/>`__ | 1738 | 1 <https://lwn.net/Articles/365835/>`__ |
1740 | 1739 | ||
1741 | - `Debugging the kernel using Ftrace - part | 1740 | - `Debugging the kernel using ftrace - part |
1742 | 2 <https://lwn.net/Articles/366796/>`__ | 1741 | 2 <https://lwn.net/Articles/366796/>`__ |
1743 | 1742 | ||
1744 | - `Secrets of the Ftrace function | 1743 | - `Secrets of the ftrace function |
1745 | tracer <https://lwn.net/Articles/370423/>`__ | 1744 | tracer <https://lwn.net/Articles/370423/>`__ |
1746 | 1745 | ||
1747 | - `trace-cmd: A front-end for | 1746 | - `trace-cmd: A front-end for |
1748 | Ftrace <https://lwn.net/Articles/410200/>`__ | 1747 | ftrace <https://lwn.net/Articles/410200/>`__ |
1749 | 1748 | ||
1750 | See also `KernelShark's documentation <https://kernelshark.org/Documentation.html>`__ | 1749 | See also `KernelShark's documentation <https://kernelshark.org/Documentation.html>`__ |
1751 | for further usage details. | 1750 | for further usage details. |
1752 | 1751 | ||
1753 | An amusing yet useful README (a tracing mini-HOWTO) can be found in | 1752 | An amusing yet useful README (a tracing mini-How-to) can be found in |
1754 | ``/sys/kernel/debug/tracing/README``. | 1753 | ``/sys/kernel/debug/tracing/README``. |
1755 | 1754 | ||
1756 | systemtap | 1755 | SystemTap |
1757 | ========= | 1756 | ========= |
1758 | 1757 | ||
1759 | SystemTap is a system-wide script-based tracing and profiling tool. | 1758 | SystemTap is a system-wide script-based tracing and profiling tool. |
1760 | 1759 | ||
1761 | SystemTap scripts are C-like programs that are executed in the kernel to | 1760 | SystemTap scripts are C-like programs that are executed in the kernel to |
1762 | gather / print / aggregate data extracted from the context they end up being | 1761 | gather / print / aggregate data extracted from the context they end up being |
1763 | invoked under. | 1762 | called under. |
1764 | 1763 | ||
1765 | For example, this probe from the `SystemTap | 1764 | For example, this probe from the `SystemTap |
1766 | tutorial <https://sourceware.org/systemtap/tutorial/>`__ simply prints a | 1765 | tutorial <https://sourceware.org/systemtap/tutorial/>`__ just prints a |
1767 | line every time any process on the system runs ``open()`` on a file. For each line, | 1766 | line every time any process on the system runs ``open()`` on a file. For each line, |
1768 | it prints the executable name of the program that opened the file, along | 1767 | it prints the executable name of the program that opened the file, along |
1769 | with its PID, and the name of the file it opened (or tried to open), | 1768 | with its PID, and the name of the file it opened (or tried to open), which it |
1770 | which it extracts from the open syscall's argstr. | 1769 | extracts from the argument string (``argstr``) of the ``open`` system call. |
1771 | 1770 | ||
1772 | .. code-block:: none | 1771 | .. code-block:: none |
1773 | 1772 | ||
@@ -1782,13 +1781,13 @@ which it extracts from the open syscall's argstr. | |||
1782 | } | 1781 | } |
1783 | 1782 | ||
1784 | Normally, to execute this | 1783 | Normally, to execute this |
1785 | probe, you'd simply install systemtap on the system you want to probe, | 1784 | probe, you'd just install SystemTap on the system you want to probe, |
1786 | and directly run the probe on that system e.g. assuming the name of the | 1785 | and directly run the probe on that system e.g. assuming the name of the |
1787 | file containing the above text is trace_open.stp:: | 1786 | file containing the above text is ``trace_open.stp``:: |
1788 | 1787 | ||
1789 | # stap trace_open.stp | 1788 | # stap trace_open.stp |
1790 | 1789 | ||
1791 | What systemtap does under the covers to run this probe is 1) parse and | 1790 | What SystemTap does under the covers to run this probe is 1) parse and |
1792 | convert the probe to an equivalent "C" form, 2) compile the "C" form | 1791 | convert the probe to an equivalent "C" form, 2) compile the "C" form |
1793 | into a kernel module, 3) insert the module into the kernel, which arms | 1792 | into a kernel module, 3) insert the module into the kernel, which arms |
1794 | it, and 4) collect the data generated by the probe and display it to the | 1793 | it, and 4) collect the data generated by the probe and display it to the |
@@ -1801,25 +1800,25 @@ kernel build system unfortunately isn't typically part of the image | |||
1801 | running on the target. It is normally available on the "host" system | 1800 | running on the target. It is normally available on the "host" system |
1802 | that produced the target image however; in such cases, steps 1 and 2 are | 1801 | that produced the target image however; in such cases, steps 1 and 2 are |
1803 | executed on the host system, and steps 3 and 4 are executed on the | 1802 | executed on the host system, and steps 3 and 4 are executed on the |
1804 | target system, using only the systemtap "runtime". | 1803 | target system, using only the SystemTap "runtime". |
1805 | 1804 | ||
1806 | The systemtap support in Yocto assumes that only steps 3 and 4 are run | 1805 | The SystemTap support in Yocto assumes that only steps 3 and 4 are run |
1807 | on the target; it is possible to do everything on the target, but this | 1806 | on the target; it is possible to do everything on the target, but this |
1808 | section assumes only the typical embedded use-case. | 1807 | section assumes only the typical embedded use-case. |
1809 | 1808 | ||
1810 | So basically what you need to do in order to run a systemtap script on | 1809 | Therefore, what you need to do in order to run a SystemTap script on |
1811 | the target is to 1) on the host system, compile the probe into a kernel | 1810 | the target is to 1) on the host system, compile the probe into a kernel |
1812 | module that makes sense to the target, 2) copy the module onto the | 1811 | module that makes sense to the target, 2) copy the module onto the |
1813 | target system and 3) insert the module into the target kernel, which | 1812 | target system and 3) insert the module into the target kernel, which |
1814 | arms it, and 4) collect the data generated by the probe and display it | 1813 | arms it, and 4) collect the data generated by the probe and display it |
1815 | to the user. | 1814 | to the user. |
1816 | 1815 | ||
1817 | systemtap Setup | 1816 | SystemTap Setup |
1818 | --------------- | 1817 | --------------- |
1819 | 1818 | ||
1820 | Those are a lot of steps and a lot of details, but fortunately Yocto | 1819 | Those are many steps and details, but fortunately Yocto |
1821 | includes a script called ``crosstap`` that will take care of those | 1820 | includes a script called ``crosstap`` that will take care of those |
1822 | details, allowing you to simply execute a systemtap script on the remote | 1821 | details, allowing you to just execute a SystemTap script on the remote |
1823 | target, with arguments if necessary. | 1822 | target, with arguments if necessary. |
1824 | 1823 | ||
1825 | In order to do this from a remote host, however, you need to have access | 1824 | In order to do this from a remote host, however, you need to have access |
@@ -1832,7 +1831,7 @@ having done a build:: | |||
1832 | Error: No target kernel build found. | 1831 | Error: No target kernel build found. |
1833 | Did you forget to create a local build of your image? | 1832 | Did you forget to create a local build of your image? |
1834 | 1833 | ||
1835 | 'crosstap' requires a local sdk build of the target system | 1834 | 'crosstap' requires a local SDK build of the target system |
1836 | (or a build that includes 'tools-profile') in order to build | 1835 | (or a build that includes 'tools-profile') in order to build |
1837 | kernel modules that can probe the target system. | 1836 | kernel modules that can probe the target system. |
1838 | 1837 | ||
@@ -1848,11 +1847,11 @@ Practically speaking, that means you need to do the following: | |||
1848 | the BSP README and/or the widely available basic documentation | 1847 | the BSP README and/or the widely available basic documentation |
1849 | that discusses how to build images). | 1848 | that discusses how to build images). |
1850 | 1849 | ||
1851 | - Build an -sdk version of the image e.g.:: | 1850 | - Build an ``-sdk`` version of the image e.g.:: |
1852 | 1851 | ||
1853 | $ bitbake core-image-sato-sdk | 1852 | $ bitbake core-image-sato-sdk |
1854 | 1853 | ||
1855 | - Or build a non-sdk image but include the profiling tools | 1854 | - Or build a non-SDK image but include the profiling tools |
1856 | (edit ``local.conf`` and add ``tools-profile`` to the end of | 1855 | (edit ``local.conf`` and add ``tools-profile`` to the end of |
1857 | :term:``EXTRA_IMAGE_FEATURES`` variable):: | 1856 | :term:``EXTRA_IMAGE_FEATURES`` variable):: |
1858 | 1857 | ||
@@ -1868,15 +1867,14 @@ Practically speaking, that means you need to do the following: | |||
1868 | 1867 | ||
1869 | .. note:: | 1868 | .. note:: |
1870 | 1869 | ||
1871 | SystemTap, which uses ``crosstap``, assumes you can establish an ssh | 1870 | SystemTap, which uses ``crosstap``, assumes you can establish an SSH |
1872 | connection to the remote target. Please refer to the crosstap wiki | 1871 | connection to the remote target. Please refer to the crosstap wiki |
1873 | page for details on verifying ssh connections. Also, the ability to ssh | 1872 | page for details on verifying SSH connections. Also, the ability to SSH |
1874 | into the target system is not enabled by default in ``*-minimal`` images. | 1873 | into the target system is not enabled by default in ``*-minimal`` images. |
1875 | 1874 | ||
1876 | So essentially what you need to | 1875 | Therefore, what you need to do is build an SDK image or image with |
1877 | do is build an SDK image or image with ``tools-profile`` as detailed in | 1876 | ``tools-profile`` as detailed in the ":ref:`profile-manual/intro:General Setup`" |
1878 | the ":ref:`profile-manual/intro:General Setup`" section of this | 1877 | section of this manual, and boot the resulting target image. |
1879 | manual, and boot the resulting target image. | ||
1880 | 1878 | ||
1881 | .. note:: | 1879 | .. note:: |
1882 | 1880 | ||
@@ -1884,12 +1882,12 @@ manual, and boot the resulting target image. | |||
1884 | to have the :term:`MACHINE` you're connecting to selected in ``local.conf``, and | 1882 | to have the :term:`MACHINE` you're connecting to selected in ``local.conf``, and |
1885 | the kernel in that machine's :term:`Build Directory` must match the kernel on | 1883 | the kernel in that machine's :term:`Build Directory` must match the kernel on |
1886 | the booted system exactly, or you'll get the above ``crosstap`` message | 1884 | the booted system exactly, or you'll get the above ``crosstap`` message |
1887 | when you try to invoke a script. | 1885 | when you try to call a script. |
1888 | 1886 | ||
1889 | Running a Script on a Target | 1887 | Running a Script on a Target |
1890 | ---------------------------- | 1888 | ---------------------------- |
1891 | 1889 | ||
1892 | Once you've done that, you should be able to run a systemtap script on | 1890 | Once you've done that, you should be able to run a SystemTap script on |
1893 | the target:: | 1891 | the target:: |
1894 | 1892 | ||
1895 | $ cd /path/to/yocto | 1893 | $ cd /path/to/yocto |
@@ -1918,7 +1916,7 @@ If you get an error connecting to the target e.g.:: | |||
1918 | $ crosstap root@192.168.7.2 trace_open.stp | 1916 | $ crosstap root@192.168.7.2 trace_open.stp |
1919 | error establishing ssh connection on remote 'root@192.168.7.2' | 1917 | error establishing ssh connection on remote 'root@192.168.7.2' |
1920 | 1918 | ||
1921 | Try ssh'ing to the target and see what happens:: | 1919 | Try connecting to the target through SSH and see what happens:: |
1922 | 1920 | ||
1923 | $ ssh root@192.168.7.2 | 1921 | $ ssh root@192.168.7.2 |
1924 | 1922 | ||
@@ -1936,7 +1934,7 @@ no password): | |||
1936 | matchbox-termin(1036) open ("/tmp/vte3FS2LW", O_RDWR|O_CREAT|O_EXCL|O_LARGEFILE, 0600) | 1934 | matchbox-termin(1036) open ("/tmp/vte3FS2LW", O_RDWR|O_CREAT|O_EXCL|O_LARGEFILE, 0600) |
1937 | matchbox-termin(1036) open ("/tmp/vteJMC7LW", O_RDWR|O_CREAT|O_EXCL|O_LARGEFILE, 0600) | 1935 | matchbox-termin(1036) open ("/tmp/vteJMC7LW", O_RDWR|O_CREAT|O_EXCL|O_LARGEFILE, 0600) |
1938 | 1936 | ||
1939 | systemtap Documentation | 1937 | SystemTap Documentation |
1940 | ----------------------- | 1938 | ----------------------- |
1941 | 1939 | ||
1942 | The SystemTap language reference can be found here: `SystemTap Language | 1940 | The SystemTap language reference can be found here: `SystemTap Language |
@@ -1949,7 +1947,7 @@ page <https://sourceware.org/systemtap/documentation.html>`__ | |||
1949 | Sysprof | 1947 | Sysprof |
1950 | ======= | 1948 | ======= |
1951 | 1949 | ||
1952 | Sysprof is a very easy to use system-wide profiler that consists of a | 1950 | Sysprof is an easy to use system-wide profiler that consists of a |
1953 | single window with three panes and a few buttons which allow you to | 1951 | single window with three panes and a few buttons which allow you to |
1954 | start, stop, and view the profile from one place. | 1952 | start, stop, and view the profile from one place. |
1955 | 1953 | ||
@@ -1959,16 +1957,16 @@ Sysprof Setup | |||
1959 | For this section, we'll assume you've already performed the basic setup | 1957 | For this section, we'll assume you've already performed the basic setup |
1960 | outlined in the ":ref:`profile-manual/intro:General Setup`" section. | 1958 | outlined in the ":ref:`profile-manual/intro:General Setup`" section. |
1961 | 1959 | ||
1962 | Sysprof is a GUI-based application that runs on the target system. For | 1960 | Sysprof is a GUI-based application that runs on the target system. For the rest |
1963 | the rest of this document we assume you've ssh'ed to the host and will | 1961 | of this document we assume you're connected to the host through SSH and will be |
1964 | be running Sysprof on the target (you can use the ``-X`` option to ssh and | 1962 | running Sysprof on the target (you can use the ``-X`` option to ``ssh`` and |
1965 | have the Sysprof GUI run on the target but display remotely on the host | 1963 | have the Sysprof GUI run on the target but display remotely on the host |
1966 | if you want). | 1964 | if you want). |
1967 | 1965 | ||
1968 | Basic Sysprof Usage | 1966 | Basic Sysprof Usage |
1969 | ------------------- | 1967 | ------------------- |
1970 | 1968 | ||
1971 | To start profiling the system, you simply press the ``Start`` button. To | 1969 | To start profiling the system, you just press the ``Start`` button. To |
1972 | stop profiling and to start viewing the profile data in one easy step, | 1970 | stop profiling and to start viewing the profile data in one easy step, |
1973 | press the ``Profile`` button. | 1971 | press the ``Profile`` button. |
1974 | 1972 | ||
@@ -1981,11 +1979,11 @@ with profiling data: | |||
1981 | The left pane shows a list of functions and processes. Selecting one of | 1979 | The left pane shows a list of functions and processes. Selecting one of |
1982 | those expands that function in the right pane, showing all its callees. | 1980 | those expands that function in the right pane, showing all its callees. |
1983 | Note that this caller-oriented display is essentially the inverse of | 1981 | Note that this caller-oriented display is essentially the inverse of |
1984 | perf's default callee-oriented callchain display. | 1982 | perf's default callee-oriented call chain display. |
1985 | 1983 | ||
1986 | In the screenshot above, we're focusing on ``__copy_to_user_ll()`` and | 1984 | In the screenshot above, we're focusing on ``__copy_to_user_ll()`` and |
1987 | looking up the callchain we can see that one of the callers of | 1985 | looking up the call chain we can see that one of the callers of |
1988 | ``__copy_to_user_ll`` is ``sys_read()`` and the complete callpath between them. | 1986 | ``__copy_to_user_ll`` is ``sys_read()`` and the complete call path between them. |
1989 | Notice that this is essentially a portion of the same information we saw | 1987 | Notice that this is essentially a portion of the same information we saw |
1990 | in the perf display shown in the perf section of this page. | 1988 | in the perf display shown in the perf section of this page. |
1991 | 1989 | ||
@@ -1993,7 +1991,7 @@ in the perf display shown in the perf section of this page. | |||
1993 | :align: center | 1991 | :align: center |
1994 | 1992 | ||
1995 | Similarly, the above is a snapshot of the Sysprof display of a | 1993 | Similarly, the above is a snapshot of the Sysprof display of a |
1996 | ``copy-from-user`` callchain. | 1994 | ``copy-from-user`` call chain. |
1997 | 1995 | ||
1998 | Finally, looking at the third Sysprof pane in the lower left, we can see | 1996 | Finally, looking at the third Sysprof pane in the lower left, we can see |
1999 | a list of all the callers of a particular function selected in the top | 1997 | a list of all the callers of a particular function selected in the top |
@@ -2008,18 +2006,17 @@ to the selected function, and so on. | |||
2008 | 2006 | ||
2009 | .. admonition:: Tying it Together | 2007 | .. admonition:: Tying it Together |
2010 | 2008 | ||
2011 | If you like sysprof's ``caller-oriented`` display, you may be able to | 2009 | If you like Sysprof's ``caller-oriented`` display, you may be able to |
2012 | approximate it in other tools as well. For example, ``perf report`` has | 2010 | approximate it in other tools as well. For example, ``perf report`` has |
2013 | the ``-g`` (``--call-graph``) option that you can experiment with; one of the | 2011 | the ``-g`` (``--call-graph``) option that you can experiment with; one of the |
2014 | options is ``caller`` for an inverted caller-based callgraph display. | 2012 | options is ``caller`` for an inverted caller-based call graph display. |
2015 | 2013 | ||
2016 | Sysprof Documentation | 2014 | Sysprof Documentation |
2017 | --------------------- | 2015 | --------------------- |
2018 | 2016 | ||
2019 | There doesn't seem to be any documentation for Sysprof, but maybe that's | 2017 | There doesn't seem to be any documentation for Sysprof, but maybe that's |
2020 | because it's pretty self-explanatory. The Sysprof website, however, is | 2018 | because it's pretty self-explanatory. The Sysprof website, however, is here: |
2021 | here: `Sysprof, System-wide Performance Profiler for | 2019 | `Sysprof, System-wide Performance Profiler for Linux <http://sysprof.com/>`__ |
2022 | Linux <http://sysprof.com/>`__ | ||
2023 | 2020 | ||
2024 | LTTng (Linux Trace Toolkit, next generation) | 2021 | LTTng (Linux Trace Toolkit, next generation) |
2025 | ============================================ | 2022 | ============================================ |
@@ -2029,7 +2026,7 @@ LTTng Setup | |||
2029 | 2026 | ||
2030 | For this section, we'll assume you've already performed the basic setup | 2027 | For this section, we'll assume you've already performed the basic setup |
2031 | outlined in the ":ref:`profile-manual/intro:General Setup`" section. | 2028 | outlined in the ":ref:`profile-manual/intro:General Setup`" section. |
2032 | LTTng is run on the target system by ssh'ing to it. | 2029 | LTTng is run on the target system by connecting to it through SSH. |
2033 | 2030 | ||
2034 | Collecting and Viewing Traces | 2031 | Collecting and Viewing Traces |
2035 | ----------------------------- | 2032 | ----------------------------- |
@@ -2042,7 +2039,7 @@ tracing. | |||
2042 | Collecting and viewing a trace on the target (inside a shell) | 2039 | Collecting and viewing a trace on the target (inside a shell) |
2043 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | 2040 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
2044 | 2041 | ||
2045 | First, from the host, ssh to the target:: | 2042 | First, from the host, connect to the target through SSH:: |
2046 | 2043 | ||
2047 | $ ssh -l root 192.168.1.47 | 2044 | $ ssh -l root 192.168.1.47 |
2048 | The authenticity of host '192.168.1.47 (192.168.1.47)' can't be established. | 2045 | The authenticity of host '192.168.1.47 (192.168.1.47)' can't be established. |
@@ -2134,11 +2131,11 @@ supplying your own name to ``lttng create``):: | |||
2134 | drwxr-xr-x 5 root root 1024 Oct 15 23:57 .. | 2131 | drwxr-xr-x 5 root root 1024 Oct 15 23:57 .. |
2135 | drwxrwx--- 3 root root 1024 Oct 15 23:21 auto-20121015-232120 | 2132 | drwxrwx--- 3 root root 1024 Oct 15 23:21 auto-20121015-232120 |
2136 | 2133 | ||
2137 | Collecting and viewing a userspace trace on the target (inside a shell) | 2134 | Collecting and viewing a user space trace on the target (inside a shell) |
2138 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ | 2135 | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
2139 | 2136 | ||
2140 | For LTTng userspace tracing, you need to have a properly instrumented | 2137 | For LTTng user space tracing, you need to have a properly instrumented |
2141 | userspace program. For this example, we'll use the ``hello`` test program | 2138 | user space program. For this example, we'll use the ``hello`` test program |
2142 | generated by the ``lttng-ust`` build. | 2139 | generated by the ``lttng-ust`` build. |
2143 | 2140 | ||
2144 | The ``hello`` test program isn't installed on the root filesystem by the ``lttng-ust`` | 2141 | The ``hello`` test program isn't installed on the root filesystem by the ``lttng-ust`` |
@@ -2154,7 +2151,7 @@ Copy that over to the target machine:: | |||
2154 | You now have the instrumented LTTng "hello world" test program on the | 2151 | You now have the instrumented LTTng "hello world" test program on the |
2155 | target, ready to test. | 2152 | target, ready to test. |
2156 | 2153 | ||
2157 | First, from the host, ssh to the target:: | 2154 | First, from the host, connect to the target through SSH:: |
2158 | 2155 | ||
2159 | $ ssh -l root 192.168.1.47 | 2156 | $ ssh -l root 192.168.1.47 |
2160 | The authenticity of host '192.168.1.47 (192.168.1.47)' can't be established. | 2157 | The authenticity of host '192.168.1.47 (192.168.1.47)' can't be established. |
@@ -2169,7 +2166,7 @@ Once on the target, use these steps to create a trace:: | |||
2169 | Session auto-20190303-021943 created. | 2166 | Session auto-20190303-021943 created. |
2170 | Traces will be written in /home/root/lttng-traces/auto-20190303-021943 | 2167 | Traces will be written in /home/root/lttng-traces/auto-20190303-021943 |
2171 | 2168 | ||
2172 | Enable the events you want to trace (in this case all userspace events):: | 2169 | Enable the events you want to trace (in this case all user space events):: |
2173 | 2170 | ||
2174 | root@crownbay:~# lttng enable-event --userspace --all | 2171 | root@crownbay:~# lttng enable-event --userspace --all |
2175 | All UST events are enabled in channel channel0 | 2172 | All UST events are enabled in channel channel0 |
@@ -2241,13 +2238,13 @@ the entire blktrace and blkparse pipeline on the target, or you can run | |||
2241 | blktrace in 'listen' mode on the target and have blktrace and blkparse | 2238 | blktrace in 'listen' mode on the target and have blktrace and blkparse |
2242 | collect and analyze the data on the host (see the | 2239 | collect and analyze the data on the host (see the |
2243 | ":ref:`profile-manual/usage:Using blktrace Remotely`" section | 2240 | ":ref:`profile-manual/usage:Using blktrace Remotely`" section |
2244 | below). For the rest of this section we assume you've ssh'ed to the host and | 2241 | below). For the rest of this section we assume you've to the host through SSH |
2245 | will be running blkrace on the target. | 2242 | and will be running blktrace on the target. |
2246 | 2243 | ||
2247 | Basic blktrace Usage | 2244 | Basic blktrace Usage |
2248 | -------------------- | 2245 | -------------------- |
2249 | 2246 | ||
2250 | To record a trace, simply run the ``blktrace`` command, giving it the name | 2247 | To record a trace, just run the ``blktrace`` command, giving it the name |
2251 | of the block device you want to trace activity on:: | 2248 | of the block device you want to trace activity on:: |
2252 | 2249 | ||
2253 | root@crownbay:~# blktrace /dev/sdc | 2250 | root@crownbay:~# blktrace /dev/sdc |
@@ -2258,10 +2255,10 @@ In another shell, execute a workload you want to trace:: | |||
2258 | Connecting to downloads.yoctoproject.org (140.211.169.59:80) | 2255 | Connecting to downloads.yoctoproject.org (140.211.169.59:80) |
2259 | linux-2.6.19.2.tar.b 100% \|*******************************\| 41727k 0:00:00 ETA | 2256 | linux-2.6.19.2.tar.b 100% \|*******************************\| 41727k 0:00:00 ETA |
2260 | 2257 | ||
2261 | Press Ctrl-C in the blktrace shell to stop the trace. It | 2258 | Press ``Ctrl-C`` in the blktrace shell to stop the trace. It |
2262 | will display how many events were logged, along with the per-cpu file | 2259 | will display how many events were logged, along with the per-cpu file |
2263 | sizes (blktrace records traces in per-cpu kernel buffers and simply | 2260 | sizes (blktrace records traces in per-cpu kernel buffers and just |
2264 | dumps them to userspace for blkparse to merge and sort later):: | 2261 | dumps them to user space for blkparse to merge and sort later):: |
2265 | 2262 | ||
2266 | ^C=== sdc === | 2263 | ^C=== sdc === |
2267 | CPU 0: 7082 events, 332 KiB data | 2264 | CPU 0: 7082 events, 332 KiB data |
@@ -2277,7 +2274,7 @@ with the device name as the first part of the filename:: | |||
2277 | -rw-r--r-- 1 root root 339938 Oct 27 22:40 sdc.blktrace.0 | 2274 | -rw-r--r-- 1 root root 339938 Oct 27 22:40 sdc.blktrace.0 |
2278 | -rw-r--r-- 1 root root 75753 Oct 27 22:40 sdc.blktrace.1 | 2275 | -rw-r--r-- 1 root root 75753 Oct 27 22:40 sdc.blktrace.1 |
2279 | 2276 | ||
2280 | To view the trace events, simply invoke ``blkparse`` in the directory | 2277 | To view the trace events, just call ``blkparse`` in the directory |
2281 | containing the trace files, giving it the device name that forms the | 2278 | containing the trace files, giving it the device name that forms the |
2282 | first part of the filenames:: | 2279 | first part of the filenames:: |
2283 | 2280 | ||
@@ -2376,8 +2373,8 @@ Live Mode | |||
2376 | ~~~~~~~~~ | 2373 | ~~~~~~~~~ |
2377 | 2374 | ||
2378 | blktrace and blkparse are designed from the ground up to be able to | 2375 | blktrace and blkparse are designed from the ground up to be able to |
2379 | operate together in a "pipe mode" where the stdout of blktrace can be | 2376 | operate together in a "pipe mode" where the standard output of blktrace can be |
2380 | fed directly into the stdin of blkparse:: | 2377 | fed directly into the standard input of blkparse:: |
2381 | 2378 | ||
2382 | root@crownbay:~# blktrace /dev/sdc -o - | blkparse -i - | 2379 | root@crownbay:~# blktrace /dev/sdc -o - | blkparse -i - |
2383 | 2380 | ||
@@ -2446,7 +2443,7 @@ just ended:: | |||
2446 | Total: 11800 events (dropped 0), 554 KiB data | 2443 | Total: 11800 events (dropped 0), 554 KiB data |
2447 | 2444 | ||
2448 | The blktrace instance on the host will | 2445 | The blktrace instance on the host will |
2449 | save the target output inside a hostname-timestamp directory:: | 2446 | save the target output inside a ``<hostname>-<timestamp>`` directory:: |
2450 | 2447 | ||
2451 | $ ls -al | 2448 | $ ls -al |
2452 | drwxr-xr-x 10 root root 1024 Oct 28 02:40 . | 2449 | drwxr-xr-x 10 root root 1024 Oct 28 02:40 . |
@@ -2518,7 +2515,7 @@ Tracing Block I/O via 'ftrace' | |||
2518 | It's also possible to trace block I/O using only | 2515 | It's also possible to trace block I/O using only |
2519 | :ref:`profile-manual/usage:The 'trace events' Subsystem`, which | 2516 | :ref:`profile-manual/usage:The 'trace events' Subsystem`, which |
2520 | can be useful for casual tracing if you don't want to bother dealing with the | 2517 | can be useful for casual tracing if you don't want to bother dealing with the |
2521 | userspace tools. | 2518 | user space tools. |
2522 | 2519 | ||
2523 | To enable tracing for a given device, use ``/sys/block/xxx/trace/enable``, | 2520 | To enable tracing for a given device, use ``/sys/block/xxx/trace/enable``, |
2524 | where ``xxx`` is the device name. This for example enables tracing for | 2521 | where ``xxx`` is the device name. This for example enables tracing for |
@@ -2576,7 +2573,7 @@ section can be found here: | |||
2576 | - https://linux.die.net/man/8/btrace | 2573 | - https://linux.die.net/man/8/btrace |
2577 | 2574 | ||
2578 | The above manual pages, along with manuals for the other blktrace utilities | 2575 | The above manual pages, along with manuals for the other blktrace utilities |
2579 | (btt, blkiomon, etc) can be found in the ``/doc`` directory of the blktrace | 2576 | (``btt``, ``blkiomon``, etc) can be found in the ``/doc`` directory of the blktrace |
2580 | tools git repo:: | 2577 | tools git repository:: |
2581 | 2578 | ||
2582 | $ git clone git://git.kernel.dk/blktrace.git | 2579 | $ git clone git://git.kernel.dk/blktrace.git |
diff --git a/documentation/styles/config/vocabularies/OpenSource/accept.txt b/documentation/styles/config/vocabularies/OpenSource/accept.txt index 98e76ae1f5..e378fbf79b 100644 --- a/documentation/styles/config/vocabularies/OpenSource/accept.txt +++ b/documentation/styles/config/vocabularies/OpenSource/accept.txt | |||
@@ -1,4 +1,20 @@ | |||
1 | autovivification | ||
2 | blkparse | ||
3 | blktrace | ||
4 | callee | ||
5 | debugfs | ||
1 | ftrace | 6 | ftrace |
2 | toolchain | 7 | KernelShark |
3 | systemd | 8 | Kprobe |
4 | LTTng | 9 | LTTng |
10 | perf | ||
11 | profiler | ||
12 | subcommand | ||
13 | subnode | ||
14 | superset | ||
15 | Sysprof | ||
16 | systemd | ||
17 | toolchain | ||
18 | tracepoint | ||
19 | Uprobe | ||
20 | wget | ||
diff --git a/documentation/styles/config/vocabularies/Yocto/accept.txt b/documentation/styles/config/vocabularies/Yocto/accept.txt index b725414014..ca622ba412 100644 --- a/documentation/styles/config/vocabularies/Yocto/accept.txt +++ b/documentation/styles/config/vocabularies/Yocto/accept.txt | |||
@@ -1,4 +1,5 @@ | |||
1 | Yocto | ||
2 | BSP | ||
3 | BitBake | 1 | BitBake |
2 | BSP | ||
3 | crosstap | ||
4 | OpenEmbedded | 4 | OpenEmbedded |
5 | Yocto | ||