summaryrefslogtreecommitdiffstats
path: root/documentation/profile-manual
diff options
context:
space:
mode:
Diffstat (limited to 'documentation/profile-manual')
-rw-r--r--documentation/profile-manual/profile-manual-usage.xml353
1 files changed, 353 insertions, 0 deletions
diff --git a/documentation/profile-manual/profile-manual-usage.xml b/documentation/profile-manual/profile-manual-usage.xml
index 71feb418fd..cfe916a795 100644
--- a/documentation/profile-manual/profile-manual-usage.xml
+++ b/documentation/profile-manual/profile-manual-usage.xml
@@ -981,10 +981,363 @@
981 </para> 981 </para>
982 </section> 982 </section>
983 983
984 <section id='system-wide-tracing-and-profiling'>
985 <title>System-Wide Tracing and Profiling</title>
984 986
987 <para>
988 The examples so far have focused on tracing a particular program or
989 workload - in other words, every profiling run has specified the
990 program to profile in the command-line e.g. 'perf record wget ...'.
991 </para>
992
993 <para>
994 It's also possible, and more interesting in many cases, to run a
995 system-wide profile or trace while running the workload in a
996 separate shell.
997 </para>
998
999 <para>
1000 To do system-wide profiling or tracing, you typically use
1001 the -a flag to 'perf record'.
1002 </para>
1003
1004 <para>
1005 To demonstrate this, open up one window and start the profile
1006 using the -a flag (press Ctrl-C to stop tracing):
1007 <literallayout class='monospaced'>
1008 root@crownbay:~# perf record -g -a
1009 ^C[ perf record: Woken up 6 times to write data ]
1010 [ perf record: Captured and wrote 1.400 MB perf.data (~61172 samples) ]
1011 </literallayout>
1012 In another window, run the wget test:
1013 <literallayout class='monospaced'>
1014 root@crownbay:~# wget <ulink url='http://downloads.yoctoproject.org/mirror/sources/linux-2.6.19.2.tar.bz2'>http://downloads.yoctoproject.org/mirror/sources/linux-2.6.19.2.tar.bz2</ulink>
1015 Connecting to downloads.yoctoproject.org (140.211.169.59:80)
1016 linux-2.6.19.2.tar.b 100% |*******************************| 41727k 0:00:00 ETA
1017 </literallayout>
1018 Here we see entries not only for our wget load, but for other
1019 processes running on the system as well:
1020 </para>
1021
1022 <para>
1023 <imagedata fileref="figures/perf-systemwide.png" width="6in" depth="7in" align="center" scalefit="1" />
1024 </para>
1025
1026 <para>
1027 In the snapshot above, we can see callchains that originate in
1028 libc, and a callchain from Xorg that demonstrates that we're
1029 using a proprietary X driver in userspace (notice the presence
1030 of 'PVR' and some other unresolvable symbols in the expanded
1031 Xorg callchain).
1032 </para>
1033
1034 <para>
1035 Note also that we have both kernel and userspace entries in the
1036 above snapshot. We can also tell perf to focus on userspace but
1037 providing a modifier, in this case 'u', to the 'cycles' hardware
1038 counter when we record a profile:
1039 <literallayout class='monospaced'>
1040 root@crownbay:~# perf record -g -a -e cycles:u
1041 ^C[ perf record: Woken up 2 times to write data ]
1042 [ perf record: Captured and wrote 0.376 MB perf.data (~16443 samples) ]
1043 </literallayout>
1044 </para>
1045
1046 <para>
1047 <imagedata fileref="figures/perf-report-cycles-u.png" width="6in" depth="7in" align="center" scalefit="1" />
1048 </para>
1049
1050 <para>
1051 Notice in the screenshot above, we see only userspace entries ([.])
1052 </para>
1053
1054 <para>
1055 Finally, we can press 'enter' on a leaf node and select the 'Zoom
1056 into DSO' menu item to show only entries associated with a
1057 specific DSO. In the screenshot below, we've zoomed into the
1058 'libc' DSO which shows all the entries associated with the
1059 libc-xxx.so DSO.
1060 </para>
1061
1062 <para>
1063 <imagedata fileref="figures/perf-systemwide-libc.png" width="6in" depth="7in" align="center" scalefit="1" />
1064 </para>
1065
1066 <para>
1067 We can also use the system-wide -a switch to do system-wide
1068 tracing. Here we'll trace a couple of scheduler events:
1069 <literallayout class='monospaced'>
1070 root@crownbay:~# perf record -a -e sched:sched_switch -e sched:sched_wakeup
1071 ^C[ perf record: Woken up 38 times to write data ]
1072 [ perf record: Captured and wrote 9.780 MB perf.data (~427299 samples) ]
1073 </literallayout>
1074 We can look at the raw output using 'perf script' with no
1075 arguments:
1076 <literallayout class='monospaced'>
1077 root@crownbay:~# perf script
1078
1079 perf 1383 [001] 6171.460045: sched_wakeup: comm=kworker/1:1 pid=21 prio=120 success=1 target_cpu=001
1080 perf 1383 [001] 6171.460066: sched_switch: prev_comm=perf prev_pid=1383 prev_prio=120 prev_state=R+ ==> next_comm=kworker/1:1 next_pid=21 next_prio=120
1081 kworker/1:1 21 [001] 6171.460093: sched_switch: prev_comm=kworker/1:1 prev_pid=21 prev_prio=120 prev_state=S ==> next_comm=perf next_pid=1383 next_prio=120
1082 swapper 0 [000] 6171.468063: sched_wakeup: comm=kworker/0:3 pid=1209 prio=120 success=1 target_cpu=000
1083 swapper 0 [000] 6171.468107: sched_switch: prev_comm=swapper/0 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=kworker/0:3 next_pid=1209 next_prio=120
1084 kworker/0:3 1209 [000] 6171.468143: sched_switch: prev_comm=kworker/0:3 prev_pid=1209 prev_prio=120 prev_state=S ==> next_comm=swapper/0 next_pid=0 next_prio=120
1085 perf 1383 [001] 6171.470039: sched_wakeup: comm=kworker/1:1 pid=21 prio=120 success=1 target_cpu=001
1086 perf 1383 [001] 6171.470058: sched_switch: prev_comm=perf prev_pid=1383 prev_prio=120 prev_state=R+ ==> next_comm=kworker/1:1 next_pid=21 next_prio=120
1087 kworker/1:1 21 [001] 6171.470082: sched_switch: prev_comm=kworker/1:1 prev_pid=21 prev_prio=120 prev_state=S ==> next_comm=perf next_pid=1383 next_prio=120
1088 perf 1383 [001] 6171.480035: sched_wakeup: comm=kworker/1:1 pid=21 prio=120 success=1 target_cpu=001
1089 </literallayout>
1090 </para>
985 1091
1092 <section id='perf-filtering'>
1093 <title>Filtering</title>
986 1094
1095 <para>
1096 Notice that there are a lot of events that don't really have
1097 anything to do with what we're interested in, namely events
1098 that schedule 'perf' itself in and out or that wake perf up.
1099 We can get rid of those by using the '--filter' option -
1100 for each event we specify using -e, we can add a --filter
1101 after that to filter out trace events that contain fields
1102 with specific values:
1103 <literallayout class='monospaced'>
1104 root@crownbay:~# perf record -a -e sched:sched_switch --filter 'next_comm != perf &amp;&amp; prev_comm != perf' -e sched:sched_wakeup --filter 'comm != perf'
1105 ^C[ perf record: Woken up 38 times to write data ]
1106 [ perf record: Captured and wrote 9.688 MB perf.data (~423279 samples) ]
987 1107
1108
1109 root@crownbay:~# perf script
1110
1111 swapper 0 [000] 7932.162180: sched_switch: prev_comm=swapper/0 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=kworker/0:3 next_pid=1209 next_prio=120
1112 kworker/0:3 1209 [000] 7932.162236: sched_switch: prev_comm=kworker/0:3 prev_pid=1209 prev_prio=120 prev_state=S ==> next_comm=swapper/0 next_pid=0 next_prio=120
1113 perf 1407 [001] 7932.170048: sched_wakeup: comm=kworker/1:1 pid=21 prio=120 success=1 target_cpu=001
1114 perf 1407 [001] 7932.180044: sched_wakeup: comm=kworker/1:1 pid=21 prio=120 success=1 target_cpu=001
1115 perf 1407 [001] 7932.190038: sched_wakeup: comm=kworker/1:1 pid=21 prio=120 success=1 target_cpu=001
1116 perf 1407 [001] 7932.200044: sched_wakeup: comm=kworker/1:1 pid=21 prio=120 success=1 target_cpu=001
1117 perf 1407 [001] 7932.210044: sched_wakeup: comm=kworker/1:1 pid=21 prio=120 success=1 target_cpu=001
1118 perf 1407 [001] 7932.220044: sched_wakeup: comm=kworker/1:1 pid=21 prio=120 success=1 target_cpu=001
1119 swapper 0 [001] 7932.230111: sched_wakeup: comm=kworker/1:1 pid=21 prio=120 success=1 target_cpu=001
1120 swapper 0 [001] 7932.230146: sched_switch: prev_comm=swapper/1 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=kworker/1:1 next_pid=21 next_prio=120
1121 kworker/1:1 21 [001] 7932.230205: sched_switch: prev_comm=kworker/1:1 prev_pid=21 prev_prio=120 prev_state=S ==> next_comm=swapper/1 next_pid=0 next_prio=120
1122 swapper 0 [000] 7932.326109: sched_wakeup: comm=kworker/0:3 pid=1209 prio=120 success=1 target_cpu=000
1123 swapper 0 [000] 7932.326171: sched_switch: prev_comm=swapper/0 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=kworker/0:3 next_pid=1209 next_prio=120
1124 kworker/0:3 1209 [000] 7932.326214: sched_switch: prev_comm=kworker/0:3 prev_pid=1209 prev_prio=120 prev_state=S ==> next_comm=swapper/0 next_pid=0 next_prio=120
1125 </literallayout>
1126 In this case, we've filtered out all events that have 'perf'
1127 in their 'comm' or 'comm_prev' or 'comm_next' fields. Notice
1128 that there are still events recorded for perf, but notice
1129 that those events don't have values of 'perf' for the filtered
1130 fields. To completely filter out anything from perf will
1131 require a bit more work, but for the purpose of demonstrating
1132 how to use filters, it's close enough.
1133 </para>
1134
1135 <note>
1136 Tying It Together: These are exactly the same set of event
1137 filters defined by the trace event subsystem. See the
1138 ftrace/tracecmd/kernelshark section for more discussion about
1139 these event filters.
1140 </note>
1141
1142 <note>
1143 Tying It Together: These event filters are implemented by a
1144 special-purpose pseudo-interpreter in the kernel and are an
1145 integral and indispensable part of the perf design as it
1146 relates to tracing. kernel-based event filters provide a
1147 mechanism to precisely throttle the event stream that appears
1148 in user space, where it makes sense to provide bindings to real
1149 programming languages for postprocessing the event stream.
1150 This architecture allows for the intelligent and flexible
1151 partitioning of processing between the kernel and user space.
1152 Contrast this with other tools such as SystemTap, which does
1153 all of its processing in the kernel and as such requires a
1154 special project-defined language in order to accommodate that
1155 design, or LTTng, where everything is sent to userspace and
1156 as such requires a super-efficient kernel-to-userspace
1157 transport mechanism in order to function properly. While
1158 perf certainly can benefit from for instance advances in
1159 the design of the transport, it doesn't fundamentally depend
1160 on them. Basically, if you find that your perf tracing
1161 application is causing buffer I/O overruns, it probably
1162 means that you aren't taking enough advantage of the
1163 kernel filtering engine.
1164 </note>
1165 </section>
1166 </section>
1167
1168 <section id='using-dynamic-tracepoints'>
1169 <title>Using Dynamic Tracepoints</title>
1170
1171 <para>
1172 perf isn't restricted to the fixed set of static tracepoints
1173 listed by 'perf list'. Users can also add their own 'dynamic'
1174 tracepoints anywhere in the kernel. For instance, suppose we
1175 want to define our own tracepoint on do_fork(). We can do that
1176 using the 'perf probe' perf subcommand:
1177 <literallayout class='monospaced'>
1178 root@crownbay:~# perf probe do_fork
1179 Added new event:
1180 probe:do_fork (on do_fork)
1181
1182 You can now use it in all perf tools, such as:
1183
1184 perf record -e probe:do_fork -aR sleep 1
1185 </literallayout>
1186 Adding a new tracepoint via 'perf probe' results in an event
1187 with all the expected files and format in
1188 /sys/kernel/debug/tracing/events, just the same as for static
1189 tracepoints (as discussed in more detail in the trace events
1190 subsystem section:
1191 <literallayout class='monospaced'>
1192 root@crownbay:/sys/kernel/debug/tracing/events/probe/do_fork# ls -al
1193 drwxr-xr-x 2 root root 0 Oct 28 11:42 .
1194 drwxr-xr-x 3 root root 0 Oct 28 11:42 ..
1195 -rw-r--r-- 1 root root 0 Oct 28 11:42 enable
1196 -rw-r--r-- 1 root root 0 Oct 28 11:42 filter
1197 -r--r--r-- 1 root root 0 Oct 28 11:42 format
1198 -r--r--r-- 1 root root 0 Oct 28 11:42 id
1199
1200 root@crownbay:/sys/kernel/debug/tracing/events/probe/do_fork# cat format
1201 name: do_fork
1202 ID: 944
1203 format:
1204 field:unsigned short common_type; offset:0; size:2; signed:0;
1205 field:unsigned char common_flags; offset:2; size:1; signed:0;
1206 field:unsigned char common_preempt_count; offset:3; size:1; signed:0;
1207 field:int common_pid; offset:4; size:4; signed:1;
1208 field:int common_padding; offset:8; size:4; signed:1;
1209
1210 field:unsigned long __probe_ip; offset:12; size:4; signed:0;
1211
1212 print fmt: "(%lx)", REC->__probe_ip
1213 </literallayout>
1214 We can list all dynamic tracepoints currently in existence:
1215 <literallayout class='monospaced'>
1216 root@crownbay:~# perf probe -l
1217 probe:do_fork (on do_fork)
1218 probe:schedule (on schedule)
1219 </literallayout>
1220 Let's record system-wide ('sleep 30' is a trick for recording
1221 system-wide but basically do nothing and then wake up after
1222 30 seconds):
1223 <literallayout class='monospaced'>
1224 root@crownbay:~# perf record -g -a -e probe:do_fork sleep 30
1225 [ perf record: Woken up 1 times to write data ]
1226 [ perf record: Captured and wrote 0.087 MB perf.data (~3812 samples) ]
1227 </literallayout>
1228 Using 'perf script' we can see each do_fork event that fired:
1229 <literallayout class='monospaced'>
1230 root@crownbay:~# perf script
1231
1232 # ========
1233 # captured on: Sun Oct 28 11:55:18 2012
1234 # hostname : crownbay
1235 # os release : 3.4.11-yocto-standard
1236 # perf version : 3.4.11
1237 # arch : i686
1238 # nrcpus online : 2
1239 # nrcpus avail : 2
1240 # cpudesc : Intel(R) Atom(TM) CPU E660 @ 1.30GHz
1241 # cpuid : GenuineIntel,6,38,1
1242 # total memory : 1017184 kB
1243 # cmdline : /usr/bin/perf record -g -a -e probe:do_fork sleep 30
1244 # event : name = probe:do_fork, type = 2, config = 0x3b0, config1 = 0x0, config2 = 0x0, excl_usr = 0, excl_kern
1245 = 0, id = { 5, 6 }
1246 # HEADER_CPU_TOPOLOGY info available, use -I to display
1247 # ========
1248 #
1249 matchbox-deskto 1197 [001] 34211.378318: do_fork: (c1028460)
1250 matchbox-deskto 1295 [001] 34211.380388: do_fork: (c1028460)
1251 pcmanfm 1296 [000] 34211.632350: do_fork: (c1028460)
1252 pcmanfm 1296 [000] 34211.639917: do_fork: (c1028460)
1253 matchbox-deskto 1197 [001] 34217.541603: do_fork: (c1028460)
1254 matchbox-deskto 1299 [001] 34217.543584: do_fork: (c1028460)
1255 gthumb 1300 [001] 34217.697451: do_fork: (c1028460)
1256 gthumb 1300 [001] 34219.085734: do_fork: (c1028460)
1257 gthumb 1300 [000] 34219.121351: do_fork: (c1028460)
1258 gthumb 1300 [001] 34219.264551: do_fork: (c1028460)
1259 pcmanfm 1296 [000] 34219.590380: do_fork: (c1028460)
1260 matchbox-deskto 1197 [001] 34224.955965: do_fork: (c1028460)
1261 matchbox-deskto 1306 [001] 34224.957972: do_fork: (c1028460)
1262 matchbox-termin 1307 [000] 34225.038214: do_fork: (c1028460)
1263 matchbox-termin 1307 [001] 34225.044218: do_fork: (c1028460)
1264 matchbox-termin 1307 [000] 34225.046442: do_fork: (c1028460)
1265 matchbox-deskto 1197 [001] 34237.112138: do_fork: (c1028460)
1266 matchbox-deskto 1311 [001] 34237.114106: do_fork: (c1028460)
1267 gaku 1312 [000] 34237.202388: do_fork: (c1028460)
1268 </literallayout>
1269 And using 'perf report' on the same file, we can see the
1270 callgraphs from starting a few programs during those 30 seconds:
1271 </para>
1272
1273 <para>
1274 <imagedata fileref="figures/perf-probe-do_fork-profile.png" width="6in" depth="7in" align="center" scalefit="1" />
1275 </para>
1276
1277 <note>
1278 Tying It Together: The trace events subsystem accomodate static
1279 and dynamic tracepoints in exactly the same way - there's no
1280 difference as far as the infrastructure is concerned. See the
1281 ftrace section for more details on the trace event subsystem.
1282 </note>
1283
1284 <note>
1285 Tying It Together: Dynamic tracepoints are implemented under the
1286 covers by kprobes and uprobes. kprobes and uprobes are also used
1287 by and in fact are the main focus of SystemTap.
1288 </note>
1289 </section>
1290
1291 <section id='perf-documentation'>
1292 <title>Documentation</title>
1293
1294 <para>
1295 Online versions of the man pages for the commands discussed in this
1296 section can be found here:
1297 <itemizedlist>
1298 <listitem><para>The <ulink url='http://linux.die.net/man/1/perf-stat'>'perf stat' manpage</ulink>.
1299 </para></listitem>
1300 <listitem><para>The <ulink url='http://linux.die.net/man/1/perf-record'>'perf record' manpage</ulink>.
1301 </para></listitem>
1302 <listitem><para>The <ulink url='http://linux.die.net/man/1/perf-report'>'perf report' manpage</ulink>.
1303 </para></listitem>
1304 <listitem><para>The <ulink url='http://linux.die.net/man/1/perf-probe'>'perf probe' manpage</ulink>.
1305 </para></listitem>
1306 <listitem><para>The <ulink url='http://linux.die.net/man/1/perf-script'>'perf script' manpage</ulink>.
1307 </para></listitem>
1308 <listitem><para>Documentation on using the
1309 <ulink url='http://linux.die.net/man/1/perf-script-python'>'perf script' python binding</ulink>.
1310 </para></listitem>
1311 <listitem><para>The top-level
1312 <ulink url='http://linux.die.net/man/1/perf'>perf(1) manpage</ulink>.
1313 </para></listitem>
1314 </itemizedlist>
1315 </para>
1316
1317 <para>
1318 Normally, you should be able to invoke the man pages via perf
1319 itself e.g. 'perf help' or 'perf help record'.
1320 </para>
1321
1322 <para>
1323 However, by default Yocto doesn't install man pages, but perf
1324 invokes the man pages for most help functionality. This is a bug
1325 and is being addressed by a Yocto bug:
1326 <ulink url='https://bugzilla.yoctoproject.org/show_bug.cgi?id=3388'>Bug 3388 - perf: enable man pages for basic 'help' functionality</ulink>.
1327 </para>
1328
1329 <para>
1330 The man pages in text form, along with some other files, such as
1331 a set of examples, can be found in the 'perf' directory of the
1332 kernel tree:
1333 <literallayout class='monospaced'>
1334 tools/perf/Documentation
1335 </literallayout>
1336 There's also a nice perf tutorial on the perf wiki that goes
1337 into more detail than we do here in certain areas:
1338 <ulink url='https://perf.wiki.kernel.org/index.php/Tutorial'>Perf Tutorial</ulink>
1339 </para>
1340 </section>
988</section> 1341</section>
989</chapter> 1342</chapter>
990<!-- 1343<!--