diff options
Diffstat (limited to 'documentation/profile-manual')
-rw-r--r-- | documentation/profile-manual/profile-manual-usage.xml | 664 |
1 files changed, 664 insertions, 0 deletions
diff --git a/documentation/profile-manual/profile-manual-usage.xml b/documentation/profile-manual/profile-manual-usage.xml index 12323ed7e7..a35a112e55 100644 --- a/documentation/profile-manual/profile-manual-usage.xml +++ b/documentation/profile-manual/profile-manual-usage.xml | |||
@@ -1340,6 +1340,670 @@ | |||
1340 | </para> | 1340 | </para> |
1341 | </section> | 1341 | </section> |
1342 | </section> | 1342 | </section> |
1343 | |||
1344 | <section id='profile-manual-ftrace'> | ||
1345 | <title>ftrace</title> | ||
1346 | |||
1347 | <para> | ||
1348 | 'ftrace' literally refers to the 'ftrace function tracer' but in | ||
1349 | reality this encompasses a number of related tracers along with | ||
1350 | the infrastructure that they all make use of. | ||
1351 | </para> | ||
1352 | |||
1353 | <section id='ftrace-setup'> | ||
1354 | <title>Setup</title> | ||
1355 | |||
1356 | <para> | ||
1357 | For this section, we'll assume you've already performed the basic | ||
1358 | setup outlined in the General Setup section. | ||
1359 | </para> | ||
1360 | |||
1361 | <para> | ||
1362 | ftrace, trace-cmd, and kernelshark run on the target system, | ||
1363 | and are ready to go out-of-the-box - no additional setup is | ||
1364 | necessary. For the rest of this section we assume you've ssh'ed | ||
1365 | to the host and will be running ftrace on the target. kernelshark | ||
1366 | is a GUI application and if you use the '-X' option to ssh you | ||
1367 | can have the kernelshark GUI run on the target but display | ||
1368 | remotely on the host if you want. | ||
1369 | </para> | ||
1370 | </section> | ||
1371 | |||
1372 | <section id='basic-ftrace-usage'> | ||
1373 | <title>Basic ftrace usage</title> | ||
1374 | |||
1375 | <para> | ||
1376 | 'ftrace' essentially refers to everything included in | ||
1377 | the /tracing directory of the mounted debugfs filesystem | ||
1378 | (Yocto follows the standard convention and mounts it | ||
1379 | at /sys/kernel/debug). Here's a listing of all the files | ||
1380 | found in /sys/kernel/debug/tracing on a Yocto system.: | ||
1381 | <literallayout class='monospaced'> | ||
1382 | root@sugarbay:/sys/kernel/debug/tracing# ls | ||
1383 | README kprobe_events trace | ||
1384 | available_events kprobe_profile trace_clock | ||
1385 | available_filter_functions options trace_marker | ||
1386 | available_tracers per_cpu trace_options | ||
1387 | buffer_size_kb printk_formats trace_pipe | ||
1388 | buffer_total_size_kb saved_cmdlines tracing_cpumask | ||
1389 | current_tracer set_event tracing_enabled | ||
1390 | dyn_ftrace_total_info set_ftrace_filter tracing_on | ||
1391 | enabled_functions set_ftrace_notrace tracing_thresh | ||
1392 | events set_ftrace_pid | ||
1393 | free_buffer set_graph_function | ||
1394 | </literallayout> | ||
1395 | The files listed above are used for various purposes - | ||
1396 | some relate directly to the tracers themselves, others are | ||
1397 | used to set tracing options, and yet others actually contain | ||
1398 | the tracing output when a tracer is in effect. Some of the | ||
1399 | functions can be guessed from their names, others need | ||
1400 | explanation; in any case, we'll cover some of the files we | ||
1401 | see here below but for an explanation of the others, please | ||
1402 | see the ftrace documentation. | ||
1403 | </para> | ||
1404 | |||
1405 | <para> | ||
1406 | We'll start by looking at some of the available built-in | ||
1407 | tracers. | ||
1408 | </para> | ||
1409 | |||
1410 | <para> | ||
1411 | cat'ing the 'available_tracers' file lists the set of | ||
1412 | available tracers: | ||
1413 | <literallayout class='monospaced'> | ||
1414 | root@sugarbay:/sys/kernel/debug/tracing# cat available_tracers | ||
1415 | blk function_graph function nop | ||
1416 | </literallayout> | ||
1417 | The 'current_tracer' file contains the tracer currently in | ||
1418 | effect: | ||
1419 | <literallayout class='monospaced'> | ||
1420 | root@sugarbay:/sys/kernel/debug/tracing# cat current_tracer | ||
1421 | nop | ||
1422 | </literallayout> | ||
1423 | The above listing of current_tracer shows that | ||
1424 | the 'nop' tracer is in effect, which is just another | ||
1425 | way of saying that there's actually no tracer | ||
1426 | currently in effect. | ||
1427 | </para> | ||
1428 | |||
1429 | <para> | ||
1430 | echo'ing one of the available_tracers into current_tracer | ||
1431 | makes the specified tracer the current tracer: | ||
1432 | <literallayout class='monospaced'> | ||
1433 | root@sugarbay:/sys/kernel/debug/tracing# echo function > current_tracer | ||
1434 | root@sugarbay:/sys/kernel/debug/tracing# cat current_tracer | ||
1435 | function | ||
1436 | </literallayout> | ||
1437 | The above sets the current tracer to be the | ||
1438 | 'function tracer'. This tracer traces every function | ||
1439 | call in the kernel and makes it available as the | ||
1440 | contents of the 'trace' file. Reading the 'trace' file | ||
1441 | lists the currently buffered function calls that have been | ||
1442 | traced by the function tracer: | ||
1443 | <literallayout class='monospaced'> | ||
1444 | root@sugarbay:/sys/kernel/debug/tracing# cat trace | less | ||
1445 | |||
1446 | # tracer: function | ||
1447 | # | ||
1448 | # entries-in-buffer/entries-written: 310629/766471 #P:8 | ||
1449 | # | ||
1450 | # _-----=> irqs-off | ||
1451 | # / _----=> need-resched | ||
1452 | # | / _---=> hardirq/softirq | ||
1453 | # || / _--=> preempt-depth | ||
1454 | # ||| / delay | ||
1455 | # TASK-PID CPU# |||| TIMESTAMP FUNCTION | ||
1456 | # | | | |||| | | | ||
1457 | <idle>-0 [004] d..1 470.867169: ktime_get_real <-intel_idle | ||
1458 | <idle>-0 [004] d..1 470.867170: getnstimeofday <-ktime_get_real | ||
1459 | <idle>-0 [004] d..1 470.867171: ns_to_timeval <-intel_idle | ||
1460 | <idle>-0 [004] d..1 470.867171: ns_to_timespec <-ns_to_timeval | ||
1461 | <idle>-0 [004] d..1 470.867172: smp_apic_timer_interrupt <-apic_timer_interrupt | ||
1462 | <idle>-0 [004] d..1 470.867172: native_apic_mem_write <-smp_apic_timer_interrupt | ||
1463 | <idle>-0 [004] d..1 470.867172: irq_enter <-smp_apic_timer_interrupt | ||
1464 | <idle>-0 [004] d..1 470.867172: rcu_irq_enter <-irq_enter | ||
1465 | <idle>-0 [004] d..1 470.867173: rcu_idle_exit_common.isra.33 <-rcu_irq_enter | ||
1466 | <idle>-0 [004] d..1 470.867173: local_bh_disable <-irq_enter | ||
1467 | <idle>-0 [004] d..1 470.867173: add_preempt_count <-local_bh_disable | ||
1468 | <idle>-0 [004] d.s1 470.867174: tick_check_idle <-irq_enter | ||
1469 | <idle>-0 [004] d.s1 470.867174: tick_check_oneshot_broadcast <-tick_check_idle | ||
1470 | <idle>-0 [004] d.s1 470.867174: ktime_get <-tick_check_idle | ||
1471 | <idle>-0 [004] d.s1 470.867174: tick_nohz_stop_idle <-tick_check_idle | ||
1472 | <idle>-0 [004] d.s1 470.867175: update_ts_time_stats <-tick_nohz_stop_idle | ||
1473 | <idle>-0 [004] d.s1 470.867175: nr_iowait_cpu <-update_ts_time_stats | ||
1474 | <idle>-0 [004] d.s1 470.867175: tick_do_update_jiffies64 <-tick_check_idle | ||
1475 | <idle>-0 [004] d.s1 470.867175: _raw_spin_lock <-tick_do_update_jiffies64 | ||
1476 | <idle>-0 [004] d.s1 470.867176: add_preempt_count <-_raw_spin_lock | ||
1477 | <idle>-0 [004] d.s2 470.867176: do_timer <-tick_do_update_jiffies64 | ||
1478 | <idle>-0 [004] d.s2 470.867176: _raw_spin_lock <-do_timer | ||
1479 | <idle>-0 [004] d.s2 470.867176: add_preempt_count <-_raw_spin_lock | ||
1480 | <idle>-0 [004] d.s3 470.867177: ntp_tick_length <-do_timer | ||
1481 | <idle>-0 [004] d.s3 470.867177: _raw_spin_lock_irqsave <-ntp_tick_length | ||
1482 | . | ||
1483 | . | ||
1484 | . | ||
1485 | </literallayout> | ||
1486 | Each line in the trace above shows what was happening in | ||
1487 | the kernel on a given cpu, to the level of detail of | ||
1488 | function calls. Each entry shows the function called, | ||
1489 | followed by its caller (after the arrow). | ||
1490 | </para> | ||
1491 | |||
1492 | <para> | ||
1493 | The function tracer gives you an extremely detailed idea | ||
1494 | of what the kernel was doing at the point in time the trace | ||
1495 | was taken, and is a great way to learn about how the kernel | ||
1496 | code works in a dynamic sense. | ||
1497 | </para> | ||
1498 | |||
1499 | <note> | ||
1500 | Tying It Together: The ftrace function tracer is also | ||
1501 | available from within perf, as the ftrace:function tracepoint. | ||
1502 | </note> | ||
1503 | |||
1504 | <para> | ||
1505 | It is a little more difficult to follow the call chains than | ||
1506 | it needs to be - luckily there's a variant of the function | ||
1507 | tracer that displays the callchains explicitly, called the | ||
1508 | 'function_graph' tracer: | ||
1509 | <literallayout class='monospaced'> | ||
1510 | root@sugarbay:/sys/kernel/debug/tracing# echo function_graph > current_tracer | ||
1511 | root@sugarbay:/sys/kernel/debug/tracing# cat trace | less | ||
1512 | |||
1513 | tracer: function_graph | ||
1514 | |||
1515 | CPU DURATION FUNCTION CALLS | ||
1516 | | | | | | | | | ||
1517 | 7) 0.046 us | pick_next_task_fair(); | ||
1518 | 7) 0.043 us | pick_next_task_stop(); | ||
1519 | 7) 0.042 us | pick_next_task_rt(); | ||
1520 | 7) 0.032 us | pick_next_task_fair(); | ||
1521 | 7) 0.030 us | pick_next_task_idle(); | ||
1522 | 7) | _raw_spin_unlock_irq() { | ||
1523 | 7) 0.033 us | sub_preempt_count(); | ||
1524 | 7) 0.258 us | } | ||
1525 | 7) 0.032 us | sub_preempt_count(); | ||
1526 | 7) + 13.341 us | } /* __schedule */ | ||
1527 | 7) 0.095 us | } /* sub_preempt_count */ | ||
1528 | 7) | schedule() { | ||
1529 | 7) | __schedule() { | ||
1530 | 7) 0.060 us | add_preempt_count(); | ||
1531 | 7) 0.044 us | rcu_note_context_switch(); | ||
1532 | 7) | _raw_spin_lock_irq() { | ||
1533 | 7) 0.033 us | add_preempt_count(); | ||
1534 | 7) 0.247 us | } | ||
1535 | 7) | idle_balance() { | ||
1536 | 7) | _raw_spin_unlock() { | ||
1537 | 7) 0.031 us | sub_preempt_count(); | ||
1538 | 7) 0.246 us | } | ||
1539 | 7) | update_shares() { | ||
1540 | 7) 0.030 us | __rcu_read_lock(); | ||
1541 | 7) 0.029 us | __rcu_read_unlock(); | ||
1542 | 7) 0.484 us | } | ||
1543 | 7) 0.030 us | __rcu_read_lock(); | ||
1544 | 7) | load_balance() { | ||
1545 | 7) | find_busiest_group() { | ||
1546 | 7) 0.031 us | idle_cpu(); | ||
1547 | 7) 0.029 us | idle_cpu(); | ||
1548 | 7) 0.035 us | idle_cpu(); | ||
1549 | 7) 0.906 us | } | ||
1550 | 7) 1.141 us | } | ||
1551 | 7) 0.022 us | msecs_to_jiffies(); | ||
1552 | 7) | load_balance() { | ||
1553 | 7) | find_busiest_group() { | ||
1554 | 7) 0.031 us | idle_cpu(); | ||
1555 | . | ||
1556 | . | ||
1557 | . | ||
1558 | 4) 0.062 us | msecs_to_jiffies(); | ||
1559 | 4) 0.062 us | __rcu_read_unlock(); | ||
1560 | 4) | _raw_spin_lock() { | ||
1561 | 4) 0.073 us | add_preempt_count(); | ||
1562 | 4) 0.562 us | } | ||
1563 | 4) + 17.452 us | } | ||
1564 | 4) 0.108 us | put_prev_task_fair(); | ||
1565 | 4) 0.102 us | pick_next_task_fair(); | ||
1566 | 4) 0.084 us | pick_next_task_stop(); | ||
1567 | 4) 0.075 us | pick_next_task_rt(); | ||
1568 | 4) 0.062 us | pick_next_task_fair(); | ||
1569 | 4) 0.066 us | pick_next_task_idle(); | ||
1570 | ------------------------------------------ | ||
1571 | 4) kworker-74 => <idle>-0 | ||
1572 | ------------------------------------------ | ||
1573 | |||
1574 | 4) | finish_task_switch() { | ||
1575 | 4) | _raw_spin_unlock_irq() { | ||
1576 | 4) 0.100 us | sub_preempt_count(); | ||
1577 | 4) 0.582 us | } | ||
1578 | 4) 1.105 us | } | ||
1579 | 4) 0.088 us | sub_preempt_count(); | ||
1580 | 4) ! 100.066 us | } | ||
1581 | . | ||
1582 | . | ||
1583 | . | ||
1584 | 3) | sys_ioctl() { | ||
1585 | 3) 0.083 us | fget_light(); | ||
1586 | 3) | security_file_ioctl() { | ||
1587 | 3) 0.066 us | cap_file_ioctl(); | ||
1588 | 3) 0.562 us | } | ||
1589 | 3) | do_vfs_ioctl() { | ||
1590 | 3) | drm_ioctl() { | ||
1591 | 3) 0.075 us | drm_ut_debug_printk(); | ||
1592 | 3) | i915_gem_pwrite_ioctl() { | ||
1593 | 3) | i915_mutex_lock_interruptible() { | ||
1594 | 3) 0.070 us | mutex_lock_interruptible(); | ||
1595 | 3) 0.570 us | } | ||
1596 | 3) | drm_gem_object_lookup() { | ||
1597 | 3) | _raw_spin_lock() { | ||
1598 | 3) 0.080 us | add_preempt_count(); | ||
1599 | 3) 0.620 us | } | ||
1600 | 3) | _raw_spin_unlock() { | ||
1601 | 3) 0.085 us | sub_preempt_count(); | ||
1602 | 3) 0.562 us | } | ||
1603 | 3) 2.149 us | } | ||
1604 | 3) 0.133 us | i915_gem_object_pin(); | ||
1605 | 3) | i915_gem_object_set_to_gtt_domain() { | ||
1606 | 3) 0.065 us | i915_gem_object_flush_gpu_write_domain(); | ||
1607 | 3) 0.065 us | i915_gem_object_wait_rendering(); | ||
1608 | 3) 0.062 us | i915_gem_object_flush_cpu_write_domain(); | ||
1609 | 3) 1.612 us | } | ||
1610 | 3) | i915_gem_object_put_fence() { | ||
1611 | 3) 0.097 us | i915_gem_object_flush_fence.constprop.36(); | ||
1612 | 3) 0.645 us | } | ||
1613 | 3) 0.070 us | add_preempt_count(); | ||
1614 | 3) 0.070 us | sub_preempt_count(); | ||
1615 | 3) 0.073 us | i915_gem_object_unpin(); | ||
1616 | 3) 0.068 us | mutex_unlock(); | ||
1617 | 3) 9.924 us | } | ||
1618 | 3) + 11.236 us | } | ||
1619 | 3) + 11.770 us | } | ||
1620 | 3) + 13.784 us | } | ||
1621 | 3) | sys_ioctl() { | ||
1622 | </literallayout> | ||
1623 | As you can see, the function_graph display is much easier to | ||
1624 | follow. Also note that in addition to the function calls and | ||
1625 | associated braces, other events such as scheduler events | ||
1626 | are displayed in context. In fact, you can freely include | ||
1627 | any tracepoint available in the trace events subsystem described | ||
1628 | in the next section by simply enabling those events, and they'll | ||
1629 | appear in context in the function graph display. Quite a | ||
1630 | powerful tool for understanding kernel dynamics. | ||
1631 | </para> | ||
1632 | |||
1633 | <para> | ||
1634 | Also notice that there are various annotations on the left | ||
1635 | hand side of the display. For example if the total time it | ||
1636 | took for a given function to execute is above a certain | ||
1637 | threshold, and exclamation point or plus sign appears on the | ||
1638 | left hand side. Please see the ftrace documentation for | ||
1639 | details on all these fields. | ||
1640 | </para> | ||
1641 | </section> | ||
1642 | |||
1643 | <section id='the-trace-events-subsystem'> | ||
1644 | <title>The 'trace events' Subsystem</title> | ||
1645 | |||
1646 | <para> | ||
1647 | One especially important directory contained within | ||
1648 | the /sys/kernel/debug/tracing directory is the 'events' | ||
1649 | subdirectory, which contains representations of every | ||
1650 | tracepoint in the system. Listing out the contents of | ||
1651 | the 'events' subdirectory, we see mainly another set of | ||
1652 | subdirectories: | ||
1653 | <literallayout class='monospaced'> | ||
1654 | root@sugarbay:/sys/kernel/debug/tracing# cd events | ||
1655 | root@sugarbay:/sys/kernel/debug/tracing/events# ls -al | ||
1656 | drwxr-xr-x 38 root root 0 Nov 14 23:19 . | ||
1657 | drwxr-xr-x 5 root root 0 Nov 14 23:19 .. | ||
1658 | drwxr-xr-x 19 root root 0 Nov 14 23:19 block | ||
1659 | drwxr-xr-x 32 root root 0 Nov 14 23:19 btrfs | ||
1660 | drwxr-xr-x 5 root root 0 Nov 14 23:19 drm | ||
1661 | -rw-r--r-- 1 root root 0 Nov 14 23:19 enable | ||
1662 | drwxr-xr-x 40 root root 0 Nov 14 23:19 ext3 | ||
1663 | drwxr-xr-x 79 root root 0 Nov 14 23:19 ext4 | ||
1664 | drwxr-xr-x 14 root root 0 Nov 14 23:19 ftrace | ||
1665 | drwxr-xr-x 8 root root 0 Nov 14 23:19 hda | ||
1666 | -r--r--r-- 1 root root 0 Nov 14 23:19 header_event | ||
1667 | -r--r--r-- 1 root root 0 Nov 14 23:19 header_page | ||
1668 | drwxr-xr-x 25 root root 0 Nov 14 23:19 i915 | ||
1669 | drwxr-xr-x 7 root root 0 Nov 14 23:19 irq | ||
1670 | drwxr-xr-x 12 root root 0 Nov 14 23:19 jbd | ||
1671 | drwxr-xr-x 14 root root 0 Nov 14 23:19 jbd2 | ||
1672 | drwxr-xr-x 14 root root 0 Nov 14 23:19 kmem | ||
1673 | drwxr-xr-x 7 root root 0 Nov 14 23:19 module | ||
1674 | drwxr-xr-x 3 root root 0 Nov 14 23:19 napi | ||
1675 | drwxr-xr-x 6 root root 0 Nov 14 23:19 net | ||
1676 | drwxr-xr-x 3 root root 0 Nov 14 23:19 oom | ||
1677 | drwxr-xr-x 12 root root 0 Nov 14 23:19 power | ||
1678 | drwxr-xr-x 3 root root 0 Nov 14 23:19 printk | ||
1679 | drwxr-xr-x 8 root root 0 Nov 14 23:19 random | ||
1680 | drwxr-xr-x 4 root root 0 Nov 14 23:19 raw_syscalls | ||
1681 | drwxr-xr-x 3 root root 0 Nov 14 23:19 rcu | ||
1682 | drwxr-xr-x 6 root root 0 Nov 14 23:19 rpm | ||
1683 | drwxr-xr-x 20 root root 0 Nov 14 23:19 sched | ||
1684 | drwxr-xr-x 7 root root 0 Nov 14 23:19 scsi | ||
1685 | drwxr-xr-x 4 root root 0 Nov 14 23:19 signal | ||
1686 | drwxr-xr-x 5 root root 0 Nov 14 23:19 skb | ||
1687 | drwxr-xr-x 4 root root 0 Nov 14 23:19 sock | ||
1688 | drwxr-xr-x 10 root root 0 Nov 14 23:19 sunrpc | ||
1689 | drwxr-xr-x 538 root root 0 Nov 14 23:19 syscalls | ||
1690 | drwxr-xr-x 4 root root 0 Nov 14 23:19 task | ||
1691 | drwxr-xr-x 14 root root 0 Nov 14 23:19 timer | ||
1692 | drwxr-xr-x 3 root root 0 Nov 14 23:19 udp | ||
1693 | drwxr-xr-x 21 root root 0 Nov 14 23:19 vmscan | ||
1694 | drwxr-xr-x 3 root root 0 Nov 14 23:19 vsyscall | ||
1695 | drwxr-xr-x 6 root root 0 Nov 14 23:19 workqueue | ||
1696 | drwxr-xr-x 26 root root 0 Nov 14 23:19 writeback | ||
1697 | </literallayout> | ||
1698 | Each one of these subdirectories corresponds to a | ||
1699 | 'subsystem' and contains yet again more subdirectories, | ||
1700 | each one of those finally corresponding to a tracepoint. | ||
1701 | For example, here are the contents of the 'kmem' subsystem: | ||
1702 | <literallayout class='monospaced'> | ||
1703 | root@sugarbay:/sys/kernel/debug/tracing/events# cd kmem | ||
1704 | root@sugarbay:/sys/kernel/debug/tracing/events/kmem# ls -al | ||
1705 | drwxr-xr-x 14 root root 0 Nov 14 23:19 . | ||
1706 | drwxr-xr-x 38 root root 0 Nov 14 23:19 .. | ||
1707 | -rw-r--r-- 1 root root 0 Nov 14 23:19 enable | ||
1708 | -rw-r--r-- 1 root root 0 Nov 14 23:19 filter | ||
1709 | drwxr-xr-x 2 root root 0 Nov 14 23:19 kfree | ||
1710 | drwxr-xr-x 2 root root 0 Nov 14 23:19 kmalloc | ||
1711 | drwxr-xr-x 2 root root 0 Nov 14 23:19 kmalloc_node | ||
1712 | drwxr-xr-x 2 root root 0 Nov 14 23:19 kmem_cache_alloc | ||
1713 | drwxr-xr-x 2 root root 0 Nov 14 23:19 kmem_cache_alloc_node | ||
1714 | drwxr-xr-x 2 root root 0 Nov 14 23:19 kmem_cache_free | ||
1715 | drwxr-xr-x 2 root root 0 Nov 14 23:19 mm_page_alloc | ||
1716 | drwxr-xr-x 2 root root 0 Nov 14 23:19 mm_page_alloc_extfrag | ||
1717 | drwxr-xr-x 2 root root 0 Nov 14 23:19 mm_page_alloc_zone_locked | ||
1718 | drwxr-xr-x 2 root root 0 Nov 14 23:19 mm_page_free | ||
1719 | drwxr-xr-x 2 root root 0 Nov 14 23:19 mm_page_free_batched | ||
1720 | drwxr-xr-x 2 root root 0 Nov 14 23:19 mm_page_pcpu_drain | ||
1721 | </literallayout> | ||
1722 | Let's see what's inside the subdirectory for a specific | ||
1723 | tracepoint, in this case the one for kmalloc: | ||
1724 | <literallayout class='monospaced'> | ||
1725 | root@sugarbay:/sys/kernel/debug/tracing/events/kmem# cd kmalloc | ||
1726 | root@sugarbay:/sys/kernel/debug/tracing/events/kmem/kmalloc# ls -al | ||
1727 | drwxr-xr-x 2 root root 0 Nov 14 23:19 . | ||
1728 | drwxr-xr-x 14 root root 0 Nov 14 23:19 .. | ||
1729 | -rw-r--r-- 1 root root 0 Nov 14 23:19 enable | ||
1730 | -rw-r--r-- 1 root root 0 Nov 14 23:19 filter | ||
1731 | -r--r--r-- 1 root root 0 Nov 14 23:19 format | ||
1732 | -r--r--r-- 1 root root 0 Nov 14 23:19 id | ||
1733 | </literallayout> | ||
1734 | The 'format' file for the tracepoint describes the event | ||
1735 | in memory, which is used by the various tracing tools | ||
1736 | that now make use of these tracepoint to parse the event | ||
1737 | and make sense of it, along with a 'print fmt' field that | ||
1738 | allows tools like ftrace to display the event as text. | ||
1739 | Here's what the format of the kmalloc event looks like: | ||
1740 | <literallayout class='monospaced'> | ||
1741 | root@sugarbay:/sys/kernel/debug/tracing/events/kmem/kmalloc# cat format | ||
1742 | name: kmalloc | ||
1743 | ID: 313 | ||
1744 | format: | ||
1745 | field:unsigned short common_type; offset:0; size:2; signed:0; | ||
1746 | field:unsigned char common_flags; offset:2; size:1; signed:0; | ||
1747 | field:unsigned char common_preempt_count; offset:3; size:1; signed:0; | ||
1748 | field:int common_pid; offset:4; size:4; signed:1; | ||
1749 | field:int common_padding; offset:8; size:4; signed:1; | ||
1750 | |||
1751 | field:unsigned long call_site; offset:16; size:8; signed:0; | ||
1752 | field:const void * ptr; offset:24; size:8; signed:0; | ||
1753 | field:size_t bytes_req; offset:32; size:8; signed:0; | ||
1754 | field:size_t bytes_alloc; offset:40; size:8; signed:0; | ||
1755 | field:gfp_t gfp_flags; offset:48; size:4; signed:0; | ||
1756 | |||
1757 | print fmt: "call_site=%lx ptr=%p bytes_req=%zu bytes_alloc=%zu gfp_flags=%s", REC->call_site, REC->ptr, REC->bytes_req, REC->bytes_alloc, | ||
1758 | (REC->gfp_flags) ? __print_flags(REC->gfp_flags, "|", {(unsigned long)(((( gfp_t)0x10u) | (( gfp_t)0x40u) | (( gfp_t)0x80u) | (( | ||
1759 | gfp_t)0x20000u) | (( gfp_t)0x02u) | (( gfp_t)0x08u)) | (( gfp_t)0x4000u) | (( gfp_t)0x10000u) | (( gfp_t)0x1000u) | (( gfp_t)0x200u) | (( | ||
1760 | gfp_t)0x400000u)), "GFP_TRANSHUGE"}, {(unsigned long)((( gfp_t)0x10u) | (( gfp_t)0x40u) | (( gfp_t)0x80u) | (( gfp_t)0x20000u) | (( | ||
1761 | gfp_t)0x02u) | (( gfp_t)0x08u)), "GFP_HIGHUSER_MOVABLE"}, {(unsigned long)((( gfp_t)0x10u) | (( gfp_t)0x40u) | (( gfp_t)0x80u) | (( | ||
1762 | gfp_t)0x20000u) | (( gfp_t)0x02u)), "GFP_HIGHUSER"}, {(unsigned long)((( gfp_t)0x10u) | (( gfp_t)0x40u) | (( gfp_t)0x80u) | (( | ||
1763 | gfp_t)0x20000u)), "GFP_USER"}, {(unsigned long)((( gfp_t)0x10u) | (( gfp_t)0x40u) | (( gfp_t)0x80u) | (( gfp_t)0x80000u)), GFP_TEMPORARY"}, | ||
1764 | {(unsigned long)((( gfp_t)0x10u) | (( gfp_t)0x40u) | (( gfp_t)0x80u)), "GFP_KERNEL"}, {(unsigned long)((( gfp_t)0x10u) | (( gfp_t)0x40u)), | ||
1765 | "GFP_NOFS"}, {(unsigned long)((( gfp_t)0x20u)), "GFP_ATOMIC"}, {(unsigned long)((( gfp_t)0x10u)), "GFP_NOIO"}, {(unsigned long)(( | ||
1766 | gfp_t)0x20u), "GFP_HIGH"}, {(unsigned long)(( gfp_t)0x10u), "GFP_WAIT"}, {(unsigned long)(( gfp_t)0x40u), "GFP_IO"}, {(unsigned long)(( | ||
1767 | gfp_t)0x100u), "GFP_COLD"}, {(unsigned long)(( gfp_t)0x200u), "GFP_NOWARN"}, {(unsigned long)(( gfp_t)0x400u), "GFP_REPEAT"}, {(unsigned | ||
1768 | long)(( gfp_t)0x800u), "GFP_NOFAIL"}, {(unsigned long)(( gfp_t)0x1000u), "GFP_NORETRY"}, {(unsigned long)(( gfp_t)0x4000u), "GFP_COMP"}, | ||
1769 | {(unsigned long)(( gfp_t)0x8000u), "GFP_ZERO"}, {(unsigned long)(( gfp_t)0x10000u), "GFP_NOMEMALLOC"}, {(unsigned long)(( gfp_t)0x20000u), | ||
1770 | "GFP_HARDWALL"}, {(unsigned long)(( gfp_t)0x40000u), "GFP_THISNODE"}, {(unsigned long)(( gfp_t)0x80000u), "GFP_RECLAIMABLE"}, {(unsigned | ||
1771 | long)(( gfp_t)0x08u), "GFP_MOVABLE"}, {(unsigned long)(( gfp_t)0), "GFP_NOTRACK"}, {(unsigned long)(( gfp_t)0x400000u), "GFP_NO_KSWAPD"}, | ||
1772 | {(unsigned long)(( gfp_t)0x800000u), "GFP_OTHER_NODE"} ) : "GFP_NOWAIT" | ||
1773 | </literallayout> | ||
1774 | The 'enable' file in the tracepoint directory is what allows | ||
1775 | the user (or tools such as trace-cmd) to actually turn the | ||
1776 | tracepoint on and off. When enabled, the corresponding | ||
1777 | tracepoint will start appearing in the ftrace 'trace' | ||
1778 | file described previously. For example, this turns on the | ||
1779 | kmalloc tracepoint: | ||
1780 | <literallayout class='monospaced'> | ||
1781 | root@sugarbay:/sys/kernel/debug/tracing/events/kmem/kmalloc# echo 1 > enable | ||
1782 | </literallayout> | ||
1783 | At the moment, we're not interested in the function tracer or | ||
1784 | some other tracer that might be in effect, so we first turn | ||
1785 | it off, but if we do that, we still need to turn tracing on in | ||
1786 | order to see the events in the output buffer: | ||
1787 | <literallayout class='monospaced'> | ||
1788 | root@sugarbay:/sys/kernel/debug/tracing# echo nop > current_tracer | ||
1789 | root@sugarbay:/sys/kernel/debug/tracing# echo 1 > tracing_on | ||
1790 | </literallayout> | ||
1791 | Now, if we look at the the 'trace' file, we see nothing | ||
1792 | but the kmalloc events we just turned on: | ||
1793 | <literallayout class='monospaced'> | ||
1794 | root@sugarbay:/sys/kernel/debug/tracing# cat trace | less | ||
1795 | # tracer: nop | ||
1796 | # | ||
1797 | # entries-in-buffer/entries-written: 1897/1897 #P:8 | ||
1798 | # | ||
1799 | # _-----=> irqs-off | ||
1800 | # / _----=> need-resched | ||
1801 | # | / _---=> hardirq/softirq | ||
1802 | # || / _--=> preempt-depth | ||
1803 | # ||| / delay | ||
1804 | # TASK-PID CPU# |||| TIMESTAMP FUNCTION | ||
1805 | # | | | |||| | | | ||
1806 | dropbear-1465 [000] ...1 18154.620753: kmalloc: call_site=ffffffff816650d4 ptr=ffff8800729c3000 bytes_req=2048 bytes_alloc=2048 gfp_flags=GFP_KERNEL | ||
1807 | <idle>-0 [000] ..s3 18154.621640: kmalloc: call_site=ffffffff81619b36 ptr=ffff88006d555800 bytes_req=512 bytes_alloc=512 gfp_flags=GFP_ATOMIC | ||
1808 | <idle>-0 [000] ..s3 18154.621656: kmalloc: call_site=ffffffff81619b36 ptr=ffff88006d555800 bytes_req=512 bytes_alloc=512 gfp_flags=GFP_ATOMIC | ||
1809 | matchbox-termin-1361 [001] ...1 18154.755472: kmalloc: call_site=ffffffff81614050 ptr=ffff88006d5f0e00 bytes_req=512 bytes_alloc=512 gfp_flags=GFP_KERNEL|GFP_REPEAT | ||
1810 | Xorg-1264 [002] ...1 18154.755581: kmalloc: call_site=ffffffff8141abe8 ptr=ffff8800734f4cc0 bytes_req=168 bytes_alloc=192 gfp_flags=GFP_KERNEL|GFP_NOWARN|GFP_NORETRY | ||
1811 | Xorg-1264 [002] ...1 18154.755583: kmalloc: call_site=ffffffff814192a3 ptr=ffff88001f822520 bytes_req=24 bytes_alloc=32 gfp_flags=GFP_KERNEL|GFP_ZERO | ||
1812 | Xorg-1264 [002] ...1 18154.755589: kmalloc: call_site=ffffffff81419edb ptr=ffff8800721a2f00 bytes_req=64 bytes_alloc=64 gfp_flags=GFP_KERNEL|GFP_ZERO | ||
1813 | matchbox-termin-1361 [001] ...1 18155.354594: kmalloc: call_site=ffffffff81614050 ptr=ffff88006db35400 bytes_req=576 bytes_alloc=1024 gfp_flags=GFP_KERNEL|GFP_REPEAT | ||
1814 | Xorg-1264 [002] ...1 18155.354703: kmalloc: call_site=ffffffff8141abe8 ptr=ffff8800734f4cc0 bytes_req=168 bytes_alloc=192 gfp_flags=GFP_KERNEL|GFP_NOWARN|GFP_NORETRY | ||
1815 | Xorg-1264 [002] ...1 18155.354705: kmalloc: call_site=ffffffff814192a3 ptr=ffff88001f822520 bytes_req=24 bytes_alloc=32 gfp_flags=GFP_KERNEL|GFP_ZERO | ||
1816 | Xorg-1264 [002] ...1 18155.354711: kmalloc: call_site=ffffffff81419edb ptr=ffff8800721a2f00 bytes_req=64 bytes_alloc=64 gfp_flags=GFP_KERNEL|GFP_ZERO | ||
1817 | <idle>-0 [000] ..s3 18155.673319: kmalloc: call_site=ffffffff81619b36 ptr=ffff88006d555800 bytes_req=512 bytes_alloc=512 gfp_flags=GFP_ATOMIC | ||
1818 | dropbear-1465 [000] ...1 18155.673525: kmalloc: call_site=ffffffff816650d4 ptr=ffff8800729c3000 bytes_req=2048 bytes_alloc=2048 gfp_flags=GFP_KERNEL | ||
1819 | <idle>-0 [000] ..s3 18155.674821: kmalloc: call_site=ffffffff81619b36 ptr=ffff88006d554800 bytes_req=512 bytes_alloc=512 gfp_flags=GFP_ATOMIC | ||
1820 | <idle>-0 [000] ..s3 18155.793014: kmalloc: call_site=ffffffff81619b36 ptr=ffff88006d554800 bytes_req=512 bytes_alloc=512 gfp_flags=GFP_ATOMIC | ||
1821 | dropbear-1465 [000] ...1 18155.793219: kmalloc: call_site=ffffffff816650d4 ptr=ffff8800729c3000 bytes_req=2048 bytes_alloc=2048 gfp_flags=GFP_KERNEL | ||
1822 | <idle>-0 [000] ..s3 18155.794147: kmalloc: call_site=ffffffff81619b36 ptr=ffff88006d555800 bytes_req=512 bytes_alloc=512 gfp_flags=GFP_ATOMIC | ||
1823 | <idle>-0 [000] ..s3 18155.936705: kmalloc: call_site=ffffffff81619b36 ptr=ffff88006d555800 bytes_req=512 bytes_alloc=512 gfp_flags=GFP_ATOMIC | ||
1824 | dropbear-1465 [000] ...1 18155.936910: kmalloc: call_site=ffffffff816650d4 ptr=ffff8800729c3000 bytes_req=2048 bytes_alloc=2048 gfp_flags=GFP_KERNEL | ||
1825 | <idle>-0 [000] ..s3 18155.937869: kmalloc: call_site=ffffffff81619b36 ptr=ffff88006d554800 bytes_req=512 bytes_alloc=512 gfp_flags=GFP_ATOMIC | ||
1826 | matchbox-termin-1361 [001] ...1 18155.953667: kmalloc: call_site=ffffffff81614050 ptr=ffff88006d5f2000 bytes_req=512 bytes_alloc=512 gfp_flags=GFP_KERNEL|GFP_REPEAT | ||
1827 | Xorg-1264 [002] ...1 18155.953775: kmalloc: call_site=ffffffff8141abe8 ptr=ffff8800734f4cc0 bytes_req=168 bytes_alloc=192 gfp_flags=GFP_KERNEL|GFP_NOWARN|GFP_NORETRY | ||
1828 | Xorg-1264 [002] ...1 18155.953777: kmalloc: call_site=ffffffff814192a3 ptr=ffff88001f822520 bytes_req=24 bytes_alloc=32 gfp_flags=GFP_KERNEL|GFP_ZERO | ||
1829 | Xorg-1264 [002] ...1 18155.953783: kmalloc: call_site=ffffffff81419edb ptr=ffff8800721a2f00 bytes_req=64 bytes_alloc=64 gfp_flags=GFP_KERNEL|GFP_ZERO | ||
1830 | <idle>-0 [000] ..s3 18156.176053: kmalloc: call_site=ffffffff81619b36 ptr=ffff88006d554800 bytes_req=512 bytes_alloc=512 gfp_flags=GFP_ATOMIC | ||
1831 | dropbear-1465 [000] ...1 18156.176257: kmalloc: call_site=ffffffff816650d4 ptr=ffff8800729c3000 bytes_req=2048 bytes_alloc=2048 gfp_flags=GFP_KERNEL | ||
1832 | <idle>-0 [000] ..s3 18156.177717: kmalloc: call_site=ffffffff81619b36 ptr=ffff88006d555800 bytes_req=512 bytes_alloc=512 gfp_flags=GFP_ATOMIC | ||
1833 | <idle>-0 [000] ..s3 18156.399229: kmalloc: call_site=ffffffff81619b36 ptr=ffff88006d555800 bytes_req=512 bytes_alloc=512 gfp_flags=GFP_ATOMIC | ||
1834 | dropbear-1465 [000] ...1 18156.399434: kmalloc: call_site=ffffffff816650d4 ptr=ffff8800729c3000 bytes_http://rostedt.homelinux.com/kernelshark/req=2048 bytes_alloc=2048 gfp_flags=GFP_KERNEL | ||
1835 | <idle>-0 [000] ..s3 18156.400660: kmalloc: call_site=ffffffff81619b36 ptr=ffff88006d554800 bytes_req=512 bytes_alloc=512 gfp_flags=GFP_ATOMIC | ||
1836 | matchbox-termin-1361 [001] ...1 18156.552800: kmalloc: call_site=ffffffff81614050 ptr=ffff88006db34800 bytes_req=576 bytes_alloc=1024 gfp_flags=GFP_KERNEL|GFP_REPEAT | ||
1837 | </literallayout> | ||
1838 | To again disable the kmalloc event, we need to send 0 to the | ||
1839 | enable file: | ||
1840 | <literallayout class='monospaced'> | ||
1841 | root@sugarbay:/sys/kernel/debug/tracing/events/kmem/kmalloc# echo 0 > enable | ||
1842 | </literallayout> | ||
1843 | You can enable any number of events or complete subsystems | ||
1844 | (by using the 'enable' file in the subsystem directory) and | ||
1845 | get am arbitrarily fine-grained idea of what's going on in the | ||
1846 | system by enabling as many of the appropriate tracepoints | ||
1847 | as applicable. | ||
1848 | </para> | ||
1849 | |||
1850 | <para> | ||
1851 | A number of the tools described in this HOWTO do just that, | ||
1852 | including trace-cmd and kernelshark in the next section. | ||
1853 | </para> | ||
1854 | |||
1855 | <note> | ||
1856 | Tying It Together: These tracepoints and their representation | ||
1857 | are used not only by ftrace, but by many of the other tools | ||
1858 | covered in this document and they form a central point of | ||
1859 | integration for the various tracers available in Linux. | ||
1860 | They form a central part of the instrumentation for the | ||
1861 | following tools: perf, lttng, ftrace, blktrace and SystemTap | ||
1862 | </note> | ||
1863 | |||
1864 | <note> | ||
1865 | Tying It Together: Eventually all the special-purpose tracers | ||
1866 | currently available in /sys/kernel/debug/tracing will be | ||
1867 | removed and replaced with equivalent tracers based on the | ||
1868 | 'trace events' subsystem. | ||
1869 | </note> | ||
1870 | </section> | ||
1871 | |||
1872 | <section id='trace-cmd-kernelshark'> | ||
1873 | <title>trace-cmd/kernelshark</title> | ||
1874 | |||
1875 | <para> | ||
1876 | trace-cmd is essentially an extensive command-line 'wrapper' | ||
1877 | interface that hides the details of all the individual files | ||
1878 | in /sys/kernel/debug/tracing, allowing users to specify | ||
1879 | specific particular events within the | ||
1880 | /sys/kernel/debug/tracing/events/ subdirectory and to collect | ||
1881 | traces and avoiding having to deal with those details directly. | ||
1882 | </para> | ||
1883 | |||
1884 | <para> | ||
1885 | As yet another layer on top of that, kernelshark provides a GUI | ||
1886 | that allows users to start and stop traces and specify sets | ||
1887 | of events using an intuitive interface, and view the | ||
1888 | output as both trace events and as a per-cpu graphical | ||
1889 | display. It directly uses 'trace-cmd' as the plumbing | ||
1890 | that accomplishes all that underneath the covers (and | ||
1891 | actually displays the trace-cmd command it uses, as we'll see). | ||
1892 | </para> | ||
1893 | |||
1894 | <para> | ||
1895 | To start a trace using kernelshark, first start kernelshark: | ||
1896 | <literallayout class='monospaced'> | ||
1897 | root@sugarbay:~# kernelshark | ||
1898 | </literallayout> | ||
1899 | The bring up the 'Capture' dialog by choosing from the | ||
1900 | kernelshark menu: | ||
1901 | <literallayout class='monospaced'> | ||
1902 | Capture | Record | ||
1903 | </literallayout> | ||
1904 | That will display the following dialog, which allows you to | ||
1905 | choose on or more events (or even one or more complete | ||
1906 | subsystems) to trace: | ||
1907 | </para> | ||
1908 | |||
1909 | <para> | ||
1910 | <imagedata fileref="figures/kernelshark-choose-events.png" width="6in" depth="7in" align="center" scalefit="1" /> | ||
1911 | </para> | ||
1912 | |||
1913 | <para> | ||
1914 | Note that these are exactly the same set of events described | ||
1915 | in the previous trace events subsystem section, and in fact | ||
1916 | is where trace-cmd gets them for kernelshark. | ||
1917 | </para> | ||
1918 | |||
1919 | <para> | ||
1920 | In the above screenshot, we've decided to explore the | ||
1921 | graphics subsystem a bit and so have chosen to trace all | ||
1922 | the tracepoints contained within the 'i915' and 'drm' | ||
1923 | subsystems. | ||
1924 | </para> | ||
1925 | |||
1926 | <para> | ||
1927 | After doing that, we can start and stop the trace using | ||
1928 | the 'Run' and 'Stop' button on the lower right corner of | ||
1929 | the dialog (the same button will turn into the 'Stop' | ||
1930 | button after the trace has started): | ||
1931 | </para> | ||
1932 | |||
1933 | <para> | ||
1934 | <imagedata fileref="figures/kernelshark-output-display.png" width="6in" depth="7in" align="center" scalefit="1" /> | ||
1935 | </para> | ||
1936 | |||
1937 | <para> | ||
1938 | Notice that the right-hand pane shows the exact trace-cmd | ||
1939 | command-line that's used to run the trace, along with the | ||
1940 | results of the trace-cmd run. | ||
1941 | </para> | ||
1942 | |||
1943 | <para> | ||
1944 | Once the 'Stop' button is pressed, the graphical view magically | ||
1945 | fills up with a colorful per-cpu display of the trace data, | ||
1946 | along with the detailed event listing below that: | ||
1947 | </para> | ||
1948 | |||
1949 | <para> | ||
1950 | <imagedata fileref="figures/kernelshark-i915-display.png" width="6in" depth="7in" align="center" scalefit="1" /> | ||
1951 | </para> | ||
1952 | |||
1953 | <para> | ||
1954 | Here's another example, this time a display resulting | ||
1955 | from tracing 'all events': | ||
1956 | </para> | ||
1957 | |||
1958 | <para> | ||
1959 | <imagedata fileref="figures/kernelshark-all.png" width="6in" depth="7in" align="center" scalefit="1" /> | ||
1960 | </para> | ||
1961 | |||
1962 | <para> | ||
1963 | The tool is pretty self-explanatory, but for more detailed | ||
1964 | information on navigating through the data, see the | ||
1965 | <ulink url='http://rostedt.homelinux.com/kernelshark/'>kernelshark website</ulink>. | ||
1966 | </para> | ||
1967 | </section> | ||
1968 | |||
1969 | <section id='ftrace-documentation'> | ||
1970 | <title>Documentation</title> | ||
1971 | |||
1972 | <para> | ||
1973 | The documentation for ftrace can be found in the kernel | ||
1974 | Documentation directory: | ||
1975 | <literallayout class='monospaced'> | ||
1976 | Documentation/trace/ftrace.txt | ||
1977 | </literallayout> | ||
1978 | The documentation for the trace event subsystem can also | ||
1979 | be found in the kernel Documentation directory: | ||
1980 | <literallayout class='monospaced'> | ||
1981 | Documentation/trace/events.txt | ||
1982 | </literallayout> | ||
1983 | There are a nice series of articles on using | ||
1984 | ftrace and trace-cmd at LWN: | ||
1985 | <itemizedlist> | ||
1986 | <listitem><para><ulink url='http://lwn.net/Articles/365835/'>Debugging the kernel using Ftrace - part 1</ulink> | ||
1987 | </para></listitem> | ||
1988 | <listitem><para><ulink url='http://lwn.net/Articles/366796/'>Debugging the kernel using Ftrace - part 2</ulink> | ||
1989 | </para></listitem> | ||
1990 | <listitem><para><ulink url='https://lwn.net/Articles/410200/'>trace-cmd: A front-end for Ftrace</ulink> | ||
1991 | </para></listitem> | ||
1992 | </itemizedlist> | ||
1993 | </para> | ||
1994 | |||
1995 | <para> | ||
1996 | There's more detailed documentation kernelshark usage here: | ||
1997 | <ulink url='http://rostedt.homelinux.com/kernelshark/'>KernelShark</ulink> | ||
1998 | </para> | ||
1999 | |||
2000 | <para> | ||
2001 | An amusing yet useful README (a tracing mini-HOWTO) can be | ||
2002 | found in /sys/kernel/debug/tracing/README. | ||
2003 | </para> | ||
2004 | </section> | ||
2005 | </section> | ||
2006 | |||
1343 | </chapter> | 2007 | </chapter> |
1344 | <!-- | 2008 | <!-- |
1345 | vim: expandtab tw=80 ts=4 | 2009 | vim: expandtab tw=80 ts=4 |