This is the mail archive of the systemtap@sourceware.org mailing list for the systemtap project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: stptracer-20060828 has released.


Li Guanglei wrote:
Li Guanglei wrote:

[...]
Here you can see that gBTI is around 23% faster than _stp_printf(). In
fact a long time ago I used oprofile to find out where _stp_printf()
cost cpu time, and it showed that a lot time is spent on _stp_vsnprintf().



Attached is the profiling data I got a minute ago by running
the tests I mentioned in my mail. I only kept the leading few lines of the generated profiling file.


oprofile.all is generated by:
  opreport -p STP_MODULE_DIR -lg  > oprofile.all
oprofile.kernel is generated by:
  opreport /boot/vmlinux-`uname -r`.debug -lg  > oprofile.kernel

- Guanglei



As a comparison, I also paste the profiling data when probing using lkst_getsid.stp. You can see that _stp_vsnprintf used by lket_getsid.stp will cost ~51% total time while _stp_binary_write used by lkst_getsid.stp will only cost ~4.7% total time.


- Guanglei


CPU: ppc64 POWER5, speed 1502.5 MHz (estimated)
Counted CYCLES events (Processor Cycles using continuous sampling) with a unit mask of 0x00 (No unit mask) count 100000
samples  %        linenr info                 symbol name
125867   11.6242  (no location information)   restore
93078     8.5960  op_model_power4.c:183       .hypervisor_bucket
91577     8.4574  kprobes.c:442               .kprobe_exceptions_notify
88532     8.1762  sys.c:1351                  .sys_getsid
64783     5.9829  (no location information)   syscall_error_cont
49629     4.5834  (no location information)   program_check_pSeries
43473     4.0149  (no location information)   single_step_pSeries
41871     3.8669  (no location information)   system_call_common
37849     3.4955  (no location information)   program_check_common
37300     3.4448  (no location information)   system_call_pSeries
32540     3.0052  spinlock.c:66               ._read_lock
28474     2.6297  sys.c:133                   .notifier_call_chain
22203     2.0505  (no location information)   .ret_from_except_lite
21443     1.9803  (no location information)   single_step_common
17651     1.6301  (no location information)   .save_nvgprs
16164     1.4928  (no location information)   system_call
15142     1.3984  sstep.c:56                  .emulate_step
13674     1.2628  (no location information)   syscall_exit
13070     1.2071  spinlock.c:154              ._spin_lock
12607     1.1643  traps.c:747                 .program_check_exception
12568     1.1607  sys.c:220                   .atomic_notifier_call_chain
11305     1.0440  pid.c:209                   .find_pid
11031     1.0187  hash.h:33                   .get_kprobe
10543     0.9737  (no location information)   .ret_from_except
9915      0.9157  (no location information)   .__copy_tofrom_user
8693      0.8028  traps.c:502                 .single_step_exception
8685      0.8021  op_model_power4.c:191       .kernel_unknown_bucket
7412      0.6845  buffer.c:1433               .__find_get_block
6475      0.5980  pid.c:258                   .pid_task
5260      0.4858  buffer.c:1269               .(bool, long double,...)(long, short,...)
4351      0.4018  kprobes.c:121               .set_current_kprobe
4336      0.4004  buffer.c:80                 .unlock_buffer
4220      0.3897  relay.c:535                 .relay_switch_subbuf
3945      0.3643  locks.c:29                  .__spin_yield
3349      0.3093  slab.c:3371                 .kmem_cache_free
2929      0.2705  pid.c:271                   .find_task_by_pid_type
2853      0.2635  filemap.c:1978              .generic_file_buffered_write
2280      0.2106  slab.c:3145                 .kmem_cache_alloc
2029      0.1874  wait.c:241                  .bit_waitqueue
1790      0.1653  bitops.c:19                 .find_next_bit
1790      0.1653  (no location information)   .memset
1747      0.1613  filemap.c:507               .unlock_page
1708      0.1577  fs-writeback.c:56           .__mark_inode_dirty
1692      0.1563  spinlock.c:73               ._spin_lock_irqsave
1660      0.1533  mutex.c:193                 .__mutex_lock_slowpath
1649      0.1523  slab.c:2992                 .free_block
1546      0.1428  wait.c:208                  .__wake_up_bit
CPU: ppc64 POWER5, speed 1502.5 MHz (estimated)
Counted CYCLES events (Processor Cycles using continuous sampling) with a unit mask of 0x00 (No unit mask) count 100000
samples  %        linenr info                 image name               app name                 symbol name
125867    8.2421  (no location information)   vmlinux   vmlinux   restore
117896    7.7202  lglzy.c:918                 lglzy.ko                 lglzy                    .enter_probe_1832
93078     6.0950  op_model_power4.c:183       vmlinux   vmlinux   .hypervisor_bucket
91577     5.9967  kprobes.c:442               vmlinux   vmlinux   .kprobe_exceptions_notify
88532     5.7973  sys.c:1351                  vmlinux   vmlinux   .sys_getsid
72270     4.7324  lglzy.c:269                 lglzy.ko                 lglzy                    ._stp_binary_write
65678     4.3008  (no location information)   jbd                      jbd                      (no symbols)
64783     4.2422  (no location information)   vmlinux   vmlinux   syscall_error_cont
49629     3.2498  (no location information)   vmlinux   vmlinux   program_check_pSeries
43473     2.8467  (no location information)   vmlinux   vmlinux   single_step_pSeries
42525     2.7847  lglzy.c:165                 lglzy.ko                 lglzy                    .common_probe_prologue
41871     2.7418  (no location information)   vmlinux   vmlinux   system_call_common
39245     2.5699  lglzy.c:187                 lglzy.ko                 lglzy                    .common_probe_epilogue
37849     2.4785  (no location information)   vmlinux   vmlinux   program_check_common
37300     2.4425  (no location information)   vmlinux   vmlinux   system_call_pSeries
34824     2.2804  (no location information)   ext3                     ext3                     (no symbols)
32540     2.1308  spinlock.c:66               vmlinux   vmlinux   ._read_lock
28474     1.8646  sys.c:133                   vmlinux   vmlinux   .notifier_call_chain
23742     1.5547  (no location information)   libc-2.3.4.so            libc-2.3.4.so            .__GI_getsid
22203     1.4539  (no location information)   vmlinux   vmlinux   .ret_from_except_lite
21443     1.4041  (no location information)   vmlinux   vmlinux   single_step_common
17651     1.1558  (no location information)   vmlinux   vmlinux   .save_nvgprs
16164     1.0585  (no location information)   vmlinux   vmlinux   system_call
15142     0.9915  sstep.c:56                  vmlinux   vmlinux   .emulate_step
13674     0.8954  (no location information)   vmlinux   vmlinux   syscall_exit
13070     0.8559  spinlock.c:154              vmlinux   vmlinux   ._spin_lock
12607     0.8255  traps.c:747                 vmlinux   vmlinux   .program_check_exception
12568     0.8230  sys.c:220                   vmlinux   vmlinux   .atomic_notifier_call_chain
11305     0.7403  pid.c:209                   vmlinux   vmlinux   .find_pid
11031     0.7223  hash.h:33                   vmlinux   vmlinux   .get_kprobe
10543     0.6904  (no location information)   vmlinux   vmlinux   .ret_from_except
10235     0.6702  lglzy.c:326                 lglzy.ko                 lglzy                    .function_lkst_trace_n
9988      0.6540  print.c:91                  lglzy.ko                 lglzy                    ._stp_print_flush
9915      0.6493  (no location information)   vmlinux   vmlinux   .__copy_tofrom_user
9830      0.6437  (no location information)   app                      app                      .thread_func
8693      0.5692  traps.c:502                 vmlinux   vmlinux   .single_step_exception
8685      0.5687  op_model_power4.c:191       vmlinux   vmlinux   .kernel_unknown_bucket
8635      0.5654  (no location information)   oprofiled                oprofiled                (no symbols)
7412      0.4854  buffer.c:1433               vmlinux   vmlinux   .__find_get_block

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]