Interesting discovery about perf

In this post I want to discuss something interesting I came across using perf.

For Stage 2 of project I used sample.txt which is 1GB random text file.
On aarchie I used command for((x=1; x < 8; x++)); do time perf record --call-graph=dwarf -o test$x.data ./bzip2 -c sample.txt > test$x.txt.bz2
Interestingly on aarchie I had no problem with this command, but on xerxes and on RPi3 it was showing message "Check IO/CPU Overload" and it was losing some chunks during compression.

I did some Googling about this message and it seems like it's caused by too frequent sampling rate by perf.

This answer from stackoverflow showed workaround to lower the sampling rate by using -F 97.
The perf tutorial has an option -c to collect sample every specified number of occurrences of event.

On xerxes -F 97 seemed to work, no more "Check IO/CPU Overload" message, and the time was lot shorter compared to the command used on aarchie.
On RPi3 not much luck with this. When using -F 97 on RPi3 it shows message "perf: interrupt took too long" message. 


It is possible that the 3 minute time on aarchie could be a result of perf sampling rate. Since less sampling means less interruptions, it does make sense that lower sampling rate would show faster time.

The call graph and the hot spots of functions look similar to the one done without -F option.

Comments