Using wallClockProfiler
This profiler is a wrapper around GDB that samples the main thread by getting backtraces and presents its statistics based on the wall clock - thus including the time spent waiting for I/O.
Keep in mind that the profiled program is paused while GDB gets a backtrace, so don't use very high sampling rates. Also keep an eye on GDB's CPU usage.
Get the software
In the same parent directory as nim-beacon-chain:
$ ls
nim-beacon-chain
$ git clone https://github.com/jasonrohrer/wallClockProfiler.git
$ cd wallClockProfiler
$ g++ $CXXFLAGS -o wallClockProfiler wallClockProfiler.cpp
$ cd ..
Flame graph support is in a PR that's not expected to be merged any time soon by upstream:
$ git clone https://github.com/stefantalpalaru/FlameGraph.git
$ cd FlameGraph
$ git checkout wallClockProfiler
$ cd ..
Profile beacon_node
Start a Medalla node and get the beacon_node PID:
$ cd nim-beacon-chain
$ make medalla
# wait until beacon_node starts running, then run from another terminal tab:
$ ps -e -o pid,command | grep beacon_node
Assuming the PID you got is 1234, run the profiler for 5 minutes, at 19 samples per second:
$ ../wallClockProfiler/wallClockProfiler 19 build/beacon_node 1234 300 > profile_19_300.txt
# make the flame graph
$ ../FlameGraph/stackcollapse-wcp.pl profile_19_300.txt > profile_19_300_collapsed.txt
$ ../FlameGraph/flamegraph.pl profile_19_300_collapsed.txt > profile_19_300.svg
Now you can open "profile_19_300.svg" in a browser and click on individual functions to zoom in.
Using prof (also a wall-clock-based profile)
perf record -F 99 --call-graph dwarf,16000 -e task-clock -p $(pidof nimbus_beacon_node)
perf script > out.perf
../FlameGraph/stackcollapse-perf.pl out.perf > perf_collapsed.txt
../FlameGraph/flamegraph.pl perf_collapsed.txt > perf.svg
See this for a way to get file names and line numbers: https://github.com/brendangregg/FlameGraph/issues/205#issuecomment-698008393