# Profiling

We use [`gperftools`](https://github.com/gperftools/gperftools) (Google
Performance Tools) for profiling. Note, we also considered using
[`llvm-xray`](https://llvm.org/docs/XRay.html) but found it lacking in
comparison. This will not tell you how long (wall clock time) each function
took, but it will help you determine which functions are the most expensive.

## Prequisities

On Linux (Debian), you need to install:
```
sudo apt install gperftools graphviz
```

On macOS, you need to install (via [homebrew](https://brew.sh)):
```
brew install gperftools ghostscript graphviz
```

## How to run

### Generating profiling graphs

There is a Makefile rule that should just auto-magically work:
```
make profile
```

For each profiled function, this will produce two files (a PROF and PDF
file). The PROF file is the raw profiling data and the PDF is the
human-friendly graph that generated from that profiling data.

#### Errors on macOS

Note, on macOS there may a lot of "errors" like:
```
otool-classic: can't open file: /usr/lib/libc++.1.dylib
```

In my experience, you can ignore these. It's somewhat a known issue and may be
resolved later. The PDFs should still generate successfully. I think it's the
reason some function names are a hexadecimal address though.

### Viewing profiling graphs

On Linux, you can open an individual PDF file like:
```
xdg-open blob_to_kzg_commitment.pdf
```

On macOS, you can open an individual PDF file like:
```
open blob_to_kzg_commitment.pdf
```

Or, you can open all the PDF files like:
```
open *.pdf
```

### Interpreting the profiling graphs

These might not make much sense without guidance. From a high-level, this works
by polling the instruction pointer (what's being executed) at a specific rate
(like once every 5 nanoseconds) and tracking this information. From this, you
can infer the relative time each function uses by counting the number of samples
that are in each function. 

Given a box containing:
```
my_func 189 (0.6%) of 28758 (96.8%)
```

* Each box is a unique function.
* Bigger boxes are more expensive.
* Lines between boxes are function calls.
* 189 is the number of profiling samples in this function.
* 0.6% is the percentage of profiling samples in the functions.
* 28758 is the number of profiling samples in this function and its callees.
* 96.8% is the percentage of profiling samples in this function and its callees.