[wg-camlp4] benchmarks

Alain Frisch alain.frisch at lexifi.com
Mon Feb 11 10:06:48 GMT 2013


On 02/08/2013 07:13 AM, Alain Frisch wrote:
> It is certainly a good idea to do some benchmarks with the -ppx, I'll
> try to find some time to do so.

Here we go.

I've done some timings for compiling typing/typecore.ml from the OCaml 
distribution with ocamlopt.opt, applying the sedlex AST mapper several 
times.

0.43s  ocamlopt.opt, no ppx
0.73s  ocamlopt.opt, 10 -ppx sedlex rewriters
0.98s  ocamlopt.opt, 20 -ppx sedlex rewriters
1.70s  ocamlopt.opt, 40 -ppx sedlex rewriters
0.54s  ocamlopt.opt, sedlex dynlinked by ppx_driver and run 10 times
0.58s  ocamlopt.opt, sedlex dynlinked by ppx_driver and run 20 times
0.68s  ocamlopt.opt, sedlex dynlinked by ppx_driver and run 40 times
0.84s  ocamlopt.opt, sedlex dynlinked by ppx_driver and run 80 times

0.57s  ocamlopt.opt, sedlex dynlinked by ppx_driver and run 80 times, 
with OCAMLRUNPARAM=s=32M (for ppx_driver only, not ocamlopt.opt itself)
0.45s  ocamlopt.opt, sedlex dynlinked by ppx_driver and run 80 times, 
with OCAMLRUNPARAM=s=32M for both ppx_driver and ocamlopt.opt itself

(Timings are user time, averaged over 5 runs, on a Linux 64-bit system. 
  Variance is quite high, though, so this should be taken with a grain 
of salt.)

Note: ppx_driver dynlinks sedlex.cmxs only once and runs it N times.
For big values of N, this gives a performance profile similar to static 
linking.

Intepretation:

- Running AST rewriters as independent processes adds about 7% of 
compilation time with ocamlopt.opt
- If we run a single process without marshaling the AST between 
rewriters, the marginal cost goes below 1% per rewriter.  (This is, 
basically, the cost of iterating over the AST with objects, matching 
over each expression to find extension "markers", and rebuilding a deep 
copy of the AST.)
- This can be reduced a lot with proper configuration of the GC (down to 
about 0.3% of compilation time with ocamlopt.opt, with the same GC config).
- It would be interesting to do some timings when dynlinking a lot of 
different .cmxs plugins in ppx_drivers.

(If we link the rewriters statically with a custom version of the 
compiler itself, assuming a new hook in the compiler allows to plug AST 
rewriter, we would only pay the marginal cost < 1%.)

For a code base like Jane Street's where many "extensions" have to be 
used everywhere, I suspect that the "independent" ppx processes might 
become an issue (or maybe not, if compared to camlp4), but the following 
solutions would work:

  - Statically linking a "big rewriter" called with -ppx.
  - Statically linking a version of ocamlopt.opt + all rewriters.
  - Dynamically linking all rewriters with a single ppx driver (to be 
confirmed, there would be a small overhead for the dynamic linking 
itself in addition to the 0.3% marginal cost above).




Alain


More information about the wg-camlp4 mailing list