alain.frisch at lexifi.com
Mon Feb 11 10:06:48 GMT 2013
On 02/08/2013 07:13 AM, Alain Frisch wrote:
> It is certainly a good idea to do some benchmarks with the -ppx, I'll
> try to find some time to do so.
Here we go.
I've done some timings for compiling typing/typecore.ml from the OCaml
distribution with ocamlopt.opt, applying the sedlex AST mapper several
0.43s ocamlopt.opt, no ppx
0.73s ocamlopt.opt, 10 -ppx sedlex rewriters
0.98s ocamlopt.opt, 20 -ppx sedlex rewriters
1.70s ocamlopt.opt, 40 -ppx sedlex rewriters
0.54s ocamlopt.opt, sedlex dynlinked by ppx_driver and run 10 times
0.58s ocamlopt.opt, sedlex dynlinked by ppx_driver and run 20 times
0.68s ocamlopt.opt, sedlex dynlinked by ppx_driver and run 40 times
0.84s ocamlopt.opt, sedlex dynlinked by ppx_driver and run 80 times
0.57s ocamlopt.opt, sedlex dynlinked by ppx_driver and run 80 times,
with OCAMLRUNPARAM=s=32M (for ppx_driver only, not ocamlopt.opt itself)
0.45s ocamlopt.opt, sedlex dynlinked by ppx_driver and run 80 times,
with OCAMLRUNPARAM=s=32M for both ppx_driver and ocamlopt.opt itself
(Timings are user time, averaged over 5 runs, on a Linux 64-bit system.
Variance is quite high, though, so this should be taken with a grain
Note: ppx_driver dynlinks sedlex.cmxs only once and runs it N times.
For big values of N, this gives a performance profile similar to static
- Running AST rewriters as independent processes adds about 7% of
compilation time with ocamlopt.opt
- If we run a single process without marshaling the AST between
rewriters, the marginal cost goes below 1% per rewriter. (This is,
basically, the cost of iterating over the AST with objects, matching
over each expression to find extension "markers", and rebuilding a deep
copy of the AST.)
- This can be reduced a lot with proper configuration of the GC (down to
about 0.3% of compilation time with ocamlopt.opt, with the same GC config).
- It would be interesting to do some timings when dynlinking a lot of
different .cmxs plugins in ppx_drivers.
(If we link the rewriters statically with a custom version of the
compiler itself, assuming a new hook in the compiler allows to plug AST
rewriter, we would only pay the marginal cost < 1%.)
For a code base like Jane Street's where many "extensions" have to be
used everywhere, I suspect that the "independent" ppx processes might
become an issue (or maybe not, if compared to camlp4), but the following
solutions would work:
- Statically linking a "big rewriter" called with -ppx.
- Statically linking a version of ocamlopt.opt + all rewriters.
- Dynamically linking all rewriters with a single ppx driver (to be
confirmed, there would be a small overhead for the dynamic linking
itself in addition to the 0.3% marginal cost above).
More information about the wg-camlp4