[ocaml-platform] Benchmarking in OPAM

Christophe TROESTLER Christophe.Troestler at umons.ac.be
Tue Mar 19 17:07:41 GMT 2013


On Thu, 14 Mar 2013 10:48:57 +0000, Anil Madhavapeddy wrote:
> 
> > Finally, we realized that we really need two distinct kinds of
> > benchmarking software:
> > - one "benchmark library" that is solely meant to run performance
> > tests and return the results (will be used by and linked with the
> > benchmark programs, so recompiled at each compiler change, so should
> > be rather light if possible)
> > - one "benchmark manager" that compares results between different
> > runs, plots nice graphics, stores results over time or send them to a
> > CI server, format them in XML or what not. This one is run from the
> > system compiler and can have arbitrarily large feature sets and
> > dependencies.
> > 
> > I believe a similar split would be meaningful for unit testing as
> > well. Of course, if you're considering daily automated large-scale
> > package building and checking, instead of tight feedback loops, it is
> > much less compelling to force a split, you can just bundle the two
> > kind of features under the same package.
> 
> The split you describe is generally good discipline, as it encourages
> library authors to encode more small benchmarks that can be called from
> larger tools.
> 
> The benchmark manager is definitely something we want to have in the
> OPAM hosted test system.  It's very difficult to get representative
> benchmark results without a good mix of architectures and operating
> systems, and we're going to pepper lots of odd setups into OPAM (and
> eventually have the facility to crowdsource other people's machines
> into the build pool, to make it easier to contribute resources).
> 
> So for the moment, focussing on the benchmark library would seem to
> be the best thing to do: I've not really used any of them, and would
> be interested in knowing what I ought to adopt for (e.g.) the Mirage
> protocol libraries.  Once we have that in place, the OPAM test integration
> should be much more straightforward.

Maybe this is also a good time to promote a single benchmarking
framework.  As the other two libraries mentioned by Török Edwin,
Benchmark computes the mean and std deviation — it just does not
expose them to the user.  All these libs have a lot in common and,
IMHO, it would be best to merge the features of the 3 libraries.  I
agree with the proposed split.  As I understand it:

- Benchmark: type defining what is a benchmarking sample, functions to
  write and read it to a simple format [more complex formats,
  e.g. XML, can be supported by Benchmark_manager], functions to
  perform tests (min number of samples, min running time).

- Benchmark_manager: all the rest. Statistical tests,... as already
  described by Gabriel.

If you need the name, I agree to deprecate the Benchmark module once a
replacement following those lines has seen the light.  Also, if I can
be of some help, just let me know (just do not want to take the lead
for lack of time).

Best,
C.


More information about the Platform mailing list