[ocaml-platform] Benchmarking in OPAM

Török Edwin edwin at etorok.net
Tue Mar 19 20:31:42 GMT 2013


On 03/19/2013 07:43 PM, Thomas Gazagnaire wrote:
>>> So for the moment, focussing on the benchmark library would seem to
>>> be the best thing to do: I've not really used any of them, and would
>>> be interested in knowing what I ought to adopt for (e.g.) the Mirage
>>> protocol libraries.  Once we have that in place, the OPAM test integration
>>> should be much more straightforward.
>>
>> Maybe this is also a good time to promote a single benchmarking
>> framework.  As the other two libraries mentioned by Török Edwin,
>> Benchmark computes the mean and std deviation — it just does not
>> expose them to the user.  All these libs have a lot in common and,
>> IMHO, it would be best to merge the features of the 3 libraries.

FWIW edobench is not a benchmarking library per se, sorry for the bad name choice.
It is the name I've given to my set of OCaml benchmarks, which includes a wrapper on top of bench,
a benchmark runner, the benchmarks themselves and a simple (text) post-processor.

It is not my intention to fork/merge bench or benchmark, but since both you and Edgar have expressed interest in merging the libraries, I can spend some time to
help towards that.

As a start I was thinking:
 - prepare a patch on top of bench that includes the minimum needed to run/store/read the benchmark data (just raw timings, not the  stats themselves)
 - prepare a patch on top of benchmark to add/move some bench features related to running/timing the benchmark (mostly the increased resolution)
 - prepare a patch on top of bench to expose its statistical computations
 - update my code to use the above, and move the statistical tests to its benchmark manager and to the layout proposed by Gabriel
 - finish the ocaml-re benchmarks that I promised
 - ... merge bench (to) benchmark eventually, but I'm probably not the right person to do that

When I have a draft of the above, should I post the git URL to the platform list, or Cc everyone individually as well?
(i.e. is everyone subscribed there?)

>  I
>> agree with the proposed split.  As I understand it:
>>
>> - Benchmark: type defining what is a benchmarking sample, functions to
>>  write and read it to a simple format [more complex formats,
>>  e.g. XML, can be supported by Benchmark_manager], functions to
>>  perform tests (min number of samples, min running time).

I was using a simple CSV-like format as it is quite simple to deal with that without additional dependencies
(i.e. Print/Scanf with %S, etc.).
Currently I'm storing the statistics, but perhaps storing the raw timings is a better idea (they don't take up that much space), as that allows
various independent post-processing to happen in the benchmark manager (like plotting raw timings, CDF, boxplot with median as well as the mean, etc.),
without having to bother the benchmark library with all those details.

>>
>> - Benchmark_manager: all the rest. Statistical tests,... as already
>>  described by Gabriel.

For a general purpose benchmark library I think that the statistics belong to the benchmark library (as is the case now with both libs),
at least as a module, but I tend to agree that for the purpose of compiler benchmarking here a separate (or at least separatable) statistical module is a good idea.
That way the choice of base benchmarking library doesn't have to mean a choice in what statistics to use, and it'd be possible
to plot the results at a later time without being limited by the statistics computed in the benchmark library.

>>
> 
> FWI, few months ago Pierre has also started to work on a bench framework (which uses OPAM to set-up some kind of environment for the benchs):
> 
> https://github.com/chambart/ocp-bench

I was not aware of that until recently, but it appears to have a tight dependency on lwt.
I do have quite a few lwt benchmarks myself, but not all benchmarks need lwt, and I think it is quite important that the benchmark library has minimal dependencies
so you can rebuild it quickly (lwt also has some optional dependencies, and enabling/disabling those would then trigger a rebuild of every benchmark).

Can the lwt dependency be turned into something optional? In that case I would definitely be interested in seeing if we can share some code
between ocp-bench and what I have currently.

Best regards,
--Edwin


More information about the Platform mailing list