[wg-parallel] About Lwt and Async

Anil Madhavapeddy anil at recoil.org
Mon Apr 29 16:34:56 BST 2013


On 29 Apr 2013, at 16:12, Jeremie Dimino <jdimino at janestreet.com> wrote:

> On Mon, Apr 29, 2013 at 3:57 PM, Anil Madhavapeddy <anil at recoil.org> wrote:
>> Got it. I don't have a feel for the performance impact of such deferred
>> scheduling, except that the difference between busy-spinning if there are
>> outstanding requests, vs dropping into select/kqueue/epoll more frequently
>> is very significant.
> 
> As Stephen said there shouldn't be more select/kqueue/epoll calls.
> The idea is to run all jobs until there is no one left (with a limit)
> before doing the blocking select/kqueue/epoll call.
> 
>> For example, the Arakoon folks anecdotally reported a 20x performance loss
>> between a direct Unix implementation of their database layer vs an Lwt
>> one.  A loss that large can only be explained by context switching or
>> pathological scheduling somewhere (given we know that Lwt doesn't result
>> in a lot more allocation on the major heap).
>> http://www.slideshare.net/eikke/arakoon (slide 14/15)
>> (I'm CCing Romain, who might have details).
> 
> I believe this was about the disk IO.  Disk IO are done using
> preemptive threads since unix doesn't support asynchronous disk IO.

Right... and at the block level and not the filesystem level.  Many
modern operating systems do expose an async block interface that isn't
POSIX that would be better to bind against than POSIX AIO, such as
libaio on Linux.

However, the specific decision of block scheduler depends on whether
your writes are page-aligned (so that O_DIRECT is suitable), and if
you need kernel-level elevator scheduling (often more harmful than
good on VMs or large RAID arrays with many spindles).

> When data are cached it is indeed much slower.  You can get about the
> same as direct IO after some tweaking: set the async_method to
> 'switch' and force the process to run only on one cpu. But the switch
> method doesn't work with the threaded runtime.

Yeah, I'd say that most high-performance databases gave up trying to do
anything useful with POSIX AIO a long time ago, and are mostly O_DIRECT
based.  At the filesystem layer, there are similar extensions in Linux
to query the internal details of filesystem layouts via fiemap [1,2].

One of our PhD students has also been noticing similar performance
artefacts for *network* traffic in the matrix of operating system,
destination (localhost vs remote) and protocol (TCP/UDP/pipe/shmem).

All of this makes me worry that we're entering a mire of cross-platform
issues as Async goes more open.  libuv [3], for example, takes care of
many of these async-IO issues, and has a large userbase thanks to our
Javascript-loving friends at Nodejs.  I wonder if a good intern project
would be to bind libuv to Async and test its performance profile vs
the current tree...


[1] https://github.com/Incubaid/baardskeerder/blob/master/src/posix.c
[2] http://lwn.net/Articles/297696/


More information about the wg-parallel mailing list