comparison doc/threads.txt @ 0:30782bb1fc04 MEMCACHED_1_2_3

memcached-1.2.3
author Maxim Dounin <mdounin@mdounin.ru>
date Sun, 23 Sep 2007 03:58:34 +0400
parents
children
comparison
equal deleted inserted replaced
-1:000000000000 0:30782bb1fc04
1 Multithreading support in memcached
2
3 OVERVIEW
4
5 By default, memcached is compiled as a single-threaded application. This is
6 the most CPU-efficient mode of operation, and it is appropriate for memcached
7 instances running on single-processor servers or whose request volume is
8 low enough that available CPU power is not a bottleneck.
9
10 More heavily-used memcached instances can benefit from multithreaded mode.
11 To enable it, use the "--enable-threads" option to the configure script:
12
13 ./configure --enable-threads
14
15 You must have the POSIX thread functions (pthread_*) on your system in order
16 to use memcached's multithreaded mode.
17
18 Once you have a thread-capable memcached executable, you can control the
19 number of threads using the "-t" option; the default is 4. On a machine
20 that's dedicated to memcached, you will typically want one thread per
21 processor core. Due to memcached's nonblocking architecture, there is no
22 real advantage to using more threads than the number of CPUs on the machine;
23 doing so will increase lock contention and is likely to degrade performance.
24
25
26 INTERNALS
27
28 The threading support is mostly implemented as a series of wrapper functions
29 that protect calls to underlying code with one of a small number of locks.
30 In single-threaded mode, the wrappers are replaced with direct invocations
31 of the target code using #define; that is done in memcached.h. This approach
32 allows memcached to be compiled in either single- or multi-threaded mode.
33
34 Each thread has its own instance of libevent ("base" in libevent terminology).
35 The only direct interaction between threads is for new connections. One of
36 the threads handles the TCP listen socket; each new connection is passed to
37 a different thread on a round-robin basis. After that, each thread operates
38 on its set of connections as if it were running in single-threaded mode,
39 using libevent to manage nonblocking I/O as usual.
40
41 UDP requests are a bit different, since there is only one UDP socket that's
42 shared by all clients. The UDP socket is monitored by all of the threads.
43 When a datagram comes in, all the threads that aren't already processing
44 another request will receive "socket readable" callbacks from libevent.
45 Only one thread will successfully read the request; the others will go back
46 to sleep or, in the case of a very busy server, will read whatever other
47 UDP requests are waiting in the socket buffer. Note that in the case of
48 moderately busy servers, this results in increased CPU consumption since
49 threads will constantly wake up and find no input waiting for them. But
50 short of much more major surgery on the I/O code, this is not easy to avoid.
51
52
53 TO DO
54
55 The locking is currently very coarse-grained. There is, for example, one
56 lock that protects all the calls to the hashtable-related functions. Since
57 memcached spends much of its CPU time on command parsing and response
58 assembly, rather than managing the hashtable per se, this is not a huge
59 bottleneck for small numbers of processors. However, the locking will likely
60 have to be refined in the event that memcached needs to run well on
61 massively-parallel machines.
62
63 One cheap optimization to reduce contention on that lock: move the hash value
64 computation so it occurs before the lock is obtained whenever possible.
65 Right now the hash is performed at the lowest levels of the functions in
66 assoc.c. If instead it was computed in memcached.c, then passed along with
67 the key and length into the items.c code and down into assoc.c, that would
68 reduce the amount of time each thread needs to keep the hashtable lock held.