Mercurial > hg > memcached
comparison doc/threads.txt @ 0:30782bb1fc04 MEMCACHED_1_2_3
memcached-1.2.3
author | Maxim Dounin <mdounin@mdounin.ru> |
---|---|
date | Sun, 23 Sep 2007 03:58:34 +0400 |
parents | |
children |
comparison
equal
deleted
inserted
replaced
-1:000000000000 | 0:30782bb1fc04 |
---|---|
1 Multithreading support in memcached | |
2 | |
3 OVERVIEW | |
4 | |
5 By default, memcached is compiled as a single-threaded application. This is | |
6 the most CPU-efficient mode of operation, and it is appropriate for memcached | |
7 instances running on single-processor servers or whose request volume is | |
8 low enough that available CPU power is not a bottleneck. | |
9 | |
10 More heavily-used memcached instances can benefit from multithreaded mode. | |
11 To enable it, use the "--enable-threads" option to the configure script: | |
12 | |
13 ./configure --enable-threads | |
14 | |
15 You must have the POSIX thread functions (pthread_*) on your system in order | |
16 to use memcached's multithreaded mode. | |
17 | |
18 Once you have a thread-capable memcached executable, you can control the | |
19 number of threads using the "-t" option; the default is 4. On a machine | |
20 that's dedicated to memcached, you will typically want one thread per | |
21 processor core. Due to memcached's nonblocking architecture, there is no | |
22 real advantage to using more threads than the number of CPUs on the machine; | |
23 doing so will increase lock contention and is likely to degrade performance. | |
24 | |
25 | |
26 INTERNALS | |
27 | |
28 The threading support is mostly implemented as a series of wrapper functions | |
29 that protect calls to underlying code with one of a small number of locks. | |
30 In single-threaded mode, the wrappers are replaced with direct invocations | |
31 of the target code using #define; that is done in memcached.h. This approach | |
32 allows memcached to be compiled in either single- or multi-threaded mode. | |
33 | |
34 Each thread has its own instance of libevent ("base" in libevent terminology). | |
35 The only direct interaction between threads is for new connections. One of | |
36 the threads handles the TCP listen socket; each new connection is passed to | |
37 a different thread on a round-robin basis. After that, each thread operates | |
38 on its set of connections as if it were running in single-threaded mode, | |
39 using libevent to manage nonblocking I/O as usual. | |
40 | |
41 UDP requests are a bit different, since there is only one UDP socket that's | |
42 shared by all clients. The UDP socket is monitored by all of the threads. | |
43 When a datagram comes in, all the threads that aren't already processing | |
44 another request will receive "socket readable" callbacks from libevent. | |
45 Only one thread will successfully read the request; the others will go back | |
46 to sleep or, in the case of a very busy server, will read whatever other | |
47 UDP requests are waiting in the socket buffer. Note that in the case of | |
48 moderately busy servers, this results in increased CPU consumption since | |
49 threads will constantly wake up and find no input waiting for them. But | |
50 short of much more major surgery on the I/O code, this is not easy to avoid. | |
51 | |
52 | |
53 TO DO | |
54 | |
55 The locking is currently very coarse-grained. There is, for example, one | |
56 lock that protects all the calls to the hashtable-related functions. Since | |
57 memcached spends much of its CPU time on command parsing and response | |
58 assembly, rather than managing the hashtable per se, this is not a huge | |
59 bottleneck for small numbers of processors. However, the locking will likely | |
60 have to be refined in the event that memcached needs to run well on | |
61 massively-parallel machines. | |
62 | |
63 One cheap optimization to reduce contention on that lock: move the hash value | |
64 computation so it occurs before the lock is obtained whenever possible. | |
65 Right now the hash is performed at the lowest levels of the functions in | |
66 assoc.c. If instead it was computed in memcached.c, then passed along with | |
67 the key and length into the items.c code and down into assoc.c, that would | |
68 reduce the amount of time each thread needs to keep the hashtable lock held. |