root/trunk/server/doc/protocol.txt @ 615

Revision 615, 18.6 kB (checked in by plindner, 2 years ago)

Incorporate "cas" operation developed by Dustin Sallings
<dustin@…> and implemented by Chris Goffinet
<goffinet@…>. This change allows you to do
atomic changes to an existing key.

  • Property svn:eol-style set to native
  • Property svn:keywords set to Author Date Id Revision
Line 
1Protocol
2--------
3
4Clients of memcached communicate with server through TCP connections.
5(A UDP interface is also available; details are below under "UDP
6protocol.") A given running memcached server listens on some
7(configurable) port; clients connect to that port, send commands to
8the server, read responses, and eventually close the connection.
9
10There is no need to send any command to end the session. A client may
11just close the connection at any moment it no longer needs it. Note,
12however, that clients are encouraged to cache their connections rather
13than reopen them every time they need to store or retrieve data.  This
14is because memcached is especially designed to work very efficiently
15with a very large number (many hundreds, more than a thousand if
16necessary) of open connections. Caching connections will eliminate the
17overhead associated with establishing a TCP connection (the overhead
18of preparing for a new connection on the server side is insignificant
19compared to this).
20
21There are two kinds of data sent in the memcache protocol: text lines
22and unstructured data.  Text lines are used for commands from clients
23and responses from servers. Unstructured data is sent when a client
24wants to store or retrieve data. The server will transmit back
25unstructured data in exactly the same way it received it, as a byte
26stream. The server doesn't care about byte order issues in
27unstructured data and isn't aware of them. There are no limitations on
28characters that may appear in unstructured data; however, the reader
29of such data (either a client or a server) will always know, from a
30preceding text line, the exact length of the data block being
31transmitted.
32
33Text lines are always terminated by \r\n. Unstructured data is _also_
34terminated by \r\n, even though \r, \n or any other 8-bit characters
35may also appear inside the data. Therefore, when a client retrieves
36data from a server, it must use the length of the data block (which it
37will be provided with) to determine where the data block ends, and not
38the fact that \r\n follows the end of the data block, even though it
39does.
40
41Keys
42----
43
44Data stored by memcached is identified with the help of a key. A key
45is a text string which should uniquely identify the data for clients
46that are interested in storing and retrieving it.  Currently the
47length limit of a key is set at 250 characters (of course, normally
48clients wouldn't need to use such long keys); the key must not include
49control characters or whitespace.
50
51Commands
52--------
53
54There are three types of commands.
55
56Storage commands (there are four: "set", "add", "replace", and "cas")
57ask the server to store some data identified by a key. The client
58sends a command line, and then a data block; after that the client
59expects one line of response, which will indicate success or faulure.
60
61Retrieval commands (there are two: "get" and "gets") ask the server to
62retrieve data corresponding to a set of keys (one or more keys in one
63request). The client sends a command line, which includes all the
64requested keys; after that for each item the server finds it sends to
65the client one response line with information about the item, and one
66data block with the item's data; this continues until the server
67finished with the "END" response line.
68
69All other commands don't involve unstructured data. In all of them,
70the client sends one command line, and expects (depending on the
71command) either one line of response, or several lines of response
72ending with "END" on the last line.
73
74A command line always starts with the name of the command, followed by
75parameters (if any) delimited by whitespace. Command names are
76lower-case and are case-sensitive.
77
78Expiration times
79----------------
80
81Some commands involve a client sending some kind of expiration time
82(relative to an item or to an operation requested by the client) to
83the server. In all such cases, the actual value sent may either be
84Unix time (number of seconds since January 1, 1970, as a 32-bit
85value), or a number of seconds starting from current time. In the
86latter case, this number of seconds may not exceed 60*60*24*30 (number
87of seconds in 30 days); if the number sent by a client is larger than
88that, the server will consider it to be real Unix time value rather
89than an offset from current time.
90
91
92Error strings
93-------------
94
95Each command sent by a client may be answered with an error string
96from the server. These error strings come in three types:
97
98- "ERROR\r\n"
99
100  means the client sent a nonexistent command name.
101
102- "CLIENT_ERROR <error>\r\n"
103
104  means some sort of client error in the input line, i.e. the input
105  doesn't conform to the protocol in some way. <error> is a
106  human-readable error string.
107
108- "SERVER_ERROR <error>\r\n"
109
110  means some sort of server error prevents the server from carrying
111  out the command. <error> is a human-readable error string. In cases
112  of severe server errors, which make it impossible to continue
113  serving the client (this shouldn't normally happen), the server will
114  close the connection after sending the error line. This is the only
115  case in which the server closes a connection to a client.
116
117
118In the descriptions of individual commands below, these error lines
119are not again specifically mentioned, but clients must allow for their
120possibility.
121
122
123Storage commands
124----------------
125
126First, the client sends a command line which looks like this:
127
128<command name> <key> <flags> <exptime> <bytes> [<unqiue>]\r\n
129
130- <command name> is "set", "add", "replace", or "cas"
131
132  "set" means "store this data". 
133
134  "add" means "store this data, but only if the server *doesn't* already
135  hold data for this key". 
136
137  "replace" means "store this data, but only if the server *does*
138  already hold data for this key".
139
140  "cas" is a check and set operation which means "store this data but
141  only if no one else has updated since I last fetched it."
142
143- <key> is the key under which the client asks to store the data
144
145- <flags> is an arbitrary 16-bit unsigned integer (written out in
146  decimal) that the server stores along with the data and sends back
147  when the item is retrieved. Clients may use this as a bit field to
148  store data-specific information; this field is opaque to the server.
149  Note that in memcached 1.2.1 and higher, flags may be 32-bits, instead
150  of 16, but you might want to restrict yourself to 16 bits for
151  compatibility with older versions.
152
153- <exptime> is expiration time. If it's 0, the item never expires
154  (although it may be deleted from the cache to make place for other
155  items). If it's non-zero (either Unix time or offset in seconds from
156  current time), it is guaranteed that clients will not be able to
157  retrieve this item after the expiration time arrives (measured by
158  server time). 
159
160- <bytes> is the number of bytes in the data block to follow, *not*
161  including the delimiting \r\n. <bytes> may be zero (in which case
162  it's followed by an empty data block).
163
164- <cas unique> is a unique 64-bit value of an existing entry.
165
166After this line, the client sends the data block:
167
168<data block>\r\n
169
170- <data block> is a chunk of arbitrary 8-bit data of length <bytes>
171  from the previous line.
172
173After sending the command line and the data blockm the client awaits
174the reply, which may be:
175
176- "STORED\r\n", to indicate success.
177
178- "NOT_STORED\r\n" to indicate the data was not stored, but not
179because of an error. This normally means that either that the
180condition for an "add" or a "replace" command wasn't met, or that the
181item is in a delete queue (see the "delete" command below).
182
183- "EXISTS\r\n" to indicate that the item you are trying to store with
184a "cas" command has been modified since you last fetched it.
185
186Retrieval command:
187------------------
188
189The retrieval commands "get" and "gets" operates like this:
190
191get <key>*\r\n
192gets <key>*\r\n
193
194- <key>* means one or more key strings separated by whitespace.
195
196After this command, the client expects zero or more items, each of
197which is received as a text line followed by a data block. After all
198the items have been transmitted, the server sends the string
199
200"END\r\n"
201
202to indicate the end of response.
203
204Each item sent by the server looks like this:
205
206VALUE <key> <flags> <bytes> [<cas unique>]\r\n
207<data block>\r\n
208
209- <key> is the key for the item being sent
210
211- <flags> is the flags value set by the storage command
212
213- <bytes> is the length of the data block to follow, *not* including
214  its delimiting \r\n
215
216- <cas unique> is a unique 64-bit integer that uniquely identifies
217  this specific item.
218
219- <data block> is the data for this item.
220
221If some of the keys appearing in a retrieval request are not sent back
222by the server in the item list this means that the server does not
223hold items with such keys (because they were never stored, or stored
224but deleted to make space for more items, or expired, or explicitly
225deleted by a client).
226
227
228Deletion
229--------
230
231The command "delete" allows for explicit deletion of items:
232
233delete <key> <time>\r\n
234
235- <key> is the key of the item the client wishes the server to delete
236
237- <time> is the amount of time in seconds (or Unix time until which)
238  the client wishes the server to refuse "add" and "replace" commands
239  with this key. For this amount of item, the item is put into a
240  delete queue, which means that it won't possible to retrieve it by
241  the "get" command, but "add" and "replace" command with this key
242  will also fail (the "set" command will succeed, however). After the
243  time passes, the item is finally deleted from server memory.
244
245  The parameter <time> is optional, and, if absent, defaults to 0
246  (which means that the item will be deleted immediately and further
247  storage commands with this key will succeed).
248
249The response line to this command can be one of:
250
251- "DELETED\r\n" to indicate success
252
253- "NOT_FOUND\r\n" to indicate that the item with this key was not
254  found.
255
256See the "flush_all" command below for immediate invalidation
257of all existing items.
258
259
260Increment/Decrement
261-------------------
262
263Commands "incr" and "decr" are used to change data for some item
264in-place, incrementing or decrementing it. The data for the item is
265treated as decimal representation of a 32-bit unsigned integer. If the
266current data value does not conform to such a representation, the
267commands behave as if the value were 0. Also, the item must already
268exist for incr/decr to work; these commands won't pretend that a
269non-existent key exists with value 0; instead, they will fail.
270
271The client sends the command line:
272
273incr <key> <value>\r\n
274
275or
276
277decr <key> <value>\r\n
278
279- <key> is the key of the item the client wishes to change
280
281- <value> is the amount by which the client wants to increase/decrease
282the item. It is a decimal representation of a 32-bit unsigned integer.
283
284The response will be one of:
285
286- "NOT_FOUND\r\n" to indicate the item with this value was not found
287
288- <value>\r\n , where <value> is the new value of the item's data,
289  after the increment/decrement operation was carried out.
290
291Note that underflow in the "decr" command is caught: if a client tries
292to decrease the value below 0, the new value will be 0.  Overflow in the
293"incr" command will wrap around the 32 bit mark.
294
295Note also that decrementing a number such that it loses length isn't
296guaranteed to decrement its returned length.  The number MAY be
297space-padded at the end, but this is purely an implementation
298optimization, so you also shouldn't rely on that.
299
300Statistics
301----------
302
303The command "stats" is used to query the server about statistics it
304maintains and other internal data. It has two forms. Without
305arguments:
306
307stats\r\n
308
309it causes the server to output general-purpose statistics and
310settings, documented below.  In the other form it has some arguments:
311
312stats <args>\r\n
313
314Depending on <args>, various internal data is sent by the server. The
315kinds of arguments and the data sent are not documented in this vesion
316of the protocol, and are subject to change for the convenience of
317memcache developers.
318
319
320General-purpose statistics
321--------------------------
322
323Upon receiving the "stats" command without arguments, the server sents
324a number of lines which look like this:
325
326STAT <name> <value>\r\n
327
328The server terminates this list with the line
329
330END\r\n
331
332In each line of statistics, <name> is the name of this statistic, and
333<value> is the data.  The following is the list of all names sent in
334response to the "stats" command, together with the type of the value
335sent for this name, and the meaning of the value.
336
337In the type column below, "32u" means a 32-bit unsigned integer, "64u"
338means a 64-bit unsigner integer. '32u:32u' means two 32-but unsigned
339integers separated by a colon.
340
341
342Name              Type     Meaning
343----------------------------------
344pid               32u      Process id of this server process
345uptime            32u      Number of seconds this server has been running
346time              32u      current UNIX time according to the server
347version           string   Version string of this server
348pointer_size      32       Default size of pointers on the host OS
349                           (generally 32 or 64)
350rusage_user       32u:32u  Accumulated user time for this process
351                           (seconds:microseconds)
352rusage_system     32u:32u  Accumulated system time for this process
353                           (seconds:microseconds)
354curr_items        32u      Current number of items stored by the server
355total_items       32u      Total number of items stored by this server
356                           ever since it started
357bytes             64u      Current number of bytes used by this server
358                           to store items
359curr_connections  32u      Number of open connections
360total_connections 32u      Total number of connections opened since
361                           the server started running
362connection_structures 32u  Number of connection structures allocated
363                           by the server
364cmd_get           64u      Cumulative number of retrieval requests
365cmd_set           64u      Cumulative number of storage requests
366get_hits          64u      Number of keys that have been requested and
367                           found present
368get_misses        64u      Number of items that have been requested
369                           and not found
370evictions         64u      Number of valid items removed from cache                                                                           
371                           to free memory for new items                                                                                       
372bytes_read        64u      Total number of bytes read by this server
373                           from network
374bytes_written     64u      Total number of bytes sent by this server to
375                           network
376limit_maxbytes    32u      Number of bytes this server is allowed to
377                           use for storage.
378threads           32u      Number of worker threads requested.
379                           (see doc/threads.txt)
380
381
382
383Other commands
384--------------
385
386"flush_all" is a command with an optional numeric argument. It always
387succeeds, and the server sends "OK\r\n" in response. Its effect is to
388invalidate all existing items immediately (by default) or after the
389expiration specified.  After invalidation none of the items will be returned
390in response to a retrieval command (unless it's stored again under the
391same key *after* flush_all has invalidated the items). flush_all doesn't
392actually free all the memory taken up by existing items; that will
393happen gradually as new items are stored. The most precise definition
394of what flush_all does is the following: it causes all items whose
395update time is earlier than the time at which flush_all was set to be
396executed to be ignored for retrieval purposes.
397
398The intent of flush_all with a delay, was that in a setting where you
399have a pool of memcached servers, and you need to flush all content,
400you have the option of not resetting all memcached servers at the
401same time (which could e.g. cause a spike in database load with all
402clients suddenly needing to recreate content that would otherwise
403have been found in the memcached daemon).
404
405The delay option allows you to have them reset in e.g. 10 second
406intervals (by passing 0 to the first, 10 to the second, 20 to the
407third, etc. etc.).
408
409
410"version" is a command with no arguments:
411
412version\r\n
413
414In response, the server sends
415
416"VERSION <version>\r\n", where <version> is the version string for the
417server.
418
419"verbosity" is a command with a numeric argument. It always                                                 
420succeeds, and the server sends "OK\r\n" in response. Its effect is to                                       
421set the verbosity level of the logging output.                                                               
422
423"quit" is a command with no arguments:
424
425quit\r\n
426
427Upon receiving this command, the server closes the
428connection. However, the client may also simply close the connection
429when it no longer needs it, without issuing this command.
430
431
432UDP protocol
433------------
434
435For very large installations where the number of clients is high enough
436that the number of TCP connections causes scaling difficulties, there is
437also a UDP-based interface. The UDP interface does not provide guaranteed
438delivery, so should only be used for operations that aren't required to
439succeed; typically it is used for "get" requests where a missing or
440incomplete response can simply be treated as a cache miss.
441
442Each UDP datagram contains a simple frame header, followed by data in the
443same format as the TCP protocol described above. In the current
444implementation, requests must be contained in a single UDP datagram, but
445responses may span several datagrams. (The only common requests that would
446span multiple datagrams are huge multi-key "get" requests and "set"
447requests, both of which are more suitable to TCP transport for reliability
448reasons anyway.)
449
450The frame header is 8 bytes long, as follows (all values are 16-bit integers
451in network byte order, high byte first):
452
4530-1 Request ID
4542-3 Sequence number
4554-5 Total number of datagrams in this message
4566-7 Reserved for future use; must be 0
457
458The request ID is supplied by the client. Typically it will be a
459monotonically increasing value starting from a random seed, but the client
460is free to use whatever request IDs it likes. The server's response will
461contain the same ID as the incoming request. The client uses the request ID
462to differentiate between responses to outstanding requests if there are
463several pending from the same server; any datagrams with an unknown request
464ID are probably delayed responses to an earlier request and should be
465discarded.
466
467The sequence number ranges from 0 to n-1, where n is the total number of
468datagrams in the message. The client should concatenate the payloads of the
469datagrams for a given response in sequence number order; the resulting byte
470stream will contain a complete response in the same format as the TCP
471protocol (including terminating \r\n sequences).
Note: See TracBrowser for help on using the browser.