root/trunk/server/doc/protocol.txt @ 627

Revision 627, 18.9 kB (checked in by plindner, 2 years ago)

update for prepend operation, thread safe version from Maxim replacing Filipe's implementation

  • Property svn:eol-style set to native
  • Property svn:keywords set to Author Date Id Revision
Line 
1Protocol
2--------
3
4Clients of memcached communicate with server through TCP connections.
5(A UDP interface is also available; details are below under "UDP
6protocol.") A given running memcached server listens on some
7(configurable) port; clients connect to that port, send commands to
8the server, read responses, and eventually close the connection.
9
10There is no need to send any command to end the session. A client may
11just close the connection at any moment it no longer needs it. Note,
12however, that clients are encouraged to cache their connections rather
13than reopen them every time they need to store or retrieve data.  This
14is because memcached is especially designed to work very efficiently
15with a very large number (many hundreds, more than a thousand if
16necessary) of open connections. Caching connections will eliminate the
17overhead associated with establishing a TCP connection (the overhead
18of preparing for a new connection on the server side is insignificant
19compared to this).
20
21There are two kinds of data sent in the memcache protocol: text lines
22and unstructured data.  Text lines are used for commands from clients
23and responses from servers. Unstructured data is sent when a client
24wants to store or retrieve data. The server will transmit back
25unstructured data in exactly the same way it received it, as a byte
26stream. The server doesn't care about byte order issues in
27unstructured data and isn't aware of them. There are no limitations on
28characters that may appear in unstructured data; however, the reader
29of such data (either a client or a server) will always know, from a
30preceding text line, the exact length of the data block being
31transmitted.
32
33Text lines are always terminated by \r\n. Unstructured data is _also_
34terminated by \r\n, even though \r, \n or any other 8-bit characters
35may also appear inside the data. Therefore, when a client retrieves
36data from a server, it must use the length of the data block (which it
37will be provided with) to determine where the data block ends, and not
38the fact that \r\n follows the end of the data block, even though it
39does.
40
41Keys
42----
43
44Data stored by memcached is identified with the help of a key. A key
45is a text string which should uniquely identify the data for clients
46that are interested in storing and retrieving it.  Currently the
47length limit of a key is set at 250 characters (of course, normally
48clients wouldn't need to use such long keys); the key must not include
49control characters or whitespace.
50
51Commands
52--------
53
54There are three types of commands.
55
56Storage commands (there are six: "set", "add", "replace", "append"
57"prepend" and "cas") ask the server to store some data identified by a key. The
58client sends a command line, and then a data block; after that the
59client expects one line of response, which will indicate success or
60faulure.
61
62Retrieval commands (there are two: "get" and "gets") ask the server to
63retrieve data corresponding to a set of keys (one or more keys in one
64request). The client sends a command line, which includes all the
65requested keys; after that for each item the server finds it sends to
66the client one response line with information about the item, and one
67data block with the item's data; this continues until the server
68finished with the "END" response line.
69
70All other commands don't involve unstructured data. In all of them,
71the client sends one command line, and expects (depending on the
72command) either one line of response, or several lines of response
73ending with "END" on the last line.
74
75A command line always starts with the name of the command, followed by
76parameters (if any) delimited by whitespace. Command names are
77lower-case and are case-sensitive.
78
79Expiration times
80----------------
81
82Some commands involve a client sending some kind of expiration time
83(relative to an item or to an operation requested by the client) to
84the server. In all such cases, the actual value sent may either be
85Unix time (number of seconds since January 1, 1970, as a 32-bit
86value), or a number of seconds starting from current time. In the
87latter case, this number of seconds may not exceed 60*60*24*30 (number
88of seconds in 30 days); if the number sent by a client is larger than
89that, the server will consider it to be real Unix time value rather
90than an offset from current time.
91
92
93Error strings
94-------------
95
96Each command sent by a client may be answered with an error string
97from the server. These error strings come in three types:
98
99- "ERROR\r\n"
100
101  means the client sent a nonexistent command name.
102
103- "CLIENT_ERROR <error>\r\n"
104
105  means some sort of client error in the input line, i.e. the input
106  doesn't conform to the protocol in some way. <error> is a
107  human-readable error string.
108
109- "SERVER_ERROR <error>\r\n"
110
111  means some sort of server error prevents the server from carrying
112  out the command. <error> is a human-readable error string. In cases
113  of severe server errors, which make it impossible to continue
114  serving the client (this shouldn't normally happen), the server will
115  close the connection after sending the error line. This is the only
116  case in which the server closes a connection to a client.
117
118
119In the descriptions of individual commands below, these error lines
120are not again specifically mentioned, but clients must allow for their
121possibility.
122
123
124Storage commands
125----------------
126
127First, the client sends a command line which looks like this:
128
129<command name> <key> <flags> <exptime> <bytes> [<cas unqiue>]\r\n
130
131- <command name> is "set", "add", "replace", "append", "prepend", or "cas"
132
133  "set" means "store this data". 
134
135  "add" means "store this data, but only if the server *doesn't* already
136  hold data for this key". 
137
138  "replace" means "store this data, but only if the server *does*
139  already hold data for this key".
140
141  "append" means "add this data to an existing key after existing data".
142
143  "prepend" means "add this data to an existing key before existing data".
144
145  "cas" is a check and set operation which means "store this data but
146  only if no one else has updated since I last fetched it."
147
148- <key> is the key under which the client asks to store the data
149
150- <flags> is an arbitrary 16-bit unsigned integer (written out in
151  decimal) that the server stores along with the data and sends back
152  when the item is retrieved. Clients may use this as a bit field to
153  store data-specific information; this field is opaque to the server.
154  Note that in memcached 1.2.1 and higher, flags may be 32-bits, instead
155  of 16, but you might want to restrict yourself to 16 bits for
156  compatibility with older versions.
157
158- <exptime> is expiration time. If it's 0, the item never expires
159  (although it may be deleted from the cache to make place for other
160  items). If it's non-zero (either Unix time or offset in seconds from
161  current time), it is guaranteed that clients will not be able to
162  retrieve this item after the expiration time arrives (measured by
163  server time). 
164
165- <bytes> is the number of bytes in the data block to follow, *not*
166  including the delimiting \r\n. <bytes> may be zero (in which case
167  it's followed by an empty data block).
168
169- <cas unique> is a unique 64-bit value of an existing entry.
170  Clients should use the value returned from the "gets" command
171  when issuing "cas" updates.
172
173After this line, the client sends the data block:
174
175<data block>\r\n
176
177- <data block> is a chunk of arbitrary 8-bit data of length <bytes>
178  from the previous line.
179
180After sending the command line and the data blockm the client awaits
181the reply, which may be:
182
183- "STORED\r\n", to indicate success.
184
185- "NOT_STORED\r\n" to indicate the data was not stored, but not
186because of an error. This normally means that either that the
187condition for an "add" or a "replace" command wasn't met, or that the
188item is in a delete queue (see the "delete" command below).
189
190- "EXISTS\r\n" to indicate that the item you are trying to store with
191a "cas" command has been modified since you last fetched it.
192
193
194Retrieval command:
195------------------
196
197The retrieval commands "get" and "gets" operates like this:
198
199get <key>*\r\n
200gets <key>*\r\n
201
202- <key>* means one or more key strings separated by whitespace.
203
204After this command, the client expects zero or more items, each of
205which is received as a text line followed by a data block. After all
206the items have been transmitted, the server sends the string
207
208"END\r\n"
209
210to indicate the end of response.
211
212Each item sent by the server looks like this:
213
214VALUE <key> <flags> <bytes> [<cas unique>]\r\n
215<data block>\r\n
216
217- <key> is the key for the item being sent
218
219- <flags> is the flags value set by the storage command
220
221- <bytes> is the length of the data block to follow, *not* including
222  its delimiting \r\n
223
224- <cas unique> is a unique 64-bit integer that uniquely identifies
225  this specific item.
226
227- <data block> is the data for this item.
228
229If some of the keys appearing in a retrieval request are not sent back
230by the server in the item list this means that the server does not
231hold items with such keys (because they were never stored, or stored
232but deleted to make space for more items, or expired, or explicitly
233deleted by a client).
234
235
236Deletion
237--------
238
239The command "delete" allows for explicit deletion of items:
240
241delete <key> <time>\r\n
242
243- <key> is the key of the item the client wishes the server to delete
244
245- <time> is the amount of time in seconds (or Unix time until which)
246  the client wishes the server to refuse "add" and "replace" commands
247  with this key. For this amount of item, the item is put into a
248  delete queue, which means that it won't possible to retrieve it by
249  the "get" command, but "add" and "replace" command with this key
250  will also fail (the "set" command will succeed, however). After the
251  time passes, the item is finally deleted from server memory.
252
253  The parameter <time> is optional, and, if absent, defaults to 0
254  (which means that the item will be deleted immediately and further
255  storage commands with this key will succeed).
256
257The response line to this command can be one of:
258
259- "DELETED\r\n" to indicate success
260
261- "NOT_FOUND\r\n" to indicate that the item with this key was not
262  found.
263
264See the "flush_all" command below for immediate invalidation
265of all existing items.
266
267
268Increment/Decrement
269-------------------
270
271Commands "incr" and "decr" are used to change data for some item
272in-place, incrementing or decrementing it. The data for the item is
273treated as decimal representation of a 64-bit unsigned integer. If the
274current data value does not conform to such a representation, the
275commands behave as if the value were 0. Also, the item must already
276exist for incr/decr to work; these commands won't pretend that a
277non-existent key exists with value 0; instead, they will fail.
278
279The client sends the command line:
280
281incr <key> <value>\r\n
282
283or
284
285decr <key> <value>\r\n
286
287- <key> is the key of the item the client wishes to change
288
289- <value> is the amount by which the client wants to increase/decrease
290the item. It is a decimal representation of a 64-bit unsigned integer.
291
292The response will be one of:
293
294- "NOT_FOUND\r\n" to indicate the item with this value was not found
295
296- <value>\r\n , where <value> is the new value of the item's data,
297  after the increment/decrement operation was carried out.
298
299Note that underflow in the "decr" command is caught: if a client tries
300to decrease the value below 0, the new value will be 0.  Overflow in
301the "incr" command will wrap around the 64 bit mark.
302
303Note also that decrementing a number such that it loses length isn't
304guaranteed to decrement its returned length.  The number MAY be
305space-padded at the end, but this is purely an implementation
306optimization, so you also shouldn't rely on that.
307
308Statistics
309----------
310
311The command "stats" is used to query the server about statistics it
312maintains and other internal data. It has two forms. Without
313arguments:
314
315stats\r\n
316
317it causes the server to output general-purpose statistics and
318settings, documented below.  In the other form it has some arguments:
319
320stats <args>\r\n
321
322Depending on <args>, various internal data is sent by the server. The
323kinds of arguments and the data sent are not documented in this vesion
324of the protocol, and are subject to change for the convenience of
325memcache developers.
326
327
328General-purpose statistics
329--------------------------
330
331Upon receiving the "stats" command without arguments, the server sents
332a number of lines which look like this:
333
334STAT <name> <value>\r\n
335
336The server terminates this list with the line
337
338END\r\n
339
340In each line of statistics, <name> is the name of this statistic, and
341<value> is the data.  The following is the list of all names sent in
342response to the "stats" command, together with the type of the value
343sent for this name, and the meaning of the value.
344
345In the type column below, "32u" means a 32-bit unsigned integer, "64u"
346means a 64-bit unsigner integer. '32u:32u' means two 32-but unsigned
347integers separated by a colon.
348
349
350Name              Type     Meaning
351----------------------------------
352pid               32u      Process id of this server process
353uptime            32u      Number of seconds this server has been running
354time              32u      current UNIX time according to the server
355version           string   Version string of this server
356pointer_size      32       Default size of pointers on the host OS
357                           (generally 32 or 64)
358rusage_user       32u:32u  Accumulated user time for this process
359                           (seconds:microseconds)
360rusage_system     32u:32u  Accumulated system time for this process
361                           (seconds:microseconds)
362curr_items        32u      Current number of items stored by the server
363total_items       32u      Total number of items stored by this server
364                           ever since it started
365bytes             64u      Current number of bytes used by this server
366                           to store items
367curr_connections  32u      Number of open connections
368total_connections 32u      Total number of connections opened since
369                           the server started running
370connection_structures 32u  Number of connection structures allocated
371                           by the server
372cmd_get           64u      Cumulative number of retrieval requests
373cmd_set           64u      Cumulative number of storage requests
374get_hits          64u      Number of keys that have been requested and
375                           found present
376get_misses        64u      Number of items that have been requested
377                           and not found
378evictions         64u      Number of valid items removed from cache                                                                           
379                           to free memory for new items                                                                                       
380bytes_read        64u      Total number of bytes read by this server
381                           from network
382bytes_written     64u      Total number of bytes sent by this server to
383                           network
384limit_maxbytes    32u      Number of bytes this server is allowed to
385                           use for storage.
386threads           32u      Number of worker threads requested.
387                           (see doc/threads.txt)
388
389
390
391Other commands
392--------------
393
394"flush_all" is a command with an optional numeric argument. It always
395succeeds, and the server sends "OK\r\n" in response. Its effect is to
396invalidate all existing items immediately (by default) or after the
397expiration specified.  After invalidation none of the items will be returned
398in response to a retrieval command (unless it's stored again under the
399same key *after* flush_all has invalidated the items). flush_all doesn't
400actually free all the memory taken up by existing items; that will
401happen gradually as new items are stored. The most precise definition
402of what flush_all does is the following: it causes all items whose
403update time is earlier than the time at which flush_all was set to be
404executed to be ignored for retrieval purposes.
405
406The intent of flush_all with a delay, was that in a setting where you
407have a pool of memcached servers, and you need to flush all content,
408you have the option of not resetting all memcached servers at the
409same time (which could e.g. cause a spike in database load with all
410clients suddenly needing to recreate content that would otherwise
411have been found in the memcached daemon).
412
413The delay option allows you to have them reset in e.g. 10 second
414intervals (by passing 0 to the first, 10 to the second, 20 to the
415third, etc. etc.).
416
417
418"version" is a command with no arguments:
419
420version\r\n
421
422In response, the server sends
423
424"VERSION <version>\r\n", where <version> is the version string for the
425server.
426
427"verbosity" is a command with a numeric argument. It always                                                 
428succeeds, and the server sends "OK\r\n" in response. Its effect is to                                       
429set the verbosity level of the logging output.                                                               
430
431"quit" is a command with no arguments:
432
433quit\r\n
434
435Upon receiving this command, the server closes the
436connection. However, the client may also simply close the connection
437when it no longer needs it, without issuing this command.
438
439
440UDP protocol
441------------
442
443For very large installations where the number of clients is high enough
444that the number of TCP connections causes scaling difficulties, there is
445also a UDP-based interface. The UDP interface does not provide guaranteed
446delivery, so should only be used for operations that aren't required to
447succeed; typically it is used for "get" requests where a missing or
448incomplete response can simply be treated as a cache miss.
449
450Each UDP datagram contains a simple frame header, followed by data in the
451same format as the TCP protocol described above. In the current
452implementation, requests must be contained in a single UDP datagram, but
453responses may span several datagrams. (The only common requests that would
454span multiple datagrams are huge multi-key "get" requests and "set"
455requests, both of which are more suitable to TCP transport for reliability
456reasons anyway.)
457
458The frame header is 8 bytes long, as follows (all values are 16-bit integers
459in network byte order, high byte first):
460
4610-1 Request ID
4622-3 Sequence number
4634-5 Total number of datagrams in this message
4646-7 Reserved for future use; must be 0
465
466The request ID is supplied by the client. Typically it will be a
467monotonically increasing value starting from a random seed, but the client
468is free to use whatever request IDs it likes. The server's response will
469contain the same ID as the incoming request. The client uses the request ID
470to differentiate between responses to outstanding requests if there are
471several pending from the same server; any datagrams with an unknown request
472ID are probably delayed responses to an earlier request and should be
473discarded.
474
475The sequence number ranges from 0 to n-1, where n is the total number of
476datagrams in the message. The client should concatenate the payloads of the
477datagrams for a given response in sequence number order; the resulting byte
478stream will contain a complete response in the same format as the TCP
479protocol (including terminating \r\n sequences).
Note: See TracBrowser for help on using the browser.