When using memcached_mget with a single key and cas enabled, the client will send a GETKQ request, wait for it to be acked by the server, and then follows it up by a NOP request. However, Linux delays sending acks until it has some data to reply with or until a timeout occurs. The above request pattern results in the timeout occuring every time the client makes a request. This timeout may be on the order of dozens of milliseconds, increasing the latency of gets commands by up to 1000x that of get commands.
In contrast, memcached_get sends a single GETK request which receives an immediate response, avoiding the delayed ack timeout. However, the CAS value cannot be retrieved with is API.
I have recorded an example PCAP showing the latency difference between memcached_get and memcached_gets in text/binary modes (with the CAS behavior enabled). gets in binary mode takes around 400 ms total, while all other operations take around 1 ms total. This example only performs 10 requests per mode; with 1000s of requests an even larger different may be observed.
I recommend the following improvements:
- Send a single
GETK when key_length is 1 (i.e. as if mget_mode is false).
- Don't flush
memcached_io_writev until sending the final NOP (or at least until sending the final GETKQ).
- Add an API to retrieve the CAS value without
mget.
When using
memcached_mgetwith a single key and cas enabled, the client will send aGETKQrequest, wait for it to be acked by the server, and then follows it up by aNOPrequest. However, Linux delays sending acks until it has some data to reply with or until a timeout occurs. The above request pattern results in the timeout occuring every time the client makes a request. This timeout may be on the order of dozens of milliseconds, increasing the latency ofgetscommands by up to 1000x that ofgetcommands.In contrast,
memcached_getsends a singleGETKrequest which receives an immediate response, avoiding the delayed ack timeout. However, the CAS value cannot be retrieved with is API.I have recorded an example PCAP showing the latency difference between
memcached_getandmemcached_getsin text/binary modes (with the CAS behavior enabled).getsin binary mode takes around 400 ms total, while all other operations take around 1 ms total. This example only performs 10 requests per mode; with 1000s of requests an even larger different may be observed.I recommend the following improvements:
GETKwhenkey_lengthis 1 (i.e. as ifmget_modeis false).memcached_io_writevuntil sending the finalNOP(or at least until sending the finalGETKQ).mget.