MEMCACHED_MAX_BUFFER of libmemcached

Source: Internet
Author: User
Recently, a cache_put_latency metric was added to the service, which gave a fright. It was found that the data sent to memcachedput was about 10 KB, and latency was about 7 ms, which was hard to understand, so it took some effort to find the cause. I wrote a test program for shell and C ++ respectively. 1. The shell script uses nc to send the set command. # Binenvba

Recently, a cache_put_latency metric was added to the service, which gave a fright. It was found that a 10 KB data was put to memcached, and latency was about 7 ms, which was hard to understand, so it took some effort to find the cause. I wrote a test program for shell and C ++ respectively. 1. The shell script uses nc to send the set command. #/Bin/env ba

Recently, a cache_put_latency metric was added to the service, which gave a fright. It was found that a 10 KB data was put to memcached, and latency was about 7 ms, which was hard to understand, so it took some effort to find the cause. I wrote a test program for shell and C ++ respectively.

1. The shell script uses nc to send the set command.

#/bin/env bashlet s=1let i=0let len=8*1024while truedoif (( i >= $len ))thenbreakfistr=${str}1let i++donelet i=0begin_time=`date +%s`while truedoif (( i >= 1000 ))thenbreakfiprintf "set $i 0 0 $len\r\n${str}\r\n" | nc 10.234.4.24 11211if [[ $? -eq 0 ]]thenecho "echo key: $i"filet i++doneend_time=`date +%s`let use_time=end_time-begin_timeecho "set time consumed: $use_time"let i=0begin_time=`date +%s`while truedoif (( i >= 1000 ))thenbreakfiprintf "get $i\r\n" | nc 10.234.4.22 11211 > /dev/null 2>&1let i++doneend_time=`date +%s`let use_time=end_time-begin_timeecho "get time consumed: $use_time"

2. The C ++ program uses libmemcached set.

#include 
 
  #include 
  #include 
   
    #include 
    
     #include 
     
      #include 
      
       #include "libmemcached/memcached.h"using namespace std;uint32_t item_size = 0;uint32_t loop_num = 0;bool single_server = false;std::string local_ip;std::map
       
         servers;int64_t getCurrentTime(){ struct timeval tval; gettimeofday(&tval, NULL); return (tval.tv_sec * 1000000LL + tval.tv_usec);}memcached_st* mc_init(){memcached_st * mc = memcached_create(NULL);if (mc == NULL){cout << "create mc error" << endl;return NULL;} std::map
        
         ::iterator iter; for (iter = servers.begin(); iter != servers.end(); ++iter) { if (single_server && iter->first != local_ip) { continue; } memcached_return rc = memcached_server_add(mc, iter->first.c_str(), iter->second); if(rc != MEMCACHED_SUCCESS) { cout << "add server " << iter->first << " error" << endl; return NULL; }else { cout << "add server " << iter->first << " success" << endl; } }return mc;}void test_put(memcached_st * mc){char* ptr = new char[item_size];memset(ptr, 'a', item_size);char buf[32]; memset(buf, 0, 32);struct iovec curkey, curval;curval.iov_base = ptr;curval.iov_len = item_size;curkey.iov_base = buf;curkey.iov_len = 32;uint64_t begin_time = getCurrentTime();for (uint32_t i=0; i < loop_num; ++i){sprintf(buf, "%d", i);memcached_return rc = memcached_set(mc,(const char *)curkey.iov_base, curkey.iov_len, (const char *)curval.iov_base, curval.iov_len, 600, (uint32_t)0);if (rc != MEMCACHED_SUCCESS){cout << "set key error: " << buf << endl; continue;}else { cout << "set key: " << buf << endl; }}uint64_t end_time = getCurrentTime();cout << "put time comsumed: " << end_time - begin_time << endl;}void test_get(memcached_st * mc){}int main(int argc, const char* argv[]){ // if (strcmp(argv[1], "s") == 0) { single_server = true; }else { single_server = false; } item_size = atoi(argv[2])*1024; loop_num = atoi(argv[3]); // servers["10.232.42.91"] = 11211; /*servers["10.234.4.22"] = 11211; servers["10.234.4.23"] = 11211; servers["10.234.4.24"] = 11211;*/ // local_ip = "10.232.42.91"; //memcached_st* mc = mc_init(); if (! mc) { cout << "mc_init error" << endl; return -1; }test_put(mc);test_get(mc);}
        
       
      
     
    
   
 

The test shows that the two results are different. The shell script sets 1000 8 KB items, which takes about 3 seconds and takes 3 ms on average. The C ++ version requires about 39 s, with an average time of 39 ms. Shell scripts need to constantly connect to the server and start the nc Process. It should be slower. I tracked it with ltrace and found that 8 KB of data needs to be sent twice. Both writes are very fast, but it takes a lot of time to wait for memcached to return, the main time is spent in this place.

23:32:37.069922 [0x401609]memcached_set(0x19076200, 0x7fffdad68560, 32, 0x1907a570, 8192 
 
  23:32:37.070034 [0x3f280c5f80]SYS_write(3, "set 29 0 6008192\r\naaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"..., 8196) = 8196 <0.000022>23:32:37.071657 [0x3f280c5f80]SYS_write(3, "aaaaaaaaaaaaaaa\r\n", 17) = 17 <0.000012>23:32:37.071741 [0x3f280c5f00]SYS_read(3, "STORED\r\n", 8196) = 8 <0.039765>(39ms)
 

After discussing with jianhao, jianhao immediately went to grep and found that libmemcached had a constant such as MEMCACHED_MAX_BUFFER, with a value of 8196. And it does not have the corresponding memcached_behavior_set function. Change memcached_constants.h directly to 81960, and then we are delighted to find that cache_put_latency is reduced from 7 ms to 1 ms.

Although the problem has been solved perfectly, it is still unfinished, so I want to figure out why this strange phenomenon occurs. The bottleneck seems to be on the server side, so I made some modifications to memcached. Add a time accurate to microseconds when switching the status.

static int64_t getCurrentTime(){    struct timeval tval;    gettimeofday(&tval, NULL);    return (tval.tv_sec * 1000000LL + tval.tv_usec);}static void conn_set_state(conn *c, enum conn_states state) {    assert(c != NULL);    assert(state >= conn_listening && state < conn_max_state);    if (state != c->state) {        if (settings.verbose > 2) {             fprintf(stderr, "%d: going from %s to %s, time: %lu\n",                    c->sfd, state_text(c->state),                    state_text(state), getCurrentTime());        }            c->state = state;        if (state == conn_write || state == conn_mwrite) {            MEMCACHED_PROCESS_COMMAND_END(c->sfd, c->wbuf, c->wbytes);        }        }    }

It can be seen from the printed timestamp that the time is mainly spent in the conn_nread status processing code. It takes a lot of time to locate the second read.

15: going from conn_waiting to conn_read, time: 134846658444011815: going from conn_read to conn_parse_cmd, time: 1348466584440155<15 set 98 0 600 819215: going from conn_parse_cmd to conn_nread, time: 1348466584440177conn_nread: 17> NOT FOUND 98>15 STORED15: going from conn_nread to conn_write, time: 1348466584480099(36ms)15: going from conn_write to conn_new_cmd, time: 134846658448014515: going from conn_new_cmd to conn_waiting, time: 1348466584480152

The value data may have been read in conn_read. In this case, only memmove is required. If you have not read the data in the conn_read state, you need to read the data by yourself (because the socket is set to asynchronous, you may need to read the data multiple times ), the key is that this read is too slow.

        case conn_nread:            if (c->rlbytes == 0) {                 complete_nread(c);                break;            }                /* first check if we have leftovers in the conn_read buffer */            if (c->rbytes > 0) {                 int tocopy = c->rbytes > c->rlbytes ? c->rlbytes : c->rbytes;                if (c->ritem != c->rcurr) {                    memmove(c->ritem, c->rcurr, tocopy);                }                    c->ritem += tocopy;                c->rlbytes -= tocopy;                c->rcurr += tocopy;                c->rbytes -= tocopy;                if (c->rlbytes == 0) {                     break;                }                }                /*  now try reading from the socket */            res = read(c->sfd, c->ritem, c->rlbytes);            if (res > 0) {                 pthread_mutex_lock(&c->thread->stats.mutex);                c->thread->stats.bytes_read += res;                 pthread_mutex_unlock(&c->thread->stats.mutex);                if (c->rcurr == c->ritem) {                    c->rcurr += res;                 }                    c->ritem += res;                 c->rlbytes -= res;                 break;            }

After a long time, I made a lot of timestamps before and after the io_flush function of libmemcached, and found that libmemcached sent data very quickly. Suddenly, I remembered the TCP_NODELAY parameter, so I added this parameter to the set_socket_options function in the libmemcached memcached_connect.c file (in fact, the set_socket_options function can be used to set TCP_NODELAY without careful consideration ).

    int flag = 1;    int error = setsockopt(ptr->fd, IPPROTO_TCP, TCP_NODELAY, (char *)&flag, sizeof(flag) );    if (error == -1) {          printf("Couldn't setsockopt(TCP_NODELAY)\n");            exit(-1);    }else    {             printf("set setsockopt(TCP_NODELAY)\n");    }

Without modifying MEMCACHED_MAX_BUFFER, setting a kb item is also an instant task. However, new confusions have emerged. What will the Nagle algorithm take effect? Why is the first package not cached and the second package always cached?

Libmemcached sends a set command in three parts: header (set 0 0 600 8192 \ r \ n, total 18 bytes), and value (8192 bytes ), the last is '\ r \ n' (two bytes), which contains 8212 bytes in total. Memcached can read 2048 + 2048 + 4096 + 8196 = 16 KB data in the conn_read State. Therefore, 8 KB data can be fully read in the conn_read state. By adding the following print statement to the Code for conn_read status processing, you can find that in some cases, conn_read reads only four bytes at the last time (normally 2048 + 2048 + 4096 + 20), and the remaining 16 bytes are read in conn_nread.

        res = read(c->sfd, c->rbuf + c->rbytes, avail);        if (res > 0) {            char buf[10240] = {0};            sprintf(buf, "%.*s", res, c->rbuf + c->rbytes);            printf("avail=%d, read=%d, str=%s\n", avail, res, buf);

If the TCP_NODELAY option is not set, use netstat to check that the Send-Q of the client socket remains between 8214 and 8215.

tcp        0   8215 10.232.42.91:59836          10.232.42.91:11211          ESTABLISHED 25800/t

When the TCP_NODELAY option is set, the Send-Q value of the client socket is always 0.

tcp        0      0 10.232.42.91:59890          10.232.42.91:11211          ESTABLISHED 26554/t.quick

Original article address: MEMCACHED_MAX_BUFFER of libmemcached. Thank you for sharing it with me.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.