Failure phenomena
The recent failure of virtual machine creation on the company's OpenStack, view log to locate the problem in Neutron-server to Keystone authentication token failed.
Cause of failure
The available memory size of the memcahed token backend configuration used by Keystone is 64MB, and after the new cluster is added, the token amount is increased and the amount of data to be saved is greater than this size, causing the memcached to frequently clean up the unexpired to make room for the newly created token , and then there is the case of token loss.
Processing methods
Allocate more free memory space to memcached, currently allocating 4GB.
Troubleshooting Procedures
In the event of a virtual machine creation failure, when you view the log and discover that the Nova-compute accesses neutron-server to create the port, Neutron-server verifies the token error to Keystone:
Review the code comb the authentication process once again:
1. Nova-compute will save a global token and use this token to access neutron-server. Before each visit, check that the token is about to expire, the standard is 120 seconds, hard-coded in the Nova code, that is, each request to find token remaining valid time is less than 120 seconds, re-apply a token.
2. Neutron-server extracts tokens from the request header and accesses Keystone to verify that tokens are valid.
3. Keystone configured token backend is memcache,keystone to memcached check token, found no token, return error.
Since the token stored in the memcached is dogpile encapsulated and cannot be accessed directly, the Memcached judgment token status cannot be viewed Keystone. The details of the token are printed in Nova-compute and the following results are obtained:
The token hasn't been found until the expiry date, and even the newly created token won't be found in the next second. The problem with memcached is that the Keystone token backend is set to SQL, and this issue does not occur. So look at the status of Memcached, where two key parameters are:
6710886454635
Indicates that memcached has only 64MB of available memory, and 54,635 times the object cleanup occurred because of insufficient memory. It is essential to determine that the memcached memory is insufficient. It is inferred that when there are only two region, the cached token does not reach the memcache memory limit, and after the third region joins, the new token volume causes the memcached memory to be low and more frequent token cleanup occurs.
The memcached available memory is configured to restart after 4GB, and this problem is no longer present.
Keystone token loss caused by memcached configuration error