Build high-performance services

Source: Internet
Author: User

Build high-performance service Concurrentskiplistmap and linked lists to build high-performance Java Memcached

Scene

A cache server is a common infrastructure in the Internet backend service.

Scene (a) a large number of images are stored on the image server, in order to improve the throughput of the image service, we hope to load the popular pictures into memory.

Scenario (ii) a distributed storage service that stores a large amount of meta information in memory for increased access throughput.

Problem

However, using the Java language to develop the caching service inevitably encounters GC problems. The use of Ehcache is a map-based cache that generates a large number of objects that minor GC cannot reclaim, eventually leading to a CMS or full GC, which can affect system throughput. By observing the GC logs generated by such services, frequent CMS can be observed. Here is a brief introduction to the process of CMS is the impact on the system, CMS two stage mark, reduce the time of Stop the world, the red part is STW (Stop the World).


The CMS logs are as follows:

9.780: [GC [1 cms-initial-mark:507883k (507904K)] 521962K (521984K), 0.0029230 secs] [times:user=0.00 sys=0.00, real=0.01 secs

Total time for which application threads were stopped:0.0029970 seconds

CMS first mark, stop the world. The following steps do not affect the Java threads work, which is the concurrency mode.

9.783: [Cms-concurrent-mark-start]

9.913: [cms-concurrent-mark:0.130/0.130 secs] [times:user=0.26 sys=0.00, real=0.13 secs]

9.913: [Cms-concurrent-preclean-start]

9.914: [cms-concurrent-preclean:0.001/0.001 secs] [times:user=0.00 sys=0.00, real=0.00 secs]

9.914: [Cms-concurrent-abortable-preclean-start]

9.914: [cms-concurrent-abortable-preclean:0.000/0.000 secs] [times:user=0.00 sys=0.00, real=0.00 secs]

Application time:0.1317920 seconds

9.914: [Gc[yg occupancy:14079 K (14080 k)]9.914: [Rescan (parallel), 0.0023580 secs]9.917: [Weak refs processing, 0.0000 060 secs]

[1 cms-remark:507883k (507904K)] 521962K (521984K), 0.0024100 secs] [times:user=0.01 sys=0.00, real=0.00 secs]

Total time for which application threads were stopped:0.0025420 seconds

Rescan is the second mark, STW.

Solution Solutions

Constructs and memcached slab/chunk similar Java memory management methods. Assigns a set of Chunck to the cached object, and chunk of the same size synthesizes a set of slab. The initial slab is set to 100B, if the cache object is less than 100B, put 100B slab, if greater than 100B, less than 100B * growth Factor = 1.27 = 127B, then put 127B slab. Therefore, a fast-sequenced data structure is required to implement slab. I implemented slab with Concurrentskiplistmap, and finding the insertion time complexity was consistent with the two fork tree, but the implementation was simpler. Code as follows,

Java code
  1. Public Boolean put (K key, byte[] value) {
  2. Map.entry<float, localmcslab> Entry = null;
  3. Float thesize = float.valueof (value.length);
  4. Stat.set ("cachesize=", ((Getcurrenttotalcachesize ()/1024f)) + "KB");
  5. //To cache size key, to chunks map as value, if it is larger than this cache size slab does not exist, then create a
  6. //Otherwise, find a minimum of slab in the slab of the cache size
  7. if (entry = Slabs.tailmap (thesize). Firstentry ()) = = null) {
  8. Float Floorkey = Slabs.floorkey (thesize);
  9. float Needsize = Floorkey = = null? Thesize:floorkey * SCALE;
  10. While (Needsize < thesize) {
  11. Needsize = needsize * scale;
  12. }
  13. Localmcslab<k, byte[]> slab = new localmcslab<k, byte[]> ((int) needsize);
  14. Slab.put (key, value, false);
  15. Slabs.put (needsize, Slab);
  16. return true;
  17. }
  18. else {
  19. When the current total cache size + This cached size > initsize is assigned to the entire cache, the LRU policy must be used
  20. Boolean ISLRU = getcurrenttotalcachesize () + thesize > initsize;
  21. Entry.getvalue (). Put (key, value, ISLRU);
  22. return true;
  23. }
  24. }

Each of the slab is based on a map<k, v> implementation. At the same time for the implementation of LRU, implemented a list from the beginning to insert from the tail out, so that the tail object of the list is last recent used, the code is as follows,

Java code
  1. Private static class LinkedListNode {
  2. Public LinkedListNode Previous;
  3. Public LinkedListNode Next;
  4. Public Object object;
  5. /** 
  6. * Constructs a new linked list node.
  7. * @param object The object that the node represents.
  8. * @param next a reference to the next LinkedListNode in the list.
  9. * @param previous a reference to the previous LinkedListNode in the list.
  10. */
  11. Public LinkedListNode (Object object, LinkedListNode Next,
  12. LinkedListNode previous) {
  13. This.object = object;
  14. This.next = next;
  15. this.previous = previous;
  16. }
  17. ...
  18. }
  19. Public static class LinkedList {
  20. /** 
  21. * The root of the list keeps a reference to both the first and last
  22. * Elements of the list.
  23. */
  24. private LinkedListNode head = new LinkedListNode ("Head", null, null);
  25. /** 
  26. * Creates a new linked list.
  27. */
  28. Public LinkedList () {
  29. Head.next = Head.previous = head;
  30. }
  31. /** 
  32. * Returns the first linked list node in the list.
  33. *
  34. * @return The first element of the list.
  35. */
  36. Public LinkedListNode GetFirst () {
  37. LinkedListNode node = head.next;
  38. if (node = = head) {
  39. return null;
  40. }
  41. return node;
  42. }
  43. /** 
  44. * Returns the last linked list node in the list.
  45. *
  46. * @return The last element of the list.
  47. */
  48. Public LinkedListNode GetLast () {
  49. LinkedListNode node = head.previous;
  50. if (node = = head) {
  51. return null;
  52. }
  53. return node;
  54. }
  55. Public LinkedListNode Removelast () {
  56. LinkedListNode node = head.previous;
  57. if (node = = head) {
  58. return null;
  59. }
  60. head.previous = node.previous;
  61. return node;
  62. }
  63. /** 
  64. * Adds a node to the beginning of the list.
  65. *
  66. * @param node the node to add to the beginning of the list.
  67. */
  68. Public LinkedListNode AddFirst (LinkedListNode node) {
  69. Node.next = Head.next;
  70. Head.next = node;
  71. Node.previous = head;
  72. node.next.previous = node;
  73. return node;
  74. }
  75. ...
  76. }

When the LRU policy occurs, no new byte[] is created, but the oldest one byte[] is rewritten, and the cache is moved to the list header

Java code
  1. if (REMOVELRU) {
  2. LinkedListNode lastnode = Agelist.removelast ();
  3. Object Lasthashkey = Hashkeymap.remove (Lastnode.object);
  4. if (Lasthashkey = = null) {
  5. return false;
  6. }
  7. Stat.inc ("eviction[" + This.chunksize + "]");
  8. cacheobject<byte[]> data = Map.get (Lasthashkey);
  9. System.arraycopy (value, 0, Data.object, 0, value.length);
  10. Data.length = Value.length;
  11. //Update Key/hashkey mapping
  12. Hashkeymap.put (key, Lasthashkey);
  13. Lastnode.object = key;
  14. Agelist.addfirst (Lastnode);
  15. }

Note that a hashkeymap is used, and its key is the key,value of the cache object for this put as the key for byte[], which is created the first time byte[] is created. This is also done to not recreate the object.

See the Appendix for all code and test classes.

Test

Test parameters

Java-xms2g-xmx2g-xmn128m-xx:+useconcmarksweepgc-server-xx:survivorratio=5-xx:cmsinitiatingoccupancyfraction=80 -xx:+printtenuringdistribution-xx:+printgcdetails-xx:+printgctimestamps-xx:+printgcapplicationstoppedtime-xx:+ Printgcapplicationconcurrenttime-xloggc:./gc.log test. Testmain

The test performance is stable and the memory is all recycled in the minor GC phase.

Distribution cache=1g, actual cachesize==1048625.2kb;

Number of slab chunk:

chunk[100.0] Count==5

CHUNK[209758.16] count==1231

chunk[165163.9] count==4938

Summarize

Originally wanted to write a pseudo-code, and later think Java still have a lot of better data structure, such as Concurrentskiplistmap and Lrumap or want to introduce to everyone. So I wrote this rough version, which basically reflects the way memcached Slab/chunk manages memory. The actual test performance also has a certain benefit. Online services can be developed based on this version. However, this implementation does not have a good deal of concurrency problems, the use of memory has some pits. If you encounter problems in use, you are welcome to discuss them together.

Build high-performance services

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.