Note-beam-20140503

Note-beam-20140503_MySQL

Last Update:2018-04-11 Source: Internet

Author: User

Tags mysql host

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

Note-beam-20140503 handle erlang beam occupying a large amount of memory
Step 1:
Check whether the number of processes is normal? Erlang: system_info (process_count). The number of processes is reasonable.

Step 2:
Check the node memory consumption?
> Erlang: memory ().
It is shown that most of the memory is consumed by the process, which determines that the process occupies a large amount of memory.
Step 3:
Check which processes occupy the highest memory?
> Spawn (fun ()-> etop: start ([{output, text}, {interval, 1}, {lines, 20}, {sort, memory}]) end ).
(Start etop in output text mode. The interval is 1 second, and the number of output rows is 20 rows, sorted by memory. here, spawn is a new process to output etop data without affecting the erlang shell input .)
Top output is a bit messy. if it exceeds a certain range, it becomes **, but we have found the process with the highest memory usage.

Step 4:
View the process status with the highest memory usage
> Erlang: process_info (pid (, 0 )).
Step 5:
Manual gc collection, hope the problem can be solved
> Erlang: garbage_collect (pid (, 0 )).
True
Check the process memory again and find that there is no change! Gc has not recycled any resources, so the consumed memory is still playing a role, not recycled!
Tail recursion! Try... Catch stores the corresponding information in the stack. exception capture must be placed inside the function. Therefore, send_msg finally calls try... Catch, not itself, so it is not tail recursion!
Summary:
1. in server programming, the loop must be tail recursion
2. good at using OTP. if you use gen_server to replace the handwritten loop, this problem will not occur!

1. back_log
When you observe the process list of your host and find a large number of 264084 | unauthenticated user | xxx. xxx. xxx. xxx | NULL | Connect | NULL | login | when a NULL process is to be connected, increase the value of back_log. The default value is 50. I will change it to 500.
2. interactive_timeout:
The number of seconds that the server waits for action on an interactive connection before closing it. An interactive customer is defined as a customer who uses the CLIENT_INTERACTIVE option for mysql_real_connect. The default value is 28800. I will change it to 7200.
3. key_buffer_size:
The index block is buffered and shared by all threads. Key_buffer_size is the buffer size used for index blocks. you can increase the size of indexes that can be better processed (for all reads and multi-rewrite) so that you can afford that much. If you make it too large, the system will begin to change pages and it will really slow down. The default value is 8388600 (8 M). my MySQL host has 2 GB of memory, so I changed it to 402649088 (400 MB ).
(4) max_connections:
Number of Customers allowed simultaneously. Increase the number of file descriptors required by mysqld. This number should be added. Otherwise, you will often see the Too connector connections error. The default value is 100. I will change it to 1024.
(5), record_buffer:
Each thread that performs an ordered scan allocates a buffer of this size to each table it scans. If you perform many sequential scans, you may want to increase the value. The default value is 131072 (128 K). I changed it to 16773120 (16 M)
(6) sort_buffer:
Each thread that needs to be sorted allocates a buffer of this size. Add this value to accelerate the order by or group by operation. The default value is 2097144 (2 M). I changed it to 16777208 (16 M ).
(7), table_cache:
Number of tables opened for all threads. Increase this value to increase the number of file descriptors required by mysqld. MySQL requires two file descriptors for each unique opened table. The default value is 64. I changed it to 512.
(8), thread_cache_size:
The number of threads that can be reused. If yes, the new thread is obtained from the cache. if there is space when the connection is disconnected, the customer's thread is placed in the cache. If there are many new threads, this variable value can be used to improve performance. By comparing variables in Connections and Threads_created states, you can see the role of this variable. I set it to 80.
(10), wait_timeout:
The number of seconds that the server waits for action on a connection before closing it. The default value is 28800. I will change it to 7200.
Note: You can modify the parameters by modifying the/etc/my. cnf file and restarting MySQL.

Challenges to the design team list:
1) read and write tables.
Because the inbound and outbound queues affect each other, high load may lead to lock competition, transaction deadlock, IO timeout, and so on.
2) when multiple receivers try to read data from the same queue, they randomly obtain repeated items, resulting in repeated processing.
You need to implement some high-performance row locking on the queue so that the concurrent receiver will not receive the same data items.
3) the team list needs to store rows in some order and read rows in some order, which makes designing indexes tricky.
The queue list does not always comply with the first-in-first-out rules. sometimes, messages in the order have a higher priority. you must process the message whether or not it is in the queue.
4) the team list needs to serialize objects in XML or binary format, which makes it difficult to store and rebuild indexes.
You cannot re-index the queue table because it contains text or binary fields. Therefore, every day, the data table slows down and the query times out. you have to disable the service and re-create the index.
5) a batch of row data is selected, updated, and then returned when the queue is out. You need a "State" column to define the State of the data item. When you exit the queue, you only need to select data items in a certain state. There are only several types of statuses: PENDING (to be determined), PROCESSING (PROCESSING in progress), PROCESSED (PROCESSED), and ARCHIVED (archiving ). You cannot create an index on a status column because it does not provide enough selectivity. There are thousands of data rows in the same status. Therefore, any outbound queue operation will cause the cluster index to be re-scanned. this is a CPU-and IO-intensive operation that produces lock competition.
6) in the process of getting out of the queue, you cannot just remove the relevant rows from the queue list, because this can easily cause data tables to produce storage fragments. In addition, you also need to re-process the order, task, and notification for N operations to prevent these operations from failing in the first time. This means that it takes longer to store row data, the index will continue to grow, and the outbound queue will become slower and slower.
7) You must archive the processed data items in the team table to different data tables or databases to keep the team list simplified. This means that a large amount of data rows with specific statuses need to be moved to another database. Such large data movement will frequently generate storage fragments, reducing the performance of inbound and outbound queues.
8) You have 24x7 uninterrupted services. You cannot stop the service and archive a large amount of row data. This means that you must continuously archive row data without affecting inbound and outbound stack communication.
OpenAMQ
In good condition, the short timeout + retry mechanism works well, and the system is more measurable.

Https://github.com/tiancaiamao/go-internals/blob/master/ebook/preface.md

Http://blog.chinaunix.net/uid-301743-id-4144744.html

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More