Scene One
Conditions:
The database has table A, which has 100W data.
Operation
Take all the data out of the table and handle it.
Problem
- Can you take it all out at once? Do you
select 字段1,字段2 from A
want to remove the data?
(in addition to preventing the removal of data out of too many causes memory overflow, what need to consider?)
- What factors need to be taken into account in determining the number of data that needs to be taken out each time?
Scene Two
Conditions:
A service has a batch of data and needs to invoke the interface of the B service to get more information. B Service provides a batch interface
Operation
A service invokes B service through RPC
Problem:
- Here a batch of data can be through the B service batch interface, through the RPC all fetch (if not.) What is the main concern?)
- What are the factors that need to be taken into account in determining the number of data per query?
Reply content:
Scene One
Conditions:
The database has table A, which has 100W data.
Operation
Take all the data out of the table and handle it.
Problem
- Can you take it all out at once? Do you
select 字段1,字段2 from A
want to remove the data?
(in addition to preventing the removal of data out of too many causes memory overflow, what need to consider?)
- What factors need to be taken into account in determining the number of data that needs to be taken out each time?
Scene Two
Conditions:
A service has a batch of data and needs to invoke the interface of the B service to get more information. B Service provides a batch interface
Operation
A service invokes B service through RPC
Problem:
- Here a batch of data can be through the B service batch interface, through the RPC all fetch (if not.) What is the main concern?)
- What are the factors that need to be taken into account in determining the number of data per query?
Question 1
1. You can take out all first, see how much data is taken out, if only a few m of memory, one-time removal processing, should not have any problem
2. If the batch is taken, pay attention to sorting, prevent paging out the same data
Question 2:
1. Look at the amount of data, if not big, just one call to dispose of it.
2. Do it first, the problem will always be exposed, from your simple description to see, do not know what the problem.
2 scenes first of all, just a moment.
Scenario One:
1. Can I take it all out at once? Do you want to remove the data from the Select field 1, Field 2 from a?
(in addition to preventing the removal of data out of too many causes memory overflow, what need to consider?)
2. What are the factors that need to be taken into account in determining the number of data to be taken out each time?
First, you're writing the biggest question in parentheses.
Second, without considering the parentheses in the first one, this second question needs to be considered most.
What are your needs? Is there a limit to the time? Taking it all together may be slow, do you want to split the threads in batches? Of course, all take the DB query speed and data accuracy, 100W if the field does not have a large number of fields (TEXT,BLOB, etc.) should be OK.
Scenario Two:
1. Here a batch of data can be through the B service of the bulk interface, through the RPC all take over (if not.) What is the main concern?)
2. What are the factors that need to be taken into account in determining the number of data per query?
First, it depends on how the B interface is designed, if you want to ask is can do, the answer is yes, the main worry is not to give it too much, it looked slow? Is it too big to return?
Second, see the next
2 scenes even for a moment.
In fact, your problem or performance, 100W of data processing to batch, but do not know how to do is the best, I suggest you first put this processing environment to take into the data of their own test, this aspect of the problem is very complex not one or two can be said clearly.
According to the results of the test itself to adjust the speed of processing and the performance of the program, the performance contains the implementation, the implementation of the way there are many kinds of, such as the above I mentioned multithreading, it depends on whether you query the data can be used in batches to do.