Regarding the interaction between mysql and hadoop data, and the hadoop folder design, concerning the interaction between mysql and hadoop data, and hadoop folder design, mysql is currently distinguished by region and business district, assuming that the region where the mysql database is read is located, I communicated with the leaders yesterday according to the region division. the leaders said that the click-through rate is not a necessary condition, and the region division is the focus, followed by persuasion from various aspects, so they had to distinguish by region, the key is that if the town area distinguishes data and products, there are more than 6 K regions throughout the country. The number of such hdfs folders does not crash. I still feel that there are many latitude and condition queries, I shouted another sentence above, not necessarily using hadoop... design the mysqlhadoop database
About mysql and hadoop data interaction, and hadoop folder design
About mysql and hadoop data interaction, and hadoop folder design
Currently, mysql is differentiated by region and commercial district. assume that the region where the mysql database is read is divided by region.
I communicated with the leaders yesterday. The leaders said that the click-through rate is not a necessary condition, and the regional division is the focus, followed by persuasion from various aspects, so they had to distinguish the region. The key is that the town area distinguishes data from products, there are more than 6 K regions in China,
The number of hdfs folders is not very bad,
There are still a lot of latitude and condition queries in the end. I shouted at the top, not necessarily using hadoop. What advantages does hadoop give play to? mysql multi-condition query is convenient. please make good use of this solution, then I was confused, and my heart was tangled and depressed. Mahout has made some achievements recently, and found that the standalone version is also quite good (with a small amount of data). SouFun is searched, and so many houses in Beijing are for sale in 519,059, so there is no need to use hadoop, I feel that if I only analyze 519,059 data records separately,
If you have a good blog and resources, please provide url connection. thank you.