ASP. NET + SqlSever big data solution pk hadoop, sqlseverhadoop
Half a month ago, I saw some people in the blog Park saying that. NET is not working on that article. I just want to say that you have time to complain that it is better to write more real things.
1. Advantages and Disadvantages of SQLSERVER?
Advantages: Support for indexing, transactions, security, and high fault tolerance
Disadvantage: optimization is required when the data volume reaches 1 million or more. We usually perform horizontal split, table sharding, partition, and job synchronization on the table, which greatly improves the logic complexity, difficult to maintain, only the cluster is fault tolerant, and there is no multi-database load balancing parallel computing function.
2. Is SQL server really unable to process big data?
Answer: Of course. For example, operating a single database is calledOne-dimensionalOperation. If the operation has the same structure, multiple databases distributed on multiple servers can be calledTwo-dimensionalOperation. We only need to encapsulate this two-dimensional operation to allow it to support parallel operations, so that the server pressure is dispersed. We do not need to write too much, and SQL has encapsulated a lot for us, it is like a giant, and we can easily implement big data processing for the WEB by standing on his shoulder.
3. What are the disadvantages of hadoop when it is not suitable for. NET?
(1) Slow Data Synchronization
(2) difficult Transaction Processing
(3) difficult to capture exceptions
(4) It is difficult to combine with ASP. NET, whether it is the learning cost or its own support.
(5) installation is required. It is suitable for offline big data processing, but not necessarily for WEB applications.
4. What isSqlSugar framework?
SqlSugar is a lightweight and high-performance ORM framework based on SqlSever. In addition to its comparable performance with ADO. NET, SqlSugar now supports multi-database parallel computing.
Advantages:
(1) suitable for non-delayed query of massive data
(2) Support for distributed transactions
(3) Let JOIN fly and say goodbye to Big Data NOJOIN
(4) C #. NET's own syntax and a large number of encapsulated Functions
(5) Random storage, that is, it can be stored in any node database to achieve real load balancing, rather than read/write separation in the previous Master/Slave Mode.
Disadvantages:The SQL Server license fee is too expensive. It is suitable for small enterprises that have money or do not pay the license fee.
SqlSugar learning directory
1. Basic SqlSugar Application
2. Use SqlSugar to process Big Data
3. Use SqlSugar to implement Join to be updated
4. Use SqlSugar to implement paging + grouping + Multi-column sorting to be updated
5. How to perform master-slave switchover for node faults
"2. Use SqlSugar to process Big Data 《《《
1. SqlSugar principles
Insert: Random storage to a node database (each node can be configured with a processing probability. If it is set to 0, no new data will be added to the node)
Update and Delete: asynchronously requests all database nodes to synchronously summarize the processing results
Search: The data on the first X pages, the last X pages, and PageCount <1000 (the value 1000 can be set in the program) are specially optimized, the asynchronous node algorithm synchronizes the results of other data, and the performance can be perfectly reflected in the multi-server architecture. In a single server architecture, ensure sufficient IO to avoid full table scanning, otherwise, the optimization results will not be effective.
1. Single Server, single hard disk, and multi-database architecture:
Suitable for low concurrency, with a data volume of less than 0.1 billion, and high response speed. It is recommended that the data volume not exceed 1000 million, avoid full table scanning during queries, and make full use of io performance, let the advantages of Asynchronization be reflected.
Fuzzy search is performed for 10 same-structure databases deployed on the same PC.
Name creates a full-text index, and id and num create a composite index.
The ten databases add up to 5.4 million data records, and the average hard drive takes only 0.3 seconds.
2. Single Server, multiple hard disks or Arrays:
You can use LIKE to scan the entire table, significantly improving the performance.
3. multi-server and multi-database architecture
Because the pressure is apportioned to the servers where each node is located, it is easy to store more than million pieces of data. The more node servers, the larger the data volume to process, the faster the data, even the T-level data, second query is not a problem, just need N cheap pcs.
2. Usage
1. Reference SqlSugar. dll
2. Configure the connection string
Here, rate is the probability of storing data to a node during Insert. 0 indicates that no new data will be added to the node. The following settings are set to 1, indicating that I am not eccentric at all.
3. add, delete, and use methods
4. Enable distributed transactions
The server needs to enable services such as MSDTC
5. Taskable is the underlying core of all distributed computing
Complex queries, such as paging and grouping, are displayed here. DataTable, T: Class, and value types are supported to conveniently summarize the results of multiple databases to a container.
When using Taskable, you must note that each node cannot obtain a large amount of data. by performing a few operations in memory, you can process complex data queries in the operation mode.
6. Use Taskable for grouping Query
For Statistics Report queries, the query result set is not too large and can be processed using Taskable. The Merge method can aggregate the query results of all databases to a new set.
7. Using Taskable extension functions makes it easier for you to process multi-database operations.
8. Distributed Paging
Considering the database sharding mechanism, GUID is recommended for primary keys to ensure uniqueness. This paging function can be used only when primary keys are unique.
The number of nodes is displayed on each page. The current page number is used to calculate a preliminary index, and then the data of the index location is retrieved, comparing the real index of the data with page in, you can work out a new index until you find the exact position to read the data. The principle is like this.
As for the principle, I will not talk much about it, but I can't talk about it any more. Lucky friends can join the group: 225982985 discussion.
Source Code address: https://github.com/sunkaixuan/SqlSugar
Haha, I have tried my best, whether it's good or bad, to give it a compliment