SQL Server bi: (1) Installation and Basic Concepts

Source: Internet
Author: User
ArticleDirectory
    • Download and install
    • Basic Concepts
SQL Server bi Getting started

Maybe many of you, like me, have heardBiHowever, if the work does not involve statistical analysis or data mining, it is difficult to get familiar with this knowledge. I have always been eager to have some experience in this area. It is a great honor for me to come up with the job of data statistics and analysis recently. This article also summarizes my recent work.

 

Download and install

Selected for my workSqlserver 2008 r2Because I wrote a blog on my computer, I tried to use2012By the way, you can see what is different. YesHttp://www.microsoft.com/en-us/download/details.aspx? Id = 29066 if your system is Chinese, select the Chinese version.

During installation, select the following functional modules:[Matching Diagram1]

Pay attention toPrerequisites for selected featuresThe content of the prompt.4.0In fact, you still need3.5.

Basic Concepts

Data analysis mainly includes the following content:

Original database

The original database is mainly used for data storage for data reporting. It contains the most primitive information, such as the page at which a user accesses or the buttons clicked. The data can be accessed throughJS,As, Or backendCode.

This type of log data is usually reported in a very large amount, and may produce hundreds of millions of pieces of data a day. I used to stay in an advertising company, and the amount of advertising is amazing, because the advertisements are displayed in various portalsPVIt is the number of advertisements, and data is generated for user-related operations.GThe data volume is not a problem, so the design of the original database table should pay attention to the following points:

1. There is no index (except the primary key) and no index is required, because the related analysis statistics are carried out in the data warehouse.

2. The primary key must be an ordered primary key.GuidAnd so on, the order cannot be guaranteed. When this data is inserted, the physical order of the data storage will be adjusted. This is a terrible thing and affects the speed.

3. If the data size is very large, consider using partition or sub-database storage.

4. If there is too much pressure on the database to be inserted in an instant, you need to consider adding a cache layer to relieve the pressure. This requires the writing service to sort and insert the data at the cache layer to the database. The disadvantage is that if the cache service fails, data may be lost. You can choose a persistent cache service. In short, these need to be weighed.

Data Warehouse database
Data warehouse databases are required, and all statistical analysis needs to be based on this. There are two types of data warehouse tables:Dimension Table (Dimension) And fact tables (Fact).

1. dimension table
Dimensions are easy to understand. For example, if we want to know how many users use the product every day, "Daily" is a dimension, because we need to install "Daily" to query the number of users. Similarly, year, month, week, quarter, and region are our most common dimensions.

2. fact table
Fact tables may be vague. We can generally understand which data we want to make statistics on and what records are produced by such data facts. For example, each user's operation is a fact. The fact table of actions is required when we make statistics on user operation behaviors.

3. Relationship between fact tables and dimension tables
If we use the time dimension for statistics on user behavior, the fact table must have a time field. The storage of time fields is actually the primary key of the time dimension table.IDInstead of the real time,[Matching Diagram2]

Note: My fact table factuseraction(User operation fact)OperatedateYesIntType, and dimension tableDimdateAndDimdateSplit the date into three fields: year, month, and day. Because statistics may need to be performed on the year and month, this design is required. There is a hierarchical relationship between them, which we will talk about later. Original Table Design

Careful friends may find that the fact table's useractionIt is alsoIntYes. In fact, this is also a query dimension, but we only use time as an example.

4. How to Design fact tables and dimension tables
The Design of fact tables and dimensions is mainly determined by the needs of O & M and product personnel.ProgramDevelopers must be able to reject their needs because of the complexity of development. Of course, it cannot be unreasonable. This article will explain the requirements in the time dimension. This is generally a required requirement.

5. Data filling for fact tables and dimension tables
This item is the simplest for our programmers, because we can develop a service that regularly reassembles and inserts data from the original database according to the warehouse design.Sqlserver biProvides existing tools, which are calledETL(Extraction-transformation-loadingData extraction, conversion, and loading ),SqlserverbiCallingSSIS(Sqlserver integration service).

The three squares in the image read data from a table, convert the column, and map the converted column to the table in the target database. What I am doing here is to set the createtimeConvert fieldsYear monthAndDayAnd then insert itDimdateTable

UseETLIt is more convenient andAnalysis ServicesInteraction. For example, it is very convenient to execute only tables after the program is imported, such as integration of various data sources. However, I did not do any in-depth research. I will not write this part for the time being. If you are interested, you can explore it slowly.

 

So much is written today, because I installSqlserverA great deal of effort,. NET Framework 3.5Download always fails during installation. My system isWindows 8, Must be installed separately3.5Only.

Let's take a look at these concepts. Next, write aboutAnalysis Services.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.