Focusbi:ssis Development Case (original)

Source: Internet
Author: User
Tags ole management studio sql server management sql server management studio ssis

  

Follow the public number: Focusbi See more articles; Add QQ Group: 808774277 Get learning materials and explore issues together.

  

Business Intelligence Tutorial PDF

Link: https://pan.baidu.com/s/1f9VdZUXztwylkOdFLbcmWw Password: 2r4v

This picture is really handsome, I admit that I do not because of this picture and the BI, but there is no other choice to enter the BI.

Let's not see him drag the chart so cool, so natural, you do not know how much effort behind the data team to make his analysis of the report so easy natural. In the Business Intelligence project implementation of the process of data cleaning takes up 70 of the entire project time, if a project is expected to be completed four months, that is at least three months in the data cleaning is done, this does not consider the poor quality of data, if party B before the admission Party not to understand the data quality of party A, This could be a bottomless pit. Party B will be forced to fill in the pit for party A, and these pits are not to be filled by party B, the final result will lead to no survival of the project, sleeping on the street originally three or four months to fix the project last year, the project costs multiplied to lead to project losses; how painful the understanding.

Data cleaning is dirty live dirty, also in front of the boss can not show value. This leads some BI vendors to launch self-service BI concepts that do not rely on it to make analytics available to business people by themselves, which is a bit of a nonsense. Finally, it provides SQL statements, popping.

In the "Using Python crawler for BI to prepare the data source" this blog in the use of Crawling crawl Web page data to supplement the source of data, here I do not use an acre of farmland data, using crawler crawling rental data, review data, takeaway data, cell data to do a simple ETL process, In the reptile phase I have taken into account the data cleansing and structure problems, so that the data into the database through the simple processing, and the data related to the latitude and longitude of the use of crawler automation, the following will also talk about methods.

This data is for personal use only and not for commercial purposes.

Rental Data data requirements

From a real estate site, climbed the city of Guangzhou: Huangpu, Luogang, Zengcheng, Liwan, Baiyun, Panyu, Guangzhou surrounding,

Nansha, Conghua, Yuexiu, Tianhe, Haizhu, Huadou 13 areas of the rental data; the content of the crawl is: shorthand, specific location, location area, location road, landlord, monthly rent; There are: longitude, dimension to be converted. A total of 26 rental data sheets and latitude and longitude tables you need to merge these 26 tables into one table through SSIS data cleansing.

The contents are as follows:

  

  

  

Merging data

A total of 26 tables are combined into a single table, where hard-coded processing is used to see the city table in the table list

This table has a total of 13 records used to construct the table name, and the method of merging the data uses a cursor to iterate through each of the data and save it to the same table. This hard-coded approach is not recommended, and it is entirely possible to use the system to get a type of table name in a dynamic traversal using a cursor, in order to make it easier to understand the hard-coded way to construct the table name.

SQL stored Procedures
1  Use [51zxw_ Case Library _ods]2 GO3 /** * * * object:storedprocedure [dbo].    [Zf_dataall] Script DATE:2018/8/16 16:17:56 * * * * **/4 SETAnsi_nulls on5 GO6 SETQuoted_identifier on7 GO8 -- =============================================9 --Author:focusbiTen --Create date:20180816 One --Description: Aggregate rental Data A -- ============================================= - CREATE PROCEDURE [dbo].[Zf_dataall] -  as the BEGIN -  -     truncate Table [dbo].[Ods_zf_gz_dataall] -  +     DeclareZf_datatableCursorScroll for -     Select distinct  +      Case  whenCity Name= 'around Guangzhou'  Then '[51zxw_ Case Library _yw].dbo. Zf_ around Guangzhou $'     A          Else '[51zxw_ Case Library _yw].dbo. Zf_' +City Name+ 'District $' End  aszf_table, at      Case  whenCity Name= 'around Guangzhou'  Then '[51zxw_ Case Library _yw].dbo. Zf_ around Guangzhou _jw$' -          Else '[51zxw_ Case Library _yw].dbo. Zf_' +City Name+ 'District _jw$' End  asZF_TABLE_JW -      from [51zxw_ Case Library _yw].[dbo].[zf_ City $] -  -     Openzf_datatable -     Declare @ZF_Table nvarchar( -) in     Declare @ZF_Table_JW nvarchar( -) -     Fetch Next  fromZf_datatable into @ZF_Table,@ZF_Table_JW to      while(@ @fetch_status=0) +     begin   -  the       Declare @sql0 nvarchar( $) *       Set @sql0 = 'Update' + @ZF_Table + 'set [month rent] = 0 from' + @ZF_Table + 'where [monthly rent] ="'NULL"' ' $       exec(@sql0)Panax Notoginseng  -  the       Declare @sql1 nvarchar( -) +       Set @sql1 = ' A INSERT INTO [dbo]. [Ods_zf_gz_dataall] (Zfid, shorthand, exact location, location area, location road, landlord, monthly rent, latitude, longitude) the Select + A.zfid, - Cast (case abbreviated as nvarchar) as the case is abbreviated, $ cast (a. Specific position as nvarchar) as specific position, $ cast (position area as nvarchar) as location area, - cast (position road as nvarchar) as Location road, - cast (landlord as nvarchar) as landlord, the cast (monthly rent as nvarchar) as monthly rent, - LATB as Latitude,Wuyi lngb as Longitude the  from' + @ZF_Table + 'As a - Left join' + @ZF_Table_JW + 'As b on a.zfid = B.zfid Wu       ' -        About       exec(@sql1) $  -      Fetch Next  fromZf_datatable into @ZF_Table,@ZF_Table_JW -     End -  A     Closezf_datatable +     deallocatezf_datatable the      - END

SSIS Scenario 1. Creating a Project

  

2. Configure the data source

Right-click the Connection Manager and see the window select the type of data source you want to create. Here you select the Connection manager type to connect to SQL Server OLE DB or ADO. Select Ole DB and then select New, enter the server, user name, password, select Test Connection, and then modify the connection management name. Here to build two connection managers, one is the source one is the target.

  

  

  

  

3. New Package

Right-click the SSIS package, select New SSIS package, right-click package1 Select Rename Input ods_zf_dataall.

4. Select Control Flow

Drag a two Execute SQL task control from the right-hand SSIS Toolbox. Connect with arrows.

5. Process Data Flow

Double-click the first Execute SQL task control, select the drop-down box for the connection option, select the Target data source, click SQLStatement to enter the SQL statement, modify name, click OK, and the second Execute SQL task is similar. When finished, right-click the package to select Execute package.

  

  

  

  

6. Project Deployment

Right-click the scenario name, select Deploy, click Next, enter the server address select the path to start deployment.

  

  

 

7. Implementation plan

Enter SQL Server Management Studio to select SQL Server Agent, create a new job, create a new job step, populate each parameter, and configure the schedule.

  

8. Email monitoring

Configure mail monitoring requires a mail server to configure SMTP, we recommend that you use 126 mailbox to register SMTP.

Reviews data

The review data is the same as the rent data processing method, here is not repeated explanation; The review data also contains 26 tables, the fields are store name, star rating, per capita, big class, classification, city, County city, region, address, number of reviews, ID, longitude, latitude

Takeaway data

The Takeaway data field contains: dt_id, dm_id, shop name, grade, monthly sales, starting and delivery fee, delivery fee, delivery time, dm_addr, DM_LATB, DM_LNGB, area, metro station, DT_LATB, DT_LNGN

Community data

Cell data fields include: Xqid, City, County city, region, cell name, year built, road, unit price, latitude, longitude

Latitude and longitude of address

Whether it is rent, reviews, takeout, cell data are not the presence of address, if you want to show on the map they must get this address longitude and latitude, this latitude and longitude conversion Some map manufacturers have given the interface, only need to get the interface incoming address parameter can automatically return longitude and latitude, Here I provide a method: This is the interface provided by Baidu, using Python call Baidu interface, here must first register the certification for Baidu Developer General audit time is one weeks will be notified by email you whether the audit success, you will get a developer key, This key has the ability to convert only 6,000 addresses per day.

Focusbi:ssis Development Case (original)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.