Note: to learn this article, you need to build on the basic understanding of integration services. If you do not have any knowledge, please refer to step by step to learn Bi (1)-Understanding integration services
Target: Import a text file to the execl file through the ETL project.
Steps:
1. Create a is project.
2. Double-click the package. dtsx file in the "SSIS packages" folder (this file is the package file) to go to the control flow working directory.
3. Drag the data flow task from control flow items in the toolbox on the left to the work area. As shown in:
4. Double-click the data flow task pulled from control flow items and enter the data flow work panel. Note that the table menu on the top of the work area is also switched.
5. In the data flow workspace, we used to export the source data and target data space from the toolbox. Since a text file is imported to the execl file, we first drag the flat file source from the data flow sources of the toolbox to the work zone, and then drag the Excel destination from the data flow destinations to the work zone. And direct the Green Arrow of flat file source to excel destination. In this way, the data flow conversion is established according to our ideas, as shown in.
6. the flowchart has been drawn, but the actual data is not available now. At the same time, we also find that there is a red cross on the flat file source and Excel destination (indicating that there is an error in this ETL, data conversion fails ). We must pay attention to these obvious error prompts during development.
7. Add a data source, double-click flat file source, and open the flat file source editor window, as shown in:
8. in the flat file source editor window, choose to create a new flat file Connection Manager (because this is the first time we operate the data source, so we need to create a new one. If it was previously created, the flat file Connection Manager editor is displayed. The Connection Manager name should be an appropriate name. In descriptions, enter a description for the connection. In file name, Click Browse. Here we want to select a file, but we don't have it. We didn't create a new one.
9.open notepadand create a new project file named etltest.txt. Save it to the project file directory in UTF-8 format, as shown in.
10. Now that we have the data source, we have created this text file from browse in step 1. The code page automatically recognizes text files in UTF-8 format. In header row delimiter, we choose to separate them with commas.
Select preview on the left to view the separated data. For example:
11. select "OK", confirm to return to the work area of the data flow design, and find that the Red Cross on the flat file source I mentioned earlier does not exist, that is, by correctly adding the data source, the data source control is successfully configured. Next, double-click Excel destination to edit the target data source. In ole db Connection Manager of Excel destination Editor, select create an Excel target data source. In the pop-up Excel Connection Manager window, we need to select an Excel file. If no file exists, browse and create a new file in the project directory and select, going back to the Excel Connection Manager window, Excel versions automatically recognizes the Excel version type. As shown in.
12. Select OK and return to the Excel destination editor window. in name of the Excel sheet, select execl workbook for the data import. Create a New workbook, as shown in.
After that, return to the workflow design window. The Red Cross we mentioned above on Excel destination is not removed. Move the mouse up to see what the prompt is ......
Discovery means that the data in the 0th column does not match the type during the type conversion process. Data Types cannot be converted between Unicode and non-Unicode. In this case, I need to perform an intermediate conversion.
13. Drag a data conversion from data flow transformations in the toolbox to the work zone and re-adjust the input and output streams. Use the flat file source output as the data conversion input, and the data conversion output as the Excel destination input. The purpose is to convert data into compliant data through data conversion (in actual engineering applications, we will have a lot of work to use tools in Data Flow transformations to convert the data and then send it to the target data ).
14. Double-click data conversion creation, as shown in. Select the name of the column to be processed in the input column. Output alias is the name of the output column, and the type to be converted is selected in the data type column.
15. After the conversion component configuration is complete, we need to re-adjust and edit our Excel destination. Adjust the Mappings in Excel destionation editor. We will use the output of data conversion as the input column. Here we change the positions of columns 1st and 2nd.
16. Return to the data flow design window and find that the Red Cross on the Excel destionation component is missing this time and ETL is run. The exciting time has finally come. All the Green passes and runs properly. Open the Excel file and check it out. Haha.
Summary:Through this experiment, we have mastered the basic principles and procedures of integration services. In actual Bi projects, the customer's current data may be various: text files, execl files, various database files ..., It can be extracted through integration services. After extraction, a lot of data has no value. What we want is to extract valuable data from massive data that meets our requirements. Then this process is conversion, A large number of Bi techniques, such as conversion tools and algorithms, are required to do these tasks, and finally generate valuable target data.
Download the project in this article: integration services project1
Next time, we will further discuss more complex ETL processes. I hope you will enjoy your learning and hope this article will help you!