For the Data warehouse and ETL knowledge, I am basically a layman. Everything has to start from scratch, take a note, to facilitate the understanding of learning progress.
First, let's take a look at the basic definition:
Well, some people also called the ETL simple data extraction. At least before the study, the leader told me that you need to do a data extraction tool.
In fact, extraction is the key part of the ETL, as the name suggests, but also the data from different data sources to crawl (copy) out.
It's too easy.
There is no end to the explanation above, a little like the seventh sesame seed that will make you full,
Careful thinking, extraction is impossible to exist alone, we need to be associated with some of the other links to take out.
So, get the definition of ETL:
The process of data extraction (Extract), Transformation (Transform), cleaning (cleansing), loading (load).
Well, now that we've come to this level, we're going to expand our association and bring out the cause and the consequences of this abstract event,
Where the extracted source is.
What is the purpose of the load?
Extract Source: In most cases, it can be considered a relational database, and a professional one is the transaction processing system (OLTP). Of course, in a broader sense, it could be another database or file system.
Destination: OK, we want to be a data warehouse. What the Data Warehouse is. Before studying, it is an abstract monster to me, after reading some simple data, understand this monster is not strange at all. A warehouse that accumulates data for analysis. Yes, it's for analysis, so it's different from the data store in OLTP.
Then, let's see why we want the ETL.
In my opinion, there are two reasons.
One: Performance pulls out the data that needs to be analyzed from the OLTP, making parsing and transactional processing not conflicting. Hey. This is not the effect of the Data warehouse. Yes,
Data warehouses, in most cases, are generated by ETL tools.
Second: Control users can completely control the data extracted from the OLTP, with the data, it has everything.
Well, OLAP analysis, data mining, etc. .....
Finally, to sum up,
From the data to see, ETL is a subject, for the big things, really some fear, so, I think should stop to think about the next step I should do something.
Well, there's no way I can start all over again,
Yes, from the application, look at the work now, the most urgent need is what.
Ducks want to become a dish, not put their hands on the oil pot of labor.
OK, to turn the raw rice into cooked rice, ducks put on the market, a pile of nonsense, I have to see what the kitchen has some.