PowerBI development Article 7: data set and data refresh, powerbi Article 7

Source: Internet
Author: User
Tags power bi columnar database

PowerBI development Article 7: data set and data refresh, powerbi Article 7

PowerBI reports are based on Data analysis engines. The real Source of Data is the Data storage media such as databases and files. PowerBI supports a variety of Data Source types. PowerBI Service (cloud) sometimes does not directly access Data Source, but directly obtains Data from the Dataset created by PowerBI Service. The content of centralized Data storage is divided into three parts: metadata such as Data of Data Source, Credentials for connecting to the Data Source, and Table Schema of the Data Source ). When the PowerBI Service analyzes data, it directly accesses Dataset to obtain data and performs aggregation computing to respond to Users' Query requests. The advantage of using Dataset is that PowerBI only needs to maintain a unified Data Store and does not need to read Data from many different DataSource. All required Data can be retrieved from a single Data structure (Dataset).

PowerBI Service automatically creates a Dataset for each published Report. The maximum size of each Dataset is 1 GB. In Import connection mode, PowerBI imports Data from multiple Data sources to Dataset. That is to say, Dataset stores snapshots of multiple Data sources. Whether to import Data sources to Dataset is determined by the Data Connection Model.

I. Data Connection Mode

When"Get Data"When connecting to the Data Source, PowerBI automatically creates a Dataset and loads Data from multiple Data sources to one Dataset. The Dataset also contains Credentials for connecting to the Data Source ), and data architecture and other metadata. PowerBI Service directly references Data from Dataset, rather than directly from Data Source. PowerBI supports two connection modes: Import and Live/DirectQuery. In Import mode, Data Source Data is imported to the Dataset of PowerBI Service, and direct query mode establishes a direct connection between Data Source and Dataset.

1,Import Mode

In Import mode, data stored in Dataset On the cloud comes from a copy of On-Premises data. Once the data source is loaded, all data defined in the query will be loaded into the Dataset. PowerBI queries data from a highly optimized Dataset and has high query performance. It can quickly respond to users' interactive queries. Because the import mode copies data source snapshots to Dataset, changes to the underlying data source are not updated to Dataset in real time, which makes the data stored in Dataset obsolete, you need to manually refresh or set Scheduling refresh. Otherwise, Dataset data will not be updated. Data refresh is a full update instead of an incremental update.

The maximum Size of Dataset is 1 GB.

2. Direct Query

In DirectQuery mode, PowerBI directly accesses the underlying data source, so the data is always up-to-date. Once Data is loaded, PowerBI Service does not load any Data to Dataset. This means that Dataset does not store any Data. However, dataset still stores the creden for connecting to the Data Source and the metadata of the Data Source to access the underlying Data Source. When a query request is executed, PowerBI Service directly sends the query request to the original Data Source to obtain the required Data. Direct query takes the initiative to obtain data, which means that any updates to the underlying data are not immediately reflected in the current report presentation. You need to Refresh the data. However, all new query requests use the latest data.

In direct query mode, you need to use the On-Premises Data Gateway. PowerBI Service can send query requests from the cloud to local data sources (on-premises Data sources. When Data interaction occurs, the query is directly sent to the database, Excel, Azure SQL DB/DW, etc. Because PowerBI and Data Source are directly connected, datasets of PowerBI Service are not required ).

Live/DirectQuery – This means there is a live connection between Power BI and the data source.

The advantages of DirectQuery connection mode are:

  • Ability to access Dataset of larger Size: because data does not need to be loaded into Dataset, DirectQuery mode can load data from massive data sources;
  • Direct access to the data source: In DirectQuery mode, PowerBI uses the latest data.

2. Data Refresh)

PowerBI Service obtains data from Dataset for data analysis and display. You can use schedule refresh and refresh now to REFRESH the data of Dataset and update the data of Dataset to the latest version. Before refreshing the Dataset, you must configure an intranet Data Gateway (On-Premises Data Gateway). For more information, see Chapter 6 PowerBI development: Data Network Management in my blog. PowerBI Service refreshes the data Dataset completely instead of incremental data.

When you refresh data, you are updating the data in the dataset that is stored in Power BI from your data source. This refresh is a full refresh and not incremental.

When the Import connection mode is used, all Data is imported from the Data Source to the cache of the PowerBI Service. The visualization control of PowerBI is to query Data from the cache. Once the PowerBI file is published to the PowerBI Service, PowerBI creates a Dataset to store the imported data. Set scheduling and regularly refresh Dataset so that PowerBI presents the latest analytical data, which is very important for making correct decisions.

Iii. Connection Mode Performance

Import connection mode is recommended.This is because PowerBI uses a memory-based columnar database.VertiPaqThis function is used to compress and quickly process released datasets. It enables offline access to PowerBI reports, column-oriented processing, and is highly optimized to 1: processing Performance of N relationships. The import mode is very suitable for aggregate queries, especially when there are a large number of relationships, PowerBI can quickly perform aggregation operations. The disadvantage of the import mode is that the maximum Size of Dataset is 1 GB, and the latest data can be accessed only after scheduling and refreshing.

In direct query mode, a direct connection is established between PowerBI and Data Source. The accessed Data is always up-to-date, and the Data Source size is unlimited.. In direct query mode, PowerBI directly sends the query to Datasource to obtain the required data. When Data Source is a relational database, PowerBI directly sends SQL query statements to the database. The biggest drawback of the direct query mode is the performance problem.

In direct query mode, all direct query requests are directly sent to the source database. The speed at which the backend data source responds to the query request determines the performance of direct query. Although PowerBI optimizes the generated SQL commands as much as possible, it is found through monitoring that the final SQL commands generated by PowerBI are very inefficient, especially when querying massive data sources, the Back-end data source takes a long time to return results. If the waiting time exceeds 30 s, the user's health check is not ideal. When the import mode cannot meet business requirements, consider the direct query mode.

 

Reference:

Data refresh in Power BI

Use DirectQuery in Power BI Desktop

Power BI and DirectQuery

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.