When dealing with exponential market data (IDXD), I encountered a Kylin performance query problem, very strange. After some research has found the reason and successfully solved:
Symptoms:
Select COUNT (*) from SENSITOP.IDXD where ticker = ' 000300 ' and tradedate between ' 2016-01-01 ' and ' 2016-07-01 '
Soon, less than a second
SELECT * from SENSITOP.IDXD where ticker = ' 000300 ' and tradedate between ' 2016-01-01 ' and ' 2016-07-01 '
It's slow, takes more than 50 seconds, and sometimes times out.
Analysis:
Since count is very fast, the description of retrieving the cube itself is very fast, and the problem may be in getting the data, possibly the problem of reading the data, need to check the cube's settings
Solve:
Discovery by default, the Tradedate field in the cube is Dict, and the performance issue is resolved when you change to date.
?
Conclusion:
This should be a problem with deserialization. When the cube is retrieved, it is indexed, and then the data needs to be read from the disk and deserialized into an object. For Tradedate, there is a significant performance difference between Dict's encoding and the encoding of date. This point deserves our attention!
Analysis on the reasons of poor query performance Kylin