2017.04.19: Today's headline data analysis of written 01_ data

Source: Internet
Author: User

1. How to identify the cottage app

2. Supervised Learning vs unsupervised learning

Whether there is a supervision (supervised), it depends on whether the input data has a label (label). The input data has the label, then has supervises the study, does not have the label for unsupervised study. The simplest and most common class of machine learning Algorithms is the Classification (classification). For classification, the training data entered has a feature (feature) with a label (label). The essence of so-called learning is to find the relationship between the characteristics and the label (mapping). This allows us to get an unknown data label from an existing relationship when there is an unknown data entry with features and no tags. In the above classification process, if all the training data are labeled, there is supervised learning (supervisedlearning). If the data is not labeled, it is obviously unsupervised learning (unsupervisedlearning), or clustering (clustering).


3.P value

P value is the probability of a sample observation or more extreme results when the original hypothesis is true.

P-value, the probability of coincidence, the statistical significance of NULL hypothesis

The likelihood that p>0.05 happens is greater than 5%. Null and void hypothesis no significant difference between the two groups

The likelihood of p<0.05 appearing is less than 5%, which can negate the hypothesis that the two sets of differences have significant significance.

The likelihood that p<0.01 will appear is less than 1%, and the difference between the null and void assumptions is very significant.

4. Law of large numbers

The law of large numbers means that in a randomized trial, the results of each occurrence are different, but the average of the results of a large number of repeated trials is almost always close to a certain value. The reason is that in a large number of observational tests, the difference between individual and accidental factors will cancel out, thus making the inevitable regularity of the phenomenon appear.

5. Gradient Descent

The gradient descent method is an optimization algorithm, which is often called the steepest descent method. The steepest descent method is one of the simplest and oldest methods to solve unconstrained optimization problems, although it is no longer practical, but many effective algorithms have been improved and modified based on it. The steepest descent method is to use the direction of the negative gradient as the search direction, the steepest descent method is closer to the target value, the smaller the step, the slower the advance.

Disadvantage: The convergence rate slows down near the minimum value. A straight line search may cause problems. May be "zigzag" to descend.

6. Warehouse Model

1. Star Type model

The star model is a modeling paradigm of a point outward radiation, in which a single object is connected to multiple objects along a radius. The star model reflects the end-user's view of business inquiries: sales facts, compensation, payments, and consignment of goods are described in one-dimensional or multidimensional terms (by month, product, geographic location). The object of the Star Model Center is called a "fact table", and the object connected to it is called a "dimension table." The query to the fact table is to get a pointer table that points to the dimension table, and when a query to the fact table is combined with a query to a dimension table, you can retrieve a large amount of information. By combining, the dimension table can be subdivided and aggregated to find criteria.

2, Snowflake model

The snowflake model is an extension of the star model, each point is connected outward to multiple points along the radius. The Snowflake model further standardizes the Star dimension table, which has the advantage of improving query performance by minimizing data storage and combining smaller standardized tables (rather than large non-standard tables). And the lower granularity of the dimension, the snowflake model increases the flexibility of the application.

3. Hybrid model

The hybrid model is a tradeoff between the star model and the Snowflake model, where the star model consists of a fact table and a standardized dimension table, and all the dimension tables of the snowflake model are standardized. In a mixed model, only the largest dimension tables are standardized, and these tables typically contain a list of fully standardized (duplicate) data.

7. The new strategy on line, 10% of the corresponding revenue for the 5%;100% corresponding to the return of 1%, why.

8. Commodity pricing Strategy

9.Hive different when writing code with MySQL

10. Improve the user experience of advertising

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.