How to excavate the requirement

Source: Internet
Author: User
Keywords Demand management demand analysis demand mining project requirements
Tags analysis application based behavior clear click content data

A summary of demand satisfaction

What is demand satisfaction

1.1 What is demand satisfaction

Users to search for "octopus Paul", in terms of textual relevance, search engines simply return to the "Octopus Paul" content related to the results can be, so that users are satisfied with it? User A: I heard that Octopus Emperor Hung, to see the latest results, how all is August, back page ... User B: Today colleagues are talking about the octopus brother Hung, what is the Octopus brother? I again out, to search the Octopus Emperor's life story is what, how all is the latest results, no Octopus brother Introduction Ah, change a query to see users c: I am a die-hard fan, see Octopus Brother, and then look at the football related bar, Wayne Rooney, If Steven Gerrard has scored again, there is no recommendation, and I have to type it myself. User ding: Look for an octopus brother's head. Must be very windy, how all is the result no square chart, so flat diagram how to use AH User e: For the Octopus elder brother's wallpaper, perhaps next time buys the lottery to be able to send the big money, Yi, how all is the small size chart ... (The above information is analyzed by 2010-10-27 user session.) Generally speaking, the user to the search engine to express his needs, search engines understand the needs of users, provide different requirements under the resources, the whole process can be collectively referred to as demand satisfaction. In short, it is that in addition to the basic text relevance of the rank work, are required to meet the category, that is, to provide users with the search results, not only in the literal and user input text-related, but also to meet the various needs of users. Demand satisfies the position in rank system:

1.2 Why needs satisfaction

Users express their needs through query, and for most queries, especially query with implied requirements, only the results of a literal match may not be able to meet their needs. At present, our sorting system is mainly based on the text relevance of this dimension, the weight reflects the term in query and obj related degree, in this system, the relevant results may not be able to meet user needs. For example, the aforementioned "Octopus Paul" example, it is clear that these requirements in the text relevance of the dimension is difficult to solve, especially in relation to sudden timeliness requirements, pan-demand. 1.3 Requirements to cover what work

From the above example, you can see that demand to meet the need to solve the timeliness of demand problems, multiple demand issues, related recommendations, size requirements, material requirements, browsing guidance and so on. The rank strategy in addition to the underlying text dependencies and the query analysis for these purposes can be considered to be a requirement-satisfying task, as well as a front-end result showing the user-guided browsing work. Image needs to meet, according to different dimensions, can be divided into the following areas:

1) Demand identification

2) Resource Construction

3 Demand Power

4 Results Organization and recommendation

5 User-guided interaction

Second, the need to meet how to do needs to meet the core issues to solve:

Demand identification resource construction demand Power 2.1 demand identification

Types of 2.1.1 Requirements

Identifying the needs of query and the strength of your requirements is the most basic work. First of all need to have the system, can complete description of various requirements, followed by how to identify these requirements, each query needs to correspond to the system.

Identify requirements through query classification:

Now the online query classification system is based on the topic attribute. Including the scenery category, place names, people, cars and so on, for each category, in some dimensions of the demand is not the same, such as the landscape needs large size, more clear, does not contain people's pictures, and chat class needs smaller size, preferably a animated GIF. The project under this strategy is: size adjustment power, format adjustment, face demand, human and inhuman.

Demand identification based on statistics

Through a large number of statistical analysis of data, you can identify the common aspects of query. There are many data available for analysis, such as user behavior data, click Feedback, search results, etc. For example: Query search results, according to a feature clustering, if a category contains a lot of pictures, more than set thresholds, it is considered that the category of pictures, on this feature, represents the query's needs. Line face demand recognition is done this way. Statistical user feedback to obtain demand is the most reflective of user needs, user feedback, including user clicks, query transformation, etc., in this regard we do not much work, experience is not much, is the focus of our follow-up work.

Proper Names & demand words

It is the most direct way to judge the keywords in query that contain proper names or demand words. For example, "Red BMW", the display of the expression of color requirements.

Timeliness requirements

Timeliness requirements include three parts, sudden timeliness, cycle time and timeliness of demand, the current line is to do the emergency timeliness needs. The recognition of demand is mainly judged by the sudden occurrence of retrieval quantity, the number of resources and the actual events. The amount of burst, refers to the cumulative per hour of user retrieval frequency, with 15 consecutive days of user retrieval frequency, calculate the slope of the sudden, according to the size of the slope, to judge the strength of the timeliness of demand. The above method is only suitable for popular query, for long tail query, the retrieval frequency is very low, can not be identified in this way, generally this query is more term query, you can hit the keyword to judge: Through the event to judge: this way, The main point is to see the proportion of critical term hit timeliness events. Of course, these events are through the active mining of the timeliness of query, through clustering, the training for each category of keywords.

The strength of 2.1.2 demand

To do a good job of demand satisfaction, not only to identify what type of demand query, but also to identify the type of demand, he directly directs the follow-up demand power. Each dimension needs to have the intensity of demand, the strength of the demand determines the weight of the dimension when the weights are combined. For example, timeliness requirements, the intensity of demand is high, to meet the timeliness of resources, must be ranked in the front. For example, the definition of saturation, for most of query, the demand is not very strong, the intensity of power can not be too big. Demand strength of the calculation, and the following rank model requirements related to the ideal state is each query, you can dynamically calculate in each dimension of the demand strength, we have little experience in this area, if the temporary can not do accurate calculation, the temporary may consider the method of artificial designation, For example, for different query classification, manually set the intensity of the requirements dimension. There are some ways to think about this:

Explicit demand for strong demand

Users express their needs by including the need words in query, which is strong demand. For example, the latest Andy Lau pictures, red BMW based on statistical mining requirements, the decision value exceeded the threshold of the proportion size, determine the strength of the demand

When using statistics to excavate the user's demand, the attribute of a dimension is usually selected, and the statistic characteristic is calculated, which can be used to judge the strength of the demand according to the distribution of the numerical data. For example, the timeliness of demand, a certain period of time, the query search volume burst particularly large, was yesterday 100 times times the amount of retrieval, if we set the threshold is twice times, then this query can be considered to be particularly time-sensitive requirements. For example, the user clicks on the data mining size requirements, for the Avatar class query, most users click on the 100*100 of the both, but the total number of clicks is not very high, such as only to 60%, then for this query, size requirements is the general intensity of demand.

2.2 Satisfaction of demand

To identify the needs of query, the next step is to provide the appropriate resources.

Mining of 2.2.1 Resources

How to get the resources to meet the needs is another core issue of demand satisfaction. In resources, through a combination of one or several features, to meet the requirements of the resources and not meet the requirements of the resource areas, to find the needs of users of resources, remove the resources do not meet the requirements, is the main work. Content attribute Characteristics

For the content attribute dimension, it can be divided into the underlying physical characteristics, middle-level object recognition and high-level semantic features; for the underlying physical characteristics, relatively simple, we can now use, including size, color, format, clarity, and so on, middle features, we currently use not much, some people and Non-human, Pornographic pictures, the identification of the whole car, the identification of mobile phone pictures, etc. for high-level semantic features, including scene recognition, image style recognition, is our future development direction. Topic attribute Dimensions

Query classification of similar systems, can also be similar to the topic of resources classification, we currently only do the site level classification, the effect is not very ideal, the main reason is that the site granularity is too coarse, and the second is the site classification of the recall there is a big problem. We want to be able to classify the granularity at obj level, at least at the page level. If you have the classification of topic attributes, and query requirements of the classification of the match, you can achieve a multiplier effect. The collection of timeliness resources

Our current time-sensitive resources are mainly to excavate the timeliness of the library, and news resources, and the distinction between time-sensitive resources is relatively easy.

2.2.2 Demand Transfer Power

To clarify the needs of query, mining the resources to meet the needs, then how to meet the needs of the resources rank to the front end? For a variety of different needs of the dimension, have their own right to adjust the strategy. For example, the format of the right, assuming that query has GIF requirements, for GIF dynamic graph, weights multiplied by 1.2, for the static diagram to reduce the right, weight by 0.1. Also such as timeliness requirements, directly inserted in the first three pages of the timeliness of the results of the library, this is because the timeliness of demand is a strong demand dimension, simple weighting, can not guarantee the results adjusted to the first three pages. As can be seen from these examples, the current demand adjustment strategy is 2 types: In the total weight of the right to raise the final order of the results. At present, this strategy is directly superimposed on the power transfer method, the advantage is simple, direct, shortcomings are more, the biggest is not controllable, a dimension of the power, will be the final result of how much impact on the number of power dimensions, he said, the weight of how big, do not know. In the future, the need to adjust the power, first of all, the resources to meet the needs of the situation, make a detailed profile, do have intuitive physical meaning, secondly, according to the strength of the dimension, the dimension of the rating reflected to the final results, whether it is a cross-file adjustment or fine-tuning, such as: strong demand: Meet the requirements of the results directly transferred to the most high-end, For example, timeliness requirements general requirements: meet the requirements of the results, can be based on certain rules to improve their own file position weak demand: can not promote the stall, in the same file, to do the weight adjustment

2.3 Effect of demand satisfaction

The previous completed query needs identification, resource identification has been required to adjust the work, then the user is not satisfied? The search engine finally is to the user Service, the user feels cool, is the most important goal. So how do you know if the user is satisfied? The user receives feedback on the information provided by the search engine. The feedback includes user clicks on search results, active transformations of query, and related behavior after these behaviors. Through the analysis of these data, we can know the satisfaction of users. For example, the need to identify the correction, through the user clicks feedback, you can know query needs to identify the correct, whether the need to exit. For example, the timeliness of demand, be misjudged query or should be out of query, can be through user feedback, to determine whether should exit. Of course, this way is not reasonable yet to investigate, after all, users do not click on a map of the reasons for many possible, there may be a need to identify the problem, it is possible to identify the dimensions of the problem, but also may be the problem of rank. At present, the user feedback application only clicks the right, whether the user's feedback can be in the individual dimension effective, but also needs the detailed investigation analysis. In addition, over time, query demand is changing, through the user's feedback, you can make timely adjustments.

Iii. the prospect of demand satisfaction

3.1 The establishment of the system framework of requirement satisfaction

Before doing a lot of requirements to meet the job, color, format, size, face, facial expression, avatar and so on, respectively, is the case by the case to do, in fact, there are many similarities between them. From the perspective of investigation, it can be divided into three parts: demand identification, resource satisfaction and demand power transfer. Such scattered points to do, there are many drawbacks, one is the investigation of duplication, efficiency is low; the second is the front-end power system like patching, more and more chaotic, no clear system framework. Third, it is easy to cause the right to repeat. At present, the better idea is to establish the system framework of demand satisfaction, including the front-end demand analysis and the back-end demand transfer power. The front-end analysis is clear about what requirements are in query, and how strong each needs is, passing the information to the back end, the backend having to have the resources for each requirement, and then combining the requirements mentioned above to give each obj an appropriate stall and weight. Later to do the requirements to meet the type of project, only need to adjust the needs of the front end to identify, if there is no additional requirements dimension, the back-end of the resources and demand transfer rights, can not be changed.

3.2 More Intelligent Requirements identification

1. Analysis of user behavior data mining requirements users in the search process, can be recorded by all of our actions, such as clicks, pages, query transformation, stay time, whereabouts, sources, etc., the use of these data, you can better understand the user's intentions. At present, the application of this aspect only clicks the right, need to excavate the application is very few, is our future work focus direction. 2. Personalized demand mining through the analysis session, can obtain personalized user individual needs. The present result of our display is good or bad, mainly through the analysis of the session, to judge the needs of users, and for high-frequency query, display the result is a large-scale user behavior statistics, a universal model, this model because it is statistical, so with objectivity, may hurt a part of personalized users. 3. Demand intensity identification of the needs of each dimension, must have the intensity of demand, in each dimension of power consolidation, the strength of the demand determines the dimension of the weight, such as the timeliness of demand, the intensity of demand is very high, to meet the timeliness of resources, must be ranked in the front. For example, the definition of saturation, for most of query, the demand is not very strong, the intensity of power can not be too big.

3.3 Resource Construction

At present, we in query classification, query Demand identification, query analysis has done a lot of work, compared to the construction of resources, we do less work, and then want to promote the obj level of resource classification, content attributes on several resource categories: Icon resource identification, map resource identification, Cartoon animation map recognition, as well as some screenshots of recognition.

3.4 Demand Guidance

Image is a browse-oriented product, how to guide users more convenient browsing, is one of the key tasks in the future. We have tried the Star results page in the query category display, in the future we want to be able to more categories of query, take the initiative to guide. Reference pm to the definition of theme query: "Theme" query refers to: query refers to the people, things, objects contain a wide range of content, the specific performance of query corresponding to the current search results in a variety of obvious aspects of the mixed situation. Actually refers to the pan-demand, multiple-demand query. For this type of pan-demand query results of the display style, the form of the tab multiple results page is not necessarily the best, how to better play a role, the need for more in-depth research and innovative ideas. In addition, it also includes the optimization of the RS display, the display of dynamic summary and the continuous upgrade of the graph presentation.

Iv. Conclusion

Image needs to meet the direction has just started, the future to the intelligent, automation, diversification direction of sustainable development. Our ultimate goal is to meet the needs of this direction did not, demand mining, resources to meet all automation, so that "no sword in the hands of the sword in mind."

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.