The future of software analysis

Source: Internet
Author: User

This article was originally published in IEEE Software magazine and is now presented by the INFOQ and IEEE Computer Society.

In this issue we invited a six-person panel of Experts, composed of eminent experts in the field of software analysis, to describe some of the most important or seriously overlooked issues in the field. They all think the current practice is not open enough: software analysis audiences should go beyond the developer (Ahmed Hassan), and the results of software analysis should go beyond mere numbers (per Runeson). The analysis should prove its usefulness (Abram Hindle, Martin Shepperd). "Natural" software analysis based on statistical natural language processing is facing great opportunities (Prem Devanbu). Finally, software analysis requires information analysts and agents such as Chloe O ' Brian and Jack Bauer (Sunghun Kim), as in the TV show, "24 hours of counter-terrorism." I hope you like it! --Invited editors Tim Menzies and Tom Zimmermann

Software analysis: Beyond the developer

Ahmed E. Hassan

Software analysis (SA) brings the concept of business intelligence to the software industry through a factual decision support system. Today, SA is focused on helping developers personally resolve decision issues in day-to-day coding and bug-fixing activities by digging into developer-related repositories, such as version control systems and bug tracking systems. For example, by digging through the actual risk of historical change, we can automatically determine the risk of a current code change-or an error may be (bugginess) 1.

In order to become a powerful, strategic decision making tool, future SA research must transcend these mediocre tasks. SAS can help a variety of stakeholders in a project-marketing, sales, support, and legal teams-not just developers. The SA must excavate resources and knowledge in all aspects of the project, in addition to the developer-related repository. Traditional methods of mining code libraries should be combined with other customer-facing resource libraries, such as run logs, customer support call records, blogs, and video discussions.

Only by taking such a multifaceted perspective can the SA support more influential business-level decisions, rather than those that are too common, like "is this file flawed?" Which programmer can fix it "problem like that. Only then can we help developers and their managers to have a more strategic discussion of the importance of a piece of code and its user satisfaction and revenue issues, and help support people better answer customer inquiries with the help of error logs and video aids. Help marketers to better target groups for their campaigns based on the actual use of data, helping salespeople better price their customers by understanding each feature's intrinsic value to the customer.

Prove its value to the practitioner

Abram Hindle

The most achievable goal of software analysis-metrics and reporting-has been a great response to success. Modern software services such as GitHub, BitBucket, Ohlol, Jira, and Fogbugz have been widely used in visualization and even in estimating the cost of defects. In this case, even if the developers have never read one of our papers, we can pat ourselves on the back and show praise for ourselves.

Yet the sweeter fruit of software analysis is disappointing: data mining has been forgotten in the software world. I think future software analysis will take into account various levels of context: the Software development domain (non-functional requirements, environment, tools, style, etc.), the software itself (database, application, etc.), and the overall software project context (requirements, glossary, architecture, community, etc.). For example, the N-grams model used in natural language processing ignores the nesting structure of the source code, and we should take this knowledge into account. Therefore, we should stop blindly using the latest data mining tools, but should be modified and customized to make them suitable for software analysis.

I can see an improvement in the accuracy or recall rate of defect repetition recognition or submission classification, but I do not see the significance of these improvements for practitioners. To prove its meaning, software analysis must show whether its cost-value ratio is reasonable. By contrast, doing nothing is sometimes very effective. We have to evaluate these technologies around the real needs of practitioners. We should not evaluate whether the improvement in the accuracy of 10% is meaningful to the defect classification or the manager-we must evaluate its economy, which determines whether the practitioner can afford our technology.

The future of software analysis depends on proving its value to the practitioner, proving its economic applicability, providing the necessary tools and techniques, and creating more meaningful and less superficial software analysis by effectively utilizing our knowledge in the field of software engineering.

Simple numbers are not enough.

Per Runeson

In recent years, the interest in software analysis has been increasing in the engineering community. This represents a good trend toward a "systematic, organized, quantifiable approach to software development", as described in the ISO610.12 software engineering definition. However, the numbers are not enough. The problems posed by software engineering are rarely answered by answers such as "4" or "y = 3.1x + 2". Numbers and equations are important for representing the relationship between data, but for practical use, numbers and equations must be accompanied by interpretation and visualization.

The "explanation" here refers to the discovery of digital operations into the real, living software engineering world, into the organization and corporate culture, into the business and market. This is mainly a shift from quantitative to qualitative. For example, suppose that software analysis finds a negative correlation between the experience of the project manager and the failure of the project (i.e., the higher the project manager's experience, the greater the likelihood of project failure), does that mean that we should hire inexperienced managers? No, don't forget the third factor-the complexity of the project. Experienced managers tend to be responsible for more complex projects, and these projects are more likely to fail. It is this kind of interpretive work that makes software analysis truly useful.

Another way to apply the numbers of software analysis to the day-to-day management of managers is visualization. Although most software managers have good technical and analytical skills, they often do not have the time to explore the details, and they need a visual way to understand the discoveries represented by the numbers. Statistics and spreadsheet tools produce charts that are a good start, but more research is needed to actually deliver the output of software analysis to decision-makers. Visualization is a powerful tool that enables software analysis to show great power. In short, we should continue to promote the study of software analysis, but please do not forget to explain and visualize.

Analysis of three questions

Martin Sheppard

Using Advanced machine learning and statistical methods, data mining of various software engineering artifacts (such as source code and change data) is becoming a rapidly growing industry. Many interesting, useful models and discoveries are emerging, but these methods are not without danger. As researchers, we should ask ourselves three questions:

First, how much better is my model compared to a dummy strategy, such as guessing or using a public array? This may seem like an unnecessary problem, but if the goal of analysis itself is to surpass another model or result, then our model will ultimately not be worth the risk or existence of a bad model. If this is sensational, let me give you an example: Stephen Macdonell and I have recently shown that some of the previously published predictive models of regression and case-based reasoning are actually less predictable than simple enumerations. The seemingly naïve benchmark approach has the advantage of simplicity and ease of use.

Second, how significant is the actual effect? This is generally judged by the effect amount (effect Sizes3). One danger, however, is that if, as usual, only focus on the 0 hypothesis significance test and report p value, then when n is very large (as is often the case with data analysis), even very small effects are statistically significant (and not so in effect). Because our goal is to find practical results, it is important to ensure that the value of not being fooled by P values.

Third, when one or more of the input conditions change a little, how much will the result change, that is, how sensitive is the analysis to the input condition? We must remind ourselves that we are confronted with noisy and uncertain data, so it is important to be aware that a small measurement error in the input may have some impact on the model. Sensitivity analysis is a systematic approach to analyzing this vulnerability 4. With it, users know how badly a model relies on a particular input, so they know at least where to put extra work to ensure its accuracy.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.