DeepCode and AI tools poised to revolutionize static code analysis

Developers use static analysis tools to identify problems in their code sooner in the development lifecycle. However, the overall architecture of these tools has only changed incrementally with the addition of new rules crafted by experts. Researchers are now, though, starting to use AI to automatically generate much more elaborate rule sets for parsing code. This can help identify more problems earlier in the lifecycle and provide better feedback.

Some companies, like the game maker Ubisoft, are already working on these kinds of tools internally. A team of researchers at ETH Zurich is now making a similar AI tool available for mainstream adoption, called DeepCode. It analyzes Java, JavaScript and Python code using about 250,000 rules compared to about 4,000 for traditional static analyzer tools. We caught up with Boris Paskalev, CEO at DeepCode, to find out how this works and what’s next.

What experiences and related work informed your decision in using AI to improve software development?

Boris Paskalev: The idea for using AI to improve software came from longer-term research done at the Secure, Reliable, and Intelligent Systems Lab at the Department of Computer Science, ETH Zurich (http://www.sri.inf.ethz.ch). During a period of several years, we explored a number of concepts, built several research systems based on these (some of which are widely used), and received various awards. We observed the enormous impact our technology could have on software construction. As a result, we started DeepCode with the vision of pushing the limit of these AI techniques and bringing those benefits to every software developer worldwide.

How does DeepCode compare with other approaches like static or dynamic analysis in terms of usage, performance, or the kinds of problems it can identify?

Paskalev: DeepCode relies on a creative and non-trivial combination of static analysis and custom machine learning algorithms. Unlike traditional static analysis, it does not rely on manually hardcoded rules, but learns these automatically from data and uses them to analyze your program. This concept of never-ending learning enables the system to constantly improve with more data, without supervision.

DeepCode also enables analysis with zero configuration which means one can simply point their repository to DeepCode which will then provide the results several seconds later, without the need to compile the program or locate all external code. These features are especially desirable in an enterprise setting, where running the code via dynamic analysis or trying to perform standard static analysis can be very time-consuming and difficult.

How does DeepCode fit into the developer workflow, and how does this contrast with other approaches for finding similar bugs, such as identifying a problem in QA or after code is released?

Paskalev: Currently, we optimized DeepCode to report issues at code review time, as this is a serious pain point in the software creation lifecycle. However, it is possible to integrate DeepCode at any step of the lifecycle.

How does DeepCode compare, contrast, and complement JSNice, Nice2Predict, and DeGuard?

Paskalev: JSNice and DeGuard are systems we created which target the specific problem of code layout deobfuscation. DeepCode is a more general system which aims to automatically find a wide range of issues in code. This makes DeepCode applicable not only when trying to understand someone else’s code (e.g. to audit it for security), but also when writing and committing new code.

What other research on using AI to explore bugs have you come across, and how does DeepCode compare and contrast with these?

Paskalev: The field of using AI for code is fairly new but growing. However, we are currently not aware of any system with the capabilities of DeepCode. Unlike other systems that try to use AI methods directly over code, DeepCode is based on AI that is actually able to learn interpretable rules. This means the rules can be examined by a human and easily integrated into an analyzer.

Can you say more about the process of parsing code with the AI tools and building up the rule-set? What kinds of AI or other analytics techniques are used?

Paskalev: DeepCode is based on custom AI and semantic analysis techniques specifically designed to learn rules and other information from code, as opposed to other data (e.g., images, videos) which are less effective when dealing with code.

How do you go about classifying code as a mistake?

Paskalev: Our AI engine learns rules based on patterns that others have fixed in the past and understands what problem it fixed for them based on the commit messages and bug databases. Then, it uses the learned rules to analyze your code, which if they trigger, are reported to the developer.

What have you learned about making recommendations for fixing bugs?

Paskalev: We learned that simply localizing the bug is not enough. The real challenge is to explain the issue and provide an actionable feedback on what the problem actually is. DeepCode connects the report to how others have fixed a similar issue, which is an important step towards that goal.

What languages does it support now, and what is involved in adding support for new ones?

Paskalev: Currently, DeepCode supports Java, JavaScript, and Python. Adding a language requires adding a parser and extending our semantic code analyzer to handle special features of the language. Because of the particular way DeepCode is architected, we can add a language every few months.

How does DeepCode compare differ from traditional static analysis tools?

Paskalev: Static analysis tools available out there often come with a set of hardcoded rules that aim to capture what is considered “bad” in code. Then, they detect these rules in your code. Over the last decade, many companies have created such tools, e.g. Coverity, Grammatech, JetBrains, SonarSource, and others. That type of approach typically gets one to a few thousands of rules across tens of programming languages.

250,000 rules seem like a lot compared to 4,000. Is it the case that it can identify more types of problems, or that it can provide greater granularity in identifying how to rectify an issue, or perhaps a little bit of both?

Paskalev: We identify many different types of issues than what existing hardcoded rule analyzers cover. We also provide a more detailed explanation what the issue is and how others have fixed a similar problem. This enables users to more quickly figure out what fix they should apply.

What categories of problems does it identify now – is it just different categories of bugs or can it find opportunities for performance improvement?

Paskalev: DeepCode finds bugs, security issues, possible performance improvements and also code style issues. We learn these from commits in open source code and we use natural language processing to understand the issue that the commits fix.

Can DeepCode be used for code or architecture refactoring? Is that something you are looking at doing in the future?

Paskalev: Some of our suggestions are indeed suggesting refactoring of code, but not yet on a project-wide architectural level. Our platform’s utility is to enable any service that requires a deep understanding of your code to be quickly and easily created. We are already scoping the launch of several exciting services that some of our early adopters have asked for.

How do you expect the use and technology of DeepCode to evolve and the use of AI as part of improving developer workflow in general?

Paskalev: Our platform is constantly getting better. This will enable developers to work on much larger projects/scopes with the same or smaller effort while minimizing the risk defects and costly production problems.