OpenStack develops its own static code analysis tool for Python

For OpenStack, necessity proved to be the mother of invention, leading to development of a static code analysis tool for Python called Bandit.

It has become standard fare for large, enterprise organizations to reach out to the open source community to fill the void between the software their projects need, and the amount of software their information technology teams can actually develop. Whether it's something as simple as packaging the slf4j library with their applications to help simplify the task of logging, or something as complex as leveraging OpenStack to provide a globally distributed system of hypervisor-based compute nodes, more often than not, open source applications are becoming the software of choice.

The BMW Group is a perfect example. Recognizing the cost-saving benefits of virtualization and cloud-based computing, BMW began developing its own software that could bring automation and resource provisioning to new levels. "In about 2011, we developed a piece of software internally and we called it our own internal cloud," said Dr. Stephan Lenz, BMW's IT infrastructure manager. And although BMW was admittedly ahead of the curve in terms of recognizing the role cloud-based computing would play in the modern world of IT, the automotive giant couldn't be expected to divert its technology budget toward managing and maintaining its internally built, cloud-based systems. This is why BMW chose OpenStack as its cloud computing platform of choice.

Sadly, the world is sometimes full of injustices and cruel ironies.

Managing and maintaining software quality

Dealing with non-functional requirements is always a challenge, especially on large open source projects such as OpenStack, when source code contributions are coming in from innumerable resources all over the globe. Even the most obsessive-compulsive project leads can't possibly go over every line of submitted code and evaluate it against an encyclopedic list of possible security gaffes. "Manual review of code takes a long time," said Michael Xin, Rackspace's manager of security engineering. "The general rule of thumb is 200 or 300 lines per hour." When you're dealing with a hundreds of potential subprojects, let alone hundreds of lines of code, doing line by line analysis is an impossibility." That's where static code analyzers come in.

Static code analysis has turned into big business, especially after Apple released a security-ridden product update that could have been prevented easily if static code analysis was performed after every build. HP's Fortify product has a good reputation in the industry, as does Klocwork. But when OpenStack went to evaluate a static code analysis tool to integrate into its continuous integration and continuous delivery systems, they couldn't find a product that dealt with Python. "Unfortunately, the commercial vendors do not support Python," Xin said. "It is very difficult to do data flow analysis on a dynamic language." So on that rare occasion when an open source project solicited the help of the commercial community, there wasn't a vendor that could help.

Of course, that wouldn't deter the security team. Integrating static code analysis with a continuous integration process is a standard best practice and an important part of maintaining code quality. If a static code analysis tool for Python couldn't be bought externally, one would need to be built internally, with the end result being Project Bandit.

Building Bandit

Bandit, now being integrated into more and more OpenStack projects, provides the ability to scan source code and identify potential security-related problems, including possible SQL injections, the use of unsafe libraries, the existence of unreachable code, variable taint analysis and more. Problems are identified right at the line of code in which they appear, and the tool will assert both a severity and confidence level associated with the alert.

Another compelling feature of Bandit is its customizability. Source code analysis can take an extremely long time when as a source code repository grows. To speed things up, various plug-ins can be turned off, or certain directories can be excluded from scans. After all, why should the tool search for SQLInjection points if the project doesn't use a database? Or why should a folder containing nothing but tests be subject to any scrutiny when none of it will go into production? The ability to customize the tool makes its use extremely compelling.

Bandit easy to use

Whenever code is checked into GitHub, Bandit automatically kicks off and reports are sent to the team. It gives us a baseline to describe security performance.
Michael XinManager of security engineering, Rackspace

But perhaps the most compelling aspect of Bandit is the ease with which users can write customized plug-ins for the tool. Specialized problem domains have custom security needs, so the ability to write custom plug-ins that help to identify security issues that might not be pertinent to other use cases is extremely valuable. "Bandit provides a very flexible way to write plug-ins," Xin said when describing the level of effort required to add in custom security checks.

And how has Bandit performed?  "In Rackspace, we've been running Bandit on a couple of projects, "Xin said. "Whenever code is checked into GitHub, Bandit automatically kicks off and reports are sent to the team. It gives us a baseline to describe security performance."

Necessity really is the mother of invention, and although OpenStack likely never wanted to immerse itself in the world of developing a static code analysis tool for Python, the lack of any commercial products that would meet its needs led to Project Bandit's inception. And although it may not have been as easy as buying an existing solution off the shelf, it has proven to be highly effective, and other interested Python users will be able to benefit through the open source nature of the project.

How has your experience been with static code analysis tools? Let us know.

Next Steps

Static code analysis with Klocwork

Cloud-based static code analysis tools

Dig Deeper on Development tools for continuous software delivery

App Architecture
Software Quality
Cloud Computing