michelangelus - Fotolia
AI techniques are widely used as part of modern applications to improve application capabilities. Now, software developers are starting to look at how artificial intelligence techniques, such as deep learning, can be used to improve the ability to understand complex software, said Steven Lowe, principal consultant at ThoughtWorks Inc., a Chicago-based application development consultancy, at DeveloperWeek in San Francisco. Lowe has been working on a novel approach for using deep learning to analyze how the structure of software can quickly gain insights from old codebases.
Perhaps deep learning's greatest utility in the short run can come from exploring the gap between the way software has been coded and a company's existing business model, Lowe said. He said he believes one of the most important tools for a modern software organization lies in applying domain-driven design to model how workers see a company's existing business process. This model provides a hypothesis for testing out assumptions. The real value of deep learning comes when the underlying model can be tested out against real-world deployment. Aligning an enterprise codebase with the domain model makes it easier to iterate on new business ideas with less effort.
Deep learning requires less effort
There are literally hundreds of artificial intelligence techniques, many of which build upon each other. Deep learning is an interesting modality that allows a developer to craft algorithms with minimal effort. Developers can simply start with a large set of data to train algorithms without having to supervise the process. It has proven its strengths in Google's AlphaGo project, the Watson Jeopardy challenge, lipreading and teaching computers to recognize cats from pictures.
Lowe said he wanted to develop a set of deep learning algorithms to make it simple to visualize large codebases. One day, a colleague asked him to take a look at a rather large legacy system with over 2 million lines of code. He first used a variety of code analysis tools to generate metrics about the code. These could show him suspicious areas to look at to review.
Lowe suspected it might be possible to somehow visualize this complex code in a way that he could get insights out of it. He experimented with a variety of visualization tools. These made it possible to create a force graph illustrating the weights between nodes in the way the code interacted using the force diagram. But this just showed where there were giant balls of mud that made little sense.
Making code visualization useful
There are number of other promising techniques for software visualization. Adrian Kuhn developed a tool for Software Cartography almost 8 years ago. Primitive.io demonstrated a virtual reality tool for visualizing the structure of code. These tools can make it easier to understand code structure at a glance, but there are still challenges for developers looking for the next right step.
It was frustrating because it was difficult to find the difference between bugs and bad data models. It was difficult to identify any kind of deep inheritance hierarchies within the code. Lowe thought it might be possible to label things, but then that would take away from the power visualization.
Combine big data visualization with business design
One promising technique was to apply latent semantic indexing, a deep learning technique to analyze documents to mine the semantics or meaning of the words contained within. Latent semantic analysis uses deep learning to generate a vector graph for modeling the words used in a corpus of documents and even application code. One promising tool is Word2vec, which works on top of Google's TensorFlow tool. For example, it could represent: queen = king – man + woman.
Lowe applied Word2vec to process user stories and code for a software project. He said this could be a promising technique for comparing how closely application code matches a business domain model. Domain-driven design leverages user stories and domain expertise to quickly generate a business model. This initial model only represents a hypothesis of how the business works based on enterprise domain experts. It must be tested out through actual working code to see how close the hypothesis comes to reality, Lowe said.
As the organization implements this model into software, it can discover where there are gaps and inaccurate assumptions. The business team can then tune the model to see what does improve business process and goals. But the process of tuning the model can be challenging when the application code does not reflect the working business model. Developers need to spend a lot of time coming up with kludges to fill in the gaps.
Deep learning is just the beginning
Lowe said he believes better tools to quickly visualize where code and business models diverge can make it easier to determine where to focus development resources to improve business agility. Plotting the results of latent semantic analysis of code and user stories into the same visualization can make it easier to find these gaps.
Analyzing software structure is just one use case for leveraging deep learning for developers. Other research is looking at how deep learning and other AI techniques could automate code development. Lowe said this kind of research is still in its early stages, and he encouraged other developers to launch their own research projects.
"We cannot deny that deep learning and related technologies are powerful tools, and we are not using them. The problem is not software; our brains have limits," Lowe said. "We are reading code for orientation and understanding. If we could use machine learning to speed that up, it would be a huge improvement. This is where visualization and virtual reality could help to take advantage of more of our senses."
Use cognitive analytics to reveal data's hidden patterns
Cognitive computing and its place in the real world
Deep learning drives Loop AI quest