Painless Unit Testing with AgitarOne: Video Tech Brief

Discussions

News: Painless Unit Testing with AgitarOne: Video Tech Brief

  1. Alberto Savoia, the CTO and co-founder of Agitar software, talks about their new AgitarOne, a tool for dealing with what some call "the steaming pile of legacy code". Legacy in this context is the code hasn't been unit tested. Agitar's new tool allows full automation to aid in testing of new and legacy code. Out of the box, AgitarOne generates basic unit tests for at least 80% of the code in a project in a matter of hours, even if the sources for some of the classes aren't available. Development teams can then reduce the risk of changing business applications and focus their energy on optimizing the tests and generating new application logic. (Click here if you can't playback the video.) Agitar also offers a free service, Junit Factory, for automatic generation of Java code JUnit tests that characterize specific code behavior. JUnit Factory has the same underlying technology as AgitarOne, but only provides JUnit generation; it's a good way of gauging this technology's capabilities. The tools only take you so far since they generate tests that characterize the actual behavior of the code. Alberto shares his wisdom with us on how to develop an all-encompassing approach to testing in his book The Way of Testivus (PDF download, 500 KB), also available in paper upon request from Agitar Software. Alberto Savoia co-founded Agitar in June 2002. Before Agitar, he worked at Google as the engineering manager in charge of the highly successful and profitable ads group. In October 1998 he co-founded and became CTO of Velogic, Inc., the pioneer in Internet performance and scalability testing, acquired by Keynote Systems in 2000. You can learn more from Alberto in his blog, hosted at Artima. Watch other Tech Briefs
  2. ...we invented magic technology called "mockitator" that runs inside of "agitator"...
    That sounded funny :-) On a serious side paying >$50K for 10-seat license is rather excessive for almost everyone. Best, Nikita Ivanov. GridGain - Grid Computing Made Simple
  3. Hi Nikita,
    ...we invented magic technology called "mockitator" that runs inside of "agitator"...
    The Mockitator name was meant to be funny :-) or at least funny-ish. You should see the code/product names that we didn't pick. We also have an internal tool called the "mutilator" that we use to experiment with mutation testing.
    On a serious side paying >$50K for 10-seat license is rather excessive for almost everyone.
    I agree that AgitarOne is not inexpensive. But ... ... considering that the fully loaded cost (i.e. salaries + benefits + HW + offices + support + ...) for a 10 person development team is ~$2M for most of our customers, $50K is only 2.5%. For teams that believe in the benefits of having unit tests for their new and legacy code, the test amplification and automation provided by AgitarOne allows them to have exhaustive tests without exhausted developers - which is priceless IMO (cue in MasterCard commercial music). We have found that it takes 3-4 lines of JUnit for every line of Java to achieve relatively thorough coverage. That's a lot of manual testing and a lot of developer time - especially since a lot of the tests code they have to write is basic template code and/or obvious test cases that can be derived from the code. By automating/accelerating *some* of the test development, we leave developers more time to focus on composing some of the more complicated test cases - those that require human intelligence and specific domain knowledge. Futhermore, for teams that don't have or don't want to spend the money we still have options: 1) For the people that don't *have* money to spend on it (i.e. students, open-source developers, researchers) we offer full-featured free licenses. Just ask. There are already several universities using our tools in their programming classes. 2) For people who don't *want* to spend any money (even on hardware to run the server), we offer JUnitFactory. Even though we designed JUnitFactory for open-source and students, we have a bunch of well know and very rich organizations using it on a regular basis. Alberto
  4. Nikita, Instead of always criticizing others, isn't it a good idea to accept the fact?
  5. Nikita,
    Instead of always criticizing others, isn't it a good idea to accept the fact?
    What is wrong with criticism? I think criticism is healthy when it's justified. Unfortunately my good credit would not be enough to finance this product - I would have to take a second mortgage for it ;-) Putting jokes aside, I actually have a question that has been bugging me. After Agitar generates tens of thousands lines of unit tests for my "legacy" code, what do I do after I see errors in 5,000 tests at once? What I mean is how do I maintain such a huge load of generated unit tests without having "exhausted" developers? Another question is how long does it take to run these tests? Basically, after I refactored a line of code, how long does it take me to find out that I didn't break anything? Good luck with the product! Best, Dmitriy Setrakyan GridGain - Grid Computing Made Simple
  6. What is wrong with criticism? I think criticism is healthy when it's justified.

    I don't mind some criticism and feedback - as long as we are all respectful in expressing our opinions (which has been the case) and we are both open to the possibility that the other person might have a point.
    I actually have a question that has been bugging me. After Agitar generates tens of thousands lines of unit tests for my "legacy" code, what do I do after I see errors in 5,000 tests at once? What I mean is how do I maintain such a huge load of generated unit tests without having "exhausted" developers?

    Good question. The first thing I should point out is that if you are working with legacy code, you should run the tests very frequently. Ideally after each substantive change. If you generate the tests, then spend 3 months making changes and re-run the tests you will get a lot of failures at once - which would be a problem. But if you run the tests several times a day, you should not run into a situation where you have 5,000 or 500 failures. But if you do, it's a sign that you have made a really major change. At that point you are usually happy that the tests started flashing red. BTW, this (i.e. run the tests frequently and not let the failures accumulate) is exactly what you'd do even if the tests were generated manually. If you do get 5,000 failures after a change there are a couple of possibilities. 1) Oooops, I did not realize that so many other classes depended on the behavior I just changed. I better backtrack. 2) All the failures are expected and intended consequences from the change. In this case most of the failure messages will be similar (e.g. "expected got ") so you can scan a few and automatically repair/regenerate the tests so they are in-sync with the newest version code - another advantage of test automation. Without it, you'd have to edit a bunch of tests.
    Another question is how long does it take to run these tests? Basically, after I refactored a line of code, how long does it take me to find out that I didn't break anything?

    Another good questions. It depends on the size of your code of course. If you have a big project with, say, 1,000s of Java classes your AgitarOne server will proabably consists of multiple CPUs and the tests will run in parallel. I try to have enough CPUs (which are cheap compared to developer time and time to market delays) to run most of our thorough test suites in under an hour or so. We also have some smoke test suites that run in minutes. BTW, on JUnit Factory, you can not only generate tests for free, but you can also run your tests in parallel and generate a fancy-schmancy test run, code coverage, and code analysis report suitable for framing - also for free. We even pay for the electricity :-). Here's an example of a dashboard: http://www.junitfactory.com/agitar-server/dashboards/developer/8D296ED9EAC8022B08E4DDCC2387BE5C/latest/index.html
    Good luck with the product!
    Thank you and same to you. I'll check out GridGain to see if we can use it for our servers. Alberto
  7. BTW, on JUnit Factory, you can not only generate tests for free, but you can also run your tests in parallel and generate a fancy-schmancy test run, code coverage, and code analysis report suitable for framing - also for free. We even pay for the electricity :-).
    Yes, this sounds great except for the fact I don't want to be doing this on an existing, commerical project - particularly when your legal terms mean that I have to accept uploading my code to your servers. NO WAY!
  8. After Agitar generates tens of thousands lines of unit tests for my "legacy" code, what do I do after I see errors in 5,000 tests at once?
    You scream? Someone in the audience at a presentation Kent Beck did used the phrase "getting yourself into useful trouble". I think that is a very fitting term. If you have just broken your application, you want the gongs to sound, the walls to tremble and the horseman of system architecture to at least saddle his horse. Your legacy code must be really awful for that to happen, though. You should probably fix it right now. For that to happen, you need tests according to "refactoring" by Martin Fowler. I agree with him. I am more concerned about the effects of this tool on a "usefulness" scale. Same concern you are talking about, but on a different level. You have 10k tests, all automatically generated. Some test wanted behaviour. Some of them are bound to test behaviour that are bugs that nobody has gotten around to fixing. Still, your tests are at the unit level - and it will probably be very hard to link the failed test to behaviour the user wants - or one of the bugs I mentioned, this being legacy code. I have an uncomfortable feeling this is a tool that can very easily turn into a comfort tool, only there to achieve the magic 80% of test coverage. If nobody knows what the business rules behind those 80% are, the bomb may still be ticking. Then again, it seems I am more of a Dan North/BDD fan than a Kent Beck/TDD fan. I am thoroughly biased, and proud of it. I wish someone would do a "use case coverage tool" for testing. How much of the stuff the customer is paying me to do have I actually tested? The rest of the tests are lint. Geir
  9. No, it is not. I'm glad that you are much better off than me that $50K/10-seat license in today's market doesn't bother you. Hope that one day I will also be looking from my ivory tower thinking that $50K is just a cheap change :-) Best, Nikita Ivanov. GridGain - Grid Computing Made Simple
  10. ... and our developers needto eat too - even if it is mostly pizza.
    No, it is not. I'm glad that you are much better off than me that $50K/10-seat license in today's market doesn't bother you.

    Hope that one day I will also be looking from my ivory tower thinking that $50K is just a cheap change :-)

    Best,
    Nikita Ivanov.
    Hi Nikita, I can pretty much assure you that most of our customers are as far from an ivory tower as you can imagine. They are often working with 1,000s of legacy code classes, with the original developers gone, and for which they have NO tests. They are terrified to touch the code for fear of breaking backward compatibility which, in many cases, has to go back several versions. It's a very nasty and difficult job and one that's almost impossible to do without some test automation. In these situations our technology is hardly an ivory tower luxury. Also, in today's market, good developers are very hard to find and expensive to keep. Many companies pay recruiters 20-30%+ of a developer salary to help the find the right person - that's $20-40K+! With the average developer tenure of 2 years or less, that $10-20K/yr/developer just in recruiting fees. IMO, a technology that helps you to make the most of the developers you have, keeps them happy and producing high-quality and well-tested code, is definitely worth some money. It would be a false economy to hire expensive developers but not be willing to arm them with the tools they need to be as productive as they could. But that's just my opinion. I like to give my developers fast CPUs, big monitors, and pretty much any software tool they believe they need to be most effective. The other side of the equation is that it takes a LOT of money and development time to create a technology like AgitarOne - we are talking about tens of millions of dollars and many years of development with lots of the very expensive and hard-to-find developers I just mentioned :-). I am a big believer in open source, and Agitar is a major financial supporter of the open source efforts we leverage (we employ and fund two of the three JUnit developers (Kent Beck and David Saff), and we also employ Jeff Fredrick, the top contributor to CruiseControl.) But for something as complex and sophisticated as AgitarOne to come to life, you need a significant initial investment and some very focused effort by a lot of developers for several years. I don't think that there's anything wrong (either morally or business-wise) to charge profitable companies (much bigger and more profitable than Agitar) the price we do to recoup our initial expenses and to fund further development to help them even more. They benefit, we get to keep developing exciting technology, and the smaller/younger companies also get to benefit because we make a lot of our technology freely available. Speaking of which. I think I should re-iterate that for teams that are too small or cannot afford a team/enterprise license, JUnit Factory offers the most requested features of AgitarOne (i.e. automated JUnit test generation, distributed test execution, and our fancy-shmancy reporting and dashboard) for free. Frankly, I think we are being pretty darn good and generous citizens :-). Alberto PS Here's another link to another JUnit Factory dashboard example. This is for the IM version of a game called "Diplomacy" that some of our developers like to play in their ample free time: http://www.junitfactory.com/agitar-server/dashboards/developer/A1E247B1C05647B1F539D47F6BDEC358/latest/index.html
  11. Agitar testing complicated[ Go to top ]

    For those of you who commented that this would mean less exhausted developers..I would say think again. The testcase,the assertions and interpretation is too complex for an average developer. It takes an above average person to make sense out of it all. In short using Agitar is no less exhausting.
  12. The testcase,the assertions and interpretation is too complex for an average developer. It takes an above average person to make sense out of it all.
    In short using Agitar is no less exhausting.
    First of all, let me state that I believe that the ideal state of affairs is to have developers write tests for their code while they are coding it. This helps to ensure that the code is testable, it will be more solid, etc. If you develop using TDD, or write nice and thorough test for most your code without the help of any tools, then congratulations and more power to you. If every programmer did that the industry would be in much better shape and we would not have started Agitar to address this issue. But the situation, as anyone can tell, is very different from that. Of all the Java code out there, I'd be surprised if more than 20% of it has automated unit tests to go with it. The actual number is probably less than 10%. On the test interpretation issue: It's tautological that a test written by a developer will be easier to interpret by that developer (although not necessarily by other developers). Again, manual unit tests are great. Every non-trivial Java class deserves them. Too bad that there are so few of them. But even with manual tests, over the years I have found that readability and ease of interpretation of unit tests is proportional to the readability, quality, and complexity of the code. If you need to generate unit tests for a class that can't be constructed/initialize without the server running, has a dozens of dependencies on other classes (that also cannot be constructed or initialized) you have to resort to using mocks which, whether written by hand or automatically generated, make a test harder to read. Ugly code --> Ugly tests. But as I wrote in The Way of Testivus: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ An ugly test is better than no test When the code is ugly, the tests may be ugly. You don’t like to write ugly tests, but ugly code needs testing the most. Don’t let ugly code stop you from writing tests, but let ugly code stop you from writing more of it. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ You can make legacy code less ugly by refactoring (but you may need some ugly tests to help you do that safely). And you can make the generated tests more readable by writing "test helpers" that will be picked up by the test generator to help generate better/more-realistic test data and better/more-realistic-and-relevant test assertions. On the issue of Agitar is no less exhausting: I strongly disagree with this one. Let's do the math: Imagine being handed 3,000 Java classes that you did not write and having to provide tests for it. A typical situation for many of our users. Imagine that many of the classes, and most of the core classes on which everything else depends, can't be constructed/initialized without making heavy use of mocks. Assuming an average of 50 lines of code/class and applying the 3 lines-of-JUnit / 1 line-of-java ratio that's required to achieve some decent code coverage, someone would have to write some 450,000 lines of JUnit (i.e. 3,000 * 50 * 3). Even if you crank out an amazing 1,000 lines of JUnit per day, and never take a day off, you are looking at well over 1 year of doing nothing but writing JUnit. You would not be exhausted - you'd be dead :-). That's why the above scenario never happens. When faced with providing tests for legacy code, the task looks so monumental that people don't even try. Even writing a small subset of those tests for the parts you think are at risk is a big challenge for the problems I already mentioned (i.e. the code is untestable without some heavy/difficult mocking/refactoring, etc.) With AgitarOne, you can generate a basic set of unit tests in a matter of hours that will get you some 80% coverage. Admittedly, at the beginning, many of those tests will be "ugly" because the only way to generate any tests was to make aggressive use of mocking techniques. But if one is willing to invest on the order of 1/10th of the effort required to write all tests manually to create some test helpers and refactor some code to make it testable, you will have a very powerful and readable set of tests. The second, very important, point I want to make is that you don't need to look at every single test that is generated. Why would you? As I covered in the interview, in testing legacy code the main objective and challenge is to make changes/additions to some portions of the code without unintentionally breaking other portions of the code. This means that you only need to look at the tests that fail - which will be a very small subset of the overall set. As I said in the interview - testing the "steaming pile of legacy code" is not an easy job; and it's often an seemingly impossible job that would require months and years of writing tests by hand. But with some test automation and test amplification, this impossible but necessary task becomes possible, manageable and doable in days and weeks instead of months and years. If you disagree, perhaps we can set-up a challenge between two groups of developers. Let's pick a typical pile of steaming legacy code (say 1,000 Java classes) that neither . Let's take one week or one month of time to come up with some tests: one group manually, the other with AgitarOne. I am sure that the group with AgitarOne tests will achieve excellent coverage, but that's not the point - coverage is a necessary but not sufficient condition. The real question is: will the tests catch regressions? So let's introduce some random intentional changes/defects/regressions in the code and let's see which suite of tests detects the most potential regressions. The fact is that testing is a combinatorial problem. Every conditional statement (e.g. an 'if statement') requires two at least two tests. A compound 'if statement' (e.g. if (a || b & (c || d))) would require many more. Most developers and most teams don't have time, skills, and desire to spend 50%+ of their time writing all the tests that are needed. I believe that the current state of affairs provides ample and irrefutable evidence of that. In summary: To test efficiently and effectively, some automation is not a luxury but a necessity. I stand by my position: If you want exhaustive tests without exhausted developers you need some test automation or test amplification. AgitarOne might not be perfect yet, but I know of no other technology that even comes close to it. Alberto
  13. licensing for smaller team[ Go to top ]

    Hi, I agree with previous comments that $50K price tag is too high. Do you have any other licensing options for smaller team, say < 10 developers. -Regards Hemant
  14. How about JUnit Factory for free?[ Go to top ]

    Do you have any other licensing options for smaller team, say < 10 developers?
    Hi Hemant, We opened up JUnit Factory (JUF - which was originally meant for open-source developers, students, and researchers and required an invitation) to anyone who wants to use it. You get Agitar's test generation engine, distributed test execution, and dashboard reporting. All for free. Since we've opened-up JUF, we've had thousands of users from dozens of countries using it. Curiously enough, most of the heavy users (i.e. generating 1,000s of tests day after day) are commercial organizations (banks, auto makers, etc.) instead of open-source developers, students, or researchers. Over time we'll keep adding features that people request and more CPUs to keep up with the demand. But for the time being, would that address your immediate needs? Alberto