Common sense and Code Quality - Part 1


If you are involved in a software project (as an individual coder, technical team lead, architect or project manager) chances are that code quality might not be the first thing on your mind. However, the truth is, it needs to be on everyone's radar. It is one of those things that needs well thought out strategy and continued focus throughout the project's lifecycle. Otherwise it simply spirals out of control and comes back to bite when project can ill afford a quality issue.

This article takes a very simplistic and common sense approach to code quality. The intent is to demystify code quality and help project teams pick process and tool that makes sense to them.

Just to contain the scope of the article, I have restricted the rest of the discussion, to a Java / J2EE based technology project in an enterprise scenario. The basic definition of quality and ways to ensure that, should be similar in technology projects using other technology stacks and operating in non corporate world e.g. in open source arena.

Who should care about code quality?

Let's start with a quick questionnaire:
  1. Do you deliver and / or review code written in Java?
  2. Do you manage / update / configure any 3rd party product written in Java?
  3. Do you contribute code in any java project which has legacy code?
  4. Do you contribute code in any java project which has sizeable number of classes (say more than 100) and you want to have a grasp on interdependence of those classes?
  5. Are you interested in assessing if there are structural issues in a given java project?
If the answer is yes to any / many of these questions, you should care about code quality.
The truth of the matter is that you might not have realized it yet and code quality (measuring, ensuring, delivering) might not show up as a distinct item in your role and responsibilities. But it is only a matter of time that it will catch up and cause grief if left unaddressed. It is much better approach to handle this monster proactively.

What is a high quality code anyway?

If you google it up or discuss this, you generally get two types of answers.

First type is generic *ity stuff (Flexibility, Reusability, Portability, Maintainability, Reliability, Testability etc.). While they are important, it is not always clear as to how exactly to measure them and how exactly to improve them.

Second type is highly specific technical parameters e.g. cyclometic complexity, Afferent coupling, Efferent coupling etc. There are well documented mathematical formulae to calculate these parameters, software that will calculate them for you, and relatively easy to get to a concrete actionable that will improve these numbers. However, converting the improvement in numbers to improvement in code quality remains a specialized skill.

So, net net, there is no easy answer. Let's try to change that. Let's put a series of questions that - from common sense - anyone in a team that writes / maintains high quality code base should be able to answer in affirmative.

Question 1: Are you confident that as you add new code, none of the existing, working functionality will break?

Do you / your team check in code? I think it is safe to assume, yes. Does an average developer of your team check in code more than once a day? Let's assume, yes. Is it possible for an average developer of your team, on an average day, to know at the top of his head, what all other developers have checked in and how those code snippets are supposed to work? No. Even if you have all Newton and Einstein in your team, it is an emphatic no. So, how do you ensure that as the coders are frantically churning out code, they are not actually breaking more than they are creating?

The answer should be unit testing. Cover as much code as you can cover by unit test. (If your answer is something else could you comment about it in the article, please? I would love to hear about your suggestion.)

Have an automated way of reporting to everyone in the team on the success of all unit tests every morning. If unit tests are broken, fixing them gets the highest priority for the day.

Also have an automated report to everyone in the team every morning reporting on the code coverage percentage. Ideally the code coverage percentage should increase in every report. At the very least it should remain same. If it goes down on any report, halt everything and investigate.

My common sense says that this has to be the most important code quality measure and process. (Again, if you have a different opinion, please leave a comment.) Fortunately, sorting out this bit, is comparatively easy. Just use these toolset:
  1. Unit testing framework: JUnit, TestNG
  2. Unit test coverage tool: EclEmma, Cobertura
  3. A build tool: Maven, Ant
  4. A continuous integration tool: Jenkins, TeamCity
  5. A web dashboard for the report: Sonar
I am not saying this is the single / best answer. All I am saying, if you don't have a better answer, this answer is easy, free and it works.

One note of caution. Many times, when teams start with this, someone or other googles around and finds out that good quality products are supposed to have 80% unit test coverage. In comparison the product turns out to be in a much worse state. This has many implications including morale and political issues. It is important to emphasize here that 80% code coverage in isolation does not guarantee anything. What is really important is to get a working process in place and continuously improve the test coverage.

Question 2: As you add new code, are you sure you are not committing the same silly mistakes that generally coders do? E.g. did you free up all resources in final block?

Anyone who codes commits mistakes. You are lucky if the compiler catches them for you and spits out a stack trace. But what about those that compiler does not catch but coding community knows from experience to be bad code. If you have worked on banking software a decade ago, the only way to catch the silly mistakes was by having someone senior from the team to review your code. Things have not changed much. You should still have an extra pair of eyes look at your code and design. But luckily there is some help as well. You could use this toolset:
  1. Any source code analyzer: PMD, Checkstyle, Findbugs, Crap4j
  2. A build tool: Maven, Ant
  3. A continuous integration tool: Jenkins, TeamCity
  4. A web dashboard for the report: Sonar
Again, I am not saying this is the single / best answer. All I am saying, if you don't have a better answer, this answer is easy, free and it works.

One note of caution. Most of the projects which start with these are inundated with hundreds (if not thousands) of items flagged by these source code analyzers. It is very important to spend some time upfront with these tools and throttle the reporting. Fortunately it is very easy to add / delete rules to these source code analyzers effectively configuring these to report only what you / your team thinks is worthy of flagging. The trick is to ensure that the rules are relevant to your team and the reports are treated with utmost respect. It is no good if the tools keep reporting a bunch of issues and nobody in the team is either convinced that they are relevant or nobody is sure who is expected to fix them.

I will draw part 1 of this article to a close here. The first couple of questions that we have discussed in this article, I believe, are the most important. They should be taken up first by any technology project which sees value in having a handle on the quality of code. The next part will touch on advanced topics like structural analysis, mutation testing etc.

Until then, happy coding.

A version of this article - slightly edited, is also available at this link at Javalobby.
If you would like to read the second article of this series, please click here

If you want to get in touch, you can look me up at Linkedin or Google + .

No comments: