I have been redesigning a J2EE system recently that was written about a year ago and I have done this countless in my career of over fifteen years. But I found a quite a number of code smells that show up every time. I find that badly design procedural code is easier to fix than badly design object oriented code. Often the systems are written in object oriented languages and using state of the art technologies such as Hibernate/Spring, so you would think that they also follow good principles of object oriented or domain driven design, but you will be surprised. First, let me summarize characteristics of good systems that I learned at school and over the years:
- loosely coupled or separation of concerns – this is by far the most important ingredient of good software.
- good abstraction that focus on areas of domain that are focus of the system
- separation of concerns – good use layers. Here by layer, I mean logical layer not physical layer (tier), which is also source of confusion to many.
- single responsibility principle – high encapsulated modules that have high cohesion within and low coupling across
Also, when you are developing large systems, it is also important to consider physical separation of modules and interdependencies, which means system with:
- low cyclomatic complexity
- Acyclic Dependency Principle – No circular dependencies
- Dependency injection – This greatly helps coupling and facilitates testing.
- Multiple codebases – Though, it doesn’t matter for small systems, but when dealing with hundreds of thousands of lines of code, it is important to separate code into multiple codebases along the line of business functionality.
And now the actual code smells that I have found time and again while maintaining, extending or rewriting systems:
Procedural code in disguise of Object Oriented code
It says, old habits are hard to break, often I find systems that are written in object oriented languages, but in procedural style. The smells I often find are:
- anemic model (thanks to the EJB legacy), where domain classes are just like C structs and all business logic is in services. Often I find that service managers have methods that take domain class as an argument and do some business logic and return. Such business logic is better off in the domain object itself.
- static methods and attributes – this is also big code smell often from developers that started with C like languages.
This would be the winner of code smell that is found most often. For example, in my current system I found that domain classes were more like structs, and services where all the fun happened. The services did all database queries (Hibernate) and business logic. Worse, the application had severe performance issues so for every service there was a sister class for caching the service methods and often those caching classes also had database access methods. Clearly, they violated coupling and separation of concerns. I am trying to divide all those responsibilities into domain classes, Dao/Repositories, and Services. In this case, most of the business logic will go to the domain classes, all database access will go to Dao/Repositories and high level business or orchestration logic will go to the services. In this design, cross cutting concerns like transactions, caching are defined at service layer in declarative manner. For example, I added some caching annotations to signify the methods that should be cached. Another violation of separation of concerns was coupling of user view in the service layer and a lot of service managers took Locale as argument and looked up resource file. All that logic will be in the view layer.
Factories and Singletons
The GoF made Factories and Singleton a must have pattern for all cool object oriented systems, but in reality they make maintenance and testing hard. I found plenty of usage of these patterns in my current system and I am replacing them with dependency injection.
Too much Code
Though, for a large system you do end up with a lot of code, but I have seen even smaller projects with a lot of code. Also, some languages are more verbose than others. For example, often I find that code I write in Ruby takes less than half number of lines of code.
Distinction between Classes and Objects
I have found this smell in a number of projects where many of the classes were just classes with different values or slightly different data. For example, I found over 400 classes in my current project that I would consider objects with a little bit of difference. Though, for complicated systems where meta information is slightly different, you may use adaptive object model pattern, but that is not for the faint of heart.
This is another reason for large code base when the system adds some technical requirements when they are not really needed. For example, in my current system we don’t have internationalization requirement, but we have sporadic use of localization. It may not be too bad, except all that localization is inconsistent, which would have to be changed if i18n becomes real requirement. Meanwhile, you will have to maintain all localization code. In this regard, I would follow agile principle of YAGNI.
Cross Cutting Concerns/Separation of Policies
The cross cutting concerns such as caching, security, transactions should be handled centrally in a declarative manner. Though, AOP is not for faint of heart, but EJB 3.0 and Spring provide nice support for that. Also, with annotations in Java, you can define your own policies. For example, caching was one thing that I found embedded in every piece of code (in addition to separate sister classes for services) in my system and it was really hard to track how things work.
Strings as glorified DSL
This is another code smell that I have seen where strings store some complicated data structure that you have to parse. For example, prior to JDK 1.4, developers had to parse line numbers of source code from stack trace, because it would return String. I found countless examples of this in my current system where strings stored complicated data structures or in other cases where strings were used when actual typing such as Long, Date would have been better.
Pareto principle (80/20 Rule)
I believe the system should be design based on the majority of use cases. Unfortunately, I have seen some systems where the whole design was sacrificed to fit some corner cases. This made the whole system very convoluted. The special cases should be handled specially and not at the cost of entire system. Again, I found an example of this in my current system where primary keys were defined with convoluted abstraction just because a couple of classes could not support surrogate keys for primary ids.
These days, I find most the projects are driven by quick wins. I find that in most of the development shops, developers are facing impossible deadlines where they must ship the system. In these circumstances, they do the quick and dirty job to ship the product, but soon it becomes the maintenance hell. I have found that since the dot bombs era, the working environment in most shops have degraded for developers and offshoring has worsened this situation. Many of the managers consider developers “code monkeys” that can be easily replaced. Unfortunately, I don’t see things improving anytime soon.
PS: Further reading: