Twitter Updates

Archive for February, 2009

Does software quality matters?

Tuesday, February 24th, 2009

In last couple of weeks, I followed interesting debate between Bob Martin and Joel Spolsky/Jeff Attwood (listen Hanselminutes) on the issue of SOLID principles. The SOLID acronym stands for five principles, i.e.,

  • Single Responsibility Principle – this principle is based on the principle of Cohesion from Tom Demarco’s Structured Analysis and Design book and mandates that a class should have one, and only one, reason to change.
  • Open Closed Principle – this principle is based on Bertrand Meyer’s book on Object Oriented Software Construction that says that a class should be open to extend without modifying it.
  • Liskov Substitution Principle – this principle was introduced by Barbara Liskov that says derived classes must be substitutable for their base classes.
  • Dependency Inversion Principle – states that your should depend on abstractions, not on concrete implementation.
  • Interface Segregation Principle – states that you should make fine grained interfaces that are client specific.

I learned these principles many years agao and also watched Bob Martin in Best Practices 2004 conference, where he talked about these principles. These principles sound good, though in reality they should be general guidelines rather than principles. But the heat of the debate was the way these principles have been evangilized by Bob Martin who insisted on using these all the time. Over the years, I have seen similar debates between Cedric Beust and Bob Martin over the use of test driven development. There is also debate on topic of TDD and architecture between Bob Martin and Jim ‘O Coplien. Overall, I find that the issues from these debates boil down to the following:

Usable software vs quality

One of the controversial point that Jeff Attwood raised was that the quality does not matter if no one is using your application. I find there is lot of truth to that. In fact, this is the route that most startup takes when developing a new idea and trying something new. For example, everyone was blaming Twitter for choosing Rails when it had scalability and reliability issues. However, Twitter would not have existed if it were written with most scalable and reliable language or framework that took two more years or if the application itself wasn’t as useful. I tend to follow the age old advice: First make it work, then make it right and then make it fast. There is a reason why rapid prototyping frameworks such as Rails, Django, Zend framework, etc. are popular because they allow you to focus more on business functionality and to reduce time to market. So, I agree the first goal of the software should be to make the software that solves real problems or add value. Nevertheless, if first implementation is horrible then it takes hercules effort to make it right and some companies like Friendster never recover.

External design vs internal design

One of the earliest advice I got on software development was to write manual before writing the code. It focuses you to solve business problem of the customer rather than writing with top down architecture, which is in a way similar to behavioral driven design in spirit. I find that most useful software developed bottom up and driven by the users. Kent Beck often says that you can’t hide a bad architecture with a good GUI. Nevertheless, for an average user, the usability matters a lot. I remember back in early 90s, IBM OS/2 was superior operating system than Windows but largely loss the market due to the usability (and marketing) issues. The same could be told why people think Macs are better than PC. Rails is also a good example that became popular because it could whip up a webapp in ten minutes despite the fact that its code has been plagued with maintenance issues from monolithic architecture and tricks like chain of alias methods. Other examples include Wordpress and Drupal both written in PHP and are the leader in the blogging and content management area due to their usability rather than quality of the code. Again as your software crosses some threshold of number of users it may have to be redesigned, e.g. Rails recently announced that it will merge with another framework Merb in 3.0 because Merb has been designed with micro-kernel and pluggability based architecture. This also reminds me of merge between Struts and WebWork that turned out to be failure. Joel Spolsky cautions about software rewrites in his blogs and book and I have also blogged on Software rewrites earlier. In the end you want to extend your application incrementally, which is not an easy thing to do. Ultimately, I think software development is all about people rather than technology, principles or best practice. If you have good people then can take your application to the next step regardless if it was written in PHP, Java or Erlang.

Evolutionary architecture vs up front architecture

This has been brought up in debate between Jim Coplien and Bob Martin, where Jim took the position of more upfront design and architecture and Bob took position of evolutionary architecture. I have a lot of respect for Jim Coplien, I still have a copy of Advanced C++ I bought back in ‘92 and it introduced me to the concepts of abstraction, body/handle, envelop/letter principles which are sort of DIP. In the interview with Bob Martin, Jim Coplien raised a lot of good points that YAGNI and test-driven based bottom design can create architecture meltdown. Though, I believe good software is developed bottom up, but I do believe some architecture and domain analysis beneficial. I am not necessarily suggesting BDUF (big design up front) or BRUF (big requirements up front), but iteration 0 style architecture and domain analysis when solving a complex problem. For example, the domain driven design by Eric Evans or Responsibility driven design by Rebecca Wirfs-Brock require working closely with the domain experts to analyze the business problems and capturing essential knowledge that may not be obvious. Any investment in proper domain analysis simplifies rest of development and make your application more extensible. There are a number of agile methodologies such as feature driven development and DSDM that encourage some upfront architecture and domain analysis, which I find beneficial.

Extensibility and maintenance

Once your product is hit and loved by users, your next task becomes extending it and adding more features. At this stage, all -ilities such as scalability, performance, security, extensibility becomes more important. This is also more common in corporate environment and most of the developers live their entire lives maintaining the code written by others. Nevertheless, each team should decide on what practices and principles are appropriate and follow them. Agile methodologies encourage collective ownership and pair programming that can spread knowledge and skills, though there are some alternatives such as code reviews or mentoring. I also think, having a technical lead who ensures overall quality and keeps the bar up for rest of developers can help with extensibility and maintenance.

Test driven Development

Bob Martin has been adament proponent of test driven development with his three laws of TDD. I blogged about original debate between Cedric Beust and Bob Martin back in June, 2006 and showed how Bob Martin’s position was unpragmatic. This reaction has also been echoed by Joel, Jeff, Cedric and Jim, who agree 100% coverage is unrealistic. Lately, more and more people are joining this group. I watched recently an interview of Luke Francl who expressed similar sentiments. In my earlier blog, I wrote various types of testing required for writing a reliable and enterprise level software. One of the selling point of unit testing has been ease of refactoring with confidence but I have also found too many unit tests stifle refactoring because they require changing both the code and unit tests. I have found that testing only public interfaces or a bit high level testing can produce reliable software and also is less prone to breakage when doing refactoring. Though, no one refuses value of testing, but it needs to be practical.

Dependency management

A number of SOLID principles and extended principles such as reuse principle, dependency principles mandates removing or minimizing dependnecies between packages. Those principles were made in C++ environment that historically had problems with managing dependencies. Again, the problem comes from religious attitude with these principles. For example, I have worked in companies where different teams shared their code with java jar files and it created dependency hell. I used jDepend to reduce such dependency but it’s not needed in all situation. I am currently building a common platform that will be used by dozens of teams and I intentionally removed or minimized static dependencies and instead used services.

Conclusion

Unfortunately, in software industry buzzwords are often used to foster consulting or selling books. There are lots of books on principles, best practices, etc. The entire agile movement has turned into prescriptive methodologies as opposed to being pragmatic and flexible based on the environment. People like uncle bob use these principles in draconian fashion such as always stressing on 100% test coverage or always keeping interfaces separate from implementation (DIP). Such advice may bring more money from consulting work or publication, but is disingenuous for practical use. In the end, software development is about people and not about technology or principles and you need good judgement to pick what practices and solutions suit the problem.

Software Estimation

Monday, February 23rd, 2009

Software estimation is a difficult art that I am still learning despite developing software for more than twenty years. I have worked on a number of projects that started with some broad vision and manager asked me how many man-months will it take. You feel like a guy who is asked how long will it take you to survey a cave without going inside (see Software Estimates and the Parable of the Cave). So based on some initial requirements, you make up some numbers. But, often that number translates into commitment and some target date. This issue has been also brought up by Software Estimation by Steve McConnell, Manage It by Johanna Rothman, Lean Software Development by Mary Poppendieck and a number of other people. So it must be made clear that your estimate is not the target date.

As a project is always constrained by iron triangle of schedule/cost/functionality or sometime referred to as cost/quality/schedule or cost/resourcs/schedule. It is crucial to find what’s driving the project as also suggested by Johanna Rothman in her book Manage It. I have seen a number of cases where dates were arbitrary picked, sometime referred to as “happy date”. Though, at other times, dates may depend on marketing campaign, seasons, tax time, olympics, etc. So, you can negotiate between functionality and schedule based on what’s driving the project. Following are some of techniques that I have found useful with estimation:

  • Get the vision and requirements straight – It’s important about the charter, constraints and requirements for the project as any misdirection here would lead to disaster. Luke Hohmann in his book Beyond Software Architecture recommends starting with good vision and mission statement. Johanna Rothman also recommends creating a project charter before starting the project.
  • Probablistic based estimation – Despite the fact, you are often pressured to produce more precise estimates even though they would be inaccurate, it is better to give estimate with some probablity. Both Johanna Rothman and Steve McConnell cite cone of uncertainty, where your estimate becomes more accurate as project progresses.


  • Based on best/worst/most-likely case – use following formula from Steve McConnell’s book can be used when estimates are more accurate:
     expected case = (best-case + (4 * most-likely) + worse-case) / 6
     

    If estimates are not accurate, then Steve McConnell recommends

     expected case = (best-case + (3 * most-likely) + (2*worse-case)) / 6
     

    Bob Martin also similar formula from his article PERT, CPM, and Agile Project Management:

      Mean = (best-case+worst-case+(4*most-likely))/6
      Variance = ((worst-case-best-case)/6)^2
     
  • Iterative development – No matter if you are working on small or large project, the only way to bring some reality and feedback on initial estimate is to develop iteratively starting with highest valued features.
  • T-shirt based estimation – I find t-shirt based estimation useful when estimating with minimal information available. For example, you may have to estimate projects that you can deliver in Q1, Q2, etc and you can order them in small, medium, large and compare them against their business value.
  • Spiking can also help in areas that are new to the team and spending a little time creating walking skeleton or tracer bullet can give you some idea on the size of the effort for the project.
  • Delphi estimation – where PM and team prepares task list, assumptions and estimate in private and reviews them together.
  • Divide and conquer/Decomposition/WBS – as with any large effort, breaking a project into smaller subsystems, components, services and tasks will help estimate better. In general any errors in estimation of smaller tasks will cancel each other.
  • Estimate fine grained tasks – I can rarely estimate with some accuracy for tasks that are longer than a few days so it’s important to estimate only fine grained tasks. XP has a concept of inch pebble and story points that can help in this case. The idea is that each task is either done or not done.
  • Planning poker a technique from Agile Estimating and Planning by Mike Cohn, where each member of the team pick an estimate for a story based on fibonacci numbers, but don’t show until everyone selects some number. The members then pick some average or may ask member with highest or lowest estimates to explain.
  • Historical data – though I rarely see PM track estimates but tracking them can help future projects and new projects can use LOC, man-months, function-points, # of services, files, interfaces, bugs from prior projects for estimation.
  • Schedule chicken – Kent Beck often talks about schedule chicken where you have some some meeting about who is ontrack and you hope there is someone who is behind so that you don’t have to admit you are behind as well. Integrity is big part of the XP and agile methodologies so it encourages transparency and honesty instead of schedule chicken.
  • Better to overestimate than underestimate – programmers often underestimate and though there is risk of student syndrome or Parkinson’s law but it’s better to overestimate.
  • Don’t question developer’s estimate – even though developers tend to underestimate, some managers still question them, which is not a good idea.
  • In XP or Scrum, you use story points, which can be ideal hours or based on some multiplier. These numbers are generally follow fibonacci sequence such as 1, 2, 3, 5, 8, 13, 21.
  • Function points use number of external input/output/queries, internal logical files/external interface files and it can be used as unit of measurements similar to story points.
  • Estimation quality factory (EQF) as proposed by Tom Demarco in his paper A Defined Process For Project Postmortem Review can be used to check how accurate estimates are.
  • Include vacation, sick, holidays as well as non-development activities such as testing, deployment, configuration, migration, etc in your project plan.
  • Scheduling is all about ordering with highest value features. I find rolling-wave scheduling based on milestones useful when planning iterations.

I often find projects turn into death march projects due to overly optimistic estimates and “queen of denial” manager who holds developers’ estimates as commitment and refuses to accept the reality. I don’t find any good solution to estimation except iterative development starting with highest value, which creates biggest value for the business. Though, often agile methodologies recommend starting with simplest thing possible but I like Rational Unified Process for risk management that suggests doing highest risk tasks first. Also, I find multi-tasking diminishes productivity so it’s better to assign people to only one project if possible. Finally, I have seen many managers accept more work than the team can handle in order to appear more committed in front of upper management. It takes courage to say NO and in the end it can save your credibility and not to mention unnecessary overtime.

IP addresses of Spammers from my Honeypot

Thursday, February 5th, 2009

I have an old guest book application that was originally written in J2EE in ‘98 and then moved to Rails a couple of years ago. Though, I don’t get much guest entries, but it does get a lot of SPAM. It has become sort of honeypot and I keep IP addresses of those spammers and in case anyone is interested you can download that list from http://plexobject.com/spammers_ip_addresses.txt.