Design by Committee
Most large organizations have an architecture committee, platform team,
center of excellence or other forms of review committees that determine
vision and direction of software projects along with technology. It is
generally good idea to have a uniform vision and a fewer set of technologies
used across projects in order to minimize maintenance and learning curve.
One of the things that these committees are responsible is the enterprise
architecture. The enterprise architecture includes hardware platform,
software platform, tools, etc. In addition, it also includes application
architecture that determines how the system will be broken into subsystems.
In some respect these committees work similar to industry standard bodies or
consortium of organizations. Depending on size and diversity of these
committees, often the design process becomes somewhat bureaucratic. The
situation can be worse if you follow strict IT governance practices or ITIL
process.
The disagreement often results in lowest common denominator or low
quality solution.
Another frequest observation is that solution is often overly complexed
and over-engineered for the problem domain. Worse, due to diversity, the
architecture is not uniform and consistent. One of the reason for
such inconsistent or complexed architure is pissing contest. In general,
software architects don’t work well with other architects and the design
meetings turn into show-offs for coming up with most clever solution.
On the other hand, an architecture by a single experienced person is often
simple and consistent. The open source community offers many examples
of creating a simple solutions that works bettern than design
by committee’s. The open source community often uses the term “benevolent
dictator” who is incharge of unified vision and architecture. So,
the software project should probably be only designed by a single architect.
Software architecture is still an art that is learned through apprenticeship.
So, it helps if there is an apprentice who can do the grunt work such as
documenting designs and taking care of details. However, I don’t mean an
architect who does not do hands on work, but rather similar to
James O’ Coplien’s pattern “Architects-Also-Implements”.
March 21, 2006
Design by Committee
March 20, 2006
How overwork leads to dumb workers
One of best ideas from agile methodologies is the balance of work and life. In the original XP book, Kent Beck laid down 40-hour week as one of the core practices of XP. However, in the second edition he started the term “Energized Work”, which is a bit vague. I guess if you want to be less controversial then choose more vague terms. Anyway, the idea is pretty much same that knowledge work such as developing software needs a relaxed mind. I have seen everywhere that when you are tired you make stupid mistakes and spend next day or days making up for it.
I read another interesting article Be smarter at work, slack off that reinforces same idea. It quotes Peter Capelli, a professor of management at Wharton, “You can turn a smart person into an idiot just by overworking him.”
Nevertheless, most of the real work still requires 50-60 hours a week and I have seen plenty of death-march projects and fear of loss of jobs to offshore has made things much worse.
March 14, 2006
Log Locally and Query Globally
Log locally and query globally I have found the rule “log locally and query globally” quite handy in large distributed applications. However, I still find many projects that try to create a centralized logging server for receiving all log messages. In many cases this adds too much overhead for logging. Though, asynchronous logging can reduce direct overhead, but it is still problematic. This also results in loss of log messages or confusion when the log server crashes. A single logging server often becomes bottleneck for resources such as network, disk and CPU. I admit I have designed some systems where I setup centralized logging service. In some instances I used wrapper on top of UNIX’s syslog server. One of the drawback of syslog was that it drops messages under heavy load. Though, syslog service provides useful logging for system level logs where you don’t want to log system related messages locally especially security violations because if the local machine is compromised you will loose all logging information. However, I have found for application logging, local logging works much better. On the other hand this requires that a centralized service exist that can search logs from all machines. One of
the difficulty with local logging is creating a complete picture of logs. For example a user request goes throught various services running on different servers and for debugging it is important to trace complete business transaction information. In such cases, I have found a transaction correlation id quite useful. For example, each business transaction id can be stored as part of the log messages or user’s session information can be used, which is later
used to create complete picture. Also, it is important that all servers use NTP service and have their clocks synchronized so that the log messages are in right time order. For query, you can create a local searching agent on each machine which is contacted by centralized query system. This way you can query multiple log files simultaneously and return results much more
efficiently.
March 4, 2006
Responsibility vs Accountability
Based on Jim O’ Coplien’s book on organization patterns, you can not give responsibility to anyone. Only someone can accept it, however you can hold someone accountable. According to Kent Beck, accountability entails reporting back on what happened and responsibility means sticking with the problem in trying to solve it. However, you cannot hold someone accountable if you don’t provide enough resources to him to complete his job. I think this last statement is specially true and I have dealt with numerous situations when management never provided tools or resources to finish the job or do the job right. For example, recently I was trying to track an application bug that caused some financial damage. However, due to complexity of the system, the log files were of terabytes. Worse, there isn’t any data-mining tools to query the logs instead you have to do zgrep on huge compressed files and it takes days to search. In the meantime management keeps bugging when will the financial data be ready. This is another example of an organization where daily firefighting prevents anyone to think about any long term solutions to common problems.
March 2, 2006
Glory of Firefighting
Glory of Firefighting
I have observed in a few dysfunctional organizations where firefighting
is the norm and is rewarded. In such organization normal process of software
development is viewed as mundane and people do poor quality job. However,
they use this low quality process to become heros when crisis emerge. In
one of the organization, they created special roles and teams like SWAT
team that handles crisis or firefighting. In these organizations, emphasis
is on firefighting rather than fire-prevention.
February 26, 2006
A few old usenet postings:
A few old usenet postings:
I guess not many people usenet anymore, here are links to some of my past
postings:
Software rewrites
Software rewrites
I just read a great Blog entry of Jim Shore about Software Rewrite. I like many have been in situation when I just wished that if we could redo, the application would have been much better. As some people say always do things twice
and throw away your first try. However, when you are working on a large
complicated software, despite the temptation it can be enormous effort.
According to Jim, If your platform or language does not change, dont’ throw
away your code instead refactor or redesign to work with existing software.
I have heard similar advince from Joel Spolsky. I have been involved with a few
projects with complete rewrites. In early 90s, there was widespread use of
the term “Re-engineering”, that related with business re-engineering and
encompassed a lot of rewrites of software. Later, I was involved with an
ITS project for Illinois Department of Transportation, where we rewrote old
CTIC project, which was written in C/UNIX and wrote new system
GCM Travel in C++/Java/CORBA. It
turned out to be major effort and over budget and years late. Unlike
private company, the government had plenty of tax dollars to keep spending
on the project. When I was at United, I worked on prettty badly written
software and I had to decide whether to continue with the software or rewrite.
I opted for incremental refactoring because unlike CTIC project, I had to
not only write new version, but had to maintain old software. It is terribly
risky strategy to maintain two versions of software. Martin Fowler also has
Strangler
Pattern he describes for rewriting software. Unfortunately, I am
currently involved with another grand rewrite project (it was started
before I joined). It also has requirement to maintain two different
versions of the software and I can see a lot of time will be spend on
integrating two different versions. I have already seen people taking
short cuts to finish the project and before you would know it, it is
going to follow same road that previous project did.
February 19, 2006
Finished Personal and Company’s Web Redesign
Finished Personal and Company’s Web Redesign
Since I moved my personal website from my school to my own domain back in 1998,
I did little touch up. So, I finally spent some time to change the looks. I
then realized my company’s website can take a new look too as that has not been
changed since around the same time. I wanted to keep both sites easy to
maintain without any dynamic content management, so I used same technique
that I have been using over ten years, i.e., wrote some templates and created
Ruby scripts to generate html files. It’s not state of the art, but given
my server isn’t too powerful, I think it will work.
February 14, 2006
Refactoring
Since Fowler’s Refactoring book and adaption of the this term in the agile community, everyone is onboard for refactoring. It’s a great tool to prevent stale code. An ability to adapt to change is a survival
for any business and you continuously need to align your software to make sure it can change with speed of the business.
Strictly speaking, refactoring is modifying a code without changing external behavior. Often at the end of release cycles, developers apply quick hacks to finish the deliverables. Most agile environment dedicate a few days after each release to clean up the code. Many also use a geek week after a couple of iterations to do this refactoring or make structural changes to the application. The point of my rant is two fold refactoring: types of refactoring and when to use refactoring.
A software needs to continuously adapt and modify as business requirements change, however I find people use the term refactoring too loosely. I consider refactoring a minor change in the code without breaking the contract, however every day I hear people use the term “refactoring” when they mean redesign, rearchitect or completely rewrite the software or a piece of software. I acknowledge it’s very buzzword, but too often it’s used improperly. I have seen a number of projects that start from modest size with a couple of average developers to large size. In many cases the code is not properly maintained and incurs high technical debt. The development team may address the design or architecture issues that are at the heart of problem and start making major changes in the software.
Such effort of major redesign or rearchitecture cannot be labeled refactoring. Instead, they should be referred as design refactoring as opposed to code refactoring. Ideally, development staff need to allocate a few days or a week quarterly to address the design and architecture issues and adapt its
design with emerging needs of the customers. You can call it a geek week and spending a week quarterly will be a lot effective than rewriting the software.
Most teams favor speedy delivery of the software and often this gives the impression to move ahead with minimal changes for design, or architecture or refactoring. I find that architecture needs to be continuously evolve based on needs and small changes to design prevent code rot or technical debt. In some cases, it may increase the scope of the project, for example you may need to use a different database or service platform to support performance or scalability. For a sizeable project, it make sense to spend a few weeks to nail down immediate and emerging needs of the application so that it is able to support demands of the software. One thing that can make a big difference is having an experienced architect who has build similar systems before and understands the business domain of the application. Having a an architect helps that the software architecture is consistent and key design aspects are preserved when building new components or updating existing components.
February 6, 2006
Heros

Most of the corporate culture encourages heroism, where dedication or value of employees is measured by how much time evenings or weekend they spend at work. This largely effect how employees are treated by management or promoted. Many such environment also encourage incompetence. For example, if you do a sloppy job at work leading to buggy product or software and then spend nights and weekends to fix them you are lauded and recognized as a dedicated worker. However, if you finish your job within schedule and without any problems and leave your work at time then your dedication is questioned. Most companies don’t realize that when your employees are spending 13-14 hours a day they find ways to do their other personal tasks such as making private calls, dropping cleaning, etc. Nevertheless, it is corporate culture that rewards people who make heroic effort to make up for their incompetence.