Shahzad Bhatti

May 26, 2009

Tracing memory leaks with command line Java tools

Filed under: Java — admin @ 3:42 pm

I was recently tracking memory leak in one of our applicatios, however the problem was only occuring in our production environment and the network latency between production subnet and my local subnet was too high to run profiler (I tried). So, I had to use some other tools to look at the snapshot of memory that was in use. I found jmap and jhat that are shipped with JDK (1.6 in our case) pretty handy. When memory usage was high, I dumped the memory using jmap, e.g.

 jmap -F -dump:file=/tmp/dump.out

I then copied the file to my local machine and ran jhat using

 jhat dump.out

Though, jhat is fairly slow, but it finally starts the web server on port 7000 and you can then point your browser to

 http://mymachine:7000

The jhat can also read data from hprof that you can dump by adding following vm option:

 -agentlib:hprof=heap=sites

and then read the file using jhat. In addition to above tool, you can run garbage collection in verbose mode using following vm option:

 -verbose:gc -verbose:class -Xloggc:gc.out -XX:+PrintGCDetails -XX:+PrintGCTimeStamps

Finally, you can use jconsole to look at memory graph by adding support of jmx to your application server, .e.g.

 -Dcom.sun.management.jmxremote.port=9003 -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote

Despite all these options, I find profiler is much better for finding memory leaks, but these tools helped a bit.

Comments (0)

May 6, 2009

Integrating Springs event notification with JMS

Filed under: Java — admin @ 3:56 pm

I recently had to deploy a service that needed deployment in cluster fashion, however I wanted to synchronize some cache between the instances of servers. Since that service was making some use of Spring, I decided to leverage Spring’s builtin event notification with some glue to convert those Spring events into JMS based messaging. First, I defined an event that I can use in the application. As Map of Strings to Strings seemed simple, I used it to store message properties, e.g.

  1 import org.springframework.context.ApplicationEvent;
 
  2 import org.apache.commons.lang.builder.ToStringBuilder;
  3 import org.apache.commons.lang.builder.ToStringStyle;
  4 import java.util.Map;
 
  5 import java.util.HashMap;
  6 import java.util.Collection;  
  7                                                                   
  8 public class MessageEvent extends ApplicationEvent {       
 
  9     private String msg;
 10     private Map<String, String> map;
 11     public MessageEvent(final Object source, final String ... properties) {
 
 12         super(source);                                                            
 13         map = new HashMap<String, String>();
 14         for (int i=0; i<properties.length-1; i+=2) { 
 
 15             String name = properties[i];
 16             String value = properties[i+1];
 17             map.put(name, value);
 18         }
 19     }   
 20     public MessageEvent(final Object source, final Map<String, String> map) {
 
 21         super(source);
 22         this.map = map; 
 23     }       
 24     public String getProperty(final String key) {
 
 25         return map.get(key);
 26     }   
 27     public boolean isProperty(final String key) {
 
 28         String value = map.get(key);
 29         if (value == null) {
 30             return false;
 31         }
 
 32         return new Boolean(value).booleanValue();
 33     }   
 34     public Map<String, String> getProperties() {
 
 35         return map;
 36     }   
 37     @Override
 38     public String toString() {
 39         return new ToStringBuilder(this, ToStringStyle.MULTI_LINE_STYLE)
 
 40                 .append("source", getSource())
 41                 .append("map", this.map)
 42                 .toString();
 43     }
 
 44 }
 45 
 46

I then created a class to convert above message into JMS message, e.g.

  1 import javax.jms.Session;
 
  2 import javax.jms.JMSException;
  3 import javax.jms.Message;
  4 import javax.jms.MapMessage;                                      
 
  5                                                            
  6 import org.springframework.jms.support.converter.MessageConversionException;
  7 import org.springframework.jms.support.converter.MessageConverter;
  8                                                                            
 
  9 import java.util.Map;
 10 import java.util.HashMap;
 11 import java.util.Enumeration;
 12                                         
 
 13 public class MapMessageConverter implements MessageConverter {
 14     public Object fromMessage(final Message message) throws JMSException, MessageConversionException {
 
 15         if (!(message instanceof MapMessage)) {
 16             throw new MessageConversionException("Message isn't a MapMessage");
 
 17         }
 18         MapMessage mapMessage = (MapMessage) message;
 19         Map<String, String> map = new HashMap<String, String>();
 
 20         Enumeration<String> en = mapMessage.getMapNames();
 21         while (en.hasMoreElements()) {
 22             String name = en.nextElement();
 23             String value = mapMessage.getString(name);
 
 24             map.put(name, value);
 25         }
 26         return map;
 27     }       
 28     public Message toMessage(final Object object, final Session session) throws JMSException, MessageConversionException {
 
 29         if (!(object instanceof Map)) {
 30             throw new MessageConversionException("Object isn't a Map");
 
 31         }
 32         Map<String, String> map = (Map<String, String>) object;
 33         MapMessage message = session.createMapMessage();
 34         for (Map.Entry<String, String> e : map.entrySet()) {
 
 35             message.setString(e.getKey(), e.getValue());
 36         }
 37         return message;
 38     }           
 39 }              
 
 40

Next I created a class that listened for Spring application event and converted into JMS message and published it:

  1 import javax.jms.JMSException;
 
  2 import javax.jms.Session;
  3 import javax.jms.Message;
  4 import java.util.Map;
 
  5 import java.util.UUID;
  6 import org.apache.log4j.Logger;                            
  7 import org.springframework.jms.core.JmsTemplate;                            
 
  8 import org.springframework.jms.core.MessageCreator;               
  9 import org.springframework.jms.support.converter.MessageConverter;         
 10 import org.springframework.context.ApplicationEvent;
 
 11 import org.springframework.context.ApplicationListener;
 12     
 13 public class PublisherAdapter implements ApplicationListener {
 
 14     private static final Logger logger = Logger.getLogger(PublisherAdapter.class);
 15     private JmsTemplate jmsTemplate;
 16     private MessageConverter converter;
 
 17             
 18     //      
 19     public void setMessageConverter(final MessageConverter converter) {
 20         this.converter = converter;
 
 21     }
 22         
 23     public void setJmsTemplate(final JmsTemplate jmsTemplate) {
 24         this.jmsTemplate = jmsTemplate;
 
 25     }
 26         
 27     public void publish(final Map<String, String> map) {
 
 28         jmsTemplate.send(new MessageCreator() {
 29                 public Message createMessage(final Session session) throws JMSException {
 
 30                     return converter.toMessage(map, session);
 31                 }
 32             });
 33     }   
 34     
 35     public void onApplicationEvent(final ApplicationEvent event) {
 
 36        if (event.getSource() != converter && event instanceof MessageEvent) {
 37            publish(((MessageEvent)event).getProperties());
 38        }        
 39     }           
 
 40 }
 41 
 42

Then, I created a JMS listener that listened for messages on Topic and converted those into Spring application event:

  1 import org.apache.log4j.Logger;
 
  2 import java.util.Map;
  3 import javax.jms.MessageListener;
  4 import javax.jms.Message;
 
  5 import javax.jms.MapMessage;
  6 import javax.jms.JMSException;
  7 import org.springframework.jms.support.converter.MessageConverter;
 
  8 import org.springframework.context.ApplicationContext;
  9 import org.springframework.context.ApplicationContextAware;
 10 import org.springframework.beans.BeansException;
 
 11 
 12 
 13 public class ListenerAdapter implements MessageListener, ApplicationContextAware {
 14     private static final Logger logger = Logger.getLogger(ListenerAdapter.class);
 
 15     private MessageConverter converter;
 16     private ApplicationContext applicationContext;
 17             
 18     public void setMessageConverter(final MessageConverter converter) {
 
 19         this.converter = converter;
 20     }   
 21     public void setApplicationContext(final ApplicationContext applicationContext) throws BeansException {
 
 22         this.applicationContext = applicationContext;
 23             
 24     }       
 25             
 26     public void onMessage(final Message message) {
 
 27         Map<String, String> map = (Map<String, String>) converter.fromMessage(message);
 28         applicationContext.publishEvent(new MessageEvent(this, map));
 29     }   
 
 30 }       
 31 
 32

Next, here is Spring configuration to bootstrap these listeners:

  1 <?xml version="1.0" encoding="UTF-8"?>
 
  2 <beans>
  3    <bean id="mapMessageConverter" class="com.amazon.jasper.messaging.spring.MapMessageConverter"/>
 
  4   <bean id="springTopic" class="org.apache.activemq.command.ActiveMQTopic">
  5       <constructor-arg index="0" value="springTopic"/>
 
  6    </bean>
  7    <bean id="springJmsTemplate" class="org.springframework.jms.core.JmsTemplate" scope="prototype">
 
  8         <property name="connectionFactory" ref="jmsConnectionFactory"/>
  9         <property name="deliveryPersistent" value="true"/>
 
 10         <property name="messageConverter" ref="mapMessageConverter"/>
 11         <property name="defaultDestination" ref="springTopic"/>
 
 12    </bean>
 13    <bean id="publisherAdapter" class="com.amazon.jasper.messaging.spring.PublisherAdapter" scope="prototype">
 
 14         <property name="jmsTemplate" ref="springJmsTemplate"/>
 15         <property name="messageConverter" ref="mapMessageConverter"/>
 
 16    </bean>
 17    <bean id="springTopicListener" class="com.amazon.jasper.messaging.spring.ListenerAdapter" scope="prototype">
 
 18         <property name="messageConverter" ref="mapMessageConverter"/>
 19    </bean>
 
 20    <bean class="org.springframework.jms.listener.DefaultMessageListenerContainer" init-method="start" destroy-method="stop" scope="prototype">
 
 21         <property name="connectionFactory" ref="jmsConnectionFactory"/>
 22         <property name="destination" ref="springTopic"/>
 
 23         <property name="messageListener" ref="springTopicListener"/>
 24         <property name="transactionManager" ref="jmsTransactionManager"/>
 
 25         <property name="concurrentConsumers" value="10"/>
 26    </bean>
 
 27 </beans>
 28 
 29

Finally, here is how you will actually use this plumbing in your code:

  1 import org.springframework.context.ApplicationContext;
 
  2 import org.springframework.context.ApplicationContextAware;
  3 import org.springframework.context.ApplicationEvent;
  4 import org.springframework.context.ApplicationListener;
 
  5 import com.amazon.jasper.workflow.WorkflowContext;
  6 public class Myclass implements ApplicationListener, ApplicationContextAware {
 
  7     private ApplicationContext ctx;
  8 
  9     //  ...
 10             ctx.publishEvent(new MessageEvent(this, SYNC_ID, syncId, SYNC_XPDL, "true"));
 
 11 
 12     public void setApplicationContext(ApplicationContext applicationContext) {
 13         this.ctx = applicationContext;
 14     }
 15 
 
 16     public void onApplicationEvent(ApplicationEvent event) {
 17             if (event instanceof MessageEvent) {
 18                 MessageEvent msgEvt = (MessageEvent)event;
 
 19                 // do cache coherence
 20             }
 21     }
 22 
 23

All you need is to add your class to your Spring configuration file, it will automatically be registered as listener for spring events. All this is fairly simple, but I hope it helps you for similar uses in your code.

Comments (0)

April 22, 2009

Waterfall model of Hardware Acquisition

Filed under: Computing — admin @ 5:22 pm

I have been ordering a bunch of machines for a new project we are launching in a month and despite the fact that I work for biggest provider of cloud computing, we still use old fashioned acquisition process where the purchase of some machines can take months. So, I have been struggling to find the right size for different services we will be deploying. I find such acquision model similar to waterfall projects where you try to come up with requirements in beginning and in a lot of cases you end up with more requirements you need (because you won’t be able change them) or the requirements that are not suitable in the end. On the other hand, I find hardware based on “cloud computing” or “virtualization” like agile methodologies where you start with just enough resources you need and then add more servers as you grow. Also, as retail business is highly seasonal you can grow and shrink as needed. This is why I like services like EC2, S3 or SimpleDB, but due to security and other corporate policies we can’t use them for internal applications, I feel like child of cobbler who can’t get the shoes even though we provide these services to thousands of other companies.

Comments (0)

March 25, 2009

When in Rome, code like how Romans code

Filed under: Languages — admin @ 12:11 pm

I have been programming for over twenty years and I have learned a number of programming languages over the years. One of recurring behavior I have seen in a lot of programmers is that they take a lot of programming habbits (good or bad) from old language(s) to the new language. This could be how you design the application, style of coding, naming conventions, etc. I remember when I switched from C to C++, I was used to procedural thinking and had to learn how to break the problem into classes and how to assign responsibility to different classes. Similarly, when I starting using Java back in 95-95, I had to learn about Java’s peculiar style. For example, I used to declare public methods in C++ at top and all private methods including attributes at the bottom. I also tended to use underscores to prefix member attributes. I slowly learned Java’s style of declaring class attributes at top, using all uppercase for constants, camel case, etc.

In early 2000s, I learned Ruby from PicAxe Ruby book that taught me Ruby from object oriented style and I missed all its functional or meta-programming features. I slowly learned more functional style of programming and meta programming. I even had to switched back to underscores as opposed to camel case. I read Ruby code of other programmers to learn how they code and what conventions they use. I did similar excercises when I learned Python, Erlang, Scala, Objective-C, etc, i.e., I tried to learn not only language itself, its core and third party libraries but how people write the code, package applications or create libraries. Though, I think it helps if there are examples of good usage or style for that language. For example, I have seen plenty of abuses of Javascript that misunderstood its prototype or functional roots and used it as either procedural or class oriented language.

At my work, we use code reviews before any code checkin and I see conventions and styles of other languages mixed in all the time. I think learning different styles of programming makes you a better programmer. For example, I learned from functional programming how immutability can make sure programs safer and I tend to use it more in other object oriented or multi-paradigm langauges that don’t enforce immutability. Though, in other cases it’s hard to force yourself to use features from one language to another when that feature isn’t available inherently. For example, I like mixins feature of Ruby or traits of Scala but I can’t really use them in langauges that support only single inheritance such as Java. So instead of jumping over hoops to use features from other language, I try to use the style suitable for that specific langauge such as using multiple interfaces. I have been learning iphone development and been reading iphone SDK book by Jonathan Zdziarski. One of peculiar thing about his coding examples is that he does not use Interface Builder and creates all UI components from the code. Though, such style is acceptable in many situations but I would prefer to use Interface Builder and follow path of least resistence.

In practice, you will often find multiple styles or approaches of doing a thing in a single language. For example, Ruby encourage multiple ways to do things that can be quite confusing. Though, I like Python’s philolsophy of only one way to do things, but there are plenty of divergent opinions in that language as well. Another somewhat related topic is how to pick a language as languages vary in their core areas of strength. For example, Java was originally marketed as language for Web platform but these days I tend to use Ruby or Python for web development and Java for system development. Also, I tend to use Erlang for network oriented or concurrent applications and use C/C++ where performance is critical. Last year, there was big hoopla over Erlang’s aweful performance for search engine that sparked WideFinder benchmarks but it missed the point that Erlang’s core strength is distributed/concurrent applications and not text searching. So in nutshell, I think it helps to pick a language based on the problem and take advantage of its strengths. Finally, stick to general style of coding and conventions of the language especially when working with large codebase or large number of programmers.

Comments (0)

February 24, 2009

Does software quality matters?

Filed under: Methodologies — admin @ 9:24 pm

In last couple of weeks, I followed interesting debate between Bob Martin and Joel Spolsky/Jeff Attwood (listen Hanselminutes) on the issue of SOLID principles. The SOLID acronym stands for five principles, i.e.,

Single Responsibility Principle – this principle is based on the principle of Cohesion from Tom Demarco’s Structured Analysis and Design book and mandates that a class should have one, and only one, reason to change.
Open Closed Principle – this principle is based on Bertrand Meyer’s book on Object Oriented Software Construction that says that a class should be open to extend without modifying it.
Liskov Substitution Principle – this principle was introduced by Barbara Liskov that says derived classes must be substitutable for their base classes.
Dependency Inversion Principle – states that your should depend on abstractions, not on concrete implementation.
Interface Segregation Principle – states that you should make fine grained interfaces that are client specific.

I learned these principles many years ago and attended Bob Martin in Best Practices 2004 conference, where he talked about these principles. These principles sound good, though in reality they should be a broader guidelines rather than principles. But the heat of the debate was the way these principles have been evangelized by Bob Martin who insisted on using these all the time. Over the years, I have seen similar debates between Cedric Beust and Bob Martin over the use of test driven development. There is also debate on topic of TDD and architecture between Bob Martin and Jim ‘O Coplien. Overall, I find that the issues from these debates boil down to the following:

Usable software vs High quality but unused software

One of the controversial point that Jeff Attwood raised was that the quality does not matter if no one is using your application. I find there is lot of truth to that. In fact, this is the route that most startup takes when developing a new idea and trying something new. For example, everyone was blaming Twitter for choosing Rails when it had scalability and reliability issues. However, Twitter would not have existed if it was written with most scalable and reliable language or framework that took two more years or if the application itself wasn’t as useful. I tend to follow the age old advice: First make it work, then make it right and then make it fast. There is a reason why rapid prototyping frameworks such as Rails, Django, Zend framework, etc. are popular because they allow you to focus more on business functionality and to reduce time to market. So, I agree the first goal of the software should be to make the software that solves real problems or add value. Nevertheless, if first implementation is horrible then it takes hercules effort to make it right and some companies like Friendster never recover.

Customer Experience vs Internal design

One of the earliest advice I got on software development was to write manual before writing the code. It focuses you to solve business problem of the customer rather than writing with top down architecture, which is in a way similar to behavioral driven design in spirit. I find that most useful software developed bottom up and driven by the users. Kent Beck often says that you can’t hide a bad architecture with a good GUI. Nevertheless, for an average user, the usability matters a lot. I remember back in early 90s, IBM OS/2 was superior operating system than Windows but largely loss the market due to the usability (and marketing) issues. The same could be told why people think Macs are better than PC. Rails is also a good example that became popular because you could whip up a webapp in ten minutes despite the fact that its code has been plagued with maintenance issues from monolithic architecture and tricks like chain of alias methods. Other examples include WordPress and Drupal both written in PHP and are the leader in the blogging and content management area due to their usability rather than quality of the code. Again as your software crosses some threshold of number of users it may have to be redesigned, e.g. Rails recently announced that it will merge with another framework Merb in 3.0 because Merb has been designed with micro-kernel and plugable architecture. This also reminds me of merge between Struts and WebWork that turned out to be failure. Joel Spolsky cautions about software rewrites in his blogs and book and I have also blogged on Software rewrites earlier. In the end, you want to extend your application incrementally using Strangler Fig model, which is not an easy thing to do. Ultimately, people matters more than technology, processes or best practices in software development as good people can ship a good software regardless of the language or tools you use.

Evolutionary architecture vs Up front architecture

This has been brought up in debate between Jim Coplien and Bob Martin, where Jim took the position of more upfront design and architecture and Bob took position of evolutionary architecture. I have a lot of respect for Jim Coplien, I still have a copy of Advanced C++ I bought back in ’92 and it introduced me to the concepts of abstraction, body/handle, envelop/letter principles which are sort of DIP. In the interview with Bob Martin, Jim Coplien raised a lot of good points that YAGNI and test-driven based bottom design can create architecture meltdown. Though, I believe good software is developed bottom up, but I do believe some architecture and domain analysis beneficial. I am not necessarily suggesting BDUF (big design up front) or BRUF (big requirements up front), but iteration 0 style architecture and domain analysis when solving a complex problem. For example, the domain driven design by Eric Evans or Responsibility driven design by Rebecca Wirfs-Brock require working closely with the domain experts to analyze the business problems and capturing essential knowledge that may not be obvious. Any investment in proper domain analysis simplifies rest of development and make your application more extensible. There are a number of agile methodologies such as feature driven development and DSDM that encourage some upfront architecture and domain analysis, which I find beneficial.

Extensibility and maintenance

Once your product is hit and loved by users, your next task becomes extending it and adding more features. At this stage, all -ilities such as scalability, performance, security, extensibility becomes more important. Each team can decide on what practices and principles are appropriate and follow them. Agile methodologies encourage collective ownership and pair programming that can spread knowledge and skills, though there are some alternatives such as code reviews or mentoring. I think, having a technical lead who ensures overall quality and keeps the bar up for rest of developers can help with extensibility and maintenance.

Test driven Development

Bob Martin has been adamant proponent of test driven development with his three laws of TDD. I blogged about original debate between Cedric Beust and Bob Martin back in June, 2006 and showed how Bob Martin’s position was not pragmatic. This reaction has also been echoed by Joel, Jeff, Cedric and Jim, who agree 100% coverage is unrealistic. Lately, more and more people are joining this group. I watched recently an interview of Luke Francl who expressed similar sentiments. In my earlier blog, I wrote various types of testing required for writing a reliable and enterprise level software. One of the selling point of unit testing has been ease of refactoring with confidence but I have also found too many unit tests stifle refactoring because they require changing both the code and unit tests. I have found that testing only public interfaces or a bit high level testing without dependency on internal implementation can produce reliable software and also is less prone to breakage when doing refactoring. Though, no one refuses value of testing, but it needs to be practical.

Dependency management

A number of SOLID principles and extended principles such as reuse principle, dependency principles mandates removing or minimizing dependencies between packages. Those principles were made in C++ environment that historically had problems with managing dependencies. Again, the problem comes from religious attitude with these principles. For example, I have worked in companies where different teams shared their code with java jar files and it created dependency hell. I have used jDepend in past to reduce such dependency but it’s not needed in all situation. I am currently building a common platform that will be used by dozens of teams and I intentionally removed or minimized static dependencies and instead used services.

Conclusion

Unfortunately in software industry, new buzzwords often appear every few years that are often used to foster consulting or selling books. There are lots of books on principles, best practices, etc. and the entire agile movement has turned into prescriptive methodologies as opposed to being pragmatic and flexible based on the environment. People like uncle bob use these principles in draconian fashion such as always stressing on 100% test coverage or always keeping interfaces separate from implementation (DIP). Such advice may bring more money from consulting work or publication, but is disingenuous for practical use. In the end, people matters more in software development than arbitrary technology or principles and you need good judgement to pick what practices and solutions suit the problem at hand.

Comments (0)

February 23, 2009

Software Estimation

Filed under: Project Management,Technology — admin @ 6:04 pm

Software estimation is a difficult art that I am still learning despite developing software for more than twenty years. I have worked on a number of projects that started with some broad vision and manager asked me how many man-months will it take. You feel like a guy who is asked how long will it take you to survey a cave without going inside (see Software Estimates and the Parable of the Cave). So based on some initial requirements, you make up some numbers. But, often that number translates into commitment and some target date. This issue has been also brought up by Software Estimation by Steve McConnell, Manage It by Johanna Rothman, Lean Software Development by Mary Poppendieck and a number of other people. So it must be made clear that your estimate is not the target date.

As a project is always constrained by iron triangle of schedule/cost/functionality or sometime referred to as cost/quality/schedule or cost/resourcs/schedule. It is crucial to find what’s driving the project as also suggested by Johanna Rothman in her book Manage It. I have seen a number of cases where dates were arbitrary picked, sometime referred to as “happy date”. Though, at other times, dates may depend on marketing campaign, seasons, tax time, Olympics, etc. So, you can negotiate between functionality and schedule based on what’s driving the project. Following are some of techniques that I have found useful with estimation:

Get the vision and requirements straight – It’s important about the charter, constraints and requirements for the project as any misdirection here would lead to disaster. Luke Hohmann in his book Beyond Software Architecture recommends starting with good vision and mission statement. Johanna Rothman also recommends creating a project charter before starting the project.
Probablistic based estimation – Despite the fact, you are often pressured to produce more precise estimates even though they would be inaccurate, it is better to give estimate with some probablity. Both Johanna Rothman and Steve McConnell cite cone of uncertainty, where your estimate becomes more accurate as project progresses.
Based on best/worst/most-likely case – use following formula from Steve McConnell’s book can be used when estimates are more accurate:

expected_case = (best_case + (4 * most_likely) + worse_case) / 6

If estimates are not accurate, then Steve McConnell recommends

expected_case = (best_case + (3 * most_likely) + (2 * worse_case)) / 6

Bob Martin also similar formula from his article PERT, CPM, and Agile Project Management:

Mean     = (best_case + worst_case + (4 * most_likely) ) / 6

Variance = ((worst_case_best_case) / 6) ^ 2

Iterative development – No matter if you are working on small or large project, the only way to bring some reality and feedback on initial estimate is to develop iteratively starting with highest valued features.
T-shirt based estimation – I find t-shirt based estimation useful when estimating with minimal information available. For example, you may have to estimate projects that you can deliver in Q1, Q2, etc and you can order them in small, medium, large and compare them against their business value.
Spiking can also help in areas that are new to the team and spending a little time creating walking skeleton or tracer bullet can give you some idea on the size of the effort for the project.
Delphi estimation – where PM and team prepares task list, assumptions and estimate in private and reviews them together.
Divide and conquer/Decomposition/WBS – as with any large effort, breaking a project into smaller subsystems, components, services and tasks will help estimate better. In general any errors in estimation of smaller tasks will cancel each other.
Estimate fine grained tasks – I can rarely estimate with some accuracy for tasks that are longer than a few days so it’s important to estimate only fine grained tasks. XP has a concept of inch pebble and story points that can help in this case. The idea is that each task is either done or not done.
Planning poker a technique from Agile Estimating and Planning by Mike Cohn, where each member of the team pick an estimate for a story based on fibonacci numbers, but don’t show until everyone selects some number. The members then pick some average or may ask member with highest or lowest estimates to explain.
Historical data – though I rarely see PM track estimates but tracking them can help future projects and new projects can use LOC, man-months, function-points, # of services, files, interfaces, bugs from prior projects for estimation.
Schedule chicken – Kent Beck often talks about schedule chicken where you have some some meeting about who is ontrack and you hope there is someone who is behind so that you don’t have to admit you are behind as well. Integrity is big part of the XP and agile methodologies so it encourages transparency and honesty instead of schedule chicken.
Better to overestimate than underestimate – programmers often underestimate and though there is risk of student syndrome or Parkinson’s law but it’s better to overestimate.
Don’t question developer’s estimate – even though developers tend to underestimate, some managers still question them, which is not a good idea.
In XP or Scrum, you use story points, which can be ideal hours or based on some multiplier. These numbers are generally follow fibonacci sequence such as 1, 2, 3, 5, 8, 13, 21.
Function points use number of external input/output/queries, internal logical files/external interface files and it can be used as unit of measurements similar to story points.
Estimation quality factory (EQF) as proposed by Tom Demarco in his paper A Defined Process For Project Postmortem Review can be used to check how accurate estimates are.
Include vacation, sick, holidays as well as non-development activities such as testing, deployment, configuration, migration, etc in your project plan.
Scheduling is all about ordering with highest value features. I find rolling-wave scheduling based on milestones useful when planning iterations.

Summary

I often find projects turn into death march projects due to overly optimistic estimates and “queen of denial” manager who holds developers’ estimates as commitment and refuses to accept the reality. One way to overcome bad estimation is to adopt iterative development that delivers small features based on the value proposition, which creates biggest value for the business. Another way is to use advice from the Rational Unified Process that uses risk management to prioritize the highest risk tasks first. Though, some managers are keen to accept more work than the team can handle in order to aim high but it takes a courage to say NO. In the end, under-promise and over deliver as it can save your credibility and not to mention unnecessary overtime and stress on the team.

Comments (0)

February 5, 2009

IP addresses of Spammers from my Honeypot

Filed under: SPAM — admin @ 3:59 pm

I have an old guest book application that was originally written in J2EE in ’98 and then moved to Rails a couple of years ago. Though, I don’t get much guest entries, but it does get a lot of SPAM. It has become sort of honeypot and I keep IP addresses of those spammers and in case anyone is interested you can download that list from http://plexobject.com/spammers_ip_addresses.txt.

Comments (0)

January 25, 2009

Review of Clean Code

Filed under: Design — admin @ 8:54 pm

I just finished Clean Code: A Handbook of Agile Software Craftsmanship book, which is a compilation of patterns on writing clean/maintainable code and is written by uncle Bob and other folks from his consulting company Object Mentors including Tim Ottinger, Michael Feathers (author of Working Effectively with Legacy Code. This book is similar to Implementation Patterns by Kent Beck that I recently read on similar coding practices. Though, this book is a lot thicker with seventeen chapters, however there are plenty of pages filled with tedious listing of Java code.

The first chapter shares thoughts on good code from a number innovators and authors such as Kent Beck, Bjarne Stroustrup, Grady Booch, Dave Thomas, etc. They mention various attributes of good code such as easy to read, efficient, DRY, focused, literate, minimal, error handling and warn of bad code that leads to messy code or broken windows mentality.

The chapter 2 talks about golden advice of using intention-revealing names to improve the readability and following principle of least surprise. This chapter denounces use of hungarian notation or member prefix such as _ or m_.

The chapter 3 describes functions and methods and advises to use small, cohesive functions. It suggests using one level of abstraction to facilitate reading code from top to bottom. It also recommends using polymorphic methods as opposed to switch statement, if-else or functions that take flag/boolean arguments. The chapter discourages functions with side effects or the one that create temporal coupling. This chapter describes an aged old advice of command query separation, though it skips its roots from Bertrand Meyer’s design by contract. Finally, this chapter urges use of exceptions as opposed to error codes.

The chapter 4 describes writing good comments that focus on intent and dissuades against redundant and misleading comments.

The chapter 5 illustrates use of good formatting such as indentation, horizontal, vertical spacing, etc.

The chapter 6 describes use of objects and data structures. It encompasses advice on polymorphism, law of demeter, encapsulation, DTO/value objects, etc.

The chapter 7 discusses error handling. Again it encourages use of exceptions rather than return codes or error codes. It prohibits use of checked exceptions as it violates open/closed principle. It also discourages returning or passing null and recommends exceptions or empty collection instead of returning null.

The chapter 8 explains defining boundaries between components.

The chapter 9 describes advice on unit testing such as TDD, keeping tests clean, one assert per test, single concept per test, etc. Though, such advice should be taken with caution as one assert per test may not capture a unit of testing properly.

The chapter 10 is similar to chapter 6 and contains recommendations on writing classes such as encapsulation, classes should be small, single responsibility principle, cohesion, dependency inversion principle, etc.

The chapter 11 encompasses advice on building systems and managing complexity. It suggests dependency injection and use of factories. It also advocates use of AOP for managing cross cutting concerns.

The chapter 12 talks about emergent design, the concept I first heard from Andy Hunt. This chapter describes advice from Kent Beck: run all tests, refactoring, no duplication, express intents of the programmer and minimize the number of classes and methods. This chapter advocates use of template methods to remove duplication and command/visitor patterns to express design to other developers.

The chapter 13 discusses concurrency and suggests use of single-responsbility principle to keep concurrent code separate from other code, limiting scope of data, keeping threads independent, and use of immutable oobjects or copies of data. It recommends keeping critical section small. The chapter also holds advice on testing threaded code and suggests making threaded code plugable.

The chapter 14 shows an excercise on how to incrementally improve code.

The chapter 15 describes JUnit framework and walks reader through improving tests.

The chapter 16 walks reader through another refactor exercise.

The chapter 17 covers a number of smells, heuristics and anti-patterns. It deters use of poorly written comments, builds/tests that require more than one step. The chapter prohibits passing too many arguments to functions or use of output/flag arguments to functions. It encourages testing boundary conditions and respecting overridden safeties. It also proscribes writing code at wrong level of abstraction, i.e., exposing low-level logic through interface. The chapter also forbids exposing derivatives to base classes, having too much information in interface. Other smells include artificial coupling (sharing constants), feature envy (coupling formatting logic), flag arguments. This chapter also holds advice of Kent Beck about using explanatory (local/temporary) variables. The chapter recommends use of constants, structure over convention, encapsulate conditionals, avoiding temporal couplings, keeping functions at same level of abstraction and keeping configurable data at high levels. The chapter also contains Java specific is advice such as avoiding wildcards in imports, avoiding use of interface to inherit constants, etc. Finally, the chapter cautions against insufficient tests, use of coverage tool, testing boundary conditions, etc.

In conclusion, this book contains advice from wide range of authors on writing good maintainable code. As a coder with over 15 years of experience, I can attest that writing good code requires a lot of micro decisions and detailed attention to details and you generally have to continually improve and refactor your code for good design. Sometime it’s hard to maintain a good design when delivering features under tight deadlines so you may have to take shortcuts. Kent often talks about courage in XP and programmers needs to have courage to fight for writing maintainable code and take time to refactor existing code. Finally, I would caution against using these practices arbitrarily as another favorite rule of mine is that every practice or pattern depends on the situation. Unfortunately, there are lot of people in software industry including uncle bob who use these rules in draconian fashion such as always stressing on 100% test coverage or interfaces separate from implementation (DIP). Such advice may bring more money from consulting work or publication, but is disingenuous for practical use in real projects.

Comments (0)

January 18, 2009

Tips from Implementation Patterns

Filed under: Computing — admin @ 10:05 pm

I read recently read Kent Beck’s Implementation Patterns book. The book contains a number of low-level programming techniques for improving design of a program. Kent Beck is grand master of programming and a great communicator. If there is one thing you can learn from this book is to communicate design effectively with the code. The book is fairly concise and consists of ten chapters and 130 pages. Following are some of my favorite tips from the book:

Values, Principles and Patterns

Due to his pioneering work in design patterns, Kent uses patterns to identify common programming techniques and similar to his extreme programming style for agile development he divides those techniques into values, principles and patterns. The values focus on high level goals such as communication, simplicity, flexibility. The principles focus on local consequences (minimize side-effect), DRY, keeping data/logic together, symmetry, decalartive (annotations), rate of change (Reuse/Release Equivalency Principle).

Class

The chapter 5 describes different ways of organizing the code using classes, interfaces, versioned interface, abstract class, value object, etc. Kent shows importance of dependency inversion principle of coding to interfaces. He provides trade-offs between interfaces and abstract class such as change in implementation and change of the interface itself. One of the hardest thing in real world is progression of interfaces with new behavior and Kent describes versioned interface for it. Kent also encourages use of value or immutable objects to make the program side-effect free as in functional languages. He bemoans procedural interfaces due to temporal depdency they impose. Inheritance is also most difficult to get right, so Kent gives a lot of advice on that to make sure subclasses follow Liskov Substition Principle. Kent describes discourages use of conditional logic and encourags delegation technique based on polymorphism. He also shows pluggable selector briefly to implement plugin like behavior. Finally, he discourages of use of library classes with static methods and encourages use of instance methods.

State

The chapter 6 describes patterns for state such as access, variables, parameters, initializations, etc. Unlike functional languages that don’t allow mutable state, imperative languages have to manage state that changes over time. Kent prefers indirect access to the state rather than direct especially when there is dependency between multiple pieces of data. Kent prefers keeping scope of variables local. This chapter also gives good advice on naming variables and parameters. Finally, this chapter describes techniques for eager and lazy initialization.

Behavior

The chapter 7 describes patterns for control flow, methods and exceptions. In object oriented languages messages are fundamental mechanism for controlling workflow and communicating with the objects. Kent also describes technique for double dispatch (similar to visitor pattern) that provides polymorphic behavior but at the cost of additional coding and maintenance overhead. Other topics include providing guards, naming method and exception handling.

Methods

The chapter 8 describes how to divide logic into methods. The chapter describes composed method for calling other methods. It encourages use of symmetry and having same level of abstraction for called methods. Kent also shows use of Method object or Function object that mimics functional style of programming. This chapter also describes conversion methods, factory methods, getter/setter methods and advice of returning copy from the methods instead of internal references.

Collections

The chapter 9 describes collections such as arrays, lists, sets, map.

Evolving Framework

This chapter describes how to evolve frameworks without breaking the applications. Kent borrows a lot of his experience from JUnit framework and Eclipse (his buddy Eric Gamma). One of interestic topic is how to use objects and the chapter provides three styles including instantiation, configuration and implmentation. Kent shows how implementation technique allow clients to implement a framework interface and extend behavior. Kent also describes trade-offs of extending interfaces and using specialized interfaces as used by AWT (LayoutManager2). The chapter also offers advice on use of internal classes by clients and how they can be instantiated using constructors, factories, etc.

Conclusion

This book shows a lot of techniques and patterns that most experienced programmers knowingly or unknowningly use on daily basis. Nevertheless, it helps to review these techniques and some trade-offs for them. Though, I wish the book gave a lot more examples and described antipatterns as I often found most of the topics a bit dull.

Comments (0)

January 15, 2009

Tips from Effective Java, 2nd edition

Filed under: Java — admin @ 1:56 pm

I have been using Java since ’95 and I just reread Effective Java Second Edition by by Joshua Bloch, it is must have reference book for any Java programmer. It consists of over 75 guidelines for writing better code in Java or any other object oriented language for that better. I describe some of my favorite tips from the book:

Factories/Builders

Though, this book does not cover design patterns in general but encourages use of factories instead of constructors in certain cases. I also use this advice in cases when I have a lot of constructors and it is difficult to tell which constructor should be used. The factory methods also allows instantiating and returning subclass types. Though, the downside is that factory methods look like other static methods so it may not be obvious. I also use builders when I need to create a class with a lot of parameters and some of those parameters are optional. I often use following interface to build objects:

 public interface Builder<T> {
     public T build();
 }

Use enum for defining Singleton

As Java does not offer language level construct for defining Singletons (as opposed to Scala), defining Singleton can be difficult. The general rule I follow is to make the constructor private and define a static method for getting instance. However, there are still gotchas of preserving Singletons especially upon deserialization. Joshua recommends using enum because it automatically takes care of those edge cases.

Avoid finalize

Since, Java borrowed much of its syntax from C++, it created finalize method for a sort of desctructor. It is often misunderstood by new Java developers because it’s not a destructor. Also, it may not be called by the Java at all. Joshua recommends using try/finally block for any kind of resource allocation/release. I think closures in Java 7 may also help in this regard.

Generics

There are tons of gotchas with using generics in Java. One of my favorite tip to reduce amount of code is to define helper methods is to type inferencing, e.g.

   public static <K,V> Map<K, V> newMap() {
         return new HashMap<K, V>();
   }

One of confusing part of generics is when to use extend vs super, the book gives easy acronym PECS (producer extend consumer super) for remembering it. For example, following code shows a method that adds items to a collection:

   void popall(Collection<? super E> d) {
     d.add(pop());
   }
   void pushAll(Iterable<? extends E> src) {
      for (E e : src)
         push(e);
   }

Functional Programming

Though, Java is not a functional language, but Joshua offers some tips on creating functors or function objects, e.g.

         interface UnaryFunction<T> {
            public T apply(T arg);
         }

Now this interface can be implemented for various operations and give a flavor of functional programming. Again, closures in Java 7 would help in this regard.

EnumSet and EnumMap

I have rarely used EnumSet and EnumMap in practice, but the book offers useful tips for using those instead of bit masking and manipulation.

Threading

Though, I like Java Concurrency in Practice for concurrency and threading related topics. But the book offers good tips such as using immutable classes and use of synchronization features of Java for writing thread-safe applications. One of my favorite tip is to use double check idiom, I learned that when Java initially came out and synchronization was somewhat expensive. However, I stopped using it due to some of concerns in Java memory model. Java 1.5 has fixed the Java memory model so we can use it again by declaring the shared field as volatile and doing something like:

   volatile FieldHolder fh;
   ...
   void foo() {
     FieldHolder ref = fh;
     if (ref == null) {
         synchronized (this) {
             ref = fh;
             if (ref == null) {
                fh = ref = compute();
             }
         }
     }
   }

Another tip is to use lazy init holder pattern for initializing static field lazily that requires some synchronization, e.g.

 class MyClass {
     static class FieldHolder {
         static Field field = compute();
     }
     static Field get() {
         return FieldHolder.field;
     }
     static synchronized Field compute() {
        ...
     }
 }

Other threading tips include use of Executors and classes from new Java concurrency library (java.util.concurency).

Cloning and Serialization

The short answer for cloning is not to use it and instead use constructors to copy objects. The serialization creates a lot of security problems and can be difficult when using inheritance.

Exceptions

Joshua also offers a lot of good advice for exceptions such as using exceptions with proper abstraction level. Though, he still suggests use of checked exceptions for recoverable errors but my suggestion is to use RuntimeException for both application/non-recoverable exceptions and system/recoverable exceptions.

equals, hashCode, toString, Comparable

Writing equals method when inheritance is used can also be difficult and Joshua offers a lot of good advice on writing correct equals method. Another gotcha is to make sure hashCode matches semantics of equals method. Also, it good idea to implement Comparable interface so that it’s easy to use sorted collections.

Conclusion

I briefly wrote some of my favorite tips from the book, again it’s absolute desk reference book.

Comments (0)

« Newer Posts — Older Posts »

Shahzad Bhatti Welcome to my ramblings and rants!

May 26, 2009

May 6, 2009

April 22, 2009

March 25, 2009

February 24, 2009

Usable software vs High quality but unused software

Customer Experience vs Internal design

Evolutionary architecture vs Up front architecture

Extensibility and maintenance

Test driven Development

Dependency management

Conclusion

February 23, 2009

Summary

February 5, 2009

January 25, 2009

January 18, 2009

Values, Principles and Patterns

Class

State

Behavior

Methods

Collections

Evolving Framework

Conclusion

January 15, 2009

Factories/Builders

Use enum for defining Singleton

Avoid finalize

Generics

Functional Programming

EnumSet and EnumMap

Threading

Cloning and Serialization

Exceptions

equals, hashCode, toString, Comparable

Conclusion