I think Sprint is wrong metaphore for iteration, because managers seem to think that it means developers will run like hell to finish the work. Though, my project has adopted Scrum recently, but it is far from the true practices, principles and values of agility. For past many months, we have been developing like hell and as a result our quality has suffered.
May 19, 2008
May 16, 2008
Integrating with lots of Services and AJAX
About year and half ago, I was involved in a system rewrite for a project that communicated with dozens of data sources. The web application allowed users to generate ad-hoc reports and perform various transactions on the data. The original project was in Perl/Mason, which proved to be difficult to scale because demands of adding more data sources. Also, the performance of the system became problematic because the old system waited for all data before displaying them to the user. I was assigned to redesign the system, and I rebuilt the new system using lightweight J2EE based stack including Spring, Hibernate, Spring-MVC, Sitemesh and used DWR, Prototype, Scripaculous for AJAX based web interface. In this blog, I am going to focus on high level design especially integration with oher services.
High level Design
The system consisted of following layers:
- Domain layer – This layer defined domain classes. Most of the data structures were just aggregates of heterogenous data with a little structure. Also, users wanted to add new data sources with minimal time and effort, so a number of generic classes were defined to represent them.
- Data provider layer – This layer provided services to search and agregate data from different sources. Basically, each provider published the query data that it required and output data that it supported.
- Data aggregation layer – This layer collected data from multiple data sources and allowed UI to pull the data as it became available.
- Service layer – This layer provided high level operations for quering, reporting and transactions.
- Presentation – This layer provided web based interface. This layer used significant use of AJAX to show the data incremently.
Domain layer
Most of the data services simply returned rows of data with little structure and commonality. So, I designed a general purpose classes to represent rowsets and columns:
MetaField
represents a meta information for each atomic data element used for reporting purpose. It stored information such as name and type of the field.
DataField
represents both MetaField and its value. The value could be one of following:
- UnInitailized – This is a marker interface that signifies that data is not yet populated. It was used to indicate visual clues to the users for the data elements that are waiting for response from the data providers.
- DataError – This class stores an error while accessing the data item. This class also had subclasses like
- UnAvailable – which means data is not available from the service
- TimeoutError – service timedout
- ServerError – any server side unexpected error.
- Value from the data provider.
DataSink
The data provider allowed clients to specify the size of data that is needed, however many of the data providers had internal limits of size of the data that they could return. So, it required multiple invocations of underlying services to the data providers. The DataSink interface allowed higher order services to consume the data from each data provider in stream fashioned, which enhanced UI interaction and minimized the memory required to buffer the data from the service providers. Here is the interface for DataSink callback :
/** * This method allows clients to consume a set of tuples. The client returns true when it wants to stop processing more data and * no further calls would be made to the providers * @param set - set of new tuples received from the data providers * @return - true if client wants to stop */ public boolean consume(TupleSet set); /** * This method notifies client that data provider is done with fetching all required data */ public void dataEnded();
DataProvider
interface is used for integration to each of the data service
public interface DataProvider {
public int getPriority();
public MetaField[] getRequestMetaData();
public MetaField[] getResponseMetaData();
public DataSink invoke(Map context, DataField[] inputParameters) throws DataProviderException;
DataLocator
This class used a configuration file to map all data locators needed for the query.
public interface DataProviderLocator {
public DataProvider[]> getDataProviders(MetaField[] input, MetaField[] output);
}
DataExecutor
This class used Java’s Executors to send off queries to different data providers in parallel.
public interface DataExecutor {
public void execute();
}
The implementation of this class manages the dependency of the data providers and runs the in separate thread.
DataAggregator
This class stored results of all data providers in a rowset format where each row was array of data fields. It was
consumed by the AJAX clients which polled for new data.
public interface DataAggregator {
public void add(DataField[] keyFields, DataField[] valueFields);
public DataField[] keyFields();
public DataField[] dequeue(DataField[] keyFields) throws NoMoreDataException;
}
The first method is used by the DataExecutor to add data to the aggregator. In our application, each of the report had some kind of a key field such as SKU#. In some cases that key was passed by the user and in other cases it was queried before the actual search. The second method returned those key fields. The third method was used by the AJAX clients to query new data.
Service Layer
This layer abstraction for communcating with underlying data locators, providers and aggregators.
public interface DataProviderService{
public DataAggregator search(DataField[] inputFields, DataField[] outputFields);
}
End to End Example
+--------------------------+
| Client selects |
| input/output fields |
| and sends search request |
| View renders initial |
| table. |
+--------------------------+
| ^
V |
+-----------------------+
| Web Controller |
| creates DataFields |
| and calls service. |
| It then stores |
| aggregator in session.|
+-----------------------+
| ^
V |
+------------------------+
| Service calls locators,|
| and executor. |
| |
| It returns aggregator |
| |
+------------------------+
| ^
V |
+------------------------+
| Executor calls |
| providers and adds |
| responses to aggregator|
| |
+------------------------+
| ^
V |
+---------------------+
| Providers call |
| underlying services |
| or database queries |
+---------------------+
+------------------------+
| Client sends AJAX |
| request for new data |
| fields. View uses |
| $('cellid').value to |
| update table. |
+------------------------+
|
V
+-----------------------+
| Web Controller |
| calls aggregator |
| to get new fields |
| It cleans up aggreg. |
| when done. |
+-----------------------+
|
V
+----------------+
| Aggregator |
+----------------+
- Client selects types of reports, where each report has slightly different input data fields.
- Client opens the application and selects the data fields he/she is interested in.
- Client hits search button
- Web Controller intercepts the request and converts form into an array of input and output data field objects.
- Web Controller calls search method of DataProviderService and stores the DataAggregator in the session. Though, our application used
multiple servers, we used sticky sessions and didn’t need to provide replication of the search results. The controller then sent back the
keyfields to the view. - The view used the key data to populate the table for report. The view then starts polling the server for the incoming data.
- Each poll request finds new data and returns to the view, which then populates the table cells. When all data is polled, the aggregator
throws NoMoreDataException and view stops polling. - Also, view stops polling after two minutes in case service stalls. In that case, aggregator from the session is cleared by another background
thread.
Lessons Learned
This design has served well as far as performance and extensibility, but we had some scalability issues because we allowed output of one provider to be used as input to another provider. Thus, some of the threads were idle, so we added some smarts into Executors to spawn threads only when there is input data available. Also, though some of the data sources provided asynchronous services, most didn’t and for others we had to use the database. If services were purely asynchronous, we could have used reactive style of concurrency and used only two threads per search instead of almost one thread for each provider, where one thread would send requests to all services and another thread would poll response from all unfinished providers and add it to the aggregator if it’s finished. I think this kind of application is much better suited for language like Erlang, which provides extremely lightweight processes and you can easily launch hundreds of thousand processes. Also, Erlang has builtin support for tuples used in our application.
May 10, 2008
My blog has been hacked and spammed!
I have been blogging since 2002, and I started with Blojsom, which was based on Java/JSP. It worked pretty well, but when I switched my ISP a couple of years ago, I could not run my own Tomcat so I switched to WordPress. Though, it is much more user friendly, but I had a lot of problems with SPAM in comments and finally I just disabled. However, lately Spammers have gotten much more sophisticated and they added SPAM to my header.php and footer.php and even modified and added blog entries in the database. As a result, my blog has been removed from the search engines. I manually fixed the contents, but I found that it gets changed every day. I am not quite sure how they are getting access. For now, I have changed my database password and added some ways to detect file changes. Let me know if you have any ways to fix this for good.
May 7, 2008
How not to handle errors for an Ecommerce site!
I have been reading and hearing a great deal about Scala language, which is an object-oriented and functional hybrid language and is implemented on JVM and CLR. So, I decided to buy the only book available from Artima. I had been a long subscriber of Artima, so when I ordered I logged in and entered my credit card information. However when I hit enter, I got page with cryptic error message. This was not what I expected. I hoped to get a link to the book, instead I had no idea what happened. Worst, there was no indication on what to do or who to contact. I found a phone number from the website, but when I called the number, the phone company told me that it was disconnected. The only email I could find from the site was for webmaster, so I emailed webmaster and explained what happened. But, webmaster didn’t get back. I knew Bill Venners ran the site, so I searched the site for his email and finally got his email from google. I emailed Bill and explained the situation. Bill was quick to respond and I finally got the link to the ebook within half hour. Though, Bill explained that the bug was not common and they have very high standards for testing. But, I was aggravated by the way errors were handled without giving any clues to the customer. Clearly, building a site that handles credit cards require higher standards. In case of errors when performing transaction, I expect a clear message what went wrong, whether my credit card was charged (which was charged in my case), and some kind of contact page, email address, IM/chat or a phone number. I also like email confirmation that you generally get from ecommerce sites after the transaction.
In the end, I was glad to get the PDF for the book. I am finding Scala is a really cool language with features from a number of languages like Lisp, Smalltalk, Eiffel, Haskell, Erlang, ML, etc. Most of all, it takes advantage of Java’s Ecosystem that has tons of libraries and tools.
IT Sweatshops
Though, I have blogged on Sweatshops a few years ago, but recently I was talking to a friend who works for a company that was nominated as “15 best places to work for” in Seattle Metropolitan’s May issue, but I found out that the company’s HR department pushed employees to vote to get into the top list. What I found was that the company is not bad place to work, but like many other companies is a sort of sweatshop. Having worked for more than sixteen years and over ten companies as an employee and consultant, I could relate to it as well. The truth is that IT departments in most companies are sweatshops, where workers are pushed to make incredible hours and sacrifice nights and weekends. In fact, my current employer is no different. In retrospect, I have found five big reasons that contribute to such environments:
- Taylorism – Despite popularity of agile methodologies and the claim that Agility has crossed the chasm, I have found command-control structure based on Taylorism mentality is still rampant in most places. The management in most places think that giving workers impossible deadlines will force them to work harder, which implies putting 60+hours/week. I have seen companies claim to be agile and promote working smart over working hard, but their practices were no different. These people try to increase velocity by any means (e.g. overtime). I heard one manager brag how his team has higher velocity than the consultants they hired to learn agile practices, ignoring the fact that team was putting 70-80 hours/week.
- Offshoring/H1 – Though, this may not be politically correct thing to say, but offshoring and H1 visas lowered the values of software developers. Despite Paul Graham’s essay on productivity, most management measure programmers by their rates. Also, most of H1 visa holders are not married and tend to work longer until they get their green cards.
- Dot com Boom/Bomb – Though, this may be nostalgia, but I feel programmers had more respect before the dot com boom/bomb. Though, I admit during boom, programmers were over valued, but they have not gained prior status.
- No overtime – The fact that IT workers are not eligible for overtime adds incentive for management to ask for any amount of work. Though, I have seen some lawsuits from game companies and IBM, but things would be a lot different if this rule changed. This probably be one of the reason, I would be open to creating a worker union for IT folks.
- 24/7 With the widespread usage of Internet, all companies want to become 24/7 shops even when they didn’t need it. Though, this has added convenience for mass consumers, but IT folks have to pay for it. For example, in my company developers are responsible for operations, which add considerable work.
Conclusion
I don’t see this trend will subside easily in near future. Most companies measure dedication and promotion by how many hours one put. To most employers and recruiters, the word “family friendly environment” is a code word for a candidate who is not committed. The only solutions to sweatshop mentality I see are adopting agile values, changing overtime policies or becoming independent contractors and make your own rules. Though, agile practices offer glimmer hope to address this, but bad habits are hard to break. Many companies adopt agile processes without adopting key values that they promote. In early 90s when Total Quality Management was in vogue, I saw my company changed titles of managers to facilitators and titles of directors to coaches, and yet their cultured remained the same. Today, I see many companies change titles of team leads or project managers to scrum masters and titles of managers to product owners, and think they are doing agile.
February 15, 2008
Throttling Server in Java
I am about to release another J2EE web application and I have been profiling and load testing the application before the deployment. In my last blog I showed how I am collecting the profiling data. I found that after running load testing with increasingly number of users, the system becomes slower and eventually crashes. In past, I have used a throttling module in Apache to stop accepting new connections, but it didn’t take into memory/cpu into account. I am going to add a throttling to the Java server (Tomcat) so that the system will reject any new requests when the memory becomes too low or system load becomes very high. Luckily, we are using Java 6 which has added a number of nice JMX APIs for collecting system information, for example I have added following
import java.lang.management.ManagementFactory;
...
long freeMemory = ManagementFactory.getMemoryMXBean().getHeapMemoryUsage().getMax() -
ManagementFactory.getMemoryMXBean().getHeapMemoryUsage().getUsed();
double loadAverage = ManagementFactory.getOperatingSystemMXBean().getSystemLoadAverage();
These parameters were enough for my work, though I also tracked other information such as CPU time by thread:
int threads = ManagementFactory.getThreadMXBean().getThreadCount();
long[] threadIds = ManagementFactory.getThreadMXBean().getAllThreadIds();
StringBuilder threadInfo = new StringBuilder();
for (long id : threadIds) {
threadInfo.append("thread " + id + " cputime " + ManagementFactory.getThreadMXBean().getThreadCpuTime(id) +
", usertime " + ManagementFactory.getThreadMXBean().getThreadUserTime(id));
}
Also system start time:
Date started = new Date(ManagementFactory.getRuntimeMXBean().getStartTime());
long uptime ManagementFactory.getRuntimeMXBean().getUptime();
and other system information:
List jvmArgs = ManagementFactory.getRuntimeMXBean().getInputArguments();
int cpus = ManagementFactory.getOperatingSystemMXBean().getAvailableProcessors();
Java 6 also comes with plenty of command line tools to monitor server such as
Memory Usage
The jstat command can be used to monitor the memory usage and garbage collection statistics as follows:
jstat -gcutil
In order to get heap histogram use jmap as follows:
jmap -histo:live pid
Creating memory dump automatically when running out of memory with following command option:
java -XX:+HeapDumpOnOutOfMemoryError
You can use jhat to analyze heap as follows:
jhat heap.dump.out
Monitoring threads
Use jstack to dump stack traces of all threads as follows:
jstack pid
Use JTop and JConsole to monitor the application, e.g.
java -jar/demo/management/JTop/JTop.jar
February 14, 2008
Tracking server health in Java
In order to deploy a scalable application, it is necessary to know how the application behaves with varying number of users. I am about to release another web application at work so I have been profiling and load testing it. I wrote a blog entry a year ago on Load and Functional Testing with Selenium and Grinder. Today, I am showing somewhat simpler things that can be done on J2EE application to keep track of its health.
Data Collector
Though, when working for big companies, often we used commercial software such as complex event processor or messaging middleware to keep track of all kind of alerts and events. But here I am showing a simple class that uses LRU based map to keep track of profile data:
1 import java.lang.management.ManagementFactory; 2 import java.util.ArrayList; 3 import java.util.Collections; 4 import java.util.Date; 5 import java.util.List; 6 7 import com.amazon.otb.cache.CacheMap; 8 9 /** 10 * Profile DataCollector used to store profiling data. 11 */ 12 public class ProfileDataCollector { 13 private static ProfileDataCollector instance = new ProfileDataCollector(); 14 private CacheMap<String, Object> map; 15 private ProfileDataCollector() { 16 map = new CacheMap<String, Object>(1000, 300, null); 17 map.put("ProfileStarted", new Date()); 18 } 19 20 public static ProfileDataCollector getInstance() { 21 return instance; 22 } 23 24 public void add(String name, Object value) { 25 map.put(name, value); 26 } 27 public long increment(String name) { 28 Long number = (Long) map.get(name); 29 if (number == null) { 30 number = new Long(1); 31 } else { 32 number = new Long(number.longValue()+1); 33 } 34 add(name, number); 35 return number.longValue(); 36 } 37 38 public long decrement(String name) { 39 Long number = (Long) map.get(name); 40 if (number == null) { 41 number = new Long(0); 42 } else { 43 number = new Long(number.longValue()-1); 44 } 45 add(name, number); 46 return number.longValue(); 47 } 48 49 public long sum(String name, long total) { 50 Long number = (Long) map.get(name); 51 if (number == null) { 52 number = new Long(total); 53 } else { 54 number = new Long(number.longValue()+total); 55 } 56 add(name, number); 57 return number.longValue(); 58 } 59 60 61 public long average(String name, long elapsed) { 62 long number = increment(name + "_times"); 63 long sum = sum(name + "_total", elapsed); 64 long average = sum / number; 65 add(name + "_average", average); 66 return average; 67 } 68 69 public void lapse(String name) { 70 add(name, new Long(System.currentTimeMillis())); 71 } 72 73 public long elapsed(String name) { 74 Long started = (Long) map.get(name); 75 if (started != null) { 76 long time = System.currentTimeMillis() - started.longValue(); 77 return average(name, time); 78 } else { 79 return -1; 80 } 81 } 82 83 public String[][] getProfileData(String keyPrefix) { 84 List<String> keys = new ArrayList<String>(map.keySet()); 85 Collections.sort(keys); 86 String[][] data = new String[keys.size()+5][]; 87 Runtime runtime = Runtime.getRuntime(); 88 Date started = (Date) map.get("ProfileStarted"); 89 long elapsed = (System.currentTimeMillis() - started.getTime()) / 1000; 90 int n = 0; 91 double systemLoad = ManagementFactory.getOperatingSystemMXBean().getSystemLoadAverage(); 92 data[n++] = new String[] {keyPrefix + "TotalMemoryInMegs", String.valueOf(runtime.totalMemory()/1024/1024)}; 93 data[n++] = new String[] {keyPrefix + "FreeMemoryInMegs", String.valueOf(runtime.freeMemory()/1024/1024)}; 94 data[n++] = new String[] {keyPrefix + "ActiveThreads", String.valueOf(Thread.activeCount())}; 95 data[n++] = new String[] {keyPrefix + "ServerRunningInSecs", String.valueOf(elapsed)}; 96 data[n++] = new String[] {keyPrefix + "SystemLoadAverage", String.valueOf(systemLoad)}; 97 98 for (String key : keys) { 99 CacheMap.TimedValue tv = map.getTimedValue(key); 100 data[n++] = new String[] {keyPrefix + key, tv.value + " @" + new Date(tv.time)}; 101 } 102 return data; 103 } 104 } 105 106
Where CacheMap is a simple LRU based map, i.e.,
1 import java.util.ArrayList; 2 import java.util.Collection; 3 import java.util.Collections; 4 import java.util.concurrent.Callable; 5 import java.util.concurrent.Executors; 6 import java.util.concurrent.ExecutorService; 7 import java.util.concurrent.locks.ReentrantLock; 8 import java.util.HashMap; 9 import java.util.HashSet; 10 import java.util.Iterator; 11 import java.util.LinkedHashMap; 12 import java.util.List; 13 import java.util.Map; 14 import java.util.Set; 15 import org.apache.log4j.Logger; 16 17 18 19 /** 20 * CacheMap - provides lightweight caching based on LRU size and timeout 21 * and asynchronous reloads. 22 * 23 */ 24 public class CacheMap<K, V> implements Map<K, V> { 25 private final static Logger log = Logger.getLogger(CacheMap.class); 26 private final static int MAX_THREADS = 10; // for all cache items across VM 27 private final static int MAX_ITEMS = 1000; // for all cache items across VM 28 private final static ExecutorService executorService = Executors.newFixedThreadPool(MAX_THREADS); 29 private final static boolean lockSync = false; 30 31 public class TimedValue<V> { 32 public final long time; 33 public V value; 34 private boolean updating = false; 35 TimedValue(V value) { 36 this.value = value; 37 this.time = System.currentTimeMillis(); 38 } 39 public boolean isExpired(long timeoutInSecs) { 40 long timeDiff = System.currentTimeMillis() - time; 41 return timeDiff > timeoutInSecs * 1000; 42 } 43 public synchronized boolean markUpdating() { 44 if (!this.updating) { 45 this.updating = true; 46 return true; 47 } 48 return false; 49 } 50 51 @Override 52 public String toString() { 53 long timeDiff = System.currentTimeMillis() - time; 54 return "TimedValue(" + value + ") expiring in " + timeDiff; 55 } 56 } 57 58 59 class FixedSizeLruLinkedHashMap<K, V> extends LinkedHashMap<K, V> { 60 private final int maxSize; 61 62 public FixedSizeLruLinkedHashMap(int initialCapacity, float loadFactor, int maxSize) { 63 super(initialCapacity, loadFactor, true); 64 this.maxSize = maxSize; 65 } 66 67 @Override 68 protected boolean removeEldestEntry(Map.Entry<K, V> eldest) { 69 return size() > maxSize; 70 } 71 } 72 73 private final Cacheable classCacheable; 74 private final Map<K, TimedValue<V>> map; 75 private final Map<Object, ReentrantLock> locks; 76 private CacheLoader<K, V> defaultCacheLoader; 77 private final long expireEntriesPeriod; 78 private long lastexpireEntriesTime; 79 80 81 public CacheMap(Cacheable cacheable) { 82 this(cacheable, null, 5); 83 } 84 85 public CacheMap(int timeoutInSecs, CacheLoader<K,V> defaultCacheLoader) { 86 this(0, timeoutInSecs, defaultCacheLoader); 87 } 88 89 public CacheMap(int maxCapacity, int timeoutInSecs, CacheLoader<K,V> defaultCacheLoader) { 90 this(new CacheableImpl(maxCapacity, timeoutInSecs, true, false), defaultCacheLoader, 5); 91 } 92 93 public CacheMap(Cacheable cacheable, CacheLoader<K,V> defaultCacheLoader, long expireEntriesPeriodInSecs) { 94 this.classCacheable = cacheable; 95 this.defaultCacheLoader = defaultCacheLoader; 96 this.expireEntriesPeriod = expireEntriesPeriodInSecs * 1000; 97 int maxCapacity = cacheable != null && cacheable.maxCapacity() > 0 && cacheable.maxCapacity() < MAX_ITEMS ? cacheable.maxCapacity() : MAX_ITEMS; 98 this.map = Collections.synchronizedMap(new FixedSizeLruLinkedHashMap<K, TimedValue<V>>(maxCapacity/ 10, 0.75f, maxCapacity)); 99 this.locks = new HashMap<Object, ReentrantLock>(); 100 } 101 102 public void setDefaultCacheLoader(CacheLoader<K,V> defaultCacheLoader) { 103 this.defaultCacheLoader = defaultCacheLoader; 104 } 105 106 public int size() { 107 return map.size(); 108 } 109 110 public boolean isEmpty() { 111 return map.isEmpty(); 112 } 113 114 public boolean containsKey(Object key) { 115 return map.containsKey(key); 116 } 117 118 public boolean containsValue(Object value) { 119 return map.containsValue(value); 120 } 121 122 public V put(K key, V value) { 123 TimedValue<V> old = map.put(key, new TimedValue<V>(value)); 124 if (old != null) { 125 return old.value; 126 } else { 127 return null; 128 } 129 } 130 131 public V remove(Object key) { 132 TimedValue<V> old = map.remove(key); 133 if (old != null) { 134 return old.value; 135 } else { 136 return null; 137 } 138 } 139 140 141 public void putAll(Map<? extends K, ? extends V> m) { 142 for (Entry<? extends K, ? extends V> e : m.entrySet()) { 143 put(e.getKey(), e.getValue()); 144 } 145 } 146 147 148 public void clear() { 149 map.clear(); 150 } 151 152 153 public Set<K> keySet() { 154 return map.keySet(); 155 } 156 157 158 public Collection<V> values() { 159 List<V> list = new ArrayList<V>(map.size()); 160 for (TimedValue<V> e : map.values()) { 161 list.add(e.value); 162 } 163 return list; 164 } 165 166 public Set<Map.Entry<K,V>> entrySet() { 167 Set<Map.Entry<K,V>> set = new HashSet<Entry<K,V>>(); 168 for (final Map.Entry<K, TimedValue<V>> e : map.entrySet()) { 169 set.add(new Map.Entry<K,V>() { 170 public K getKey() { 171 return e.getKey(); 172 } 173 public V getValue() { 174 return e.getValue().value; 175 } 176 public V setValue(V value) { 177 TimedValue<V> old = e.getValue(); 178 e.getValue().value = value; 179 if (old != null) { 180 return old.value; 181 } else { 182 return null; 183 } 184 } 185 }); 186 } 187 return set; 188 } 189 190 public TimedValue<V> getTimedValue(Object key) { 191 return this.map.get(key); 192 } 193 194 public V get(Object key) { 195 V value = null; 196 if (classCacheable != null && defaultCacheLoader != null) { 197 value = get((K)key, classCacheable, defaultCacheLoader, null); 198 } else { 199 TimedValue<V> item = this.map.get(key); 200 if (item == null) { 201 value = null; 202 } else { 203 value = item.value; 204 } 205 } 206 return value; 207 } 208 209 public V get(K key, Cacheable cacheable, CacheLoader<K, V> loader, boolean[] wasCached) { 210 TimedValue<V> item = this.map.get(key); 211 V value = null; 212 ReentrantLock lock = null; 213 214 //expire old entries 215 if (System.currentTimeMillis() - this.lastexpireEntriesTime > this.expireEntriesPeriod) { 216 expireEntries(cacheable); 217 } 218 219 220 try { 221 synchronized(this) { 222 if (lockSync && cacheable.synchronizeAccess()) { 223 lock = lock(key); 224 } 225 } 226 // 227 if (item == null) { 228 // load initial value 229 value = reloadSynchronously(key, loader); 230 231 } else { 232 233 value = item.value; 234 235 boolean cached = true; 236 if (cacheable.timeoutInSecs() > 0 && item.isExpired(cacheable.timeoutInSecs())) { 237 238 if (! cacheable.canReloadAsynchronously()) { 239 // ---> reload it now - don't used cached value 240 cached = false; 241 log.debug("Reloading expired entry synchronously " + key); 242 value = reloadSynchronously(key, loader); 243 } else if (item.markUpdating()) { 244 log.debug("Reloading expired entry asynchronously " + key); 245 reloadAsynchronously(key, loader); 246 } 247 } 248 if (wasCached != null) { 249 wasCached[0] = cached; 250 } 251 } 252 } finally { 253 if (lock != null) { 254 lock.unlock(); 255 locks.remove(key); 256 //log.debug("Unlocking " + key); 257 } 258 } 259 return value; 260 } 261 262 263 @Override 264 public String toString() { 265 return super.toString() + "--" + map; 266 } 267 268 269 private ReentrantLock lock(Object key) { 270 ReentrantLock lock = null; 271 synchronized (locks) { 272 lock = locks.get(key); 273 if (lock == null) { 274 lock = new ReentrantLock(); 275 locks.put(key, lock); 276 } 277 } 278 //log.debug("Locking " + key); 279 lock.lock(); 280 return lock; 281 } 282 283 private V reloadSynchronously(final K key, final CacheLoader<K, V> loader) { 284 try { 285 V value = loader.loadCache(key); 286 put(key, value); 287 //log.info("------reloadSynchronously loaded key " + key + ", cache size: " + this.size() + " -- " + System.identityHashCode(map)); 288 return value; 289 } catch (Exception e) { 290 log.error("Failed to load " + key, e); 291 throw new RuntimeException("Failed to load " + key + " for " + classCacheable, e); 292 } 293 } 294 295 296 private void reloadAsynchronously(final K key, final CacheLoader<K,V> loader) { 297 if (log.isDebugEnabled()) { 298 log.debug("requesting reload for "+key.toString()+": "+Thread.currentThread().getName()); 299 } 300 executorService.submit(new Callable<V>() { 301 public V call() throws Exception { 302 if (log.isDebugEnabled()) { 303 log.debug("reloading for "+key.toString()+": "+Thread.currentThread().getName()); 304 } 305 return reloadSynchronously(key, loader); 306 } 307 }); 308 } 309 310 private void expireEntries(final Cacheable cacheable) { 311 Iterator<TimedValue<V>> it = new ArrayList<TimedValue<V>>(this.map.values()).iterator(); 312 while (it.hasNext()) { 313 TimedValue<V> value = it.next(); 314 if (cacheable.timeoutInSecs() > 0 && value.isExpired(cacheable.timeoutInSecs())) { 315 it.remove(); 316 } 317 } 318 this.lastexpireEntriesTime = System.currentTimeMillis(); 319 } 320 } 321 322
Session Tracking
I added a listener to keep track of active users/sessions (who accessed the system in last 5 mintutes) as opposed to all sessions which may take 30 minutes or more to expire.
1 public class SessionLogger implements HttpSessionListener, HttpSessionBindingListener { 2 private transient static Log log = LogFactory.getLog(SessionLogger.class); 3 private static int sessionCount; 4 private static final String OBJECTS_IN_SESSION_COUNT = "Objects in Session"; 5 private static final long ACTIVE_THRESHOLD = 5 * 60 * 1000; // 5 minutes 6 private static Map<HttpSession, HttpSession> activeSessions = Collections.synchronizedMap(new HashMap<HttpSession, HttpSession>()); 7 private static Map<HttpSession, String> allUsers = Collections.synchronizedMap(new HashMap<HttpSession, String>()); 8 private static Map<HttpSession, String> activeUsers = Collections.synchronizedMap(new HashMap<HttpSession, String>()); 9 10 private static void checkActiveSessions() { 11 long now = System.currentTimeMillis(); 12 Iterator<HttpSession> it = activeSessions.keySet().iterator(); 13 activeUsers.clear(); 14 while (it.hasNext()) { 15 HttpSession session = it.next(); 16 if (now-session.getLastAccessedTime() > ACTIVE_THRESHOLD) { 17 it.remove(); 18 } else { 19 activeUsers.put(session,(String) session.getAttribute(UserUtil.OTB_REAL_USER)); 20 } 21 } 22 ProfileDataCollector.getInstance().add("TotalSessions", String.valueOf(sessionCount)); 23 ProfileDataCollector.getInstance().add("ActiveSessions", String.valueOf(activeSessions.size())); 24 ProfileDataCollector.getInstance().add("ActiveUsers", activeUsers.values().toString()); 25 ProfileDataCollector.getInstance().add("AllUsers", allUsers.values().toString()); 26 } 27 28 public void sessionCreated(HttpSessionEvent se) { 29 HttpSession session = se.getSession(); 30 synchronized (this) { 31 sessionCount++; 32 activeSessions.put(session, session); 33 activeSessions.put(session, session); 34 checkActiveSessions(); 35 } 36 allUsers.put(session, (String) session.getAttribute(UserUtil.OTB_REAL_USER)); 37 } 38 39 public void sessionDestroyed(HttpSessionEvent se) { 40 HttpSession session = se.getSession(); 41 allUsers.remove(session); 42 synchronized (this) { 43 sessionCount--; 44 activeSessions.remove(session); 45 checkActiveSessions(); 46 } 47 } 48 49 public void valueBound(HttpSessionBindingEvent event) { 50 String username = (String) event.getSession().getAttribute(UserUtil.OTB_REAL_USER); 51 Integer old = (Integer) event.getSession().getAttribute(OBJECTS_IN_SESSION_COUNT); 52 if (old == null) { 53 old = new Integer(0); 54 } 55 Integer count = new Integer(old.intValue()+1); 56 event.getSession().setAttribute(OBJECTS_IN_SESSION_COUNT, count); 57 ProfileDataCollector.getInstance().add("TotalSessionValues", String.valueOf(count)); 58 } 59 60 public void valueUnbound(HttpSessionBindingEvent event) { 61 String username = (String) event.getSession().getAttribute(UserUtil.OTB_REAL_USER); 62 Integer old = (Integer) event.getSession().getAttribute(OBJECTS_IN_SESSION_COUNT); 63 if (old == null) { 64 old = new Integer(0); 65 } 66 Integer count = new Integer(old.intValue()-1); 67 event.getSession().setAttribute(OBJECTS_IN_SESSION_COUNT, count); 68 ProfileDataCollector.getInstance().add("TotalSessionValues", String.valueOf(count)); 69 } 70 } 71
Publishing profiling data
I then added code that needs to be profiled. Though, I have used AspectJ in past for this kind of work, but for now I am just adding this code where needed, e.g.
1 try { 2 ProfileDataCollector.getInstance().increment("ActiveSearches"); 3 ProfileDataCollector.getInstance().lapse("SearchLapse"); 4 ... 5 6 } finally { 7 ProfileDataCollector.getInstance()..increment("TotalSearches"); 8 ProfileDataCollector.getInstance()..decrement("ActiveSearches"); 9 ProfileDataCollector.getInstance()..elapsed("SearchLapse"); 10 }
Viewing Profile Data
Finally, I added a servlet to return profiling data with AJAX call, e.g.
1 import java.io.IOException; 2 3 import javax.servlet.ServletException; 4 import javax.servlet.http.HttpServlet; 5 import javax.servlet.http.HttpServletRequest; 6 import javax.servlet.http.HttpServletResponse; 7 public class ProfileServlet extends HttpServlet { 8 protected void doPost(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException { 9 response.setContentType("text/plain"); 10 String format = request.getParameter("format"); 11 if ("line".equals(format)) { 12 response.getWriter().println(getProfileLine()); 13 } else if ("json".equals(format)) { 14 response.getWriter().println(getProfileJson()); 15 } else { 16 response.getWriter().println(getProfileTable()); 17 } 18 } 19 20 protected void doGet(HttpServletRequest req, HttpServletResponse rsp) throws ServletException, IOException { 21 doPost(req, rsp); 22 } 23 24 25 public static String[][] getProfileData() { 26 return ProfileDataCollector.getInstance().getProfileData(""); 27 } 28 29 public static String getProfileLine() { 30 StringBuilder sb = new StringBuilder(); 31 String[][] data = getProfileData(); 32 for (String[] param: data) { 33 sb.append(param[0] + "=" + param[1]); 34 } 35 return sb.toString(); 36 } 37 38 public static String getProfileTable() { 39 StringBuilder sb = new StringBuilder("<table width='100%' border='2'>"); 40 String[][] data = getProfileData(); 41 sb.append("<tr>"); 42 for (String[] param: data) { 43 sb.append("<th>" + param[0] + "</th>"); 44 } 45 sb.append("</tr>"); 46 sb.append("<tr>"); 47 for (String[] param: data) { 48 sb.append("<td>" + param[1] + "</td>"); 49 } 50 sb.append("</tr>"); 51 sb.append("</table>"); 52 return sb.toString(); 53 } 54 55 public static String getProfileJson() { 56 StringBuilder sb = newJsonString(); 57 String[][] data = getProfileData(); 58 for (String[] param: data) { 59 appendJsonString(sb, param[0], param[1]); 60 } 61 return endJsonString(sb); 62 } 63 64 65 protected static StringBuilder newJsonString() { 66 return new StringBuilder('['); 67 } 68 69 protected static void appendJsonString(StringBuilder sb, String name, String id) { 70 if (sb.length() > 1) { 71 sb.append(','); 72 } 73 sb.append("{ name:'").append(name.replace("'", "\'")).append("', id:'").append(id.replace("'", "\'")).append( 74 "' }"); 75 } 76 77 protected static String endJsonString(StringBuilder sb) { 78 sb.append(']'); 79 return sb.toString(); 80 } 81 82 protected static void addJsonProperty(StringBuilder sb, String name, String value) { 83 sb.append(""" + name + """); 84 sb.append(":"); 85 sb.append(""" + value + """); 86 } 87 88 protected static void startJsonObj(StringBuilder sb) { 89 sb.append("{"); 90 91 } 92 93 protected static void addNewJsonProperty(StringBuilder sb) { 94 sb.append(","); 95 96 } 97 98 protected static void endJsonObj(StringBuilder sb) { 99 sb.append("}"); 100 101 } 102 } 103
And then a JSP to show the results using AJAX calls with Prototype library:
<script src="/otb-static/scriptaculous/prototype.js" type="text/javascript"></script>
<script>
var updater = null;
function initAjax() {
updater = new Ajax.PeriodicalUpdater('profile_div', '/profile?format=table', {
method: 'get',
insertion: Insertion.Top,
frequency: 15,
decay: 2
});
}
function startRequest() {
updater.start();
}
function stopRequest() {
updater.stop();
}
</script>
</head>
<body onLoad="initAjax(), startRequest()">
<ul id="profile_div">
</ul>
Testing
I then wrote a unit test to call various actions of the web application, I am going to use my old pal Grinder to do some real load testing and monitor the health of the server. I am not showing the test here, because it’s very application specific.
February 7, 2008
February 5, 2008
Does experience matter?
I came across a blog entry of DHH, Years of irrelevance
and and an old blog post of Jeff Atwood on Skill Disparities in Programming. Both of them cover somewhat similar topic, i.e. are programmers with more experience better than programmers with less experience. For example, DHH asserts that after six month of experience with a technology you can be pretty experienced. Similarly, Jeff shows various research findings where person with just two year experience is just as good as person with seven or more years of experience. This topic often comes up with job postings when companies require applicant to have certain number of years of experience. Martin Fowler also spoke recently in his blog entry PreferDesignSkills, where he prefers person with broad experience in design and programming than a specialized person. I have seen my share of discrimination in job market when recruiter is looking for X years of experience or is only looking for person with Websphere and would not consider Weblogic experience. It is even worse when new technology is involved and I faced similar discrimination when transitioning from J2EE to Rails. I totally agree that requiring experience with some technology does not matter because a smart person can easily learn it. This also has been discussed many times by Scott Ambler in Generalists vs Specialists, which I have quoted in my website for many years. Since, there is a considerable amount of difference between productivity and quality between programmers, this is an important question. Based on my twenty years of programming experience with sixteen years doing professionally I agree with these findings. I have seen people with twenty years of experience doing same job in big companies with no desire to learn anything else. I have learned it’s incredibly important to diversify your skills and be a generalist than specialist. It is the only way a programmer with more years of experience can be more valuable than programmer with fewer years of experience. When hiring I would look for a person with broad skills who has track record of getting things done and has passion to learn new things. As Joel Spolsky often says when hiring look for smart people who get the things done. There are always new things coming and a good programmer will find a way to learn that in short amount of time as DHH mentioned. I have worked as software developer and systems engineer/administrator and that helped my understanding of overall systems. I have used various languages over the years including Basic, FORTRAN, COBOL, C, C++, Java, PERL, Python, Ruby, Erlang, Haskell. As Dave Thomas talks about learning new language to think differently, and it can show you finding new solutions.
Besides being generalist, another way experience matters is with design. Though with small system, productivity and quality of experienced and junior programmer may seem similar, but I have found with larger systems quality does show up. (Note, I am only considering smart programmers with varying number of years here and not considering really bad programmers.) I have observed that design of experienced programmer (more than five years of experience) will be more flexible because he/she would have likely worked on similar problem before (which may also give productivity advantage). I have also found junior programmers struggle with roles and responsibilities (Rebecca J Wirfs) and good object oriented design (or domain driven design). I have observed that senior programmers have better understanding of things like separation of concerns, modularization, encapsulation, loose coupling and scalability.
In nutshell, any good programmer can learn a new technology very quickly and can solve any problem, but I believe experienced programmer with more general experience will be more valuable than junior programmer and quality of design of a senior programmer with broad experience will be much better in terms of good design principles and -ilities. I think, design is something you learn over the years because for each design decision you may be using thousands of small lessons you have learned over the years. As far as development teams are concerned, I like to have one really experienced programmer and others with junior to mid level experience. This gives a good apprenticeship environment for junior people to learn as software development is still an art. Finally, as everything in software, there aren’t hard rules and everything depends on the environment and people.
December 23, 2007
Released ErlSDB 0.1
I started working on an Erlang library to access Amazon’s SimpleDB web service and I released an early version of the library this weekend. Here are some notes on its usage:
Installing
svn checkout http://erlsdb.googlecode.com/svn/trunk/ erlsdb-read-only
Building
Testing
edit Makefile and add access key and secret key, then type make test
Usage
Take a look at test/erlsdb_test.erl to learn usage, here is a sample code
Starting Server
erlsdb:start(type, [#sdb_state{ access_key = "YourAccessKey", secret_key = "YourSecretKey", domain = "YourDomain" } ])
Creating Domain
erlsdb:create_domain()
Note that the server will use the domain that was passed during initialization.
Listing all Domains
{ok, List, _} = erlsdb:list_domains()
Deleting Domain
erlsdb:delete_domain()
Adding an item
Attributes = lists:sort([ ["StreetAddress", "705 5th Ave"], ["City", "Seattle"], ["State", "WA"], ["Zip", "98101"] ]), erlsdb:put_attributes("TccAddr", Attributes)
Retrieving an item
{ok, UnsortedAttrs} = erlsdb:get_attributes("TccAddr")
Deleting an item
erlsdb:delete_attributes("TccAddr"),
