Shahzad Bhatti Welcome to my ramblings and rants!

May 7, 2008

IT Sweatshops

Filed under: Computing — admin @ 11:50 am

Though, I have blogged on Sweatshops a few years ago, but recently I was talking to a friend who works for a company that was nominated as “15 best places to work for” in Seattle Metropolitan’s May issue, but I found out that the company’s HR department pushed employees to vote to get into the top list. What I found was that the company is not bad place to work, but like many other companies is a sort of sweatshop. Having worked for more than sixteen years and over ten companies as an employee and consultant, I could relate to it as well. The truth is that IT departments in most companies are sweatshops, where workers are pushed to make incredible hours and sacrifice nights and weekends. In fact, my current employer is no different. In retrospect, I have found five big reasons that contribute to such environments:

  1. Taylorism – Despite popularity of agile methodologies and the claim that Agility has crossed the chasm, I have found command-control structure based on Taylorism mentality is still rampant in most places. The management in most places think that giving workers impossible deadlines will force them to work harder, which implies putting 60+hours/week. I have seen companies claim to be agile and promote working smart over working hard, but their practices were no different. These people try to increase velocity by any means (e.g. overtime). I heard one manager brag how his team has higher velocity than the consultants they hired to learn agile practices, ignoring the fact that team was putting 70-80 hours/week.
  2. Offshoring/H1 – Though, this may not be politically correct thing to say, but offshoring and H1 visas lowered the values of software developers. Despite Paul Graham’s essay on productivity, most management measure programmers by their rates. Also, most of H1 visa holders are not married and tend to work longer until they get their green cards.
  3. Dot com Boom/Bomb – Though, this may be nostalgia, but I feel programmers had more respect before the dot com boom/bomb. Though, I admit during boom, programmers were over valued, but they have not gained prior status.
  4. No overtime – The fact that IT workers are not eligible for overtime adds incentive for management to ask for any amount of work. Though, I have seen some lawsuits from game companies and IBM, but things would be a lot different if this rule changed. This probably be one of the reason, I would be open to creating a worker union for IT folks.
  5. 24/7 With the widespread usage of Internet, all companies want to become 24/7 shops even when they didn’t need it. Though, this has added convenience for mass consumers, but IT folks have to pay for it. For example, in my company developers are responsible for operations, which add considerable work.

Conclusion
I don’t see this trend will subside easily in near future. Most companies measure dedication and promotion by how many hours one put. To most employers and recruiters, the word “family friendly environment” is a code word for a candidate who is not committed. The only solutions to sweatshop mentality I see are adopting agile values, changing overtime policies or becoming independent contractors and make your own rules. Though, agile practices offer glimmer hope to address this, but bad habits are hard to break. Many companies adopt agile processes without adopting key values that they promote. In early 90s when Total Quality Management was in vogue, I saw my company changed titles of managers to facilitators and titles of directors to coaches, and yet their cultured remained the same. Today, I see many companies change titles of team leads or project managers to scrum masters and titles of managers to product owners, and think they are doing agile.

February 15, 2008

Throttling Server in Java

Filed under: Computing,Java — admin @ 4:11 pm

I am about to release another J2EE web application and I have been profiling and load testing the application before the deployment. In my last blog I showed how I am collecting the profiling data. I found that after running load testing with increasingly number of users, the system becomes slower and eventually crashes. In past, I have used a throttling module in Apache to stop accepting new connections, but it didn’t take into memory/cpu into account. I am going to add a throttling to the Java server (Tomcat) so that the system will reject any new requests when the memory becomes too low or system load becomes very high. Luckily, we are using Java 6 which has added a number of nice JMX APIs for collecting system information, for example I have added following

import java.lang.management.ManagementFactory;
...
   long freeMemory = ManagementFactory.getMemoryMXBean().getHeapMemoryUsage().getMax() -
                ManagementFactory.getMemoryMXBean().getHeapMemoryUsage().getUsed();

   double loadAverage = ManagementFactory.getOperatingSystemMXBean().getSystemLoadAverage();

These parameters were enough for my work, though I also tracked other information such as CPU time by thread:

        int threads = ManagementFactory.getThreadMXBean().getThreadCount();
        long[] threadIds = ManagementFactory.getThreadMXBean().getAllThreadIds();
        StringBuilder threadInfo = new StringBuilder();
        for (long id : threadIds) {
            threadInfo.append("thread " + id + " cputime " +  ManagementFactory.getThreadMXBean().getThreadCpuTime(id) +
                ", usertime " + ManagementFactory.getThreadMXBean().getThreadUserTime(id));
        }

Also system start time:


            Date started = new Date(ManagementFactory.getRuntimeMXBean().getStartTime());
            long uptime ManagementFactory.getRuntimeMXBean().getUptime();

and other system information:

            List jvmArgs = ManagementFactory.getRuntimeMXBean().getInputArguments();
            int cpus = ManagementFactory.getOperatingSystemMXBean().getAvailableProcessors();

Java 6 also comes with plenty of command line tools to monitor server such as

Memory Usage

The jstat command can be used to monitor the memory usage and garbage collection statistics as follows:

jstat -gcutil
  

In order to get heap histogram use jmap as follows:

jmap -histo:live pid

Creating memory dump automatically when running out of memory with following command option:

java -XX:+HeapDumpOnOutOfMemoryError

You can use jhat to analyze heap as follows:

jhat heap.dump.out

Monitoring threads

Use jstack to dump stack traces of all threads as follows:

jstack pid

Use JTop and JConsole to monitor the application, e.g.

java -jar /demo/management/JTop/JTop.jar

February 14, 2008

Tracking server health in Java

Filed under: Computing,Java — admin @ 5:27 pm

In order to deploy a scalable application, it is necessary to know how the application behaves with varying number of users. I am about to release another web application at work so I have been profiling and load testing it. I wrote a blog entry a year ago on Load and Functional Testing with Selenium and Grinder. Today, I am showing somewhat simpler things that can be done on J2EE application to keep track of its health.

Data Collector

Though, when working for big companies, often we used commercial software such as complex event processor or messaging middleware to keep track of all kind of alerts and events. But here I am showing a simple class that uses LRU based map to keep track of profile data:

  1 import java.lang.management.ManagementFactory;

  2 import java.util.ArrayList;
  3 import java.util.Collections;
  4 import java.util.Date;

  5 import java.util.List;
  6
  7 import com.amazon.otb.cache.CacheMap;
  8 

  9 /**
 10  * Profile DataCollector used to store profiling data.

 11  */
 12 public class ProfileDataCollector {
 13     private static ProfileDataCollector instance = new ProfileDataCollector();

 14     private CacheMap<String, Object> map;
 15     private ProfileDataCollector() {
 16         map = new CacheMap<String, Object>(1000, 300, null);

 17         map.put("ProfileStarted", new Date());
 18     }
 19
 20     public static ProfileDataCollector getInstance() {

 21         return instance;
 22     }
 23
 24     public void add(String name, Object value) {

 25         map.put(name, value);
 26     }
 27     public long increment(String name) {
 28         Long number = (Long) map.get(name);

 29         if (number == null) {
 30             number = new Long(1);

 31         } else {
 32             number = new Long(number.longValue()+1);
 33         }

 34         add(name, number);
 35         return number.longValue();
 36     }
 37 

 38     public long decrement(String name) {
 39         Long number = (Long) map.get(name);
 40         if (number == null) {

 41             number = new Long(0);
 42         } else {
 43             number = new Long(number.longValue()-1);

 44         }
 45         add(name, number);
 46         return number.longValue();
 47     }

 48
 49     public long sum(String name, long total) {
 50         Long number = (Long) map.get(name);

 51         if (number == null) {
 52             number = new Long(total);
 53         } else {

 54             number = new Long(number.longValue()+total);
 55         }
 56         add(name, number);
 57         return number.longValue();

 58     }
 59
 60
 61     public long average(String name, long elapsed) {

 62         long number = increment(name + "_times");
 63         long sum = sum(name + "_total", elapsed);
 64         long average = sum / number;

 65         add(name + "_average", average);
 66         return average;
 67     }
 68 

 69     public void lapse(String name) {
 70         add(name, new Long(System.currentTimeMillis()));
 71     }

 72
 73     public long elapsed(String name) {
 74         Long started = (Long) map.get(name);
 75         if (started != null) {

 76             long time = System.currentTimeMillis() - started.longValue();
 77             return average(name, time);
 78         } else {

 79             return -1;
 80         }
 81     }
 82 

 83     public String[][] getProfileData(String keyPrefix) {
 84         List<String> keys = new ArrayList<String>(map.keySet()); 

 85         Collections.sort(keys);
 86         String[][] data = new String[keys.size()+5][];
 87         Runtime runtime = Runtime.getRuntime();

 88         Date started = (Date) map.get("ProfileStarted");
 89         long elapsed = (System.currentTimeMillis() - started.getTime()) / 1000;
 90         int n = 0;

 91         double systemLoad = ManagementFactory.getOperatingSystemMXBean().getSystemLoadAverage();
 92         data[n++] = new String[] {keyPrefix + "TotalMemoryInMegs", String.valueOf(runtime.totalMemory()/1024/1024)};

 93         data[n++] = new String[] {keyPrefix + "FreeMemoryInMegs", String.valueOf(runtime.freeMemory()/1024/1024)};
 94         data[n++] = new String[] {keyPrefix + "ActiveThreads", String.valueOf(Thread.activeCount())};

 95         data[n++] = new String[] {keyPrefix + "ServerRunningInSecs", String.valueOf(elapsed)};
 96         data[n++] = new String[] {keyPrefix + "SystemLoadAverage", String.valueOf(systemLoad)};

 97
 98         for (String key : keys) {
 99             CacheMap.TimedValue tv = map.getTimedValue(key);
100             data[n++] = new String[] {keyPrefix + key, tv.value + " @" + new Date(tv.time)};

101         }
102         return data;
103     }
104 }
105
106

Where CacheMap is a simple LRU based map, i.e.,

  1 import java.util.ArrayList;

  2 import java.util.Collection;
  3 import java.util.Collections;
  4 import java.util.concurrent.Callable;

  5 import java.util.concurrent.Executors;
  6 import java.util.concurrent.ExecutorService;
  7 import java.util.concurrent.locks.ReentrantLock;

  8 import java.util.HashMap;
  9 import java.util.HashSet;
 10 import java.util.Iterator;

 11 import java.util.LinkedHashMap;
 12 import java.util.List;
 13 import java.util.Map;

 14 import java.util.Set;
 15 import org.apache.log4j.Logger;
 16
 17 

 18
 19 /**
 20  * CacheMap - provides lightweight caching based on LRU size and timeout

 21  * and asynchronous reloads.
 22  *

 23  */
 24 public class CacheMap<K, V> implements Map<K, V> {

 25     private final static Logger log = Logger.getLogger(CacheMap.class);
 26     private final static int MAX_THREADS = 10; // for all cache items across VM

 27 private final static int MAX_ITEMS = 1000; // for all cache items across VM
 28     private final static ExecutorService executorService = Executors.newFixedThreadPool(MAX_THREADS);

 29     private final static boolean lockSync = false;
 30 

 31     public class TimedValue<V> {
 32         public final long time;

 33         public V value;
 34         private boolean updating = false;
 35         TimedValue(V value) {

 36             this.value = value;
 37             this.time = System.currentTimeMillis();
 38         }
 39         public boolean isExpired(long timeoutInSecs) {

 40             long timeDiff = System.currentTimeMillis() - time;
 41             return timeDiff > timeoutInSecs * 1000;
 42         }

 43         public synchronized boolean markUpdating() {
 44             if (!this.updating) {

 45                 this.updating = true;
 46                 return true;
 47             }

 48             return false;
 49         }
 50
 51         @Override

 52         public String toString() {
 53             long timeDiff = System.currentTimeMillis() - time;
 54             return "TimedValue(" + value + ") expiring in " + timeDiff;

 55         }
 56     }
 57
 58
 59     class FixedSizeLruLinkedHashMap<K, V> extends LinkedHashMap<K, V> {

 60         private final int maxSize;
 61
 62         public FixedSizeLruLinkedHashMap(int initialCapacity, float loadFactor, int maxSize) {

 63             super(initialCapacity, loadFactor, true);
 64             this.maxSize = maxSize;
 65         }
 66 

 67         @Override
 68         protected boolean removeEldestEntry(Map.Entry<K, V> eldest) {
 69             return size() > maxSize;

 70         }
 71     }
 72
 73     private final Cacheable classCacheable;

 74     private final Map<K, TimedValue<V>> map;
 75     private final Map<Object, ReentrantLock> locks;

 76     private CacheLoader<K, V> defaultCacheLoader;
 77     private final long expireEntriesPeriod;

 78     private long lastexpireEntriesTime;
 79
 80
 81     public CacheMap(Cacheable cacheable) {

 82         this(cacheable, null, 5);
 83     }
 84
 85     public CacheMap(int timeoutInSecs, CacheLoader<K,V> defaultCacheLoader) {

 86         this(0, timeoutInSecs, defaultCacheLoader);
 87     }
 88
 89     public CacheMap(int maxCapacity, int timeoutInSecs, CacheLoader<K,V> defaultCacheLoader) {

 90         this(new CacheableImpl(maxCapacity, timeoutInSecs, true, false), defaultCacheLoader, 5);
 91     }

 92
 93     public CacheMap(Cacheable cacheable, CacheLoader<K,V> defaultCacheLoader, long expireEntriesPeriodInSecs) {
 94         this.classCacheable = cacheable;

 95         this.defaultCacheLoader = defaultCacheLoader;
 96         this.expireEntriesPeriod = expireEntriesPeriodInSecs * 1000;
 97         int maxCapacity = cacheable != null && cacheable.maxCapacity() > 0 && cacheable.maxCapacity() < MAX_ITEMS ? cacheable.maxCapacity() : MAX_ITEMS;

 98         this.map = Collections.synchronizedMap(new FixedSizeLruLinkedHashMap<K, TimedValue<V>>(maxCapacity/ 10, 0.75f, maxCapacity));
 99         this.locks = new HashMap<Object, ReentrantLock>();

100     }
101
102     public void setDefaultCacheLoader(CacheLoader<K,V> defaultCacheLoader) {
103         this.defaultCacheLoader = defaultCacheLoader;

104     }
105
106     public int size() {
107         return map.size();

108     }
109
110     public boolean isEmpty() {
111         return map.isEmpty();

112     }
113
114     public boolean containsKey(Object key) {
115         return map.containsKey(key);

116     }
117
118     public boolean containsValue(Object value) {
119         return map.containsValue(value);

120     }
121
122     public V put(K key, V value) {
123         TimedValue<V> old = map.put(key, new TimedValue<V>(value));

124         if (old != null) {
125             return old.value;
126         } else {

127             return null;
128         }
129     }
130
131     public V remove(Object key) {

132         TimedValue<V> old = map.remove(key);
133         if (old != null) {
134             return old.value;

135         } else {
136             return null;
137         }
138     }

139
140
141     public void putAll(Map<? extends K, ? extends V> m) {

142         for (Entry<? extends K, ? extends V> e : m.entrySet()) {
143             put(e.getKey(), e.getValue());

144         }
145     }
146
147
148     public void clear() {

149         map.clear();
150     }
151
152
153     public Set<K> keySet() {

154         return map.keySet();
155     }
156
157
158     public Collection<V> values() {

159         List<V> list = new ArrayList<V>(map.size());
160         for (TimedValue<V> e : map.values()) {

161             list.add(e.value);
162         }
163         return list;
164     }
165
166     public Set<Map.Entry<K,V>> entrySet() {

167         Set<Map.Entry<K,V>> set = new HashSet<Entry<K,V>>();
168         for (final Map.Entry<K, TimedValue<V>> e : map.entrySet()) {

169             set.add(new Map.Entry<K,V>() {
170                 public K getKey() {
171                     return e.getKey();

172                 }
173                 public V getValue() {
174                     return e.getValue().value;
175                 }
176                 public V setValue(V value) {

177                     TimedValue<V> old = e.getValue();
178                     e.getValue().value = value;
179                     if (old != null) {
180                         return old.value;

181                     } else {
182                         return null;
183                     }
184                 }

185             });
186         }
187         return set;
188     }
189
190     public TimedValue<V> getTimedValue(Object key) {

191         return this.map.get(key);
192     }
193
194     public V get(Object key) {
195         V value = null;

196         if (classCacheable != null && defaultCacheLoader != null) {
197             value = get((K)key, classCacheable, defaultCacheLoader, null);
198         } else {

199             TimedValue<V> item = this.map.get(key);
200             if (item == null) {
201                 value = null;

202             } else {
203                 value = item.value;
204             }
205         }
206         return value;

207     }
208
209     public V get(K key, Cacheable cacheable, CacheLoader<K, V> loader, boolean[] wasCached) {
210         TimedValue<V> item = this.map.get(key);

211         V value = null;
212         ReentrantLock lock = null;
213
214         //expire old entries
215         if (System.currentTimeMillis() - this.lastexpireEntriesTime > this.expireEntriesPeriod) {

216             expireEntries(cacheable);
217         }
218
219
220         try {
221             synchronized(this) {

222                 if (lockSync && cacheable.synchronizeAccess()) {
223                     lock = lock(key);
224                 }
225             }
226             //

227             if (item == null) {
228                 // load initial value
229                 value = reloadSynchronously(key, loader);
230
231             } else {

232
233                 value = item.value;
234
235                 boolean cached = true;
236                 if (cacheable.timeoutInSecs() > 0 && item.isExpired(cacheable.timeoutInSecs())) {

237
238                     if (! cacheable.canReloadAsynchronously()) {
239                         // ---> reload it now - don't used cached value
240                         cached = false;

241                         log.debug("Reloading expired entry synchronously " + key);
242                         value = reloadSynchronously(key, loader);
243                     } else if (item.markUpdating()) {

244                         log.debug("Reloading expired entry asynchronously " + key);
245                         reloadAsynchronously(key, loader);
246                     }
247                 }
248                 if (wasCached != null) {

249                     wasCached[0] = cached;
250                 }
251             }
252         } finally {

253             if (lock != null) {
254                 lock.unlock();
255                 locks.remove(key);
256                 //log.debug("Unlocking " + key);

257             }
258         }
259         return value;
260     }
261
262 

263     @Override
264     public String toString() {
265         return super.toString() + "--" + map;

266     }
267
268
269     private ReentrantLock lock(Object key) {
270         ReentrantLock lock = null;

271         synchronized (locks) {
272             lock = locks.get(key);
273             if (lock == null) {
274                 lock = new ReentrantLock();

275                 locks.put(key, lock);
276             }
277         }
278         //log.debug("Locking " + key);
279         lock.lock();

280         return lock;
281     }
282
283     private V reloadSynchronously(final K key, final CacheLoader<K, V> loader) {

284         try {
285             V value = loader.loadCache(key);
286             put(key, value);
287             //log.info("------reloadSynchronously loaded key " + key + ", cache size: " + this.size() + " -- " + System.identityHashCode(map));

288             return value;
289         } catch (Exception e) {
290             log.error("Failed to load " + key, e);

291             throw new RuntimeException("Failed to load " + key + " for " + classCacheable, e);
292         }

293     }
294
295
296     private void reloadAsynchronously(final K key, final CacheLoader<K,V> loader) {

297         if (log.isDebugEnabled()) {
298             log.debug("requesting reload for "+key.toString()+": "+Thread.currentThread().getName());
299         }
300         executorService.submit(new Callable<V>() {

301             public V call() throws Exception {
302                 if (log.isDebugEnabled()) {
303                     log.debug("reloading for "+key.toString()+": "+Thread.currentThread().getName());

304                 }
305                 return reloadSynchronously(key, loader);
306             }
307         });
308     }

309
310     private void expireEntries(final Cacheable cacheable) {
311         Iterator<TimedValue<V>> it = new ArrayList<TimedValue<V>>(this.map.values()).iterator();

312         while (it.hasNext()) {
313             TimedValue<V> value = it.next();
314             if (cacheable.timeoutInSecs() > 0 && value.isExpired(cacheable.timeoutInSecs())) {

315                 it.remove();
316             }
317         }
318         this.lastexpireEntriesTime = System.currentTimeMillis();
319     }
320 }

321
322

Session Tracking

I added a listener to keep track of active users/sessions (who accessed the system in last 5 mintutes) as opposed to all sessions which may take 30 minutes or more to expire.

 1 public class SessionLogger implements HttpSessionListener, HttpSessionBindingListener {

 2     private transient static Log log = LogFactory.getLog(SessionLogger.class);
 3     private static int sessionCount;

 4     private static final String OBJECTS_IN_SESSION_COUNT = "Objects in Session";
 5     private static final long ACTIVE_THRESHOLD = 5 * 60 * 1000; // 5 minutes

 6     private static Map<HttpSession, HttpSession> activeSessions = Collections.synchronizedMap(new HashMap<HttpSession, HttpSession>());
 7     private static Map<HttpSession, String> allUsers = Collections.synchronizedMap(new HashMap<HttpSession, String>());

 8     private static Map<HttpSession, String> activeUsers = Collections.synchronizedMap(new HashMap<HttpSession, String>());
 9 

10     private static void checkActiveSessions() {
11         long now = System.currentTimeMillis();
12         Iterator<HttpSession> it = activeSessions.keySet().iterator();

13         activeUsers.clear();
14         while (it.hasNext()) {
15             HttpSession session = it.next();
16             if (now-session.getLastAccessedTime() > ACTIVE_THRESHOLD) {

17                 it.remove();
18             } else {
19                 activeUsers.put(session,(String) session.getAttribute(UserUtil.OTB_REAL_USER));
20             }
21         }

22         ProfileDataCollector.getInstance().add("TotalSessions", String.valueOf(sessionCount));
23         ProfileDataCollector.getInstance().add("ActiveSessions", String.valueOf(activeSessions.size()));
24         ProfileDataCollector.getInstance().add("ActiveUsers", activeUsers.values().toString());
25         ProfileDataCollector.getInstance().add("AllUsers", allUsers.values().toString());

26     }
27
28     public void sessionCreated(HttpSessionEvent se) {
29         HttpSession session = se.getSession();
30         synchronized (this) {

31             sessionCount++;
32             activeSessions.put(session, session);
33             activeSessions.put(session, session);
34             checkActiveSessions();
35         }
36         allUsers.put(session, (String) session.getAttribute(UserUtil.OTB_REAL_USER));

37     }
38
39     public void sessionDestroyed(HttpSessionEvent se) {
40         HttpSession session = se.getSession();
41         allUsers.remove(session);

42         synchronized (this) {
43             sessionCount--;
44             activeSessions.remove(session);
45             checkActiveSessions();
46         }

47     }
48
49     public void valueBound(HttpSessionBindingEvent event) {
50         String username = (String) event.getSession().getAttribute(UserUtil.OTB_REAL_USER);
51         Integer old = (Integer) event.getSession().getAttribute(OBJECTS_IN_SESSION_COUNT);

52         if (old == null) {
53             old = new Integer(0);
54         }

55         Integer count = new Integer(old.intValue()+1);
56         event.getSession().setAttribute(OBJECTS_IN_SESSION_COUNT, count);
57         ProfileDataCollector.getInstance().add("TotalSessionValues", String.valueOf(count));
58     }

59
60     public void valueUnbound(HttpSessionBindingEvent event) {
61         String username = (String) event.getSession().getAttribute(UserUtil.OTB_REAL_USER);
62         Integer old = (Integer) event.getSession().getAttribute(OBJECTS_IN_SESSION_COUNT);
63         if (old == null) {

64             old = new Integer(0);
65         }
66         Integer count = new Integer(old.intValue()-1);

67         event.getSession().setAttribute(OBJECTS_IN_SESSION_COUNT, count);
68         ProfileDataCollector.getInstance().add("TotalSessionValues", String.valueOf(count));
69     }
70 }
71

Publishing profiling data

I then added code that needs to be profiled. Though, I have used AspectJ in past for this kind of work, but for now I am just adding this code where needed, e.g.

 1 try {

 2     ProfileDataCollector.getInstance().increment("ActiveSearches");
 3     ProfileDataCollector.getInstance().lapse("SearchLapse");
 4 ...

 5
 6 } finally {
 7      ProfileDataCollector.getInstance()..increment("TotalSearches");
 8      ProfileDataCollector.getInstance()..decrement("ActiveSearches");

 9      ProfileDataCollector.getInstance()..elapsed("SearchLapse");
10 }

Viewing Profile Data

Finally, I added a servlet to return profiling data with AJAX call, e.g.

  1 import java.io.IOException;

  2
  3 import javax.servlet.ServletException;
  4 import javax.servlet.http.HttpServlet;
  5 import javax.servlet.http.HttpServletRequest;

  6 import javax.servlet.http.HttpServletResponse;
  7 public class ProfileServlet extends HttpServlet {

  8     protected void doPost(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException {
  9         response.setContentType("text/plain");

 10         String format = request.getParameter("format");
 11         if ("line".equals(format)) {
 12             response.getWriter().println(getProfileLine());

 13         } else if ("json".equals(format)) {
 14             response.getWriter().println(getProfileJson());
 15         } else {

 16             response.getWriter().println(getProfileTable());
 17         }
 18     }
 19
 20     protected void doGet(HttpServletRequest req, HttpServletResponse rsp) throws ServletException, IOException {

 21         doPost(req, rsp);
 22     }
 23
 24
 25     public static String[][] getProfileData() {

 26         return ProfileDataCollector.getInstance().getProfileData("");
 27     }
 28
 29     public static String getProfileLine() {

 30         StringBuilder sb = new StringBuilder();
 31         String[][] data = getProfileData();
 32         for (String[] param: data) {

 33             sb.append(param[0] + "=" + param[1]);
 34         }
 35         return sb.toString();

 36     }
 37
 38     public static String getProfileTable() {
 39         StringBuilder sb = new StringBuilder("<table width='100%' border='2'>");

 40         String[][] data = getProfileData();
 41         sb.append("<tr>");
 42         for (String[] param: data) {
 43             sb.append("<th>" + param[0] + "</th>");

 44         }
 45         sb.append("</tr>");
 46         sb.append("<tr>");
 47         for (String[] param: data) {

 48             sb.append("<td>" + param[1] + "</td>");
 49         }
 50         sb.append("</tr>");

 51         sb.append("</table>");
 52         return sb.toString();
 53     }
 54 

 55     public static String getProfileJson() {
 56         StringBuilder sb = newJsonString();
 57         String[][] data = getProfileData();
 58         for (String[] param: data) {

 59             appendJsonString(sb, param[0], param[1]);
 60         }
 61         return endJsonString(sb);

 62     }
 63
 64
 65     protected static StringBuilder newJsonString() {

 66         return new StringBuilder('[');
 67     }
 68 

 69     protected static void appendJsonString(StringBuilder sb, String name, String id) {
 70         if (sb.length() > 1) {

 71             sb.append(',');
 72         }
 73         sb.append("{ name:'").append(name.replace("'", "\'")).append("', id:'").append(id.replace("'", "\'")).append(

 74         "' }");
 75     }
 76
 77     protected static String endJsonString(StringBuilder sb) {

 78         sb.append(']');
 79         return sb.toString();
 80     }
 81 

 82     protected static void addJsonProperty(StringBuilder sb, String name, String value) {
 83         sb.append(""" + name + """);

 84         sb.append(":");
 85         sb.append(""" + value + """);
 86     }

 87
 88     protected static void startJsonObj(StringBuilder sb) {
 89         sb.append("{");

 90
 91     }
 92
 93     protected static void addNewJsonProperty(StringBuilder sb) {

 94         sb.append(",");
 95
 96     }
 97
 98     protected static void endJsonObj(StringBuilder sb) {

 99         sb.append("}");
100
101     }
102 }
103

And then a JSP to show the results using AJAX calls with Prototype library:

       <script src="/otb-static/scriptaculous/prototype.js" type="text/javascript"></script>

        <script>
        var updater = null;
        function initAjax() {
            updater = new Ajax.PeriodicalUpdater('profile_div', '/profile?format=table', {
                method: 'get',
                insertion: Insertion.Top,
                frequency: 15,
                decay: 2
                });
        }
        function startRequest() {
            updater.start();
        }
        function stopRequest() {
            updater.stop();
        }
        </script>
   </head>
    <body onLoad="initAjax(), startRequest()">
    <ul id="profile_div">
    </ul>

Testing

I then wrote a unit test to call various actions of the web application, I am going to use my old pal Grinder to do some real load testing and monitor the health of the server. I am not showing the test here, because it’s very application specific.

February 5, 2008

Does experience matter?

Filed under: Computing — admin @ 1:27 pm

I came across a blog entry of DHH, Years of irrelevance
and and an old blog post of Jeff Atwood on Skill Disparities in Programming. Both of them cover somewhat similar topic, i.e. are programmers with more experience better than programmers with less experience. For example, DHH asserts that after six month of experience with a technology you can be pretty experienced. Similarly, Jeff shows various research findings where person with just two year experience is just as good as person with seven or more years of experience. This topic often comes up with job postings when companies require applicant to have certain number of years of experience. Martin Fowler also spoke recently in his blog entry PreferDesignSkills, where he prefers person with broad experience in design and programming than a specialized person. I have seen my share of discrimination in job market when recruiter is looking for X years of experience or is only looking for person with Websphere and would not consider Weblogic experience. It is even worse when new technology is involved and I faced similar discrimination when transitioning from J2EE to Rails. I totally agree that requiring experience with some technology does not matter because a smart person can easily learn it. This also has been discussed many times by Scott Ambler in Generalists vs Specialists, which I have quoted in my website for many years. Since, there is a considerable amount of difference between productivity and quality between programmers, this is an important question. Based on my twenty years of programming experience with sixteen years doing professionally I agree with these findings. I have seen people with twenty years of experience doing same job in big companies with no desire to learn anything else. I have learned it’s incredibly important to diversify your skills and be a generalist than specialist. It is the only way a programmer with more years of experience can be more valuable than programmer with fewer years of experience. When hiring I would look for a person with broad skills who has track record of getting things done and has passion to learn new things. As Joel Spolsky often says when hiring look for smart people who get the things done. There are always new things coming and a good programmer will find a way to learn that in short amount of time as DHH mentioned. I have worked as software developer and systems engineer/administrator and that helped my understanding of overall systems. I have used various languages over the years including Basic, FORTRAN, COBOL, C, C++, Java, PERL, Python, Ruby, Erlang, Haskell. As Dave Thomas talks about learning new language to think differently, and it can show you finding new solutions.

Besides being generalist, another way experience matters is with design. Though with small system, productivity and quality of experienced and junior programmer may seem similar, but I have found with larger systems quality does show up. (Note, I am only considering smart programmers with varying number of years here and not considering really bad programmers.) I have observed that design of experienced programmer (more than five years of experience) will be more flexible because he/she would have likely worked on similar problem before (which may also give productivity advantage). I have also found junior programmers struggle with roles and responsibilities (Rebecca J Wirfs) and good object oriented design (or domain driven design). I have observed that senior programmers have better understanding of things like separation of concerns, modularization, encapsulation, loose coupling and scalability.

In nutshell, any good programmer can learn a new technology very quickly and can solve any problem, but I believe experienced programmer with more general experience will be more valuable than junior programmer and quality of design of a senior programmer with broad experience will be much better in terms of good design principles and -ilities. I think, design is something you learn over the years because for each design decision you may be using thousands of small lessons you have learned over the years. As far as development teams are concerned, I like to have one really experienced programmer and others with junior to mid level experience. This gives a good apprenticeship environment for junior people to learn as software development is still an art. Finally, as everything in software, there aren’t hard rules and everything depends on the environment and people.

December 23, 2007

Released ErlSDB 0.1

Filed under: Erlang,SimpleDB,Web Services — admin @ 7:09 pm

I started working on an Erlang library to access Amazon’s SimpleDB web service and I released an early version of the library this weekend. Here are some notes on its usage:
Installing

svn checkout http://erlsdb.googlecode.com/svn/trunk/ erlsdb-read-only

Building

make

Testing

edit Makefile and add access key and secret key, then type make test

Usage

Take a look at test/erlsdb_test.erl to learn usage, here is a sample code

Starting Server

erlsdb:start(type,
    	[#sdb_state{
		access_key = "YourAccessKey",
		secret_key = "YourSecretKey",
		domain = "YourDomain"
		}
	])

Creating Domain

    erlsdb:create_domain()

Note that the server will use the domain that was passed during initialization.

Listing all Domains

    {ok, List, _} = erlsdb:list_domains()

Deleting Domain

    erlsdb:delete_domain()

Adding an item

    Attributes = lists:sort([
	["StreetAddress", "705 5th Ave"],
        ["City", "Seattle"],
        ["State", "WA"],
        ["Zip", "98101"]
	]),
    erlsdb:put_attributes("TccAddr", Attributes)

Retrieving an item

    {ok, UnsortedAttrs} = erlsdb:get_attributes("TccAddr")

Deleting an item

    erlsdb:delete_attributes("TccAddr"),

December 16, 2007

Accessing SimpleDB using Java and Erlang

Filed under: Computing — admin @ 10:05 am

Amazon has recently announced limited beta availability of a new web service SimpleDB, which is somewhat similar to Amazon’s Dynamo service can store key-value based data. However, Amazon’s Dynamo service is not publicly available. The SimpleDB web service provides both REST and SOAP based APIs, though the REST APIs are really XML over HTTP and use simple query and RPC style parameters to create or retrieve data. You can use either GET or POST to submit HTTP requests. A key concept in SimpleDB are Domains which are comparable to the buckets in S3 and provides naming scope. Similar to S3, you are limited to the number of domains you can create. Inside domain, you can create rows of key/value pairs. The collection of key/value pair for a single row is called Item. When you save an item, you give it a name. Unlike a row in relational database, you can save items with varying number of of key/value pairs to the same domain. It is suggested that you use that name to retrieve the item. Though, SimpleDB also supports simple query language, but it only works if the query takes less than five seconds. SimpleDB is not meant to be used in a relational style, rather you are encouraged to store all related data you need with the item. However, there is limit on number of key/values you can save in an item (~255) , size of key/value (~1024) or total size of data in a single domain (~10 GB). All keys and values must be String. This also means in order for queries to work, the data must be stored in lexicographical format. Another big difference with the relational database is that SimpleDB is designed for high availability and it guarantees eventual consistency. Also, it provides transactions only at a single key/value pair. Thus, it is not suitable for applications that require higher level of consistency or transactions (See CAP principle or BASE transactions). On the other hand due to high availability, you can built highly scalable applications. Since the data is replicated across multiple servers, you may also get performance benefit by using multiple threads when reading a number of items concurrently.

Following are the core APIs that provide basic CRUD operations:

  • Create/Delete/List domains
  • Save attributes
  • Update attributes that replace previously stored attributes
  • Read attributes using Item name with option to query using attribute key/value pairs
  • Delete attributes

I have been using SimpleDB for a few months for some internal applications at Amazon. Here I am showing a Java and Erlang client code to access SimpleDB. Since, the service is actually not available for public yet so you may not be able to try it.

Java Client

  1 import java.io.BufferedReader;
  2 import java.io.IOException;
  3 import java.io.InputStreamReader;
  4 import java.io.UnsupportedEncodingException;
  5 import java.net.HttpURLConnection;
  6 import java.net.MalformedURLException;
  7 import java.net.URL;
  8 import java.net.URLEncoder;
  9 import java.security.InvalidKeyException;
 10 import java.security.NoSuchAlgorithmException;
 11 import java.text.Collator;
 12 import java.text.FieldPosition;
 13 import java.text.NumberFormat;
 14 import java.text.SimpleDateFormat;
 15 import java.util.ArrayList;
 16 import java.util.Date;
 17 import java.util.HashMap;
 18 import java.util.List;
 19 import java.util.Map;
 20 import java.util.TreeMap;
 21
 22 import javax.crypto.Mac;
 23 import javax.crypto.spec.SecretKeySpec;
 24 import javax.xml.parsers.DocumentBuilder;
 25 import javax.xml.parsers.DocumentBuilderFactory;
 26 import javax.xml.parsers.ParserConfigurationException;
 27
 28 import org.w3c.dom.Document;
 29 import org.w3c.dom.Element;
 30 import org.w3c.dom.Node;
 31 import org.w3c.dom.NodeList;
 32 import org.xml.sax.InputSource;
 33 import org.xml.sax.SAXException;
 34
 35 /**
 36  * This class provides business delegate to access SDB web service using REST approach.
 37  *
 38  */
 39 public class SdbDelegate {
 40     private static final String DEFAULT_ENCODING = "UTF-8";
 41     private static final String HMAC_SHA1_ALGORITHM = "HmacSHA1";
 42     private static final String MORETOKEN_STRING = "MoreToken";
 43
 44     private final String accessId;
 45     private final String secretKey;
 46     private final String uri;
 47     private final String version;
 48
 49     public static class Response {
 50         public final Document rootDocument;
 51         public final Element rootElement;
 52         public final String requestId;
 53         public final String boxUsage;
 54         public final String moreToken;
 55         public final Object result;
 56         public Response(Document rootDocument, Element rootElement, String requestId, String boxUsage, String moreToken, Object result) {
 57             this.rootDocument = rootDocument;
 58             this.rootElement = rootElement;
 59             this.requestId = requestId;
 60             this.boxUsage = boxUsage;
 61             this.moreToken = moreToken;
 62             this.result = result;
 63         }
 64     }
 65
 66     public SdbDelegate() {
 67         this("yourkey", "yoursecret", "http://sdb.amazonaws.com", "2007-11-07");
 68     }
 69
 70
 71     public SdbDelegate(String accessId, String secretKey, String uri, String version) {
 72         this.accessId = accessId;
 73         this.secretKey = secretKey;
 74         this.uri = uri;
 75         this.version = version;
 76     }
 77
 78
 79     public void delete(String bucket, String name) {
 80         deleteAttributes(bucket, name);
 81     }
 82
 83     public void createDomain(String domainName) {
 84         Map<String,String> parameters = getBaseParameters("CreateDomain");
 85         parameters.put("DomainName",encode(domainName));
 86         invoke(parameters);
 87     }
 88
 89     public void deleteDomain(String domainName) {
 90         Map<String,String> parameters = getBaseParameters("DeleteDomain");
 91         parameters.put("DomainName",encode(domainName));
 92         invoke(parameters);
 93     }
 94
 95     public List<String> listDomains() {
 96         return listDomains(null);
 97     }
 98
 99     public List<String> listDomains(StringBuilder moreToken) {
100         return (List<String>) listDomains(moreToken, null);
101     }
102
103     public List<String> listDomains(StringBuilder moreToken, Integer maxResults) {
104         Map<String,String> parameters = getBaseParameters("ListDomains");
105         if (moreToken != null && moreToken.length() > 0) {
106             parameters.put(MORETOKEN_STRING, moreToken.toString());
107         }
108         if (maxResults != null) {
109             parameters.put("MaxResults",maxResults.toString());
110         }
111         Response response = invoke(parameters);
112         if (moreToken != null) {
113             moreToken.setLength(0);
114             moreToken.append(response.moreToken);
115         }
116
117         Element rootElement = (Element) response.rootElement;
118         NodeList domainNodes = rootElement.getElementsByTagName("DomainName");
119         List<String> domains = new ArrayList<String>();
120         for (int i = 0; i < domainNodes.getLength(); ++i) {
121             Node domainNode = domainNodes.item(i);
122             Node domainNameNode = domainNode.getFirstChild();
123             String domainName = domainNameNode.getNodeValue();
124             domains.add(domainName);
125         }
126
127         //
128         NodeList moreTokenNodes = rootElement.getElementsByTagName(MORETOKEN_STRING);
129         if (moreTokenNodes.getLength() > 0) {
130             Element moreTokenElement = (Element) moreTokenNodes.item(0);
131             moreTokenElement.normalize();
132             Node tokenNode = moreTokenElement.getFirstChild();
133             String newMoreToken = tokenNode.getNodeValue();
134             if (moreToken != null && (newMoreToken.compareToIgnoreCase("null") == 0 || newMoreToken.length() == 0)) {
135                 moreToken.setLength(0);
136             }
137         }
138         return domains;
139     }
140
141
142     public void deleteAttributes(String domain, String identifier) {
143         deleteAttributes(domain, identifier, new HashMap<String, String>());
144     }
145
146     /**
147      * Delete specified atrributes and/or values from item
148      *
149      * @param   attributes      set of attributes
150      * @return                  result of the operation
151      */
152     public void deleteAttributes(String domain, String identifier, Map<String, String> attributes) {
153         Map<String,String> parameters = getBaseParameters("DeleteAttributes");
154         parameters.put("ItemName",identifier);
155         parameters.put("DomainName",domain);
156         if (attributes != null) {
157             int i = 0;
158             for (Map.Entry<String, String> attr : attributes.entrySet()) {
159                 String name = "Attribute."+ i + ".Name";
160                 parameters.put(name,attr.getKey());
161                 if (attr.getValue() != null) {
162                     String value = "Attribute."+ i + ".Value";
163                     parameters.put(value, attr.getValue());
164                 }
165                 ++i;
166             }
167         }
168         invoke(parameters);
169     }
170
171
172     public void putAttributes(String domain, String identifier, Map<String, String> attributes) {
173         putAttributes(domain, identifier, attributes, true);
174     }
175
176
177     public void putAttributes(String domain, String identifier, Map<String, String> attributes, boolean replace) {
178         Map<String,String> parameters = getBaseParameters("PutAttributes");
179         parameters.put("ItemName", identifier);
180         parameters.put("DomainName", domain);
181         int i = 0;
182         if (attributes.size() > 255) throw new IllegalArgumentException("more than 255 attributes not supported");
183         System.out.println("Writing " + attributes.size() + " attributes for " + identifier);
184         for (Map.Entry<String, String> attr : attributes.entrySet()) {
185             if (attr.getKey().length() > 700) throw new IllegalArgumentException("name '" + attr.getKey() + "' more than 700 bytes");
186             if (attr.getValue().length() > 700) throw new IllegalArgumentException("value '" + attr.getValue() + "' of name '" + attr.getKey() + "' more than 700 bytes");
187             String name = "Attribute."+ i + ".Name";
188             parameters.put(name,attr.getKey());
189             String value = "Attribute."+ i + ".Value";
190             parameters.put(value, attr.getValue());
191             i++;
192         }
193         //
194         if (replace) {
195             parameters.put("Replace","true");
196         }
197         invoke(parameters);
198    }
199
200     public void replaceAttributes(String domain, String identifier, Map<String, String> attributes) {
201          putAttributes(domain, identifier, attributes,true);
202     }
203
204
205     public Map<String, String> getAttributes(String domain, String identifier){
206         return getAttributes(domain, identifier, new HashMap<String, String>());
207     }
208
209     public Map<String, String> getAttributes(String domain, String identifier, Map<String, String> attributes) {
210         String NAME_STRING = "Name";
211         String VALUE_STRING = "Value";
212         String ATTRIBUTE_STRING = "Attribute";
213
214         Map<String,String> parameters =  getBaseParameters("GetAttributes");
215         parameters.put("ItemName", identifier);
216         parameters.put("DomainName",domain);
217
218         int k = 0;
219         for (Map.Entry<String, String> attr : attributes.entrySet()) {
220             String name = "AttributeName."+k;
221             parameters.put(name,attr.getKey());
222             ++k;
223         }
224
225         //
226         Response response =  invoke(parameters);
227         Element rootElement = (Element) response.rootElement;
228         NodeList attributeNodes = rootElement.getElementsByTagName(ATTRIBUTE_STRING);
229         Map<String, String> returnAttributes = new TreeMap<String, String>(String.CASE_INSENSITIVE_ORDER);
230         //
231         for (int i = 0; i < attributeNodes.getLength(); ++i) {
232             Node attributeNode = attributeNodes.item(i);
233
234             Element attributeElement = (Element)attributeNode;
235
236             NodeList nameNodes = attributeElement.getElementsByTagName(NAME_STRING);
237             Node nameNode = nameNodes.item(0);
238             nameNode.normalize();
239             Node attributeNameNode = nameNode.getFirstChild();
240             String attributeName = attributeNameNode.getNodeValue();
241             NodeList valueNodes = attributeElement.getElementsByTagName(VALUE_STRING);
242             Node valueNode = valueNodes.item(0);
243             valueNode.normalize();
244             Node attributeValueNode = valueNode.getFirstChild();
245             String attributeValue = attributeValueNode.getNodeValue();
246             returnAttributes.put(attributeName,attributeValue);
247         }
248         return returnAttributes;
249     }
250
251
252     protected Response invoke(Map<String,String> parameters) {
253         try {
254             Document document = getResponse(parameters);
255             Element rootElement = document.getDocumentElement();
256             Element responseStatus = (Element) rootElement.getElementsByTagName("ResponseStatus").item(0);
257             Node requestIdNode = responseStatus.getElementsByTagName("RequestId").item(0);
258             String requestId = null;
259             if (requestIdNode != null && requestIdNode.getFirstChild() != null) {
260                 requestId = requestIdNode.getFirstChild().getNodeValue();
261             }
262             String boxUsage = null;
263             Node boxUsagesNode = responseStatus.getElementsByTagName("BoxUsage").item(0);
264             if (boxUsagesNode != null && boxUsagesNode.getFirstChild() != null) {
265                 boxUsage = boxUsagesNode.getFirstChild().getNodeValue();
266             }
267             return new Response(document, rootElement, requestId, boxUsage, null, null);
268         } catch (RuntimeException e) {
269             throw e;
270         } catch (Exception e) {
271             throw new RuntimeException(e);
272         }
273     }
274
275
276     private static String getSignature(String data, String key) throws NoSuchAlgorithmException, InvalidKeyException {
277         SecretKeySpec signingKey = new SecretKeySpec(key.getBytes(),HMAC_SHA1_ALGORITHM);
278         Mac mac = Mac.getInstance(HMAC_SHA1_ALGORITHM);
279         mac.init(signingKey);
280         byte[] rawHmac = mac.doFinal(data.getBytes());
281         return new sun.misc.BASE64Encoder().encode(rawHmac);
282     }
283
284     private Map<String,String> getBaseParameters(String action) {
285         Collator myCollator = Collator.getInstance();
286         // Note that these parameters must be sent in order
287         Map<String,String> baseParameters = new TreeMap<String,String>(myCollator);
288         baseParameters.put("Action", action);
289         baseParameters.put("Version", version);
290         baseParameters.put("AWSAccessKeyId", accessId);
291         baseParameters.put("SignatureVersion", "1");
292
293         Date curTime = new Date();
294         SimpleDateFormat dateFormat = new SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ssZ");
295         StringBuffer timeBuffer = new StringBuffer();
296         FieldPosition fp = new FieldPosition(NumberFormat.INTEGER_FIELD);
297         dateFormat.format(curTime, timeBuffer, fp);
298         timeBuffer.insert(timeBuffer.length()-2, ":");
299
300         baseParameters.put("Timestamp",timeBuffer.toString());
301         return baseParameters;
302     }
303
304     private Document getResponse(Map<String,String> parameters) throws UnsupportedEncodingException, MalformedURLException, IOException, NoSuchAlgorithmException, SAXException, ParserConfigurationException, InvalidKeyException {
305         String path = uri + getUrlPath(parameters);
306         URL url = new URL(path);
307         HttpURLConnection connection = (HttpURLConnection) url.openConnection();
308         connection.setDoOutput(true);
309         connection.setRequestMethod("GET");
310         connection.connect();
311
312         System.out.println("Connecting to " + url);
313         int respCode = connection.getResponseCode();
314
315         if (respCode == HttpURLConnection.HTTP_OK) {
316             InputSource inputSource = new InputSource(new BufferedReader(new InputStreamReader(connection.getInputStream())));
317             DocumentBuilderFactory documentBuilderFactory = DocumentBuilderFactory.newInstance();
318             DocumentBuilder documentBuilder = documentBuilderFactory.newDocumentBuilder();
319             return documentBuilder.parse(inputSource);
320         } else {
321             InputSource inputSource = new InputSource(new BufferedReader(new InputStreamReader(connection.getErrorStream())));
322             DocumentBuilderFactory documentBuilderFactory = DocumentBuilderFactory.newInstance();
323             DocumentBuilder documentBuilder = documentBuilderFactory.newDocumentBuilder();
324             Document document = documentBuilder.parse(inputSource);
325             Element rootElement = document.getDocumentElement();
326             Node requestIdNode = rootElement.getElementsByTagName("RequestID").item(0);
327             StringBuilder errorMessage = new StringBuilder("Failed to invoke input request '" + path + "'");
328
329             //
330             if (requestIdNode != null && requestIdNode.getFirstChild() != null) {
331                 errorMessage.append(" with requestId " + requestIdNode.getFirstChild().getNodeValue());
332                 Element errors = (Element) rootElement.getElementsByTagName("Errors").item(0);
333                 if (errors != null) {
334                     Element error = (Element) errors.getElementsByTagName("Error").item(0);
335                     if (error != null) {
336                         Node messageNode = error.getElementsByTagName("Message").item(0);
337                         if (messageNode != null) errorMessage.append(" due to " + messageNode.getFirstChild().getNodeValue());
338                     }
339                 }
340             }
341             throw new RuntimeException(errorMessage.toString());
342         }
343     }
344
345
346    private String encode(String name){
347         try {
348              return URLEncoder.encode(name, DEFAULT_ENCODING);
349         } catch (UnsupportedEncodingException ex) {
350             throw new RuntimeException("Failed to encode '" + name + "'");
351         }
352     }
353
354
355     private String getUrlPath(Map<String,String> parameters) throws UnsupportedEncodingException, NoSuchAlgorithmException, InvalidKeyException {
356         StringBuilder queryString = new StringBuilder("?");
357         StringBuilder signatureData = new StringBuilder();
358         boolean first = true;
359         for (Map.Entry<String, String> param : parameters.entrySet()) {
360             if (first) {
361                 first = false;
362             } else {
363                 queryString.append("&");
364             }
365
366             signatureData.append(param.getKey() + param.getValue());
367             queryString.append(param.getKey() + "=" + URLEncoder.encode((String)param.getValue(), DEFAULT_ENCODING));
368         }
369
370         queryString.append("&Signature=" + URLEncoder.encode(getSignature(signatureData.toString(),secretKey), DEFAULT_ENCODING));
371         return queryString.toString();
372     }
373 }
374

Java Client Test

 1 import junit.framework.TestCase;
 2 import junit.framework.Test;
 3 import junit.framework.TestSuite;
 4
 5 import java.util.List;
 6 import java.util.Map;
 7 import java.util.HashMap;
 8
 9 public class SdbDelegateTest extends TestCase {
10     private SdbDelegate fixture;
11     private String domainName;
12
13     public void setUp() {
14         this.fixture = new SdbDelegate();
15         this.domainName = "TestDomain";
16     }
17
18     public void testCreateDomain() {
19         fixture.createDomain(domainName);
20         assertTrue(fixture.listDomains().contains(domainName));
21     }
22
23     public void testDeleteDomain() {
24         fixture.deleteDomain(domainName);
25         assertFalse(fixture.listDomains().contains(domainName));
26     }
27
28
29     public void testGetPutAttributes() {
30         fixture.createDomain(domainName);
31         Map<String, String> attributes = new HashMap<String, String>();
32         attributes.put("StreetAddress", "705 5th Ave");
33         attributes.put("City", "Seattle");
34         attributes.put("State", "WA");
35         attributes.put("Zip", "98101");
36         fixture.putAttributes(domainName, "TCC", attributes, true);
37         Map<String, String> attributes2 = fixture.getAttributes(domainName, "TCC");
38         System.out.println("Retrieved " + attributes2);
39         assertEquals(attributes, attributes2);
40     }
41
42     public static Test suite() {
43         TestSuite suite = new TestSuite();
44         suite.addTestSuite(SdbDelegateTest.class);
45         return suite;
46     }
47
48     public static void main(String[] args) {
49         junit.textui.TestRunner.run(suite());
50     }
51 }
52

Erlang Client

  1 %%%-------------------------------------------------------------------
  2 %%% @author Shahzad Bhatti <bhatti@plexobject.com>
  3 %%% @doc
  4 %%%   Business delegate to access SimpleDB
  5 %%% @end
  6 %%%-------------------------------------------------------------------
  7 -module(sdb_delegate).
  8 -include_lib("xmerl/include/xmerl.hrl").
  9
 10
 11 %%%-------------------------------------------------------------------
 12 %%% Public APIs
 13 %%%-------------------------------------------------------------------
 14 -export([list_domains/0, list_domains/1, list_domains/2, delete_domain/1, get_attributes/2, get_attributes/3, create_domain/1, put_attributes/3, put_attributes/4, delete_item/2, delete_attributes/2, delete_attributes/3, replace_attributes/3]).
 15
 16
 17 %%%-------------------------------------------------------------------
 18 %%% Test Methods
 19 %%%-------------------------------------------------------------------
 20 -export([test/0]).
 21
 22
 23 %%%-------------------------------------------------------------------
 24 %%% Configuration Functions %%%
 25 %%%-------------------------------------------------------------------
 26 uri() ->
 27    "http://sdb.amazonaws.com?".
 28
 29
 30 version() ->
 31    "2007-11-07".
 32
 33 access_key() ->
 34    "yourkey".
 35
 36 secret_key() ->
 37     "secretkey".
 38
 39
 40
 41 %%%-------------------------------------------------------------------
 42 %%% Public APIs to access SimpleDB 
 43 %%%-------------------------------------------------------------------
 44 start() ->
 45     crypto:start(),
 46     inets:start().
 47
 48 stop() ->
 49     init:stop().
 50
 51
 52 list_domains() ->
 53     list_domains(nil, nil).
 54 list_domains(MoreToken) ->
 55     list_domains(MoreToken, nil).
 56
 57 list_domains(MoreToken, MaxResults) ->
 58     Base = base_parameters("ListDomains"),
 59     Base1 = if MoreToken == nil -> Base; true -> Base ++ [["MoreToken", MoreToken]] end,
 60     Base2 = if MaxResults == nil -> Base1; true -> Base1 ++ [["MaxResults", MaxResults]] end,
 61     Xml = rest_request(Base2),
 62     xml_values(xmerl_xpath:string("//DomainName/text()", Xml)).
 63
 64
 65 create_domain(Domain) ->
 66     Base = [["DomainName", url_encode(Domain)]|
 67                 base_parameters("CreateDomain")],
 68     rest_request(Base).
 69
 70 delete_domain(Domain) ->
 71     Base = [["DomainName", url_encode(Domain)]|
 72                 base_parameters("DeleteDomain")],
 73     rest_request(Base).
 74
 75 replace_attributes(Domain, ItemName, Attributes) ->
 76     put_attributes(Domain, ItemName, Attributes, true).
 77 put_attributes(Domain, ItemName, Attributes) ->
 78     put_attributes(Domain, ItemName, Attributes, false).
 79 put_attributes(Domain, ItemName, Attributes, Replace) ->
 80     Base = [["DomainName", url_encode(Domain)],
 81             ["ItemName", url_encode(ItemName)]|
 82                 base_parameters("PutAttributes")] ++
 83                 encode_attributes(Attributes),
 84     Base1 = if Replace == false -> Base; true -> Base ++ [["Replace", "true"]] end,
 85     rest_request(Base1).
 86
 87
 88 delete_item(Domain, ItemName) ->
 89     delete_attributes(Domain, ItemName).
 90
 91
 92 delete_attributes(Domain, ItemName) ->
 93     delete_attributes(Domain, ItemName, nil).
 94 delete_attributes(Domain, ItemName, AttributeNames) ->
 95     Base = [["DomainName", url_encode(Domain)],
 96             ["ItemName", url_encode(ItemName)]|
 97                 base_parameters("DeleteAttributes")] ++
 98                 encode_attribute_names(AttributeNames),
 99     rest_request(Base).
100
101 get_attributes(Domain, ItemName) ->
102     get_attributes(Domain, ItemName, nil).
103 get_attributes(Domain, ItemName, AttributeNames) ->
104     Base = [["DomainName", url_encode(Domain)],
105             ["ItemName", url_encode(ItemName)]|
106                 base_parameters("GetAttributes")] ++
107                 encode_attribute_names(AttributeNames),
108     Xml = rest_request(Base),
109     xml_names_values(xmerl_xpath:string("//Attribute", Xml)).
110
111
112 %%%-------------------------------------------------------------------
113 %%% Private Functions
114 %%%-------------------------------------------------------------------
115
116 rest_request(Params) ->
117     Url = uri() ++ query_string(Params),
118     Response = http:request(Url),
119     case Response of
120         {ok, {{_HttpVersion, StatusCode, _ErrorMessage}, _Headers, Body }} ->
121             error_logger:info_msg("URL ~p Status ~p~n", [Url, StatusCode]),
122             {Xml, _Rest} = xmerl_scan:string(Body),
123             %%%error_logger:info_msg("Xml ~p~n", [Xml]),
124             case StatusCode of
125                 200 ->
126                     Xml;
127                 _ ->
128                    Error = xml_values(xmerl_xpath:string("//Message/text()", Xml)),
129                    throw({Error})
130             end;
131         {error, Message} ->
132             case Message of
133                 timeout ->
134                     io:format("URL ~p Timedout, retrying~n", [Url]),
135                     sleep(1000),
136                     rest_request(Params);
137                 true ->
138                     throw({Message})
139             end
140     end.
141
142
143 query_string(Params) ->
144     Params1 = lists:sort(
145         fun([Elem1, _], [Elem2, _]) ->
146             string:to_lower(Elem1) > string:to_lower(Elem2) end,
147         Params),
148     {QueryStr, SignatureData} =
149         lists:foldr(fun query_string/2, {"", ""}, Params1),
150     QueryStr ++ "Signature=" ++ url_encode(signature(SignatureData)).
151
152
153 query_string([Key, Value], {QueryStr, SignatureData}) ->
154     QueryStr1 = QueryStr ++ Key ++ "=" ++ url_encode(Value) ++ "&",
155     SignatureData1 = SignatureData ++ Key ++ Value,
156     {QueryStr1, SignatureData1}.
157
158
159 encode_attributes(Attributes) when Attributes == nil ->
160     [];
161 encode_attributes(Attributes) ->
162     {Encoded, _} = lists:foldr(fun encode_attributes/2, {[], 0}, Attributes),
163     Encoded.
164
165 encode_attributes([Key, Value], {Encoded, I}) ->
166     KeyName = "Attribute." ++ integer_to_list(I) ++ ".Name",
167     KeyValue = "Attribute." ++ integer_to_list(I) ++ ".Value",
168     {[[KeyName, Key], [KeyValue, Value]|Encoded], I+1}.
169
170
171 encode_attribute_names(Attributes) when Attributes == nil ->
172     [];
173 encode_attribute_names(Attributes) ->
174     {Encoded, _} = lists:foldr(fun encode_attribute_names/2, {[], 0}, Attributes),
175     Encoded.
176
177 encode_attribute_names(Key, {Encoded, I}) ->
178     KeyName = "Attribute." ++ integer_to_list(I) ++ ".Name",
179     {[[KeyName, Key]|Encoded], I+1}.
180
181
182
183
184 %%%
185 % Converts a number into 2-digit 
186 %%%
187 two_digit(X) when is_integer(X), X >= 10 ->
188     integer_to_list(X);
189 two_digit(X) when is_integer(X), X < 10 ->
190     "0" ++ integer_to_list(X).
191
192 abs_two_digit(X) when X < 0 ->
193     two_digit(0-X);
194 abs_two_digit(X) when X >= 0 ->
195     two_digit(X).
196
197 %%%
198 % Returns Coordinated Universal Time (Greenwich Mean Time) time zone,
199 %%%
200 timestamp() ->
201     {{_, _, _}, {_LocalHour, _LocalMin, _}} = LocalDateTime = calendar:local_time(),
202     [{{Year, Month, Day}, {Hour, Min, Sec}}] =
203         calendar:local_time_to_universal_time_dst(LocalDateTime),
204     Z = gmt_difference(),
205     integer_to_list(Year) ++ "-" ++ two_digit(Month) ++ "-" ++ two_digit(Day)
206         ++ "T" ++ two_digit(Hour) ++ ":" ++ two_digit(Min) ++ ":" ++
207         two_digit(Sec) ++ Z.
208
209
210 gmt_difference() ->
211     "-08:00".
212
213
214
215 %%%
216 % Returns HMAC encoded access key
217 %%%
218 signature(Data) ->
219     hmac(secret_key(), Data).
220
221 hmac(SecretKey, Data) ->
222     http_base_64:encode(
223           binary_to_list(crypto:sha_mac(SecretKey, Data))).
224
225
226 base_parameters(Action) ->
227     [["Action", Action],
228      ["AWSAccessKeyId", access_key()],
229      ["Version", version()],
230      ["SignatureVersion", "1"],
231      ["Timestamp", timestamp()]].
232
233
234
235 %%%
236 % This method retrieves node value from the XML records that are returned
237 % after scanning tags.
238 %%%
239 xml_values(List) ->
240     lists:foldr(fun xml_values/2, [], List).
241
242 xml_values(#xmlText{value=Value}, List) ->
243     [Value|List].
244
245
246 xml_names_values(List) ->
247     lists:foldr(fun xml_names_values/2, [], List).
248
249 xml_names_values(Xml, List) ->
250     [ #xmlText{value=Name} ]  = xmerl_xpath:string("//Name/text()", Xml),
251     [ #xmlText{value=Value} ]  = xmerl_xpath:string("//Value/text()", Xml),
252     [[Name, Value]|List].
253
254
255 sleep(T) ->
256     receive
257     after T ->
258        true
259     end.
260
261
262 %%%
263 % URL encode - borrowed from CouchDB
264 %%%
265 url_encode([H|T]) ->
266     if
267         H >= $a, $z >= H ->
268             [H|url_encode(T)];
269         H >= $A, $Z >= H ->
270             [H|url_encode(T)];
271         H >= $0, $9 >= H ->
272             [H|url_encode(T)];
273         H == $_; H == $.; H == $-; H == $: ->
274             [H|url_encode(T)];
275         true ->
276             case lists:flatten(io_lib:format("~.16.0B", [H])) of
277                 [X, Y] ->
278                     [$%, X, Y | url_encode(T)];
279                 [X] ->
280                     [$%, $0, X | url_encode(T)]
281             end
282     end;
283 url_encode([]) ->
284     [].
285
286
287 %%%-------------------------------------------------------------------
288 %%% Test Functions
289 %%%-------------------------------------------------------------------
290
291 test_list_domains() ->
292     list_domains().
293
294 test_create_domain() ->
295     create_domain("TestDomain"),
296     ["TestDomain"] = lists:filter(
297                         fun(Elem) -> Elem == "TestDomain" end,
298                         list_domains()).
299
300
301 test_delete_domain() ->
302     delete_domain("TestDomain"),
303     [] = lists:filter(
304                         fun(Elem) -> Elem == "TestDomain" end,
305                         list_domains()).
306
307 test_put_get_attributes() ->
308     create_domain("TestDomain"),
309     Attributes = lists:sort([
310         ["StreetAddress", "705 5th Ave"],
311         ["City", "Seattle"],
312         ["State", "WA"],
313         ["Zip", "98101"]
314         ]),
315     put_attributes("TestDomain", "TCC", Attributes),
316     Attributes = lists:sort(get_attributes("TestDomain", "TCC")).
317
318 test_put_delete_attributes() ->
319     create_domain("TestDomain"),
320     Attributes = lists:sort([
321         ["StreetAddress", "705 5th Ave"],
322         ["City", "Seattle"],
323         ["State", "WA"],
324         ["Zip", "98101"]
325         ]),
326     put_attributes("TestDomain", "TCC", Attributes),
327     delete_attributes("TestDomain", "TCC"),
328     sleep(1000),        %% Let it sync
329     [] = get_attributes("TestDomain", "TCC").
330
331 test() ->
332     start(),
333     test_create_domain(),
334     test_delete_domain(),
335     test_put_get_attributes(),
336     test_put_delete_attributes(),
337     stop().
338

Note that in Erlang, I am hard coding time-zone difference, I wish Erlang had better support for time-zones, and URL encoding for UTF8.

Another word of caution, in order to use this code for real applications, you will need better handle on error processing especially because web services may timeout and you will need to retry.

You can download the code from here:

http://weblog.plexobject.com/SdbDelegate.java

http://weblog.plexobject.com/SdbDelegateTest.java

http://weblog.plexobject.com/sdb_delegate.erl

I am also adding the Erlang code to http://code.google.com/p/erlsdb/.

Further References:

http://awsdocs.s3.amazonaws.com/SDB/2007-11-07/sdb-dg-2007-11-07.pdf?

http://docs.amazonwebservices.com/AmazonSimpleDB/2007-11-07/DeveloperGuide/?

http://aws.amazon.com/simpledb

http://www.informationweek.com/news/showArticle.jhtml?articleID=204803008

You can send me comments or suggestions to improve this code at bhatti AT plexobject DOT com.

Update (Dec 17, 2007): I saw some comments on how unRESTful SimpleDB’s REST APIs are and I agree. Here is my attempt to RESTify them:

Create Domain
PUT http://sdb.amazonaws.com/DomainName

I am using PUT because I am defining the URL and per REST specs I should use PUT instead of POST. Also, I am relying on HTTP headers for authentication information.

Delete Domain
DELETE http://sdb.amazonaws.com/DomainName

List Domains
GET http://sdb.amazonaws.com?MaxDomains=XX&NextToken=YY
I am specifying parameters explicitly rather than using changing URI

Put Attributes
POST http://sdb.amazonaws.com/DomainName/ItemName
—POST BODY-BEGIN—
attributeName=attributeValue&…
—POST BODY-END—

I am specifying attributes as part of the POST body

Delete Attributes
DELETE http://sdb.amazonaws.com/DomainName/ItemName?attributeName1&…

Get Attributes
GET http://sdb.amazonaws.com/DomainName/ItemName?attributeName1&…

Query Items
GET http://sdb.amazonaws.com/DomainName/ItemName?queryExpression=…

Note that these URLs are based on Sam Ruby’s book on REST APIs. Also, Rails uses similar URLs for RESTful controllers.

December 4, 2007

Starting cluster of Erlang nodes on EC2

Filed under: Computing — admin @ 1:19 pm

Erlang is a functional language with strong support for concurrency and distribution. It is quite trivial to start Erlang on multiple hosts and connect them. However, you have to write custom scripts to start these nodes. For example, on a set of local machines, ssh with public/private keys can be used to start cluster of nodes. If you don’t have the computing environment, you can lease your servers using Amazon’s EC2 webservice. In this blog, I am going to show how to start Erlang nodes on the instances of EC2. I have broken the instructions into two parts, setting up the EC2 instances and starting the Erlang cluster.

Setting up EC2

Get an Account

If you don’t have S3 and EC2 account, you can get account for Amazon Simple Storage Service
and account for Amazon Elastic Compute Cloud.

Create X.509 certificate

Select the “AWS Access Key Identifiers under “Your Web Services Account” and follow the “Create New” button in this section to create a new X.509 certificate. Also, save them locally.

Download EC2 Toolkit

Download the Command Line Tools.

Setup Environment Variables

export EC2_HOME=
 export PATH=$PATH:$EC2_HOME/bin
export EC2_PRIVATE_KEY=$EC2_HOME/pk-AR7Z7HJUXAIIHL6CR43HD4PGN75KEJNV.pem
export EC2_CERT=$EC2_HOME/cert-AR7Z7HJUXAIIHL6CR43HD4PGN75KEJNV.pem

Finding a Suitable AMI

Use following command to view default images:

ec2-describe-images -o self -o amazon

Look for the line containing the public image identified by the ec2-public-images/getting-started.manifest.xml value in the third column. You can also use an image that I created, which includes Erlang distribution with an id of “ami-23c92c4a”.

Generating a Keypair

Use following command to create a keypair:

ec2-add-keypair gsg-keypair

create a file named id_rsa-gsg-keypair and paste everything between (and including) the “—–BEGIN RSA PRIVATE KEY—–” and “—–END RSA PRIVATE KEY—–” lines into it.

Running an Instance

Use following command to start EC2 instance:

ec2-run-instances ami-23c92c4a -k gsg-keypair

and you may see something like:

RESERVATION     r-d8fd14b1      275961154068    default
INSTANCE        i-1246b27b      ami-23c92c4a                    pending gsg-keypair     0               m1.small        2007-12-04T17:34:10+0000

Check status

ec2-describe-instances i-1246b27b

Authorizing Network Access to Your Instances

You can open certain ports you need using

ec2-authorize default -p 22
ec2-authorize default -p 80
ec2-authorize default -p 4369

Connecting to your Instance

ssh -i id_rsa-gsg-keypair root@ec2-72-44-51-166.z-1.compute-1.amazonaws.com

open browser to http://ec2-72-44-51-166.z-1.compute-1.amazonaws.com/

Creating an Image

Modifying an Existing Image

sed -i -e 's/Congratulations!/Congratulations Shahzad Bhatti/' /var/www/html/index.html

ls -l /var/www/html/index.html

Preparing for Bundling

Copy your private key and certificate to the machine being bundled.

cd $EC2_HOME
scp -i id_rsa-gsg-keypair  pk-AR7Z7HJUXAIIHL6CR43HD4PGN75KEJNV.pem cert-AR7Z7HJUXAIIHL6CR43HD4PGN75KEJNV.pem root@ec2-72-44-51-166.z-1.compute-1.amazonaws.com:/mnt
ec2-bundle-vol -d /mnt -k /mnt/pk-AR7Z7HJUXAIIHL6CR43HD4PGN75KEJNV.pem  -c /mnt/cert-AR7Z7HJUXAIIHL6CR43HD4PGN75KEJNV.pem -u your-id -s 1536

Bundling

ec2-upload-bundle -b ec2-bucket -m /mnt/image.manifest.xml -a key -s password

Registering the AMI

cd /home/shahbhat/ec2-api-tools-1.2-13740
ec2-register ec2-bucket/image.manifest.xml

Deregistering Your AMI

ec2-deregister ami-47c0252e

Removing Your AMI from Amazon S3

ec2-delete-bundle -b  -p image -a  -s  

Terminating Your Instances

ec2-terminate-instances i-b15bb0d8
/sbin/shutdown -h now

Setting up Erlang Cluster

Here is the fun and easy part:

Starting instances

I am going to start three instances from my Erlang image by opening three different shells and typing following commands:

ssh -i id_rsa-gsg-keypair root@ec2-67-202-33-171.compute-1.amazonaws.com erl -sname master -setcookie ABC
Eshell V5.5  (abort with ^G)
(master@domU-12-31-38-00-6C-81)1>

ssh -i id_rsa-gsg-keypair root@ec2-67-202-33-171.compute-1.amazonaws.com erl -sname slave1 -setcookie ABC
Eshell V5.5  (abort with ^G)
(slave1@domU-12-31-38-00-6C-81)1>

ssh -i id_rsa-gsg-keypair root@ec2-67-202-20-199.compute-1.amazonaws.com erl -sname slave2 -setcookie ABC
Eshell V5.5  (abort with ^G)
(slave2@domU-12-31-38-00-40-F6)1>

Checking status of instances

Starting instances take a couple of minutes and you can check status with following command:

ec2-describe-instances

and it showed me something like:

RESERVATION     r-f7fd149e      275961154068    default
INSTANCE        i-0146b268      ami-23c92c4a    ec2-67-202-33-171.compute-1.amazonaws.com       domU-12-31-38-00-6C-81.compute-1.internal       running gsg-keypair  0               m1.small        2007-12-04T17:29:39+0000
RESERVATION     r-c6fd14af      275961154068    default
INSTANCE        i-1046b279      ami-23c92c4a    ec2-67-202-27-186.compute-1.amazonaws.com       domU-12-31-38-00-35-D6.compute-1.internal       running gsg-keypair  0               m1.small        2007-12-04T17:33:47+0000
RESERVATION     r-d8fd14b1      275961154068    default
INSTANCE        i-1246b27b      ami-23c92c4a    ec2-67-202-20-199.compute-1.amazonaws.com       domU-12-31-38-00-40-F6.compute-1.internal       running gsg-keypair  0               m1.small        2007-12-04T17:34:10+0000

Connecting to the slaves from master

From the master node, I typed following commands to connect to the slave nodes:

 net_adm:ping(list_to_atom("slave1@domU-12-31-38-00-6C-81")).

 net_adm:ping(list_to_atom("slave1@domU-12-31-38-00-6C-81")).

and both returned pong
I then checked connected nodes with following command on the master node:

nodes().

and it returned

[‘slave1@domU-12-31-38-00-6C-81′,’slave2@domU-12-31-38-00-40-F6’]

I also ran the same command on slave1 and slave2 and got
nodes().
[‘master@domU-12-31-38-00-6C-81′,’slave2@domU-12-31-38-00-40-F6’]
(slave1@domU-12-31-38-00-6C-81)2>

nodes().
[‘master@domU-12-31-38-00-6C-81′,’slave1@domU-12-31-38-00-6C-81’]

Testing

I then typed following simple command on the master node to test cluster:

rpc:multicall(nodes(), io, format, ["Hello world~n", []]).
and it returned:
Hello world
Hello world
{[ok,ok],[]}

Shutting down instances

Since, EC2 charges based on usage, you need to shutdown instances after you are done.

ec2-describe-instances

which returned:

RESERVATION     r-f7fd149e      275961154068    default
INSTANCE        i-0146b268      ami-23c92c4a    ec2-67-202-33-171.compute-1.amazonaws.com       domU-12-31-38-00-6C-81.compute-1.internal       running       gsg-keypair     0               m1.small        2007-12-04T17:29:39+0000
RESERVATION     r-c6fd14af      275961154068    default
INSTANCE        i-1046b279      ami-23c92c4a    ec2-67-202-27-186.compute-1.amazonaws.com       domU-12-31-38-00-35-D6.compute-1.internal       running       gsg-keypair     0               m1.small        2007-12-04T17:33:47+0000
RESERVATION     r-d8fd14b1      275961154068    default
INSTANCE        i-1246b27b      ami-23c92c4a    ec2-67-202-20-199.compute-1.amazonaws.com       domU-12-31-38-00-40-F6.compute-1.internal       running       gsg-keypair     0               m1.small        2007-12-04T17:34:10+0000

I then ran ec2-terminate-instances i-0146b268 i-1046b279 i-1246b27b

Conclusion

As you can see once you have the basic EC2 instances running, starting the Erlang cluster is pretty trivial.

November 14, 2007

Evaluating dynamic mathematical expressions in Java/Javascript

Filed under: Computing — admin @ 2:23 pm

Recently, I was looking for a tool to evaluate some mathematical expressions in Java, which were stored in some configuration file. I like dynamic features of languages like Ruby, Python, Javascript or even Erlang to evaluate snippet of code dynamically. Since, Java does not support this inherently, I tried looking for some scripting language that I can embed inside Java. Also, I needed a scripting language that provides two-way integration, i.e., ability to call scripting language from Java and an ability to call Java from the scripting language. So I chose Rhino implementation of Javascript which met those goals. Here is a small example that shows how you can do that using Java and Javascript:

  1
  2 import java.util.Collection;
  3 import java.util.ArrayList;
  4 import java.util.HashSet;
  5 import org.mozilla.javascript.*;
  6 import java.math.BigDecimal;
  7 import junit.framework.TestCase;
  8 import junit.framework.Test;
  9 import junit.framework.TestSuite;
 10
 11
 12
 13 public class JsEvalTest extends TestCase {
 14     private static final String FUNCS_PREFIX = "funcs";
 15     private Context cx;
 16     private Scriptable scope;
 17
 18     public void setUp() throws Exception {
 19         cx = Context.enter();
 20         cx.setLanguageVersion(Context.VERSION_1_2);
 21         scope = cx.initStandardObjects();
 22         ScriptableObject.defineClass(scope, Functions.class);
 23         Scriptable funcs = cx.newObject(scope, "Functions", new Object[] {new ArrayList()});
 24         scope.put(FUNCS_PREFIX, scope, funcs);
 25     }
 26
 27     public void tearDown() {
 28         Context.exit();
 29     }
 30
 31     public void testFormula() throws Exception {
 32         String formula = "cpp = cp / spr";
 33         addDependentVariables(formula);
 34         cx.evaluateString(scope, formula, "<cmd>", 1, null);
 35         Object varValue = scope.get("cpp", scope);
 36         assertEquals(new Double(0.5), varValue);
 37     }
 38
 39     public void testFunction() throws Exception {
 40         String formula = "a = " + FUNCS_PREFIX + ".actuals(cp, spr)";
 41         addDependentVariables(formula);
 42         cx.evaluateString(scope, formula, "<cmd>", 1, null);
 43         Object varValue = scope.get("a", scope);
 44         assertEquals(new Double(20000), varValue);
 45     }
 46
 47     private BigDecimal getValueFor(String var) {
 48         if ("cp".equals(var)) return new BigDecimal(100);
 49         else if ("spr".equals(var)) return new BigDecimal(200);
 50         else return null;
 51     }
 52
 53     public static Test suite() {
 54         TestSuite suite = new TestSuite();
 55         suite.addTestSuite(JsEvalTest.class);
 56         return suite;
 57     }
 58
 59     /////////////////////////////////////////////////////////////////
 60     //
 61     private void addDependentVariables(String formula) {
 62         Collection<String> dependentVars = getDependentVars(formula);
 63         for (String dependentVar : dependentVars) {
 64             Object dependentValue = getValueFor(dependentVar);
 65             if (dependentValue != null) scope.put(dependentVar, scope, dependentValue);
 66         }
 67     }
 68     private Collection<String> getDependentVars(String formula) {
 69         CompilerEnvirons compilerEnv = new CompilerEnvirons();
 70         compilerEnv.initFromContext(cx);
 71         compilerEnv.setGeneratingSource(false);
 72         Parser p = new Parser(compilerEnv, compilerEnv.getErrorReporter());
 73         ScriptOrFnNode root = p.parse(formula, null, 1);
 74         return getDependentVars(root);
 75     }
 76
 77     private Collection<String> getDependentVars(final ScriptOrFnNode tree) {
 78         Collection<String> vars = new HashSet<String>();
 79         getDependentVars(tree, tree, vars);
 80         return vars;
 81     }
 82
 83     private void getDependentVars(final ScriptOrFnNode tree, final Node parent, Collection<String> vars) {
 84         Node node = null;
 85         siblingLoop:
 86         for (;;) {
 87             Node previous = null;
 88             if (node == null) {
 89                 node = parent.getFirstChild();
 90             } else {
 91                 previous = node;
 92                 node = node.getNext();
 93             }
 94             if (node == null) {
 95                 break;
 96             }
 97
 98             int type = node.getType();
 99             if (type == Token.NAME) {
100                 String className = node.getClass().getName();
101                 if ("org.mozilla.javascript.Node$StringNode".equals(className)) {
102                     vars.add(node.getString());
103                 }
104             }
105             getDependentVars(tree, node, vars);
106         }
107     }
108
109
110     public static void main(String[] args) {
111         junit.textui.TestRunner.run(suite());
112     }
113 }
114 ~
115

One gotcha, I came across with Rhino was that I also needed to know the input variables to expressions so that I can bind them properly before executing them. Luckily, Rhino is open source and I looked at the source code and hacked a way to parse the expression and find all named variables. For production, I am caching it, but it’s pretty simple to do. For example:

 77     private Collection<String> getDependentVars(final ScriptOrFnNode tree) {
 78         Collection<String> vars = new HashSet<String>();
 79         getDependentVars(tree, tree, vars);
 80         return vars;
 81     }
 82 
 83     private void getDependentVars(final ScriptOrFnNode tree, final Node parent, Collection<String> vars) {
 84         Node node = null;
 85         siblingLoop:
 86         for (;;) {
 87             Node previous = null;
 88             if (node == null) {
 89                 node = parent.getFirstChild();
 90             } else {
 91                 previous = node;
 92                 node = node.getNext();
 93             }
 94             if (node == null) {
 95                 break;
 96             }
 97 
 98             int type = node.getType();
 99             if (type == Token.NAME) {
100                 String className = node.getClass().getName();
101                 if ("org.mozilla.javascript.Node$StringNode".equals(className)) {
102                     vars.add(node.getString());
103                 }
104             }
105             getDependentVars(tree, node, vars);
106         }
107     }
108 

Another nice thing about Javascript is that you can not only call regular Java classes, but can also define Javascriptish methods, e.g.


import org.mozilla.javascript.*;

public class Functions extends ScriptableObject {
      // The zero-argument constructor used by Rhino runtime to create instances
      public Functions() {
      }

      // Method jsConstructor defines the JavaScript constructor
      public void jsConstructor(Object g) {
      }

      // The class name is defined by the getClassName method
      public String getClassName() { return "Functions"; }

      // The method jsGet_count defines the count property.
      public int jsGet_ratio_value() { return 500;}
      public int jsGet_actuals_value() { return 5000;}

      // Methods can be defined using the jsFunction_ prefix. Here we define
      //  resetCount for JavaScript.
      public double jsFunction_actuals(double a, double b) {
        return a * b;
      }
      public double jsFunction_ratio(double a, double b) {
        return a / b;
      }
}

November 2, 2007

Erlang translation of States Puzzle

Filed under: Computing — admin @ 8:48 pm

I came across neat solution from A Simple Programming Puzzle Seen Through Three Different Lenses regarding states puzzle from Mark Nelson, which was originally posted NPR. Here is translation of Anders Pearson’s solution in Erlang:

 1 -module(states).

 2

 3 %%--------------------------------------------------------------------

 4 %% External exports

 5 %%--------------------------------------------------------------------

 6 -export([

 7          test/0

 8          ]).

 9

10 states() ->

11     ["alabama","alaska","arizona","arkansas","california","colorado",

12           "connecticut","delaware","florida","georgia","hawaii","idaho",

13           "illinois","indiana","iowa","kansas","kentucky","louisiana",

14           "maine","maryland","massachusetts","michigan","minnesota",

15           "mississippi","missouri","montana","nebraska","nevada",

16           "newhampshire","newjersey","newmexico","newyork","northcarolina",

17           "northdakota","ohio","oklahoma","oregon","pennsylvania","rhodeisland",

18           "southcarolina","southdakota","tennessee","texas","utah","vermont",

19           "virginia","washington","westvirginia","wisconsin","wyoming"].

20

21 test() ->

22     States = states(),

23     Dict = dict:new(),

24     {_, _, MatchingStates} = lists:foldl(fun lookup/2, {Dict, States, []}, States),

25     [["northcarolina","southdakota"],["southcarolina","northdakota"]] = MatchingStates.

26

27 lookup(State, {Dict, States, MatchingStates}) ->

28     {State, Dict1, MatchingStates1} = lists:foldl(fun lookup2/2, {State, Dict, MatchingStates}, States),

29     {Dict1, States, MatchingStates1}.

30

31 lookup2(State1, {State2, Dict, MatchingStates}) ->

32     Key = lists:sort(lists:flatten(State1 ++ State2)),

33     Response = dict:find(Key, Dict),

34     MatchingStates1 = case Response of

35         {ok, [State2, State1] } ->

36              MatchingStates;

37         {ok, _} ->

38              [[State1, State2]|MatchingStates];

39         _ ->

40             MatchingStates

41     end,

42     Dict1= dict:store(Key, [State1, State2], Dict),

43     {State2, Dict1, MatchingStates1}.




	

October 2, 2007

RE: Erlang: the verdict

Filed under: Computing — admin @ 9:27 pm

I read an interesting blog from Cedric Beust on Erlang: the verdict . Since, Joe Armstrong’s book, Erlang has come to the spotlight whether it’s ready or not. Cedric is well known in Java community and I have used his tools such as EJBGen that he wrote in late 90s and TestNG: a testing framework. Here is summary of objections that he raised on Erlang and my thoughts:

Object Oriented

His first objection is that Joe is biased against OO and Erlang is not an object oriented language. I can’t say having OO is must for every good language. Erlang is fundamentally functional language with good support for concurrent programming. And as I wrote earlier on Polymorphism in Erlang, Erlang is based on Actor model, where each object has a mailbox associated that receives messages. In this model, the message passing takes literal form rather than method invocation. Also, according to GoF author Ralph Johnson sequential Erlang is functional and concurrent Erlang is object oriented. And I admit method invocation based object oriented is easier than message passing, but it’s just matter of style. I believe abstraction, encapsulation, separation of concerns are more important than inheritance or combination of state/behavior.

Scalability

His second objection is Erlang claims to be scalable, but it only provides primitives to build scalable systems. Though, I agree building scalable systems requires scalable architecture and just using Erlang will not make your system scalable. Also, I agree that you can build scalable systems in any language. I have been building distributed systems for over ten years and I have used CORBA, EJB, messaging middlewares, JINI and all kind of RPC or Web Services. In a lot of ways OTP gives you comparable features, but all that is well integrated into Erlang platform. Also, one of basic requirement for distributed systems is building monitoring and fault tolerant system. Back in 90s when I worked on CORBA, I build pretty complexed monitoring system that allowed sysadmins to watch all servers on machines and see their health. The system also provided ability to automatic and manual start/stop of processes when they crashed. All that was a lot of work, but OTP gives you that capability right away. These days you can also get similar features from application servers and commercial monitoring systems, though open source app servers still lack in that area.

Performance

Cedric also raised recent performance problem that Tim Bray had processing large files in Erlang. Erlang was built for telecommunication systems and though it’s now being marketed as general purpose language, but it still lacks many of the features for general purpose language. Slow I/O or slow regular expression is one of them. I hear a lot of people dismiss performance by just saying computing cycles are cheap and human cycles matter more, and just throw another box. I think CPU cycles still count, especially when difference is in the order of magnitude than other solutions. I also heard from Ebay architect Dan Pritchett, how building large data centers are running into a wall because of power demands. Erlang works best when your application is not CPU-bound, so in that case I would not recommend using it. I think now that Erlang is under spotlight, the future releases of Erlang will address slow I/O, regular expressions and commonly used primitives.

Five 9s

In Erlang fault tolerance is very different from traditional approach in Java, where server tries to handle exceptions and keep running. In Erlang when an error occurs the process simply dies (if it chooses) and supervisor can restart it. Again, this can be built in any language. Erlang provides nice primitives for building fault tolerant systems, though I agree that Erlang does not give this for free, but you still have to build it. Cedric omitted hot code swap, which I have seen only in commercial application servers such as Weblogic and Websphere.

Lock free programming

I have been writing multi-threaded applications for over ten years and I like Erlang’s approach because I have spent too much time in fixing locking and synchronization problems. Cedric complains that Erlang uses locking internally and I am fine with that as long as it’s not exposed to the APIs. Though, I admit in certain cases locking can be useful. For example, one of the exercise from Joe’s book asks reader to implement a correct way to register a process when multiple processes trying to do that simultaneously. And I have seen a lot of clever solutions for that, which would have been easier with locking. I don’t like clever code because it becomes maintenance and debugging hell, so I would have liked simple locking in that case.

Distributed Data

In conclusion, Cedric mentions how distributing data is easier than distributing code. I agree and that is why I like Mnesia instead of centralized database approach. Our CTO Werner Vogels at Amazon has been talking for a while on CAP principle and how databases don’t scale. In one of the system we actually had a database that could not scale and was replaced by a collection of distributed berkley databases. Erlang gives you that capability with Mnesia.

Conclusion

I think Erlang is a specialized language that may not be suitable for all applications, and I have been trying to learn it myself and writing a few applications. I like skepticism that Cedric showed, I think it’s important to have a healthy skepticism when trying new thing. I see too often, a new thing is simply marketed as a Silver Bullet. Also, the authors or makers of new product have their own agenda. Though, you cannot judge or give a verdict just by reading a book. You have to try it yourself in a real project. I like a quote from Matrix movie: “You do not know someone until you fight them.” You can only learn something by trying its limits and not by just reading a book. That is what I am trying to do, and despite the cryptic error messages in Erlang I will stick with it. I think the new era of application development is multi-language, where you use right language for the right task and Erlang is just another tool that will only work for the kind of problems that it inherently was designed to support. Finally, as Fred Brooks wrote there aren’t Silver Bullets, but I see all the time, people are quick to hail new language, methodology or software as that. I believe with good people you can do everything and build the best systems with any language, methodology or process.

« Newer PostsOlder Posts »

Powered by WordPress