February 7, 2008
February 5, 2008
Does experience matter?
I came across a blog entry of DHH, Years of irrelevance
and and an old blog post of Jeff Atwood on Skill Disparities in Programming. Both of them cover somewhat similar topic, i.e. are programmers with more experience better than programmers with less experience. For example, DHH asserts that after six month of experience with a technology you can be pretty experienced. Similarly, Jeff shows various research findings where person with just two year experience is just as good as person with seven or more years of experience. This topic often comes up with job postings when companies require applicant to have certain number of years of experience. Martin Fowler also spoke recently in his blog entry PreferDesignSkills, where he prefers person with broad experience in design and programming than a specialized person. I have seen my share of discrimination in job market when recruiter is looking for X years of experience or is only looking for person with Websphere and would not consider Weblogic experience. It is even worse when new technology is involved and I faced similar discrimination when transitioning from J2EE to Rails. I totally agree that requiring experience with some technology does not matter because a smart person can easily learn it. This also has been discussed many times by Scott Ambler in Generalists vs Specialists, which I have quoted in my website for many years. Since, there is a considerable amount of difference between productivity and quality between programmers, this is an important question. Based on my twenty years of programming experience with sixteen years doing professionally I agree with these findings. I have seen people with twenty years of experience doing same job in big companies with no desire to learn anything else. I have learned it’s incredibly important to diversify your skills and be a generalist than specialist. It is the only way a programmer with more years of experience can be more valuable than programmer with fewer years of experience. When hiring I would look for a person with broad skills who has track record of getting things done and has passion to learn new things. As Joel Spolsky often says when hiring look for smart people who get the things done. There are always new things coming and a good programmer will find a way to learn that in short amount of time as DHH mentioned. I have worked as software developer and systems engineer/administrator and that helped my understanding of overall systems. I have used various languages over the years including Basic, FORTRAN, COBOL, C, C++, Java, PERL, Python, Ruby, Erlang, Haskell. As Dave Thomas talks about learning new language to think differently, and it can show you finding new solutions.
Besides being generalist, another way experience matters is with design. Though with small system, productivity and quality of experienced and junior programmer may seem similar, but I have found with larger systems quality does show up. (Note, I am only considering smart programmers with varying number of years here and not considering really bad programmers.) I have observed that design of experienced programmer (more than five years of experience) will be more flexible because he/she would have likely worked on similar problem before (which may also give productivity advantage). I have also found junior programmers struggle with roles and responsibilities (Rebecca J Wirfs) and good object oriented design (or domain driven design). I have observed that senior programmers have better understanding of things like separation of concerns, modularization, encapsulation, loose coupling and scalability.
In nutshell, any good programmer can learn a new technology very quickly and can solve any problem, but I believe experienced programmer with more general experience will be more valuable than junior programmer and quality of design of a senior programmer with broad experience will be much better in terms of good design principles and -ilities. I think, design is something you learn over the years because for each design decision you may be using thousands of small lessons you have learned over the years. As far as development teams are concerned, I like to have one really experienced programmer and others with junior to mid level experience. This gives a good apprenticeship environment for junior people to learn as software development is still an art. Finally, as everything in software, there aren’t hard rules and everything depends on the environment and people.
December 23, 2007
Released ErlSDB 0.1
I started working on an Erlang library to access Amazon’s SimpleDB web service and I released an early version of the library this weekend. Here are some notes on its usage:
Installing
svn checkout http://erlsdb.googlecode.com/svn/trunk/ erlsdb-read-only
Building
Testing
edit Makefile and add access key and secret key, then type make test
Usage
Take a look at test/erlsdb_test.erl to learn usage, here is a sample code
Starting Server
erlsdb:start(type, [#sdb_state{ access_key = "YourAccessKey", secret_key = "YourSecretKey", domain = "YourDomain" } ])
Creating Domain
erlsdb:create_domain()
Note that the server will use the domain that was passed during initialization.
Listing all Domains
{ok, List, _} = erlsdb:list_domains()
Deleting Domain
erlsdb:delete_domain()
Adding an item
Attributes = lists:sort([ ["StreetAddress", "705 5th Ave"], ["City", "Seattle"], ["State", "WA"], ["Zip", "98101"] ]), erlsdb:put_attributes("TccAddr", Attributes)
Retrieving an item
{ok, UnsortedAttrs} = erlsdb:get_attributes("TccAddr")
Deleting an item
erlsdb:delete_attributes("TccAddr"),
December 16, 2007
Accessing SimpleDB using Java and Erlang
Amazon has recently announced limited beta availability of a new web service SimpleDB, which is somewhat similar to Amazon’s Dynamo service can store key-value based data. However, Amazon’s Dynamo service is not publicly available. The SimpleDB web service provides both REST and SOAP based APIs, though the REST APIs are really XML over HTTP and use simple query and RPC style parameters to create or retrieve data. You can use either GET or POST to submit HTTP requests. A key concept in SimpleDB are Domains which are comparable to the buckets in S3 and provides naming scope. Similar to S3, you are limited to the number of domains you can create. Inside domain, you can create rows of key/value pairs. The collection of key/value pair for a single row is called Item. When you save an item, you give it a name. Unlike a row in relational database, you can save items with varying number of of key/value pairs to the same domain. It is suggested that you use that name to retrieve the item. Though, SimpleDB also supports simple query language, but it only works if the query takes less than five seconds. SimpleDB is not meant to be used in a relational style, rather you are encouraged to store all related data you need with the item. However, there is limit on number of key/values you can save in an item (~255) , size of key/value (~1024) or total size of data in a single domain (~10 GB). All keys and values must be String. This also means in order for queries to work, the data must be stored in lexicographical format. Another big difference with the relational database is that SimpleDB is designed for high availability and it guarantees eventual consistency. Also, it provides transactions only at a single key/value pair. Thus, it is not suitable for applications that require higher level of consistency or transactions (See CAP principle or BASE transactions). On the other hand due to high availability, you can built highly scalable applications. Since the data is replicated across multiple servers, you may also get performance benefit by using multiple threads when reading a number of items concurrently.
Following are the core APIs that provide basic CRUD operations:
- Create/Delete/List domains
- Save attributes
- Update attributes that replace previously stored attributes
- Read attributes using Item name with option to query using attribute key/value pairs
- Delete attributes
I have been using SimpleDB for a few months for some internal applications at Amazon. Here I am showing a Java and Erlang client code to access SimpleDB. Since, the service is actually not available for public yet so you may not be able to try it.
Java Client
1 import java.io.BufferedReader; 2 import java.io.IOException; 3 import java.io.InputStreamReader; 4 import java.io.UnsupportedEncodingException; 5 import java.net.HttpURLConnection; 6 import java.net.MalformedURLException; 7 import java.net.URL; 8 import java.net.URLEncoder; 9 import java.security.InvalidKeyException; 10 import java.security.NoSuchAlgorithmException; 11 import java.text.Collator; 12 import java.text.FieldPosition; 13 import java.text.NumberFormat; 14 import java.text.SimpleDateFormat; 15 import java.util.ArrayList; 16 import java.util.Date; 17 import java.util.HashMap; 18 import java.util.List; 19 import java.util.Map; 20 import java.util.TreeMap; 21 22 import javax.crypto.Mac; 23 import javax.crypto.spec.SecretKeySpec; 24 import javax.xml.parsers.DocumentBuilder; 25 import javax.xml.parsers.DocumentBuilderFactory; 26 import javax.xml.parsers.ParserConfigurationException; 27 28 import org.w3c.dom.Document; 29 import org.w3c.dom.Element; 30 import org.w3c.dom.Node; 31 import org.w3c.dom.NodeList; 32 import org.xml.sax.InputSource; 33 import org.xml.sax.SAXException; 34 35 /** 36 * This class provides business delegate to access SDB web service using REST approach. 37 * 38 */ 39 public class SdbDelegate { 40 private static final String DEFAULT_ENCODING = "UTF-8"; 41 private static final String HMAC_SHA1_ALGORITHM = "HmacSHA1"; 42 private static final String MORETOKEN_STRING = "MoreToken"; 43 44 private final String accessId; 45 private final String secretKey; 46 private final String uri; 47 private final String version; 48 49 public static class Response { 50 public final Document rootDocument; 51 public final Element rootElement; 52 public final String requestId; 53 public final String boxUsage; 54 public final String moreToken; 55 public final Object result; 56 public Response(Document rootDocument, Element rootElement, String requestId, String boxUsage, String moreToken, Object result) { 57 this.rootDocument = rootDocument; 58 this.rootElement = rootElement; 59 this.requestId = requestId; 60 this.boxUsage = boxUsage; 61 this.moreToken = moreToken; 62 this.result = result; 63 } 64 } 65 66 public SdbDelegate() { 67 this("yourkey", "yoursecret", "http://sdb.amazonaws.com", "2007-11-07"); 68 } 69 70 71 public SdbDelegate(String accessId, String secretKey, String uri, String version) { 72 this.accessId = accessId; 73 this.secretKey = secretKey; 74 this.uri = uri; 75 this.version = version; 76 } 77 78 79 public void delete(String bucket, String name) { 80 deleteAttributes(bucket, name); 81 } 82 83 public void createDomain(String domainName) { 84 Map<String,String> parameters = getBaseParameters("CreateDomain"); 85 parameters.put("DomainName",encode(domainName)); 86 invoke(parameters); 87 } 88 89 public void deleteDomain(String domainName) { 90 Map<String,String> parameters = getBaseParameters("DeleteDomain"); 91 parameters.put("DomainName",encode(domainName)); 92 invoke(parameters); 93 } 94 95 public List<String> listDomains() { 96 return listDomains(null); 97 } 98 99 public List<String> listDomains(StringBuilder moreToken) { 100 return (List<String>) listDomains(moreToken, null); 101 } 102 103 public List<String> listDomains(StringBuilder moreToken, Integer maxResults) { 104 Map<String,String> parameters = getBaseParameters("ListDomains"); 105 if (moreToken != null && moreToken.length() > 0) { 106 parameters.put(MORETOKEN_STRING, moreToken.toString()); 107 } 108 if (maxResults != null) { 109 parameters.put("MaxResults",maxResults.toString()); 110 } 111 Response response = invoke(parameters); 112 if (moreToken != null) { 113 moreToken.setLength(0); 114 moreToken.append(response.moreToken); 115 } 116 117 Element rootElement = (Element) response.rootElement; 118 NodeList domainNodes = rootElement.getElementsByTagName("DomainName"); 119 List<String> domains = new ArrayList<String>(); 120 for (int i = 0; i < domainNodes.getLength(); ++i) { 121 Node domainNode = domainNodes.item(i); 122 Node domainNameNode = domainNode.getFirstChild(); 123 String domainName = domainNameNode.getNodeValue(); 124 domains.add(domainName); 125 } 126 127 // 128 NodeList moreTokenNodes = rootElement.getElementsByTagName(MORETOKEN_STRING); 129 if (moreTokenNodes.getLength() > 0) { 130 Element moreTokenElement = (Element) moreTokenNodes.item(0); 131 moreTokenElement.normalize(); 132 Node tokenNode = moreTokenElement.getFirstChild(); 133 String newMoreToken = tokenNode.getNodeValue(); 134 if (moreToken != null && (newMoreToken.compareToIgnoreCase("null") == 0 || newMoreToken.length() == 0)) { 135 moreToken.setLength(0); 136 } 137 } 138 return domains; 139 } 140 141 142 public void deleteAttributes(String domain, String identifier) { 143 deleteAttributes(domain, identifier, new HashMap<String, String>()); 144 } 145 146 /** 147 * Delete specified atrributes and/or values from item 148 * 149 * @param attributes set of attributes 150 * @return result of the operation 151 */ 152 public void deleteAttributes(String domain, String identifier, Map<String, String> attributes) { 153 Map<String,String> parameters = getBaseParameters("DeleteAttributes"); 154 parameters.put("ItemName",identifier); 155 parameters.put("DomainName",domain); 156 if (attributes != null) { 157 int i = 0; 158 for (Map.Entry<String, String> attr : attributes.entrySet()) { 159 String name = "Attribute."+ i + ".Name"; 160 parameters.put(name,attr.getKey()); 161 if (attr.getValue() != null) { 162 String value = "Attribute."+ i + ".Value"; 163 parameters.put(value, attr.getValue()); 164 } 165 ++i; 166 } 167 } 168 invoke(parameters); 169 } 170 171 172 public void putAttributes(String domain, String identifier, Map<String, String> attributes) { 173 putAttributes(domain, identifier, attributes, true); 174 } 175 176 177 public void putAttributes(String domain, String identifier, Map<String, String> attributes, boolean replace) { 178 Map<String,String> parameters = getBaseParameters("PutAttributes"); 179 parameters.put("ItemName", identifier); 180 parameters.put("DomainName", domain); 181 int i = 0; 182 if (attributes.size() > 255) throw new IllegalArgumentException("more than 255 attributes not supported"); 183 System.out.println("Writing " + attributes.size() + " attributes for " + identifier); 184 for (Map.Entry<String, String> attr : attributes.entrySet()) { 185 if (attr.getKey().length() > 700) throw new IllegalArgumentException("name '" + attr.getKey() + "' more than 700 bytes"); 186 if (attr.getValue().length() > 700) throw new IllegalArgumentException("value '" + attr.getValue() + "' of name '" + attr.getKey() + "' more than 700 bytes"); 187 String name = "Attribute."+ i + ".Name"; 188 parameters.put(name,attr.getKey()); 189 String value = "Attribute."+ i + ".Value"; 190 parameters.put(value, attr.getValue()); 191 i++; 192 } 193 // 194 if (replace) { 195 parameters.put("Replace","true"); 196 } 197 invoke(parameters); 198 } 199 200 public void replaceAttributes(String domain, String identifier, Map<String, String> attributes) { 201 putAttributes(domain, identifier, attributes,true); 202 } 203 204 205 public Map<String, String> getAttributes(String domain, String identifier){ 206 return getAttributes(domain, identifier, new HashMap<String, String>()); 207 } 208 209 public Map<String, String> getAttributes(String domain, String identifier, Map<String, String> attributes) { 210 String NAME_STRING = "Name"; 211 String VALUE_STRING = "Value"; 212 String ATTRIBUTE_STRING = "Attribute"; 213 214 Map<String,String> parameters = getBaseParameters("GetAttributes"); 215 parameters.put("ItemName", identifier); 216 parameters.put("DomainName",domain); 217 218 int k = 0; 219 for (Map.Entry<String, String> attr : attributes.entrySet()) { 220 String name = "AttributeName."+k; 221 parameters.put(name,attr.getKey()); 222 ++k; 223 } 224 225 // 226 Response response = invoke(parameters); 227 Element rootElement = (Element) response.rootElement; 228 NodeList attributeNodes = rootElement.getElementsByTagName(ATTRIBUTE_STRING); 229 Map<String, String> returnAttributes = new TreeMap<String, String>(String.CASE_INSENSITIVE_ORDER); 230 // 231 for (int i = 0; i < attributeNodes.getLength(); ++i) { 232 Node attributeNode = attributeNodes.item(i); 233 234 Element attributeElement = (Element)attributeNode; 235 236 NodeList nameNodes = attributeElement.getElementsByTagName(NAME_STRING); 237 Node nameNode = nameNodes.item(0); 238 nameNode.normalize(); 239 Node attributeNameNode = nameNode.getFirstChild(); 240 String attributeName = attributeNameNode.getNodeValue(); 241 NodeList valueNodes = attributeElement.getElementsByTagName(VALUE_STRING); 242 Node valueNode = valueNodes.item(0); 243 valueNode.normalize(); 244 Node attributeValueNode = valueNode.getFirstChild(); 245 String attributeValue = attributeValueNode.getNodeValue(); 246 returnAttributes.put(attributeName,attributeValue); 247 } 248 return returnAttributes; 249 } 250 251 252 protected Response invoke(Map<String,String> parameters) { 253 try { 254 Document document = getResponse(parameters); 255 Element rootElement = document.getDocumentElement(); 256 Element responseStatus = (Element) rootElement.getElementsByTagName("ResponseStatus").item(0); 257 Node requestIdNode = responseStatus.getElementsByTagName("RequestId").item(0); 258 String requestId = null; 259 if (requestIdNode != null && requestIdNode.getFirstChild() != null) { 260 requestId = requestIdNode.getFirstChild().getNodeValue(); 261 } 262 String boxUsage = null; 263 Node boxUsagesNode = responseStatus.getElementsByTagName("BoxUsage").item(0); 264 if (boxUsagesNode != null && boxUsagesNode.getFirstChild() != null) { 265 boxUsage = boxUsagesNode.getFirstChild().getNodeValue(); 266 } 267 return new Response(document, rootElement, requestId, boxUsage, null, null); 268 } catch (RuntimeException e) { 269 throw e; 270 } catch (Exception e) { 271 throw new RuntimeException(e); 272 } 273 } 274 275 276 private static String getSignature(String data, String key) throws NoSuchAlgorithmException, InvalidKeyException { 277 SecretKeySpec signingKey = new SecretKeySpec(key.getBytes(),HMAC_SHA1_ALGORITHM); 278 Mac mac = Mac.getInstance(HMAC_SHA1_ALGORITHM); 279 mac.init(signingKey); 280 byte[] rawHmac = mac.doFinal(data.getBytes()); 281 return new sun.misc.BASE64Encoder().encode(rawHmac); 282 } 283 284 private Map<String,String> getBaseParameters(String action) { 285 Collator myCollator = Collator.getInstance(); 286 // Note that these parameters must be sent in order 287 Map<String,String> baseParameters = new TreeMap<String,String>(myCollator); 288 baseParameters.put("Action", action); 289 baseParameters.put("Version", version); 290 baseParameters.put("AWSAccessKeyId", accessId); 291 baseParameters.put("SignatureVersion", "1"); 292 293 Date curTime = new Date(); 294 SimpleDateFormat dateFormat = new SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ssZ"); 295 StringBuffer timeBuffer = new StringBuffer(); 296 FieldPosition fp = new FieldPosition(NumberFormat.INTEGER_FIELD); 297 dateFormat.format(curTime, timeBuffer, fp); 298 timeBuffer.insert(timeBuffer.length()-2, ":"); 299 300 baseParameters.put("Timestamp",timeBuffer.toString()); 301 return baseParameters; 302 } 303 304 private Document getResponse(Map<String,String> parameters) throws UnsupportedEncodingException, MalformedURLException, IOException, NoSuchAlgorithmException, SAXException, ParserConfigurationException, InvalidKeyException { 305 String path = uri + getUrlPath(parameters); 306 URL url = new URL(path); 307 HttpURLConnection connection = (HttpURLConnection) url.openConnection(); 308 connection.setDoOutput(true); 309 connection.setRequestMethod("GET"); 310 connection.connect(); 311 312 System.out.println("Connecting to " + url); 313 int respCode = connection.getResponseCode(); 314 315 if (respCode == HttpURLConnection.HTTP_OK) { 316 InputSource inputSource = new InputSource(new BufferedReader(new InputStreamReader(connection.getInputStream()))); 317 DocumentBuilderFactory documentBuilderFactory = DocumentBuilderFactory.newInstance(); 318 DocumentBuilder documentBuilder = documentBuilderFactory.newDocumentBuilder(); 319 return documentBuilder.parse(inputSource); 320 } else { 321 InputSource inputSource = new InputSource(new BufferedReader(new InputStreamReader(connection.getErrorStream()))); 322 DocumentBuilderFactory documentBuilderFactory = DocumentBuilderFactory.newInstance(); 323 DocumentBuilder documentBuilder = documentBuilderFactory.newDocumentBuilder(); 324 Document document = documentBuilder.parse(inputSource); 325 Element rootElement = document.getDocumentElement(); 326 Node requestIdNode = rootElement.getElementsByTagName("RequestID").item(0); 327 StringBuilder errorMessage = new StringBuilder("Failed to invoke input request '" + path + "'"); 328 329 // 330 if (requestIdNode != null && requestIdNode.getFirstChild() != null) { 331 errorMessage.append(" with requestId " + requestIdNode.getFirstChild().getNodeValue()); 332 Element errors = (Element) rootElement.getElementsByTagName("Errors").item(0); 333 if (errors != null) { 334 Element error = (Element) errors.getElementsByTagName("Error").item(0); 335 if (error != null) { 336 Node messageNode = error.getElementsByTagName("Message").item(0); 337 if (messageNode != null) errorMessage.append(" due to " + messageNode.getFirstChild().getNodeValue()); 338 } 339 } 340 } 341 throw new RuntimeException(errorMessage.toString()); 342 } 343 } 344 345 346 private String encode(String name){ 347 try { 348 return URLEncoder.encode(name, DEFAULT_ENCODING); 349 } catch (UnsupportedEncodingException ex) { 350 throw new RuntimeException("Failed to encode '" + name + "'"); 351 } 352 } 353 354 355 private String getUrlPath(Map<String,String> parameters) throws UnsupportedEncodingException, NoSuchAlgorithmException, InvalidKeyException { 356 StringBuilder queryString = new StringBuilder("?"); 357 StringBuilder signatureData = new StringBuilder(); 358 boolean first = true; 359 for (Map.Entry<String, String> param : parameters.entrySet()) { 360 if (first) { 361 first = false; 362 } else { 363 queryString.append("&"); 364 } 365 366 signatureData.append(param.getKey() + param.getValue()); 367 queryString.append(param.getKey() + "=" + URLEncoder.encode((String)param.getValue(), DEFAULT_ENCODING)); 368 } 369 370 queryString.append("&Signature=" + URLEncoder.encode(getSignature(signatureData.toString(),secretKey), DEFAULT_ENCODING)); 371 return queryString.toString(); 372 } 373 } 374
Java Client Test
1 import junit.framework.TestCase; 2 import junit.framework.Test; 3 import junit.framework.TestSuite; 4 5 import java.util.List; 6 import java.util.Map; 7 import java.util.HashMap; 8 9 public class SdbDelegateTest extends TestCase { 10 private SdbDelegate fixture; 11 private String domainName; 12 13 public void setUp() { 14 this.fixture = new SdbDelegate(); 15 this.domainName = "TestDomain"; 16 } 17 18 public void testCreateDomain() { 19 fixture.createDomain(domainName); 20 assertTrue(fixture.listDomains().contains(domainName)); 21 } 22 23 public void testDeleteDomain() { 24 fixture.deleteDomain(domainName); 25 assertFalse(fixture.listDomains().contains(domainName)); 26 } 27 28 29 public void testGetPutAttributes() { 30 fixture.createDomain(domainName); 31 Map<String, String> attributes = new HashMap<String, String>(); 32 attributes.put("StreetAddress", "705 5th Ave"); 33 attributes.put("City", "Seattle"); 34 attributes.put("State", "WA"); 35 attributes.put("Zip", "98101"); 36 fixture.putAttributes(domainName, "TCC", attributes, true); 37 Map<String, String> attributes2 = fixture.getAttributes(domainName, "TCC"); 38 System.out.println("Retrieved " + attributes2); 39 assertEquals(attributes, attributes2); 40 } 41 42 public static Test suite() { 43 TestSuite suite = new TestSuite(); 44 suite.addTestSuite(SdbDelegateTest.class); 45 return suite; 46 } 47 48 public static void main(String[] args) { 49 junit.textui.TestRunner.run(suite()); 50 } 51 } 52
Erlang Client
1 %%%------------------------------------------------------------------- 2 %%% @author Shahzad Bhatti <bhatti@plexobject.com> 3 %%% @doc 4 %%% Business delegate to access SimpleDB 5 %%% @end 6 %%%------------------------------------------------------------------- 7 -module(sdb_delegate). 8 -include_lib("xmerl/include/xmerl.hrl"). 9 10 11 %%%------------------------------------------------------------------- 12 %%% Public APIs 13 %%%------------------------------------------------------------------- 14 -export([list_domains/0, list_domains/1, list_domains/2, delete_domain/1, get_attributes/2, get_attributes/3, create_domain/1, put_attributes/3, put_attributes/4, delete_item/2, delete_attributes/2, delete_attributes/3, replace_attributes/3]). 15 16 17 %%%------------------------------------------------------------------- 18 %%% Test Methods 19 %%%------------------------------------------------------------------- 20 -export([test/0]). 21 22 23 %%%------------------------------------------------------------------- 24 %%% Configuration Functions %%% 25 %%%------------------------------------------------------------------- 26 uri() -> 27 "http://sdb.amazonaws.com?". 28 29 30 version() -> 31 "2007-11-07". 32 33 access_key() -> 34 "yourkey". 35 36 secret_key() -> 37 "secretkey". 38 39 40 41 %%%------------------------------------------------------------------- 42 %%% Public APIs to access SimpleDB 43 %%%------------------------------------------------------------------- 44 start() -> 45 crypto:start(), 46 inets:start(). 47 48 stop() -> 49 init:stop(). 50 51 52 list_domains() -> 53 list_domains(nil, nil). 54 list_domains(MoreToken) -> 55 list_domains(MoreToken, nil). 56 57 list_domains(MoreToken, MaxResults) -> 58 Base = base_parameters("ListDomains"), 59 Base1 = if MoreToken == nil -> Base; true -> Base ++ [["MoreToken", MoreToken]] end, 60 Base2 = if MaxResults == nil -> Base1; true -> Base1 ++ [["MaxResults", MaxResults]] end, 61 Xml = rest_request(Base2), 62 xml_values(xmerl_xpath:string("//DomainName/text()", Xml)). 63 64 65 create_domain(Domain) -> 66 Base = [["DomainName", url_encode(Domain)]| 67 base_parameters("CreateDomain")], 68 rest_request(Base). 69 70 delete_domain(Domain) -> 71 Base = [["DomainName", url_encode(Domain)]| 72 base_parameters("DeleteDomain")], 73 rest_request(Base). 74 75 replace_attributes(Domain, ItemName, Attributes) -> 76 put_attributes(Domain, ItemName, Attributes, true). 77 put_attributes(Domain, ItemName, Attributes) -> 78 put_attributes(Domain, ItemName, Attributes, false). 79 put_attributes(Domain, ItemName, Attributes, Replace) -> 80 Base = [["DomainName", url_encode(Domain)], 81 ["ItemName", url_encode(ItemName)]| 82 base_parameters("PutAttributes")] ++ 83 encode_attributes(Attributes), 84 Base1 = if Replace == false -> Base; true -> Base ++ [["Replace", "true"]] end, 85 rest_request(Base1). 86 87 88 delete_item(Domain, ItemName) -> 89 delete_attributes(Domain, ItemName). 90 91 92 delete_attributes(Domain, ItemName) -> 93 delete_attributes(Domain, ItemName, nil). 94 delete_attributes(Domain, ItemName, AttributeNames) -> 95 Base = [["DomainName", url_encode(Domain)], 96 ["ItemName", url_encode(ItemName)]| 97 base_parameters("DeleteAttributes")] ++ 98 encode_attribute_names(AttributeNames), 99 rest_request(Base). 100 101 get_attributes(Domain, ItemName) -> 102 get_attributes(Domain, ItemName, nil). 103 get_attributes(Domain, ItemName, AttributeNames) -> 104 Base = [["DomainName", url_encode(Domain)], 105 ["ItemName", url_encode(ItemName)]| 106 base_parameters("GetAttributes")] ++ 107 encode_attribute_names(AttributeNames), 108 Xml = rest_request(Base), 109 xml_names_values(xmerl_xpath:string("//Attribute", Xml)). 110 111 112 %%%------------------------------------------------------------------- 113 %%% Private Functions 114 %%%------------------------------------------------------------------- 115 116 rest_request(Params) -> 117 Url = uri() ++ query_string(Params), 118 Response = http:request(Url), 119 case Response of 120 {ok, {{_HttpVersion, StatusCode, _ErrorMessage}, _Headers, Body }} -> 121 error_logger:info_msg("URL ~p Status ~p~n", [Url, StatusCode]), 122 {Xml, _Rest} = xmerl_scan:string(Body), 123 %%%error_logger:info_msg("Xml ~p~n", [Xml]), 124 case StatusCode of 125 200 -> 126 Xml; 127 _ -> 128 Error = xml_values(xmerl_xpath:string("//Message/text()", Xml)), 129 throw({Error}) 130 end; 131 {error, Message} -> 132 case Message of 133 timeout -> 134 io:format("URL ~p Timedout, retrying~n", [Url]), 135 sleep(1000), 136 rest_request(Params); 137 true -> 138 throw({Message}) 139 end 140 end. 141 142 143 query_string(Params) -> 144 Params1 = lists:sort( 145 fun([Elem1, _], [Elem2, _]) -> 146 string:to_lower(Elem1) > string:to_lower(Elem2) end, 147 Params), 148 {QueryStr, SignatureData} = 149 lists:foldr(fun query_string/2, {"", ""}, Params1), 150 QueryStr ++ "Signature=" ++ url_encode(signature(SignatureData)). 151 152 153 query_string([Key, Value], {QueryStr, SignatureData}) -> 154 QueryStr1 = QueryStr ++ Key ++ "=" ++ url_encode(Value) ++ "&", 155 SignatureData1 = SignatureData ++ Key ++ Value, 156 {QueryStr1, SignatureData1}. 157 158 159 encode_attributes(Attributes) when Attributes == nil -> 160 []; 161 encode_attributes(Attributes) -> 162 {Encoded, _} = lists:foldr(fun encode_attributes/2, {[], 0}, Attributes), 163 Encoded. 164 165 encode_attributes([Key, Value], {Encoded, I}) -> 166 KeyName = "Attribute." ++ integer_to_list(I) ++ ".Name", 167 KeyValue = "Attribute." ++ integer_to_list(I) ++ ".Value", 168 {[[KeyName, Key], [KeyValue, Value]|Encoded], I+1}. 169 170 171 encode_attribute_names(Attributes) when Attributes == nil -> 172 []; 173 encode_attribute_names(Attributes) -> 174 {Encoded, _} = lists:foldr(fun encode_attribute_names/2, {[], 0}, Attributes), 175 Encoded. 176 177 encode_attribute_names(Key, {Encoded, I}) -> 178 KeyName = "Attribute." ++ integer_to_list(I) ++ ".Name", 179 {[[KeyName, Key]|Encoded], I+1}. 180 181 182 183 184 %%% 185 % Converts a number into 2-digit 186 %%% 187 two_digit(X) when is_integer(X), X >= 10 -> 188 integer_to_list(X); 189 two_digit(X) when is_integer(X), X < 10 -> 190 "0" ++ integer_to_list(X). 191 192 abs_two_digit(X) when X < 0 -> 193 two_digit(0-X); 194 abs_two_digit(X) when X >= 0 -> 195 two_digit(X). 196 197 %%% 198 % Returns Coordinated Universal Time (Greenwich Mean Time) time zone, 199 %%% 200 timestamp() -> 201 {{_, _, _}, {_LocalHour, _LocalMin, _}} = LocalDateTime = calendar:local_time(), 202 [{{Year, Month, Day}, {Hour, Min, Sec}}] = 203 calendar:local_time_to_universal_time_dst(LocalDateTime), 204 Z = gmt_difference(), 205 integer_to_list(Year) ++ "-" ++ two_digit(Month) ++ "-" ++ two_digit(Day) 206 ++ "T" ++ two_digit(Hour) ++ ":" ++ two_digit(Min) ++ ":" ++ 207 two_digit(Sec) ++ Z. 208 209 210 gmt_difference() -> 211 "-08:00". 212 213 214 215 %%% 216 % Returns HMAC encoded access key 217 %%% 218 signature(Data) -> 219 hmac(secret_key(), Data). 220 221 hmac(SecretKey, Data) -> 222 http_base_64:encode( 223 binary_to_list(crypto:sha_mac(SecretKey, Data))). 224 225 226 base_parameters(Action) -> 227 [["Action", Action], 228 ["AWSAccessKeyId", access_key()], 229 ["Version", version()], 230 ["SignatureVersion", "1"], 231 ["Timestamp", timestamp()]]. 232 233 234 235 %%% 236 % This method retrieves node value from the XML records that are returned 237 % after scanning tags. 238 %%% 239 xml_values(List) -> 240 lists:foldr(fun xml_values/2, [], List). 241 242 xml_values(#xmlText{value=Value}, List) -> 243 [Value|List]. 244 245 246 xml_names_values(List) -> 247 lists:foldr(fun xml_names_values/2, [], List). 248 249 xml_names_values(Xml, List) -> 250 [ #xmlText{value=Name} ] = xmerl_xpath:string("//Name/text()", Xml), 251 [ #xmlText{value=Value} ] = xmerl_xpath:string("//Value/text()", Xml), 252 [[Name, Value]|List]. 253 254 255 sleep(T) -> 256 receive 257 after T -> 258 true 259 end. 260 261 262 %%% 263 % URL encode - borrowed from CouchDB 264 %%% 265 url_encode([H|T]) -> 266 if 267 H >= $a, $z >= H -> 268 [H|url_encode(T)]; 269 H >= $A, $Z >= H -> 270 [H|url_encode(T)]; 271 H >= $0, $9 >= H -> 272 [H|url_encode(T)]; 273 H == $_; H == $.; H == $-; H == $: -> 274 [H|url_encode(T)]; 275 true -> 276 case lists:flatten(io_lib:format("~.16.0B", [H])) of 277 [X, Y] -> 278 [$%, X, Y | url_encode(T)]; 279 [X] -> 280 [$%, $0, X | url_encode(T)] 281 end 282 end; 283 url_encode([]) -> 284 []. 285 286 287 %%%------------------------------------------------------------------- 288 %%% Test Functions 289 %%%------------------------------------------------------------------- 290 291 test_list_domains() -> 292 list_domains(). 293 294 test_create_domain() -> 295 create_domain("TestDomain"), 296 ["TestDomain"] = lists:filter( 297 fun(Elem) -> Elem == "TestDomain" end, 298 list_domains()). 299 300 301 test_delete_domain() -> 302 delete_domain("TestDomain"), 303 [] = lists:filter( 304 fun(Elem) -> Elem == "TestDomain" end, 305 list_domains()). 306 307 test_put_get_attributes() -> 308 create_domain("TestDomain"), 309 Attributes = lists:sort([ 310 ["StreetAddress", "705 5th Ave"], 311 ["City", "Seattle"], 312 ["State", "WA"], 313 ["Zip", "98101"] 314 ]), 315 put_attributes("TestDomain", "TCC", Attributes), 316 Attributes = lists:sort(get_attributes("TestDomain", "TCC")). 317 318 test_put_delete_attributes() -> 319 create_domain("TestDomain"), 320 Attributes = lists:sort([ 321 ["StreetAddress", "705 5th Ave"], 322 ["City", "Seattle"], 323 ["State", "WA"], 324 ["Zip", "98101"] 325 ]), 326 put_attributes("TestDomain", "TCC", Attributes), 327 delete_attributes("TestDomain", "TCC"), 328 sleep(1000), %% Let it sync 329 [] = get_attributes("TestDomain", "TCC"). 330 331 test() -> 332 start(), 333 test_create_domain(), 334 test_delete_domain(), 335 test_put_get_attributes(), 336 test_put_delete_attributes(), 337 stop(). 338
Note that in Erlang, I am hard coding time-zone difference, I wish Erlang had better support for time-zones, and URL encoding for UTF8.
Another word of caution, in order to use this code for real applications, you will need better handle on error processing especially because web services may timeout and you will need to retry.
You can download the code from here:
http://weblog.plexobject.com/SdbDelegate.java
http://weblog.plexobject.com/SdbDelegateTest.java
http://weblog.plexobject.com/sdb_delegate.erl
I am also adding the Erlang code to http://code.google.com/p/erlsdb/.
Further References:
http://awsdocs.s3.amazonaws.com/SDB/2007-11-07/sdb-dg-2007-11-07.pdf?
http://docs.amazonwebservices.com/AmazonSimpleDB/2007-11-07/DeveloperGuide/?
http://aws.amazon.com/simpledb
http://www.informationweek.com/news/showArticle.jhtml?articleID=204803008
You can send me comments or suggestions to improve this code at bhatti AT plexobject DOT com.
Update (Dec 17, 2007): I saw some comments on how unRESTful SimpleDB’s REST APIs are and I agree. Here is my attempt to RESTify them:
Create Domain
PUT http://sdb.amazonaws.com/DomainName
I am using PUT because I am defining the URL and per REST specs I should use PUT instead of POST. Also, I am relying on HTTP headers for authentication information.
Delete Domain
DELETE http://sdb.amazonaws.com/DomainName
List Domains
GET http://sdb.amazonaws.com?MaxDomains=XX&NextToken=YY
I am specifying parameters explicitly rather than using changing URI
Put Attributes
POST http://sdb.amazonaws.com/DomainName/ItemName
—POST BODY-BEGIN—
attributeName=attributeValue&…
—POST BODY-END—
I am specifying attributes as part of the POST body
Delete Attributes
DELETE http://sdb.amazonaws.com/DomainName/ItemName?attributeName1&…
Get Attributes
GET http://sdb.amazonaws.com/DomainName/ItemName?attributeName1&…
Query Items
GET http://sdb.amazonaws.com/DomainName/ItemName?queryExpression=…
Note that these URLs are based on Sam Ruby’s book on REST APIs. Also, Rails uses similar URLs for RESTful controllers.
December 11, 2007
Hersh: children raped at Abu Ghraib, Pentagon has videos
Hersh: children raped at Abu Ghraib, Pentagon has videos
” Some of the worst things that happened you don’t know about, okay? Videos, um, there are women there. Some of you may have read that they were passing letters out, communications out to their men. This is at Abu Ghraib … The women were passing messages out saying ‘Please come and kill me, because of what’s happened’ and basically what happened is that those women who were arrested with young boys, children in cases that have been recorded. The boys were sodomized with the cameras rolling. And the worst above all of that is the soundtrack of the boys shrieking that your government has. They are in total terror. It’s going to come out.”
Listen Hersh’s recording
December 4, 2007
Starting cluster of Erlang nodes on EC2
Erlang is a functional language with strong support for concurrency and distribution. It is quite trivial to start Erlang on multiple hosts and connect them. However, you have to write custom scripts to start these nodes. For example, on a set of local machines, ssh with public/private keys can be used to start cluster of nodes. If you don’t have the computing environment, you can lease your servers using Amazon’s EC2 webservice. In this blog, I am going to show how to start Erlang nodes on the instances of EC2. I have broken the instructions into two parts, setting up the EC2 instances and starting the Erlang cluster.
Setting up EC2
Get an Account
If you don’t have S3 and EC2 account, you can get account for Amazon Simple Storage Service
and account for Amazon Elastic Compute Cloud.
Create X.509 certificate
Select the “AWS Access Key Identifiers under “Your Web Services Account” and follow the “Create New” button in this section to create a new X.509 certificate. Also, save them locally.
Download EC2 Toolkit
Download the Command Line Tools.
Setup Environment Variables
export EC2_HOME=export PATH=$PATH:$EC2_HOME/bin export EC2_PRIVATE_KEY=$EC2_HOME/pk-AR7Z7HJUXAIIHL6CR43HD4PGN75KEJNV.pem export EC2_CERT=$EC2_HOME/cert-AR7Z7HJUXAIIHL6CR43HD4PGN75KEJNV.pem
Finding a Suitable AMI
Use following command to view default images:
ec2-describe-images -o self -o amazon
Look for the line containing the public image identified by the ec2-public-images/getting-started.manifest.xml value in the third column. You can also use an image that I created, which includes Erlang distribution with an id of “ami-23c92c4a”.
Generating a Keypair
Use following command to create a keypair:
ec2-add-keypair gsg-keypair
create a file named id_rsa-gsg-keypair and paste everything between (and including) the “—–BEGIN RSA PRIVATE KEY—–” and “—–END RSA PRIVATE KEY—–” lines into it.
Running an Instance
Use following command to start EC2 instance:
ec2-run-instances ami-23c92c4a -k gsg-keypair
and you may see something like:
RESERVATION r-d8fd14b1 275961154068 default INSTANCE i-1246b27b ami-23c92c4a pending gsg-keypair 0 m1.small 2007-12-04T17:34:10+0000
Check status
ec2-describe-instances i-1246b27b
Authorizing Network Access to Your Instances
You can open certain ports you need using
ec2-authorize default -p 22 ec2-authorize default -p 80 ec2-authorize default -p 4369
Connecting to your Instance
ssh -i id_rsa-gsg-keypair root@ec2-72-44-51-166.z-1.compute-1.amazonaws.com
open browser to http://ec2-72-44-51-166.z-1.compute-1.amazonaws.com/
Creating an Image
Modifying an Existing Image
sed -i -e 's/Congratulations!/Congratulations Shahzad Bhatti/' /var/www/html/index.html ls -l /var/www/html/index.html
Preparing for Bundling
Copy your private key and certificate to the machine being bundled.
cd $EC2_HOME scp -i id_rsa-gsg-keypair pk-AR7Z7HJUXAIIHL6CR43HD4PGN75KEJNV.pem cert-AR7Z7HJUXAIIHL6CR43HD4PGN75KEJNV.pem root@ec2-72-44-51-166.z-1.compute-1.amazonaws.com:/mnt
ec2-bundle-vol -d /mnt -k /mnt/pk-AR7Z7HJUXAIIHL6CR43HD4PGN75KEJNV.pem -c /mnt/cert-AR7Z7HJUXAIIHL6CR43HD4PGN75KEJNV.pem -u your-id -s 1536
Bundling
ec2-upload-bundle -b ec2-bucket -m /mnt/image.manifest.xml -a key -s password
Registering the AMI
cd /home/shahbhat/ec2-api-tools-1.2-13740 ec2-register ec2-bucket/image.manifest.xml
Deregistering Your AMI
ec2-deregister ami-47c0252e
Removing Your AMI from Amazon S3
ec2-delete-bundle -b-p image -a -s
Terminating Your Instances
ec2-terminate-instances i-b15bb0d8
/sbin/shutdown -h now
Setting up Erlang Cluster
Here is the fun and easy part:
Starting instances
I am going to start three instances from my Erlang image by opening three different shells and typing following commands:
ssh -i id_rsa-gsg-keypair root@ec2-67-202-33-171.compute-1.amazonaws.com erl -sname master -setcookie ABC Eshell V5.5 (abort with ^G) (master@domU-12-31-38-00-6C-81)1> ssh -i id_rsa-gsg-keypair root@ec2-67-202-33-171.compute-1.amazonaws.com erl -sname slave1 -setcookie ABC Eshell V5.5 (abort with ^G) (slave1@domU-12-31-38-00-6C-81)1> ssh -i id_rsa-gsg-keypair root@ec2-67-202-20-199.compute-1.amazonaws.com erl -sname slave2 -setcookie ABC Eshell V5.5 (abort with ^G) (slave2@domU-12-31-38-00-40-F6)1>
Checking status of instances
Starting instances take a couple of minutes and you can check status with following command:
ec2-describe-instances
and it showed me something like:
RESERVATION r-f7fd149e 275961154068 default INSTANCE i-0146b268 ami-23c92c4a ec2-67-202-33-171.compute-1.amazonaws.com domU-12-31-38-00-6C-81.compute-1.internal running gsg-keypair 0 m1.small 2007-12-04T17:29:39+0000 RESERVATION r-c6fd14af 275961154068 default INSTANCE i-1046b279 ami-23c92c4a ec2-67-202-27-186.compute-1.amazonaws.com domU-12-31-38-00-35-D6.compute-1.internal running gsg-keypair 0 m1.small 2007-12-04T17:33:47+0000 RESERVATION r-d8fd14b1 275961154068 default INSTANCE i-1246b27b ami-23c92c4a ec2-67-202-20-199.compute-1.amazonaws.com domU-12-31-38-00-40-F6.compute-1.internal running gsg-keypair 0 m1.small 2007-12-04T17:34:10+0000
Connecting to the slaves from master
From the master node, I typed following commands to connect to the slave nodes:
net_adm:ping(list_to_atom("slave1@domU-12-31-38-00-6C-81")).
net_adm:ping(list_to_atom("slave1@domU-12-31-38-00-6C-81")).
and both returned pong
I then checked connected nodes with following command on the master node:
nodes().
and it returned
[‘slave1@domU-12-31-38-00-6C-81′,’slave2@domU-12-31-38-00-40-F6’]
I also ran the same command on slave1 and slave2 and got
nodes().
[‘master@domU-12-31-38-00-6C-81′,’slave2@domU-12-31-38-00-40-F6’]
(slave1@domU-12-31-38-00-6C-81)2>
nodes().
[‘master@domU-12-31-38-00-6C-81′,’slave1@domU-12-31-38-00-6C-81’]
Testing
I then typed following simple command on the master node to test cluster:
rpc:multicall(nodes(), io, format, ["Hello world~n", []]).
and it returned:
Hello world
Hello world
{[ok,ok],[]}
Shutting down instances
Since, EC2 charges based on usage, you need to shutdown instances after you are done.
ec2-describe-instances
which returned:
RESERVATION r-f7fd149e 275961154068 default INSTANCE i-0146b268 ami-23c92c4a ec2-67-202-33-171.compute-1.amazonaws.com domU-12-31-38-00-6C-81.compute-1.internal running gsg-keypair 0 m1.small 2007-12-04T17:29:39+0000 RESERVATION r-c6fd14af 275961154068 default INSTANCE i-1046b279 ami-23c92c4a ec2-67-202-27-186.compute-1.amazonaws.com domU-12-31-38-00-35-D6.compute-1.internal running gsg-keypair 0 m1.small 2007-12-04T17:33:47+0000 RESERVATION r-d8fd14b1 275961154068 default INSTANCE i-1246b27b ami-23c92c4a ec2-67-202-20-199.compute-1.amazonaws.com domU-12-31-38-00-40-F6.compute-1.internal running gsg-keypair 0 m1.small 2007-12-04T17:34:10+0000
I then ran ec2-terminate-instances i-0146b268 i-1046b279 i-1246b27b
Conclusion
As you can see once you have the basic EC2 instances running, starting the Erlang cluster is pretty trivial.
November 14, 2007
Evaluating dynamic mathematical expressions in Java/Javascript
Recently, I was looking for a tool to evaluate some mathematical expressions in Java, which were stored in some configuration file. I like dynamic features of languages like Ruby, Python, Javascript or even Erlang to evaluate snippet of code dynamically. Since, Java does not support this inherently, I tried looking for some scripting language that I can embed inside Java. Also, I needed a scripting language that provides two-way integration, i.e., ability to call scripting language from Java and an ability to call Java from the scripting language. So I chose Rhino implementation of Javascript which met those goals. Here is a small example that shows how you can do that using Java and Javascript:
1 2 import java.util.Collection; 3 import java.util.ArrayList; 4 import java.util.HashSet; 5 import org.mozilla.javascript.*; 6 import java.math.BigDecimal; 7 import junit.framework.TestCase; 8 import junit.framework.Test; 9 import junit.framework.TestSuite; 10 11 12 13 public class JsEvalTest extends TestCase { 14 private static final String FUNCS_PREFIX = "funcs"; 15 private Context cx; 16 private Scriptable scope; 17 18 public void setUp() throws Exception { 19 cx = Context.enter(); 20 cx.setLanguageVersion(Context.VERSION_1_2); 21 scope = cx.initStandardObjects(); 22 ScriptableObject.defineClass(scope, Functions.class); 23 Scriptable funcs = cx.newObject(scope, "Functions", new Object[] {new ArrayList()}); 24 scope.put(FUNCS_PREFIX, scope, funcs); 25 } 26 27 public void tearDown() { 28 Context.exit(); 29 } 30 31 public void testFormula() throws Exception { 32 String formula = "cpp = cp / spr"; 33 addDependentVariables(formula); 34 cx.evaluateString(scope, formula, "<cmd>", 1, null); 35 Object varValue = scope.get("cpp", scope); 36 assertEquals(new Double(0.5), varValue); 37 } 38 39 public void testFunction() throws Exception { 40 String formula = "a = " + FUNCS_PREFIX + ".actuals(cp, spr)"; 41 addDependentVariables(formula); 42 cx.evaluateString(scope, formula, "<cmd>", 1, null); 43 Object varValue = scope.get("a", scope); 44 assertEquals(new Double(20000), varValue); 45 } 46 47 private BigDecimal getValueFor(String var) { 48 if ("cp".equals(var)) return new BigDecimal(100); 49 else if ("spr".equals(var)) return new BigDecimal(200); 50 else return null; 51 } 52 53 public static Test suite() { 54 TestSuite suite = new TestSuite(); 55 suite.addTestSuite(JsEvalTest.class); 56 return suite; 57 } 58 59 ///////////////////////////////////////////////////////////////// 60 // 61 private void addDependentVariables(String formula) { 62 Collection<String> dependentVars = getDependentVars(formula); 63 for (String dependentVar : dependentVars) { 64 Object dependentValue = getValueFor(dependentVar); 65 if (dependentValue != null) scope.put(dependentVar, scope, dependentValue); 66 } 67 } 68 private Collection<String> getDependentVars(String formula) { 69 CompilerEnvirons compilerEnv = new CompilerEnvirons(); 70 compilerEnv.initFromContext(cx); 71 compilerEnv.setGeneratingSource(false); 72 Parser p = new Parser(compilerEnv, compilerEnv.getErrorReporter()); 73 ScriptOrFnNode root = p.parse(formula, null, 1); 74 return getDependentVars(root); 75 } 76 77 private Collection<String> getDependentVars(final ScriptOrFnNode tree) { 78 Collection<String> vars = new HashSet<String>(); 79 getDependentVars(tree, tree, vars); 80 return vars; 81 } 82 83 private void getDependentVars(final ScriptOrFnNode tree, final Node parent, Collection<String> vars) { 84 Node node = null; 85 siblingLoop: 86 for (;;) { 87 Node previous = null; 88 if (node == null) { 89 node = parent.getFirstChild(); 90 } else { 91 previous = node; 92 node = node.getNext(); 93 } 94 if (node == null) { 95 break; 96 } 97 98 int type = node.getType(); 99 if (type == Token.NAME) { 100 String className = node.getClass().getName(); 101 if ("org.mozilla.javascript.Node$StringNode".equals(className)) { 102 vars.add(node.getString()); 103 } 104 } 105 getDependentVars(tree, node, vars); 106 } 107 } 108 109 110 public static void main(String[] args) { 111 junit.textui.TestRunner.run(suite()); 112 } 113 } 114 ~ 115
One gotcha, I came across with Rhino was that I also needed to know the input variables to expressions so that I can bind them properly before executing them. Luckily, Rhino is open source and I looked at the source code and hacked a way to parse the expression and find all named variables. For production, I am caching it, but it’s pretty simple to do. For example:
77 private Collection<String> getDependentVars(final ScriptOrFnNode tree) { 78 Collection<String> vars = new HashSet<String>(); 79 getDependentVars(tree, tree, vars); 80 return vars; 81 } 82 83 private void getDependentVars(final ScriptOrFnNode tree, final Node parent, Collection<String> vars) { 84 Node node = null; 85 siblingLoop: 86 for (;;) { 87 Node previous = null; 88 if (node == null) { 89 node = parent.getFirstChild(); 90 } else { 91 previous = node; 92 node = node.getNext(); 93 } 94 if (node == null) { 95 break; 96 } 97 98 int type = node.getType(); 99 if (type == Token.NAME) { 100 String className = node.getClass().getName(); 101 if ("org.mozilla.javascript.Node$StringNode".equals(className)) { 102 vars.add(node.getString()); 103 } 104 } 105 getDependentVars(tree, node, vars); 106 } 107 } 108
Another nice thing about Javascript is that you can not only call regular Java classes, but can also define Javascriptish methods, e.g.
import org.mozilla.javascript.*;
public class Functions extends ScriptableObject {
// The zero-argument constructor used by Rhino runtime to create instances
public Functions() {
}
// Method jsConstructor defines the JavaScript constructor
public void jsConstructor(Object g) {
}
// The class name is defined by the getClassName method
public String getClassName() { return "Functions"; }
// The method jsGet_count defines the count property.
public int jsGet_ratio_value() { return 500;}
public int jsGet_actuals_value() { return 5000;}
// Methods can be defined using the jsFunction_ prefix. Here we define
// resetCount for JavaScript.
public double jsFunction_actuals(double a, double b) {
return a * b;
}
public double jsFunction_ratio(double a, double b) {
return a / b;
}
}
November 2, 2007
Erlang translation of States Puzzle
I came across neat solution from A Simple Programming Puzzle Seen Through Three Different Lenses regarding states puzzle from Mark Nelson, which was originally posted NPR. Here is translation of Anders Pearson’s solution in Erlang:
1 -module(states). 2 3 %%-------------------------------------------------------------------- 4 %% External exports 5 %%-------------------------------------------------------------------- 6 -export([ 7 test/0 8 ]). 9 10 states() -> 11 ["alabama","alaska","arizona","arkansas","california","colorado", 12 "connecticut","delaware","florida","georgia","hawaii","idaho", 13 "illinois","indiana","iowa","kansas","kentucky","louisiana", 14 "maine","maryland","massachusetts","michigan","minnesota", 15 "mississippi","missouri","montana","nebraska","nevada", 16 "newhampshire","newjersey","newmexico","newyork","northcarolina", 17 "northdakota","ohio","oklahoma","oregon","pennsylvania","rhodeisland", 18 "southcarolina","southdakota","tennessee","texas","utah","vermont", 19 "virginia","washington","westvirginia","wisconsin","wyoming"]. 20 21 test() -> 22 States = states(), 23 Dict = dict:new(), 24 {_, _, MatchingStates} = lists:foldl(fun lookup/2, {Dict, States, []}, States), 25 [["northcarolina","southdakota"],["southcarolina","northdakota"]] = MatchingStates. 26 27 lookup(State, {Dict, States, MatchingStates}) -> 28 {State, Dict1, MatchingStates1} = lists:foldl(fun lookup2/2, {State, Dict, MatchingStates}, States), 29 {Dict1, States, MatchingStates1}. 30 31 lookup2(State1, {State2, Dict, MatchingStates}) -> 32 Key = lists:sort(lists:flatten(State1 ++ State2)), 33 Response = dict:find(Key, Dict), 34 MatchingStates1 = case Response of 35 {ok, [State2, State1] } -> 36 MatchingStates; 37 {ok, _} -> 38 [[State1, State2]|MatchingStates]; 39 _ -> 40 MatchingStates 41 end, 42 Dict1= dict:store(Key, [State1, State2], Dict), 43 {State2, Dict1, MatchingStates1}.
October 2, 2007
RE: Erlang: the verdict
I read an interesting blog from Cedric Beust on Erlang: the verdict . Since, Joe Armstrong’s book, Erlang has come to the spotlight whether it’s ready or not. Cedric is well known in Java community and I have used his tools such as EJBGen that he wrote in late 90s and TestNG: a testing framework. Here is summary of objections that he raised on Erlang and my thoughts:
Object Oriented
His first objection is that Joe is biased against OO and Erlang is not an object oriented language. I can’t say having OO is must for every good language. Erlang is fundamentally functional language with good support for concurrent programming. And as I wrote earlier on Polymorphism in Erlang, Erlang is based on Actor model, where each object has a mailbox associated that receives messages. In this model, the message passing takes literal form rather than method invocation. Also, according to GoF author Ralph Johnson sequential Erlang is functional and concurrent Erlang is object oriented. And I admit method invocation based object oriented is easier than message passing, but it’s just matter of style. I believe abstraction, encapsulation, separation of concerns are more important than inheritance or combination of state/behavior.
Scalability
His second objection is Erlang claims to be scalable, but it only provides primitives to build scalable systems. Though, I agree building scalable systems requires scalable architecture and just using Erlang will not make your system scalable. Also, I agree that you can build scalable systems in any language. I have been building distributed systems for over ten years and I have used CORBA, EJB, messaging middlewares, JINI and all kind of RPC or Web Services. In a lot of ways OTP gives you comparable features, but all that is well integrated into Erlang platform. Also, one of basic requirement for distributed systems is building monitoring and fault tolerant system. Back in 90s when I worked on CORBA, I build pretty complexed monitoring system that allowed sysadmins to watch all servers on machines and see their health. The system also provided ability to automatic and manual start/stop of processes when they crashed. All that was a lot of work, but OTP gives you that capability right away. These days you can also get similar features from application servers and commercial monitoring systems, though open source app servers still lack in that area.
Performance
Cedric also raised recent performance problem that Tim Bray had processing large files in Erlang. Erlang was built for telecommunication systems and though it’s now being marketed as general purpose language, but it still lacks many of the features for general purpose language. Slow I/O or slow regular expression is one of them. I hear a lot of people dismiss performance by just saying computing cycles are cheap and human cycles matter more, and just throw another box. I think CPU cycles still count, especially when difference is in the order of magnitude than other solutions. I also heard from Ebay architect Dan Pritchett, how building large data centers are running into a wall because of power demands. Erlang works best when your application is not CPU-bound, so in that case I would not recommend using it. I think now that Erlang is under spotlight, the future releases of Erlang will address slow I/O, regular expressions and commonly used primitives.
Five 9s
In Erlang fault tolerance is very different from traditional approach in Java, where server tries to handle exceptions and keep running. In Erlang when an error occurs the process simply dies (if it chooses) and supervisor can restart it. Again, this can be built in any language. Erlang provides nice primitives for building fault tolerant systems, though I agree that Erlang does not give this for free, but you still have to build it. Cedric omitted hot code swap, which I have seen only in commercial application servers such as Weblogic and Websphere.
Lock free programming
I have been writing multi-threaded applications for over ten years and I like Erlang’s approach because I have spent too much time in fixing locking and synchronization problems. Cedric complains that Erlang uses locking internally and I am fine with that as long as it’s not exposed to the APIs. Though, I admit in certain cases locking can be useful. For example, one of the exercise from Joe’s book asks reader to implement a correct way to register a process when multiple processes trying to do that simultaneously. And I have seen a lot of clever solutions for that, which would have been easier with locking. I don’t like clever code because it becomes maintenance and debugging hell, so I would have liked simple locking in that case.
Distributed Data
In conclusion, Cedric mentions how distributing data is easier than distributing code. I agree and that is why I like Mnesia instead of centralized database approach. Our CTO Werner Vogels at Amazon has been talking for a while on CAP principle and how databases don’t scale. In one of the system we actually had a database that could not scale and was replaced by a collection of distributed berkley databases. Erlang gives you that capability with Mnesia.
Conclusion
I think Erlang is a specialized language that may not be suitable for all applications, and I have been trying to learn it myself and writing a few applications. I like skepticism that Cedric showed, I think it’s important to have a healthy skepticism when trying new thing. I see too often, a new thing is simply marketed as a Silver Bullet. Also, the authors or makers of new product have their own agenda. Though, you cannot judge or give a verdict just by reading a book. You have to try it yourself in a real project. I like a quote from Matrix movie: “You do not know someone until you fight them.” You can only learn something by trying its limits and not by just reading a book. That is what I am trying to do, and despite the cryptic error messages in Erlang I will stick with it. I think the new era of application development is multi-language, where you use right language for the right task and Erlang is just another tool that will only work for the kind of problems that it inherently was designed to support. Finally, as Fred Brooks wrote there aren’t Silver Bullets, but I see all the time, people are quick to hail new language, methodology or software as that. I believe with good people you can do everything and build the best systems with any language, methodology or process.
September 19, 2007
Erlang Solution for finding 51 billionth letter when converting integers from 1 to 999,999,999 are written as words.
I came across an interesting problem from ITA software about If the integers from 1 to 999,999,999 are written as words, sorted alphabetically, and concatenated, what is the 51 billionth letter? For those who don’t know ITA software writes software to search airline flights and I used it when working for Orbitz. Their software is still the fastest I know and I like this problem that they posted for recruiting. Since, I am learning Erlang lately, so I gave a try in it. Here is the source code:
1 -module(ita_puzzle). 2 -compile(export_all). 3 4 5 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 6 %%% Puzzle from ITASoftware http://www.itasoftware.com/careers/puzzles07.html 7 %%% Word Numbers 8 %%% "If the integers from 1 to 999,999,999 are written as words, sorted 9 %%% alphabetically, and concatenated, what is the 51 billionth letter?" 10 %%% To be precise: if the integers from 1 to 999,999,999 are expressed in 11 %%% words (omitting spaces, 'and', and punctuation[1]), and sorted 12 %%% alphabetically so that the first six integers are 13 %%% 14 %%% * eight 15 %%% * eighteen 16 %%% * eighteenmillion 17 %%% * eighteenmillioneight 18 %%% * eighteenmillioneighteen 19 %%% * eighteenmillioneighteenthousand 20 %%% 21 %%% and the last is 22 %%% 23 %%% * twothousandtwohundredtwo 24 %%% 25 %%% then reading top to bottom, left to right, the 28th letter completes 26 %%% the spelling of the integer "eighteenmillion". 27 %%% 28 %%% The 51 billionth letter also completes the spelling of an integer. 29 %%% Which one, and what is the sum of all the integers to that point? 30 %%% [1] For example, 911,610,034 is written 31 %%% "ninehundredelevenmillionsixhundredtenthousandthirtyfour"; 32 %%% 500,000,000 is written "fivehundredmillion"; 1,709 is written 33 %%% "onethousandsevenhundrednine". 34 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 35 36 37 38 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 39 % test function 40 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 41 test() -> 42 "ninehundredelevenmillionsixhundredtenthousandthirtyfour" = number_to_words(911610034), 43 "fivehundredmillion" = number_to_words(500000000), 44 "onethousandsevenhundrednine" = number_to_words(1709), 45 run(). 46 47 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 48 % Run creates 1 through 999999999, sorts them and prints the 49 % number at 51000000000th (51-billion) letter and sum of all 50 % numbers until that number. 51 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 52 run() -> 53 54 statistics(runtime), 55 statistics(wall_clock), 56 57 open(), 58 convert_numbers(999999999), 59 lists:foldl(fun run/2, {0, 51000000000, 0}, "abcdefghijklmnopqrstuvwxyz"), 60 close(), 61 62 {_, RT} = statistics(runtime), 63 {_, WC} = statistics(wall_clock), 64 io:format("Completed in ~p (~p) milliseconds.~n", [RT, WC]). 65 66 run(Letter, {I, Max, Sum}) -> 67 Matched = lists:sort(dets:lookup(wordfile, Letter)), 68 lists:foldl(fun print/2, {I, Max, Sum}, Matched). 69 70 71 72 73 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 74 % Rules for creating number suffixes 75 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 76 number_suffix(N, M) when N >= 1+303, M == 0 -> {"", 303}; 77 number_suffix(N, M) when N >= 1+100, M == 0 -> {"", 100}; 78 number_suffix(N, M) when N >= 1+63, M == 0 -> {"", 63}; 79 number_suffix(N, M) when N >= 1+60, M == 0 -> {"", 60}; 80 number_suffix(N, M) when N >= 1+57, M == 0 -> {"", 57}; 81 number_suffix(N, M) when N >= 1+54, M == 0 -> {"", 54}; 82 number_suffix(N, M) when N >= 1+51, M == 0 -> {"", 51}; 83 number_suffix(N, M) when N >= 1+48, M == 0 -> {"", 48}; 84 number_suffix(N, M) when N >= 1+45, M == 0 -> {"", 45}; 85 number_suffix(N, M) when N >= 1+42, M == 0 -> {"", 42}; 86 number_suffix(N, M) when N >= 1+39, M == 0 -> {"", 39}; 87 number_suffix(N, M) when N >= 1+36, M == 0 -> {"", 36}; 88 number_suffix(N, M) when N >= 1+33, M == 0 -> {"", 33}; 89 number_suffix(N, M) when N >= 1+30, M == 0 -> {"", 30}; 90 number_suffix(N, M) when N >= 1+27, M == 0 -> {"", 27}; 91 number_suffix(N, M) when N >= 1+24, M == 0 -> {"", 24}; 92 number_suffix(N, M) when N >= 1+21, M == 0 -> {"", 21}; 93 number_suffix(N, M) when N >= 1+18, M == 0 -> {"", 18}; 94 number_suffix(N, M) when N >= 1+15, M == 0 -> {"", 15}; 95 number_suffix(N, M) when N >= 1+12, M == 0 -> {"", 12}; 96 number_suffix(N, M) when N >= 1+9, M == 0 -> {"", 9}; 97 number_suffix(N, M) when N >= 1+6, M == 0 -> {"", 6}; 98 number_suffix(N, M) when N >= 1+3, M == 0 -> {"", 3}; 99 %%% 100 number_suffix(N, _) when N >= 1+303 -> {"centillion", 303}; 101 number_suffix(N, _) when N >= 1+100 -> {"googol", 100}; 102 number_suffix(N, _) when N >= 1+63 -> {"vigintillion", 63}; 103 number_suffix(N, _) when N >= 1+60 -> {"novemdecillion", 60}; 104 number_suffix(N, _) when N >= 1+57 -> {"octodecillion", 57}; 105 number_suffix(N, _) when N >= 1+54 -> {"septendecillion", 54}; 106 number_suffix(N, _) when N >= 1+51 -> {"sexdecillion", 51}; 107 number_suffix(N, _) when N >= 1+48 -> {"quindecillion", 48}; 108 number_suffix(N, _) when N >= 1+45 -> {"quattuordecillion", 45}; 109 number_suffix(N, _) when N >= 1+42 -> {"tredecillion", 42}; 110 number_suffix(N, _) when N >= 1+39 -> {"duodecillion", 39}; 111 number_suffix(N, _) when N >= 1+36 -> {"undecillion", 36}; 112 number_suffix(N, _) when N >= 1+33 -> {"decillion", 33}; 113 number_suffix(N, _) when N >= 1+30 -> {"nonillion", 30}; 114 number_suffix(N, _) when N >= 1+27 -> {"octillion", 27}; 115 number_suffix(N, _) when N >= 1+24 -> {"septillion", 24}; 116 number_suffix(N, _) when N >= 1+21 -> {"sextillion", 21}; 117 number_suffix(N, _) when N >= 1+18 -> {"quintillion", 18}; 118 number_suffix(N, _) when N >= 1+15 -> {"quadrillion", 15}; 119 number_suffix(N, _) when N >= 1+12 -> {"trillion", 12}; 120 number_suffix(N, _) when N >= 1+9 -> {"billion", 9}; 121 number_suffix(N, _) when N >= 1+6 -> {"million", 6}; 122 number_suffix(N, _) when N >= 1+3 -> {"thousand", 3}; 123 number_suffix(_, M) when M >= 100 -> {"hundred", 2}; 124 number_suffix(N, _) when N > 2 -> {"", 2}; 125 number_suffix(_, M) when M >= 90 -> {"ninety", 1}; 126 number_suffix(_, M) when M >= 80 -> {"eighty", 1}; 127 number_suffix(_, M) when M >= 70 -> {"seventy", 1}; 128 number_suffix(_, M) when M >= 60 -> {"sixty", 1}; 129 number_suffix(_, M) when M >= 50 -> {"fifty", 1}; 130 number_suffix(_, M) when M >= 40 -> {"fourty", 1}; 131 number_suffix(_, M) when M >= 30 -> {"thirty", 1}; 132 number_suffix(_, M) when M >= 20 -> {"twenty", 1}; 133 number_suffix(_, _) -> {undefined, 0}. 134 135 136 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 137 % Rules for converting numbers to words. 138 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 139 get_word("90") -> "ninety"; 140 get_word("80") -> "eighty"; 141 get_word("70") -> "seventy"; 142 get_word("60") -> "sixty"; 143 get_word("50") -> "fifty"; 144 get_word("40") -> "fourty"; 145 get_word("30") -> "thirty"; 146 get_word("20") -> "twenty"; 147 get_word("19") -> "nineteen"; 148 get_word("18") -> "eighteen"; 149 get_word("17") -> "seventeen"; 150 get_word("16") -> "sixteen"; 151 get_word("15") -> "fifteen"; 152 get_word("14") -> "fourteen"; 153 get_word("13") -> "thirteen"; 154 get_word("12") -> "twelve"; 155 get_word("11") -> "eleven"; 156 get_word("10") -> "ten"; 157 get_word("9") -> "nine"; 158 get_word("8") -> "eight"; 159 get_word("7") -> "seven"; 160 get_word("6") -> "six"; 161 get_word("5") -> "five"; 162 get_word("4") -> "four"; 163 get_word("3") -> "three"; 164 get_word("2") -> "two"; 165 get_word("1") -> "one"; 166 get_word([$0|T]) -> get_word(T); 167 get_word(_N) -> "". 168 169 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 170 % Opening Dets file 171 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 172 open() -> 173 File = "./ita_puzzles.dat", 174 file:delete(File), 175 case dets:open_file(wordfile, [{type, bag}, {file, [File]}]) of 176 {ok, wordfile} -> 177 wordfile; 178 {error,_Reason} -> 179 io:format("cannot open word file ~p~n", [File]), 180 exit(eDetsOpen) 181 end. 182 183 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 184 % Closing Dets file 185 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 186 close() -> dets:close(wordfile). 187 188 189 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 190 % Converting numbers from 1 to some range and store it in dets table. 191 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 192 convert_numbers(1) -> 193 true; 194 convert_numbers(Num) -> 195 Word = number_to_words(Num), 196 dets:insert(wordfile, {hd(Word), Word, Num}), 197 convert_numbers(Num-1). 198 199 print(_X, {Max, Max, Sum}) -> 200 {Max, Max, Sum}; 201 print({H, Word, Num}, {I, Max, Sum}) -> 202 %io:format("Number ~p ~p~n", [Num, Word]), 203 Len = length(Word), 204 Sum1 = Num + Sum, 205 I1 = I + Len, 206 if 207 I < Max, I1 >= Max -> Letter = lists:nth(Max-I, Word), 208 io:format("Found ~pth letter ~p, Word ~p Num ~p, sum ~p~n", [Max, Letter, Word, Num, Sum1]); 209 true -> 210 true 211 end, 212 {I1, Max, Sum1}. 213 214 215 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 216 % Converting a number to word 217 %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% 218 number_to_words(Num) -> 219 lists:flatten(number_to_words(integer_to_list(Num), [])). 220 221 number_to_words([], Out) -> 222 Out; 223 224 number_to_words(L, Out) -> 225 Len = length(L), 226 {Suffix, Cutoff} = number_suffix(Len, list_to_integer(L)), 227 number_to_words(Suffix, Cutoff, Len, L, Out). 228 229 number_to_words(undefined, _Cutoff, _Len, L, Out) -> 230 [get_word(L)|Out]; 231 232 number_to_words(Suffix, _Cutoff, Len, [_H|T], Out) when Len =< 2 -> 233 Suffix ++ get_word(T) ++ Out; 234 235 number_to_words(Suffix, Cutoff, Len, [H|_T]=L, Out) -> 236 {L1, L2} = lists:split(Len-Cutoff, L), 237 get_word(H) ++ number_to_words(L1, [Suffix]) ++ number_to_words(L2, Out). 238 239 240
This problem consists of two sub-problems, first one, which is relatively easy is to convert numbers into words, but the hardest problem is sorting and iterating over the results. And it’s going to take a long time to finish all that. I will have to run the program overnight and will post my solution once it is finished. By the way, I would have loved to see foldl method in dets that takes order_by function, unfortunately I had to create some hack to get chunks of data and then sort in memory. Any suggestions to improve this code would be welcome.
