One of pattern I learned earlier in my career was to separate control flow from the data flow. For example, when I first looked at FTP protocol, I noticed it listened to two separate ports for communication between client and server. It used a data port to transfer files and used a control port to send/receive commands for managing transfer. This allows the server to respond quickly if your data port is busy transferring large files. In some ways, this is similar to Bulkhead pattern for partitioning components and limiting the blast radius. When a service reaches its capacity, it will slow down all requests including any requests to control or configure it. Thus, it helps to define a separate channel where you can manage the control-service. Also, you may need to define special access-control policies to manage the control-service. For example, an admin may need to be on a trusted network for administration. In some cases, you may build a control-service for management behind the firewall but the data-service is publicly accessible. Another use-case is to update service’s configuration at runtime where you might store the service configuration via the control-service that can update the configuration and then publishes it to the data-service.
March 9, 2018
March 6, 2018
Tips from the second edition of “Release It!”
The first edition of “Release It!” has been one of most influential books that I have read and it introduced a number of methods for writing fault tolerant systems such as circuit-breaker, bulkhead patterns so I was excited to read the second edition of the book when it came out. Here are a few tips from the book:
In the first chapter, the author defines stability in terms of robustness, i.e., “A robust system keeps processing transactions, even when transient impulses, persistent stresses, or component failures disrupt normal processing.” He recommends focusing on longevity bugs such as resource leaking, e.g. you can set timeout when invoking a network or database operation or detect dead connections before reading/writing. The author cautions against tightly coupled systems that can easily propagate failures to other parts of the system. Similarly, high number of integration points can increase probability of failure in one of those dependencies. The author suggests looking into the blocking calls that can lead to deadlocks when using multiple threads or scrutinizing resource pool that gets exhausted from blocking operations. Another cause of instability is chain reaction from failure of one of servers that increases load on the remaining servers, which can be remedied using bulkhead or circuit-breaker patterns. High memory usage on the server can also constrain resources and the author recommends using external caching systems such as Redis, memcache, etc. In order to monitor health of the system, the author suggests using a mock transaction to ensure it’s working as expected and keeping metrics on errors especially login errors, high latency warnings. One common self-inflicting failure can be caused by self-denial attacks by marketing campaign that can be mitigated by using shared-nothing architecture or reducing fan-in shared resources. The servers can also be subjected to dogpile effect, where resources after upgrade, cronjob or config change spike up that can be mitigated by using random clock skew, adding random jitter and using exponential backoff. Finally, the author recommends monitoring slow responses, removing unbounded result sets and failing fast.
Here is a summary of stability patterns that the author recommends:
- Apply timeout with integration points and delayed retries when failure occurs.
- Apply circuit-breaker to prevent cascading failures (along with timeout).
- Apply bulk-head pattern to partition the system in the event of chain reaction failure.
- Steady state
- Data purging
- Log files
- Fail fast, restart fast and reintegrate
- Let it crash with limited granularity, e.g. boundary of actor.
- Supervision for monitoring, restarts, etc.
- Shed Load
- Back-pressure – queues must be finite for finite response time
- Governor – create governor to slow rate of actions (from automated response) so that humans can review it.
Next few chapters focus on network, machines and processes for building and deploying the code. The author offers several mechanisms for scaling such as load balancing with DNS, using service registry for upgrading and fail-over, configuration, transparency, collecting logs and metrics, etc. The author recommends load shedding when under high load or use HTTP 503 to notify load balancer. For example, queue length can be calculated as: (max-wait-time / mean-processing-time + 1) * processing-threads * 1.5. You can use listen reject queue to return 503 error to prevent clients from reconnecting immediately. The control-panel chapter recommends tools for administration. It recommends postmortem template such as what happened, apologize, commit to improvement and emphasizes system failures (as opposed to human errors). The author recommends adding indicators such as traffic indicators, business transaction, users, resource pool health, database connection health, data consumption, integration point health, cache health, etc.
The security chapter offers standard best practices from OWASP such as using parameterized queries to protect against SQL injection, using high entropy random session-ids/storing cookies for exchanging session-ids to protect against session hijacking/fixation. In order to protect against XSS (when user’s input is rendered in HTML without escaping) by filtering input and escaping it when rendering. The author recommends using a random nonce and strict SameSite policy to protect against CSRF. Similarly, author recommends using the principle of least privilege, access control, etc. The admin tools can offer tools for resetting circuit breakers, adjust connection pool sizes, disabling specific outbound integrations, reloading configuration, stopping accepting load, toggling feature flags.
For ease of deployment, the author recommends automation, immutable infrastructure, continuous deployment, and rolling changes incrementally.
The author suggests several recommendations on process and organization such as OODA loop for fast learning, functional/autonomous teams, evolutionary architecture, asynchrony patterns, loose clustering and creating options for future.
Lastly, the author offers chaos engineering as a way to test resilience of your system using Simian army or writing your own chaos monkey. In the end, the new edition offers a few additional chapters on scaling, deployment, security, and chaos engineering and more war stories from author’s consulting work.
January 15, 2017
Review of “Whiplash: How to Survive Our Faster Future”
I read “Whiplash: How to Survive Our Faster Future†by Jai Ito and Jeff Howe over the holidays. Joi Ito is a director of the MIT Media Lab. The MIT Media Lab was created by Nicholas Negroponte in 1985 to build an environment where best ideas from schools of arts and science can be married to build next revolutionary discoveries.
This book narrates anecdotes of how technology revolutionized human development in past and how it continues to disrupt our lives today. As a consequence of Moor’s law and the Internet, technology is changing at an exponential speed. In such rapidly changing environments, this book provides key lessons that can be used to prepare us for uncertain future and paradigm shifts. In such exponential times, the invention of new technologies far outpaces the moral and ethical consequences of those breakthroughs. As technologies can be used for both good and bad, they offer both salvage and demise of humanity.
Here are primary principles that authors present to shape the new world:
Emergence over Authority:
The invention of Internet has facilitated communication and collaboration all over the world, where best ideas can be easily shared and exchanged. As a result, institutes that had central authorities are disintegrating. The authors present several examples of Emergence vs Authority such as Blogs vs Newspapers, Wikipedia vs Encyclopedia and central governments vs social networks based political revolutions.
In emergent systems, participants use simple rules and exchange information to build complex systems. Examples of emergent systems are ant colony, slime mold, brain, flocking birds, stock exchanges, biology, etc. The authoritarian systems enable incremental changes whereas emergent systems are more adoptive and foster non-linear progress.
Pull Over Push:
Push-based systems control their access whereas pull based use transparency and two-way communication and are able to cope with the crisis far better than push based systems. The authors recited an example of pull-based when meltdown of Fukushima nuclear plant occurred as a result of severe earthquake and fourteen feet tsunami. Joi and a team of volunteers across the world collaborate and built Geiger counters to take accurate readings of radiation. Other examples of pull based systems are crowdfunding and crowdsourcing.
Compasses over Maps:
A map has detailed knowledge and an optimal route whereas compass offers more autonomy and offer more flexibility in an unpredictable environment. The authors stated examples of the education system where standardized tests and curriculum deprive students of creativity and passion for learning. They used the culture of Media Lab as an illustration where the vision is based on compass heading. It provides a framework for individual progress leaving flexibility for interactions between groups.
Risk over Safety:
Traditional businesses are more risk averse where new ventures are thoroughly analyzed and they spend millions in studies. However, the cost of experimenting new ideas has drastically been reduced in today’s market and it offers much better return on investment.
Disobedience over Compliance:
Innovation requires creativity and breaking rules so a high-impact institutes require a culture of disobedience. It needs a culture where criticism and diverse ideas are embraced.
Practice over Theory:
Due to low-cost of launching new products, an innovating organization requires a culture where experiments are valued more than detailed planning.
Diversity over Ability:
The authors provided lessons from biochemistry companies that used gamers to design protein molecules. They used gamers with diverse background and those gamers had better pattern recognition than the biochemists with PhD. Most organizations believe in diversity, but most organizations lack diversity especially in high-tech companies such as Facebook, Yahoo and Google.
Resilience over Strength:
Resilient organizations are like the immune system that can successfully recover from failures. The authors gave examples of cyber-security where there are threats from various sources and successful defense requires treating security systems as biological systems and building strong immune systems against those security risks.
Systems over Objects:
Systems over objects emphasize understanding the connections between people, communities, and the environment. Instead of optimizing an individual or an organization, we need to optimize the impact of innovations on an entire natural system.
Conclusion:
In this final chapter, authors gave examples AI and machine learning where deep learning and reinforced learning has allowed machines to beat human experts in Chess and Go. The authors cite “The Singularity is Near†by Ray Kuzweil, who predicts that we will have intelligent explosion by 2045. In this world, we will have to think about how humans and machines will work together.
November 15, 2016
Tips from “Algorithms to Live By”
The “Algorithms to Live By” by Brian Christian and Tom Griffiths reviews computer algorithms from several domains and illustrates practical examples for applying those algorithms in real-life problems. Here is a list of some of those algorithms that I found very useful:
1. Optimal Stopping
This class of problems determines the optimal time to stop further processing when searching or selecting an option. Here are a few examples:
Secretary Hiring Problem
This is a famous math problem, which was defined by a mathematician named Merril Flood based on “Look-Then-Leap-Rule” to find the best candidate by waiting until you review 37% of the candidates and then hiring the candidate who is better than all of the past candidates. There are several other applications of this algorithm such as finding a life partner or apartment hunting. This problem assumes that you cannot go back to the previous candidate once you reject but there are other variations of this algorithm that allow it in case the selected candidate rejects your offer.
Selling a House
When selling a house, you need to determine the range of expected offers and cost of waiting for the best offer.
Finding a Parking Spot
Given a percentage of parking spots available, you determine the number of vacant spots that can be passed before a certain distance until you take the first spot.
2. Explore/Exploit
In this chapter, authors describe several algorithms for exploring available paths and then using the optimal path. Here is a sampling of the approaches based on explore/exploit:
Multi-armed bandit
Given expected value of a slot machine (winnings/# of pulls), you need to maximize winnings. There are several approaches such as:
- Win-Stay
You keep using a slot machine as long as you are winning and then switch to a different machine when you lose. - Gittins Index
It is named after Gittins, who was a professor at Oxford. It tries to maximize payoffs for future by calculating a Gittins index for all slot machines and then selecting slot machine with the highest Gittins index. - Regret and optimism
Many problems in life can be defined in terms of regrets and optimism by imagining being at the deathbed and thinking of decisions that you could have made differently. - Upper Confidence Bound
It is also referred as optimism in the face of uncertainty, where you choose your actions as if the environment is as nice as is plausibly possible. Given a range of plausible values and you pick the option with the highest confidence interval. - A/B Testing
It is often used to test new features by offering the new features to a subset of the customers.
One of insight the authors present is that people often explore longer by favoring new over the best older option.
3. Sorting
In this chapter, authors describe several algorithms for sorting and their computing cost in terms of O-notation. The O-notation is generally used to indicate algorithm’s worst performance such as:
- O(1): Constant cost
- O(N): Linear cost
- O(N^2): Quadratic cost
- O(2^N): Exponential cost
- O(N!): Factorial cost
Merge-Sort
This algorithm breaks data recursively into smaller sets until there is a single element. It then merges those subsets to create a new sorted list.
Bucket-Sort
A group of n items can be grouped into m buckets in O(nm) time and this insight is used by bucket sorting where items are grouped into a number of sorted buckets. For example, you can use this approach to load returned books into carts based on the shelf numbers.
Sorting is a pre-requisite for searching and there are a lot of practical applications for sorting such as creating matchups between teams. For example, teams can use round-robin based matchup where each team plays each other team but it would result in a lot of matches (O(N^2)). Instead, competitions such as March Madness uses Merge-Sort to move from 64 teams to 32, 16, 8, 4 and finals. However, it doesn’t use full sort as there are only 63 games in the season instead of 192.
4. Caching
In computer design, John Von Neumann designed memory hierarchy to improve lookup performance. It was first used in IBM 360 mainframes. Other computer researchers such as Belady designed algorithms for page faults to load data from disk to memory. There are several algorithms for cache eviction such as First-In, First-Out, Least-Recently-Used, etc.
5. Scheduling
Here are a few of the scheduling algorithms described in this chapter:
Earliest Due Date Strategy
It minimizes maximum lateness by choosing task with the earliest due date first.
Moore’s algorithm
It is similar to Earliest Due Date but it throws out biggest task if the new job can’t be completed by due date.
The authors give an example of Getting Things Done (GTD) technique for time management where small tasks are handled first. The tasks can also have a weight or priority and then the scheduler minimizes the sum of weighted completion time by dividing weight by length of the task and selecting the task with the highest density.
Here are a few issues that can arise with priority based tasks:
- Priority Inversion – when a low priority task possesses a resource and scheduler executes a higher priority task, which cannot make any progress. One way to address this issue is by allowing the low-priority task to inherit the priority of higher priority task and let it complete.
- Thrashing – it occurs when system grinds to halt because work cannot be completed due to lack of resources.
- Context switching – Modern operating system uses context switching to work on multiple tasks but each slice of time needs to be big enough so that the task can make progress. One technique to minimize context switching is interrupt coalescing, which delays hardware interrupt. Similar techniques can be used by batching small tasks, e.g. Getting Things Done technique encourages creating a chunk of time to handle similar tasks such as checking emails, making phone calls, etc.
6. Bayes’s Rule
Reverand Thomas Bayes’s postulated Bayes’s rule by looking at winning and losing tickets to determine overall ticket pool. It was later proved by Pierre-Simon Laplace, which is commonly referred as Laplace’s law. Laplace worked out Bayes’s Rule to use prior knowledge in prediction problems.
Copernican Principle
Richard Gott hypothesized that the moment you observe something, it is likely to be in the middle of its lifetime.
Normal or Gaussian distribution
It has a bell curve and can be used to predict average life span.
Power-law distribution
It uses range over many scales such as the population of cities or income of people.
Multiplicative Rule
It multiplies quantity observed with some constant factor.
Average Rule
It uses the distribution natural average.
Additive Rule
It predicts that the things that will go on just a constant amount longer such as a five more minute rule.
7. Overfitting
In machine learning, overfitting occurs when training data fits tightly with key factors so that it doesn’t accurately predict the outcome for the data that it has not observed.
Cross Validation
Overfitting can be solved with cross-validation by assessing model not just against training data but also against unseen data.
Regularization
It uses contents to penalize complexity.
Lasso
It uses penalty of the total weight of different factors to minimize complexity.
8. Relaxation
In constraint optimization problems, you need to find the best arrangement of a set of variables given a set of rules and scoring mechanism such as traveling salesman problem (O(N!)). Using constraint relaxation, you remove some of the problem constraints, e.g. you can create a minimum spanning tree that connects all nodes in O(N^2) amount of time. Techniques such as Lagrangian Relaxation removes some of the constraints and add them to the scoring system.
9. Randomness
This chapter describes examples of algorithms that are based on random numbers such as:
Monte Carlo Method
It uses random samples to handle qualitatively unmanageable problems.
Hill Climbing
It takes a solution and tries to improve it by permuting some of the factors. It only accepts changes if it results in improvements. However, it may not find the globally optimal solution.
Jitter
It makes random small changes and accepts them even if they don’t improve in order to find the better solution.
Metropolis algorithm
It uses Monte Carlo Method and accepts bad and good tweaks in trying different solutions.
Simulated Annealing
It optimizes problems like annealing by heating up and slowly cooling off.
10. Networking
This chapter describes algorithms used in the computer network such as:
Packet switching
One of key idea of Internet was to use packet switching where TCP/IP sends data packets over a number of connections as opposed to dedicated lines or circuit switching which were used by phone companies.
Acknowledgement
It is used to let the sender know that packet is received. TCP/IP uses the triple handshake to establish a connection and sender resends packets if ACK is not received.
Exponential Backoff
It increases average delay after successive failure.
Flow Control
TCP/IP uses Additive Increase Multiplicative Decrease to increase the number of packets sent and cut the transmission rate in half and ACK is not received.
Bufferbloat
A buffer is a queue that stores outgoing packets, but when the queue length is large, it can add a delay in sending ACK, which would result in redelivery. Explicit Congestion Notification can be used to address those issues.
11. Game Theory
In this chapter, authors discuss several problems from game theory such as:
Halting problem
This problem was first posed by Alan Turing who asserted that a computer program can never tell whether another program that it uses would take forever to compute something.
Prisoner’s dilemma
It is based on two prisoners who are caught and have to either cooperate or work against each other. In general, defection is the dominant strategy.
Nash Equilibrium
It is one of strategy where neither player changes their own play based on the opponent’s strategy.
The Tragedy of the Commons
It involves a shared-resource system where an individual can act independently in a selfish manner that is contrary to the common good of all participants, e.g. voluntary environmental laws where companies are not required to obey emission levels.
Information cascade
Information cascade occurs where an individual abandons their own information in favor of other people’s action. One application of this class of problems is auction systems. Here are a few variations of the auction systems:
- Sealed-bid – where bidders are unaware of other bid prices so they would have to predict price that other bidders would use.
- Dutch or descending auction – where bids start at a high price and is slowly lowered until someone accepts it.
- English or ascending auction – where bid starts at a low price and is then increased.
- Vickrey auction – it is similar to sealed-bid but winners pay second-place bid. It results in better valuation as bidders are incentivized to bid based on the true value.
Summary
This book presents several domains of algorithms and encourages computational kindness by applying these algorithms in real-life. For example, we can add constraints or reduce the number of available options when making a decision, which would lower the mental labor.
September 30, 2016
Review of “Simple architecture for complex enterprises”
“Simple architecture for complex enterprises” focuses on tackling complexity in IT systems. There are a number of methodologies such as Zachman, TOGAF and EA but they don’t address how to manage complexity. The author shares following concerns when implementing an enterprise architecture:
- Unreliable Enterprise Information – when enterprise cannot access or trust its information
- Untimely Enterprise Information – when reliable information is not available in a timely fashion.
- New Projects Underway – when building a new complex IT project without understanding its relationship to the business processes.
- New Companies Being Acquired
- Enterprise Wants to Spin Off Unit
- Need to Identify Outsourcing Opportunities
- Regulatory Requirements
- Need to Automate Relationships with Internal Partners
- Need to Automate Relationships with Customers
- Poor Relationship Between IT and Business Units
- Poor Interoperability of IT Systems
- IT Systems Unmanageable – when IT systems are built piecemeal and patched together.
The author defines enterprise architecture as:
“An enterprise architecture is a description of the goals of an organization, how these goals are realized by business processes, and how these business processes can be better served through technology.”
The author asserts need for planning when building enterprise IT systems and argues that complexity hinders success of these systems and cites several examples from Government and business industries. The author defines the Zachman framework for organizing architecture artifacts and design documents. John Zachman proposed six descriptive foci: data, function, network, people, time and motivation in the framework.

The author explains that Zachman framework does not address complexity of the systems. Next, author explains TOGAF (The Open Group Achitecture Framework) that has four categories:
- Business architecture – business processes
- Application architecture
- Data architecture
- Technical architecture – hardware / software infrastructure
TOGAF defines ADM (Architecture Development Method) as a recipe for creating architecture. The author considers TOGAF as a process instead of framework that can complement Zachman. TOGAF defines following levels of enterprise continuum:
- enterprise continuum
- foundation architectures
- common system architectures
- industry architectures
- organization architectures (ADM)
TOGAF defines knowledge bases such as TRM (Technical Reference Model) and SIB (Standards Information Base). The ADM defines following phases:
- Phase: Prelim – framework and principles
- Phase: A – architecture vision (statement of architecture work, architecture vision)
- Phase: B – business architecture (input frame stakeholders to get baseline and business objectives)
- Phase: C – information system architectures (baseline data architecture, review principles, models, data architecture)
- Phase: D – technology architecture – infrastructure
- Phase: E – opportunities and solutions
- Phase: F – migration planning
- Phase: G – implementation governance
- Phase: H – architecture change management and then it goes back to Phase A.
TOGAF also lacks complexity management and the author then explains The Federal Enterprise Architecture (FEA) that includes reference models for business, service, components, technical and data. FEA organizes EA into segments of business functionality and enterprise services. FEA creates five reference models:
- The Business Reference Model (BRM) – business view
- The Component Reference Model (CRM)
- The Technical Reference Model (TRM)
- The Data Reference Model
- The Performance Reference Model
In chapter two, the author explains how complexity affects system, e.g. a Rubik Cube of 2 x 2 x 2 dimensions has 8 interior cubes and 3.7 x 10^6 permutations but a Rubik Cube of 4 x 4 x 4 dimensions has 64 interior cubes and 7.4 x 10^45 permutations. The relative complexity of 4 x 4 x 4 dimensions Rubik Cube is much higher than Rubik Cube of 2 x 2 x 2 dimensions and the author argues that by partitioning 4 x 4 x 4 Rubik Cube into eight 2 x 2 x 2 Rubik Cube, you can lower its complexity. The author defines following five laws of partitions:
- Partitions must be true partitions
- Partition definitions must be appropriate (e.g. organizing clothing store by color may not be helpful to customers)
- Partition subset numbers must be appropriate
- Partition subset sizes must be roughly equal
- Subset interactions must be minimal and well defined
Further, author suggests simplification to reduce complexity when partitioning by removing partition subsets along with their associated items and removal of other items from one or more partition subsets, leaving the subsets themselves in place. The process of partitioning can be done iteratively by choosing one of the partition subsets and simplifying it. The author narrates story of Jon Boyd who came with the iterative process: observe, orient, plan, act (OOPA) when he was observing how pilots used aircrafts in dogfights at the Air Force. Also, he observed that faster you iterate on OOPA, better your chances of winning the dogfight.
In chapter three, the author shows how mathematics can be used for partitioning. He describes the number of system states as the best measure of complexity and relative complexity of two systems is ratio of the number of states in those systems. For example, a system with two variables, each taking six states can take 6^2 states. i.e,
C = S^v where C is the complexity, V is the number of variables and S is the number of significant states on average.
In business process, the number of paths and decision points within each path is the best measure of complexity, i.e.,
O = P^d where D is the number of decision points and P is the number of paths for each decision points and O is outcome.
The author introduces concept of homomorphism where observing one system make prediction on another system, e.g. relationships between dice systems, software systems and business processes as homomorphic. A system with two six-sided dice has 36 possible states (P^d or 6^2). However, we can reduce number of states by dividing dices into multiple buckets, e.g. two dices with each bucket has 12 states instead of 36. The general formula for the number of states of B buckets with D dices and F faces per dice:
B * F^d
This chapter describes concept of equivalence relations with following properties:
- E(a, a) – always true — reflexivity
- E(a, b) implies E(b, a) — symmetry
- E(a, b) and E(b, c) implies E(a, c) — transitivity
In chapter four, the author explains simple iterative partitions (SIP) to create a diagrammatic overview of the enterprise system that focus on what enterprise does (as opposed to how). The SIP starts with an autonomous business capability (ABC) that represents an equivalence class or one of the set that make up the partition. The ABC model includes process component and technology component including relationships for implementation and deployment. In addition to implementation and deployment, author adds ABC type that is used as a category of ABC such as human resources. These types can also be defined in hierarchical fashion and different implementations of same ABC types are considered siblings. The implementations can also be composed so that one ABC is part of another ABC. Another type of relationship is partner relationships at implementation or deployment levels where one ABC may create information request, information broadcast or work request.
In chapter five, author explains SIP process that has following goals:
- Complexity control
- Logic-based decisions
- Value-driven deliverables
- Reproducible results
- Verifiable architectures
- Flexible methodology
The SIP process consists of following six primary phases:
<------Preliminary------> <--------------Preparatory-------------------> <------Iteration------> Phase-0 Phase-1 Phase-2 Phase-3 Phase-4 Phase-5 Evaluation Preparation Partitioning Simplification Prioritization Iteration
The phase-0 (enterprise architecture evaluation) addresses following issue:
- Unreliable enterprise information
- Untimely enterprise information
- New complex projects underway
- New companies being acquired
- Enterprise wants to spin off unit
- Need to identify outsourcing opportunities
- Regulatory requirements
- Need to automate relationships with external partners
- Need to automate relationships with customers
- Poor relationships between IT and business units
- Poor interoperability of IT systems
- IT systems unmanageable
The phase-1 (SIP preparation) has following deliverables:
- Audit of organizational readiness
- Training
- Governance model
- SIP blend
- Enterprise-specific tools
The phase-2 (partitioning) decomposes enterprise into ABC (discrete autonomous business capability) units. The phase-3 (partition simplification) defines five laws of partitions:
- Partitions must be true partitions
- Partition definitions must be appropriate
- Partition numbers must be appropriate
- Partition sizes must be roughly equal
- Partition interactions must be minimal and well defined.
The phase-4 (ABC prioritization) uses value graph analysis to estimate potential payoff and risk. The value graph analysis addresses following factors:
- Market drivers
- Cost
- Organizational risk
- Technical risk
- Financial value
- Organizational preparedness
- Team readiness
- Status quo
The phase-5 (ABC iteration) uses iterative approach to simplify architecture.
The chapter six describes NPfit project as a case study in complexity. The NPfit promised integrated system connecting every patient, physician, laboratory, pharmacy and healthcare in the UK. Its infrastructure provided new national network, directory services, care records service (CRS). NPfit is split into five regional groups of patients and it allowed appointment to any facility, prescription fulfillment, and picture archiving. Despite huge budget of $9.8 billion dollars, there were several concerns such as failure to communicate, monolithic approach, stifling of innovation, lack of record confidentiality and quality of shared data. The SIP approach would have helped, e.g. phase-1 audits organizational readiness, training and partitioning. The phase-2 would have addressed complexity dropped multiple regional implementations. The phase-3 would have simplified partitions into subsets such as patient registration, appointment booking, prescriptions, patient records, and lab tests.
The chapter seven focuses on guarding boundaries in technical boundaries. For example two systems may communicate via RPC, shared databases or data access layer but it suggests service-oriented-architecture (SOA) for interoperability for better scalability. The author suggests use of guards or envoy entity for handling outgoing or incoming messages to the system. It defines following rules to encapsulate the software for a given ABC:
- Autonomy
- Explicit boundaries
- Partitioning of functionality
- Dependencies defined by policy
- Asynchronicity
- Partitioning of data
- No cross-fortress transactions
- Single-point security
- Inside trust
- Keep it simple
The chapter eight summarizes the book and it explains why complexity is the real enemy and how simplicity pays. It reiterates how SIP architecture can simplify architecture by partitioning system into ABC units.
February 6, 2016
Building a Generic Data Service
As REST based Micro-Services have become prevalent, I often find that web and mobile clients have to connect to different services for gathering data. You may have to call dozens of services to display data on a single screen or page. Also, you may only need subset of data from each service but you still have to pay for the bandwidth and parsing cost.
I created a new Java framework PlexDataProviders for aggregating and querying data from various underlying sources, which can be used to build a general-purpose data service. PlexDataProviders is a light-weight Java framework that abstract access to various data providers such as databases, files, web services, etc. It allows aggregation of data from various data providers.
The PlexDataProviders framework is divided into two components:
- Data Provider – This component defines interfaces that are implemented to access data sources such as database or web services.
- Query Engine – This component is used for querying and aggregating data.
The query engine can determine dependency between providers and it also allow you to use output of one of the data provider as input to another data provider. For example, let’s assume:
- data-provider A requires input-a1, input-a2 and produces output-a1, output-a2
- data-provider B requires input-b1 and output-a1 and produces output-b1, output-b2
Then you can pass input-a1, input-a2 to the query engine and request output-a1, output-a2, output-b1, output-b2 output data fields.
Benefits
PlexDataProviders provides offers following benefits:
- It provides a unified way to search data and abstracts integration to underlying data sources.
- It helps simplifying client side logic as they can use a single data service to query all data instead of using multiple data services.
- This also help with managing end-points as you only a single end-point instead of connecting to multiple web services.
- As clients can specify the data they need, this helps with payload size and network bandwidth.
- The clients only need to create a single data parser so it keeps JSON parsing logic simple.
- As PlexDataProviders supports multi-threading, it also helps with latency of the data fetch requests.
- It partial failure so that a failure in a single data provider doesn’t effect other data providers and the data service can still return partial results. User
- It supports timeout so that clients can receive available data that completes in given timeout interval
Data Structure
Following are primary data structures:
- MetaField – This class defines meta information for each data field such as name, kind, type, etc.
- MetaFieldType – This enum class supports primitive data types supported, i.e.
- SCALAR_TEXT – simple text
- SCALAR_INTEGER – integer numbers
- SCALAR_DECIMAL – decimal numbers
- SCALAR_DATE – dates
- SCALAR_BOOLEAN – boolean
- VECTOR_TEXT – array of text
- VECTOR_INTEGER – array of integers
- VECTOR_DECIMAL – array of decimals
- VECTOR_DATE – array of dates
- VECTOR_BOOLEAN – array of boolean
- BINARY – binary data
- ROWSET – nested data rowsets
- Metadata – This class defines a set of MetaFields used in DataRow/DataRowSet
- DataRow – This class abstracts a row of data fields
- DataRowSet – This class abstracts a set of rows
PlexDataProviders also supports nested structures where a data field in DataRow can be instance of DataRowSet.
Adding a Data Provider
The data provider implements following two interfaces
[codesyntax lang="java"]
public interface DataProducer {
void produce(DataRowSet requestFields, DataRowSet responseFields,
QueryConfiguration config) throws DataProviderException;
}
[/codesyntax]
Note that QueryConfiguration defines additional parameters such as:
- pagination parameters
- ordering/grouping
- filtering parameters
- timeout parameters
The timeout parameter can be used to return all available data within defined time, e.g. query engine may invoke underlying data providers in multiple threads and if underlying query takes a long time then it would return available data.
[codesyntax lang="java"]
public interface DataProvider extends DataProducer, Comparable<DataProvider> {
String getName();
int getRank();
Metadata getMandatoryRequestMetadata();
Metadata getOptionalRequestMetadata();
Metadata getResponseMetadata();
TaskGranularity getTaskGranularity();
}
[/codesyntax]
Each provider defines name, rank (or priority when matching for best provider), set of mandatory/optional input and output data fields. The data provider can also define granularity as coarse grain or fine grain and the implementation may execute those providers on different threads.
PlexDataProviders also provides interfaces for converting data from domain objects to DataRowSet. Here is an example of provider implementation:
[codesyntax lang="java"]
public class SecuritiesBySymbolsProvider extends BaseProvider {
private static Metadata parameterMeta = Metadata.from(SharedMeta.symbol);
private static Metadata optionalMeta = Metadata.from();
private static SecurityMarshaller marshaller = new SecurityMarshaller();
public SecuritiesBySymbolsProvider() {
super("SecuritiesBySymbolsProvider", parameterMeta, optionalMeta,
marshaller.getMetadata());
}
@Override
public void produce(DataRowSet parameter, DataRowSet response,
QueryConfiguration config) throws DataProviderException {
final String id = parameter.getValueAsText(SharedMeta.symbol, 0);
Map<String, Object> criteria = new HashMap<>();
criteria.put("symbol", id.toUpperCase());
Collection<Security> securities = DaoLocator.securityDao.query(criteria);
DataRowSet rowset = marshaller.marshal(securities);
addRowSet(response, rowset, 0);
}
}
[/codesyntax]
Typically, you will create data-provider for each different kind of query that you want to support. Each data provider specifies set of required and optional data fields that can be used to generate output data fields.
Here is an example of marshalling data from Securty domain objects to DataRowSet:
[codesyntax lang="java"]
public DataRowSet marshal(Security security) {
DataRowSet rowset = new DataRowSet(responseMeta);
marshal(rowset, security, 0);
return rowset;
}
public DataRowSet marshal(Collection<Security> securities) {
DataRowSet rowset = new DataRowSet(responseMeta);
for (Security security : securities) {
marshal(rowset, security, rowset.size());
}
return rowset;
}
...
[/codesyntax]
PlexDataProviders provides DataProviderLocator interface for registering and looking up provider, e.g.
[codesyntax lang="java"]
public interface DataProviderLocator {
void register(DataProvider provider);
Collection<DataProvider> locate(Metadata requestFields, Metadata responseFields);
...
}
[/codesyntax]
PlexDataProviders comes with a small application that provides data services by implementing various data providers. It uses PlexService framework for defining the service, e.g.
[codesyntax lang="java"]
@WebService
@Path("/data")
public class DataServiceImpl implements DataService {
private DataProviderLocator dataProviderLocator = new DataProviderLocatorImpl();
private QueryEngine queryEngine = new QueryEngineImpl(dataProviderLocator);
public DataServiceImpl() {
dataProviderLocator.register(new AccountsByIdsProvider());
dataProviderLocator.register(new AccountsByUseridProvider());
dataProviderLocator.register(new CompaniesBySymbolsProvider());
dataProviderLocator.register(new OrdersByAccountIdsProvider());
dataProviderLocator.register(new PositionGroupsBySymbolsProvider());
dataProviderLocator.register(new PositionsBySymbolsProvider());
dataProviderLocator.register(new QuotesBySymbolsProvider());
dataProviderLocator.register(new SecuritiesBySymbolsProvider());
dataProviderLocator.register(new UsersByIdsProvider());
dataProviderLocator.register(new WatchlistByUserProvider());
dataProviderLocator.register(new SymbolsProvider());
dataProviderLocator.register(new UsersProvider());
dataProviderLocator.register(new SymbolSearchProvider());
}
@Override
@GET
public DataResponse query(Request webRequest) {
final DataRequest dataRequest = DataRequest.from(webRequest .getProperties());
return queryEngine.query(dataRequest);
}
}
[/codesyntax]
As you can see the data service simply builds DataRequest with input data fields and sends back response back to clients.
Here is an example client that passes a search query data field and requests quote data fields with company details
public void testGetQuoteBySearch() throws Throwable {
String jsonResp = TestWebUtils.httpGet("http://localhost:" + DEFAULT_PORT
+ "/data?responseFields=exchange,symbol,quote.bidPrice,quote.askPrice,quote.sales,company.name&symbolQuery=AAPL");
...
Note that above request will use three data providers, first it uses SymbolSearchProvider provider to search for matching symbols with given query. It then uses the symbol data field to request company and quote data fields from QuotesBySymbolsProvider and CompaniesBySymbolsProvider. The PlexDataProviders framework will take care of all dependency management for providers.
Here is an example JSON response from the data service:
[codesyntax lang="javascript"]
{
"queryResponse": {
"fields": [
[{
"symbol": "AAPL_X"
}, {
"quote.sales": [
[{
"symbol": "AAPL_X"
}, {
"timeOfSale.volume": 56
}, {
"timeOfSale.exchange": "DOW"
}, {
"timeOfSale.date": 1455426008762
}, {
"timeOfSale.price": 69.49132317180353
}],
[{
"symbol": "AAPL_X"
}, {
"timeOfSale.volume": 54
}, {
"timeOfSale.exchange": "NYSE"
}, {
"timeOfSale.date": 1455426008762
}, {
"timeOfSale.price": 16.677774132458076
}],
[{
"symbol": "AAPL_X"
}, {
"timeOfSale.volume": 99
}, {
"timeOfSale.exchange": "NASDAQ"
}, {
"timeOfSale.date": 1455426008762
}, {
"timeOfSale.price": 42.17891320885568
}],
[{
"symbol": "AAPL_X"
}, {
"timeOfSale.volume": 49
}, {
"timeOfSale.exchange": "DOW"
}, {
"timeOfSale.date": 1455426008762
}, {
"timeOfSale.price": 69.61680149649729
}],
[{
"symbol": "AAPL_X"
}, {
"timeOfSale.volume": 69
}, {
"timeOfSale.exchange": "NYSE"
}, {
"timeOfSale.date": 1455426008762
}, {
"timeOfSale.price": 25.353316897552833
}]
]
}, {
"quote.askPrice": 54.99300665695502
}, {
"quote.bidPrice": 26.935682182171643
}, {
"exchange": "DOW"
}, {
"company.name": "AAPL - name"
}],
[{
"symbol": "AAPL"
}, {
"exchange": "NASDAQ"
}]
],
"errorsByProviderName": {},
"providers": ["QuotesBySymbolsProvider", "SymbolSearchProvider", "CompaniesBySymbolsProvider"]
}
}
[/codesyntax]
PlexDataProviders is available from github and is licensed under liberal MIT license. It also comes with a small sample application for demo purpose. Feel free to send me your suggestions.
August 17, 2014
PlexService Overview – a Micro-service framework for defining HTTP/Websockets and JMS based Services
I recently created a new framework PlexService for serving micro-services. which can be accessed by HTTP, Websockets or JMS interfaces. You can choose these different access mechanism by needs of your services. For example, as JMS services are inherently asynchronous, they provide good foundation for building scalable and reactive services. You may choose http stack for implementing REST services or choose websockets for implementing interactive services.
PlexService framework provides provides basic support for encoding POJO objects into JSON for service consumption. The developers define service configuration via annoations to specify gateway types, encoding scheme, end-points, etc.
PlexService provides support of role-based security, where you can specify list of roles who can access each service. The service providers implement how to verify roles, which are then enforced by PlexService framework.
If you implement all services in JMS, you can easily expose them via HTTP or Websockets by configuring web-to-jms bridge. The bridge routes all requests from HTTP/Websockets to JMS and listen for incoming messages, which are then routed back to web clients.
PlexService provides basic metrics such as latency, invocations, errors, etc., which are exposed via JMX interface. PlexService uses jetty for serving web services. The developers provide JMS containers at runtime if required.
Building/Installing
Checkout code using
git clone git@github.com:bhatti/PlexService.git
Compile and build jar file using
./gradlew jar
Copy and add jar file manually in your application.
Defining role-based security
PlexService allows developers to define role-based security, which is invoked when accessing services, e.g.
public class BuggerRoleAuthorizer implements RoleAuthorizer {
private final UserRepository userRepository;
public BuggerRoleAuthorizer(UserRepository userRepository) {
this.userRepository = userRepository;
}
@Override
public void authorize(Request request, String[] roles) throws AuthException {
String sessionId = request.getSessionId();
User user = userRepository.getUserBySessionId(sessionId);
if (user == null) {
throw new AuthException(Constants.SC_UNAUTHORIZED,
request.getSessionId(), request.getRemoteAddress(),
"failed to validate session-id");
}
for (String role : roles) {
if (!user.getRoles().contains(role)) {
throw new AuthException(Constants.SC_UNAUTHORIZED,
request.getSessionId(), request.getRemoteAddress(),
"failed to match role");
}
}
}
}
Typically, login-service will store session-id, which is then passed to the implementation of RoleAuthorizer, e.g.
@ServiceConfig(gateway = GatewayType.HTTP, requestClass = Void.class, endpoint = "/login", method = Method.POST, codec = CodecType.JSON)
public class LoginService extends AbstractUserService implements RequestHandler {
public LoginService(UserRepository userRepository) {
super(userRepository);
}
@Override
public void handle(Request request) {
String username = request.getStringProperty("username");
String password = request.getStringProperty("password");
User user = userRepository.authenticate(username, password);
AbstractResponseBuilder responseBuilder = request.getResponseBuilder();
if (user == null) {
throw new AuthException(Constants.SC_UNAUTHORIZED,
request.getSessionId(), request.getRemoteAddress(),
"failed to authenticate");
} else {
responseBuilder.addSessionId(userRepository.getSessionId(user));
responseBuilder.send(user);
}
}
}
In above example the session-id is added to response upon successful login, which is then passed for future requests. For http services, you may use cookies to store session-ids, otherwise you would need to pass session-id as a parameter.
Here is how you can invoke login-service from curl:
curl --cookie-jar cookies.txt -v -k -H "Content-Type: application/json" -X POST "http://127.0.0.1:8181/login?username=erica&password=pass"
which would return:
Content-Type: application/json
Set-Cookie: PlexSessionID=5 Expires: Thu, 01 Jan 1970 00:00:00 GMT
{"id":5,"username":"erica","email":"erica@plexobject.com","roles":["Employee"]}
Defining Services
Defining a REST service for creating a user
Here is how you can a REST service:
@ServiceConfig(gateway = GatewayType.HTTP, requestClass = User.class,
rolesAllowed = "Administrator", endpoint = "/users", method = Method.POST,
codec = CodecType.JSON)
public class CreateUserService extends AbstractUserService implements
RequestHandler {
public CreateUserService(UserRepository userRepository) {
super(userRepository);
}
@Override
public void handle(Request request) {
User user = request.getPayload();
user.validate();
User saved = userRepository.save(user);
request.getResponseBuilder().send(saved);
}
}
The ServiceConfig annotation defines that this service can be accessed via HTTP at “/users” URI. PlexService will provide encoding from JSON to User object and will ensure that service can be accessed by user who has Administrator role.
Here is how you can invoke this service from curl:
curl --cookie cookies.txt -k -H "Content-Type: application/json" -X POST "http://127.0.0.1:8181/users" -d "{\"username\":\"david\",\"password\":\"pass\",\"email\":\"david@plexobject.com\",\"roles\":[\"Employee\"]}"
Defining a Web service over Websockets for creating a user
Here is how you can a Websocket based service:
@ServiceConfig(gateway = GatewayType.WEBSOCKET, requestClass = User.class,
rolesAllowed = "Administrator", endpoint = "/users", method = Method.POST,
codec = CodecType.JSON)
public class CreateUserService extends AbstractUserService implements
RequestHandler {
public CreateUserService(UserRepository userRepository) {
super(userRepository);
}
@Override
public void handle(Request request) {
User user = request.getPayload();
user.validate();
User saved = userRepository.save(user);
request.getResponseBuilder().send(saved);
}
}
The ServiceConfig annotation defines that this service can be accessed via Websocketat “/users” endpoint. However, as opposed to HTTP based service, this endpoint is not enforced in HTTP request and can be in any format as long it’s unique for a service.
Here is how you can access websocket service from javascript:
var ws = new WebSocket("ws://127.0.0.1:8181/users");
ws.onopen = function() {
var req = {"payload":"", "endpoint":"/login", "method":"POST", "username":"scott", "password":"pass"};
ws.send(JSON.stringify(req));
};
ws.onmessage = function (evt) {
alert("Message: " + evt.data);
};
ws.onclose = function() {
};
ws.onerror = function(err) {
};
Note that websockets are not supported by all browsers and above code will work only supported browsers such as IE 11+, FF 31+, Chrome 36+, etc.
Defining a JMS service for creating a user
Here is how you can create JMS service:
@ServiceConfig(gateway = GatewayType.JMS, requestClass = User.class,
rolesAllowed = "Administrator", endpoint = "queue:{scope}-create-user-service-queue",
method = Method.MESSAGE,
codec = CodecType.JSON)
public class CreateUserService extends AbstractUserService implements RequestHandler {
public CreateUserService(UserRepository userRepository) {
super(userRepository);
}
@Override
public void handle(Request request) {
User user = request.getPayload();
user.validate();
User saved = userRepository.save(user);
request.getResponseBuilder().send(saved);
}
}
Note that the only difference is type of gateway. PlexService also support variables in end-points, which are populated from configurations. For example, you may create scope variable to create different queues/topics for different developers/environments. PlexService will serialize POJO classes into JSON when delivering messages over JMS.
Defining a REST service with parameterized URLs
PlexService allows developers to define URIs for services, that contains variables. These variables are then populated actual requests. These can be used for implementing REST services, e.g.
@ServiceConfig(gateway = GatewayType.HTTP, requestClass = BugReport.class,
rolesAllowed = "Employee", endpoint = "/projects/{projectId}/bugreports",
method = Method.POST,
codec = CodecType.JSON)
public class CreateBugReportService extends AbstractBugReportService implements RequestHandler {
public CreateBugReportService(BugReportRepository bugReportRepository,
UserRepository userRepository) {
super(bugReportRepository, userRepository);
}
@Override
public void handle(Request request) {
BugReport report = request.getPayload();
report.validate();
BugReport saved = bugReportRepository.save(report);
request.getResponseBuilder().send(saved);
}
}
Here is an example of invoking this service from curl:
curl --cookie cookies.txt -k -H "Content-Type: application/json" -X POST "http://127.0.0.1:8181/projects/2/bugreports" -d "{\"title\":\"As an administrator, I would like to assign roles to users so that they can perform required actions.\",\"description\":\"As an administrator, I would like to assign roles to users so that they can perform required actions.\",\"bugNumber\":\"story-201\",\"assignedTo\":\"mike\",\"developedBy\":\"mike\"}"
Using variables with Websocket based service
You can also create variables for websocket’s endpoints similar to JMS, which are initialized from parameters.
@ServiceConfig(gateway = GatewayType.WEBSOCKET, requestClass = BugReport.class,
rolesAllowed = "Employee", endpoint = "{variable}-create-bugreport-service-channel",
method = Method.MESSAGE, codec = CodecType.JSON)
public class CreateBugReportService extends AbstractBugReportService implements
RequestHandler {
public CreateBugReportService(BugReportRepository bugReportRepository,
UserRepository userRepository) {
super(bugReportRepository, userRepository);
}
@Override
public void handle(Request request) {
BugReport report = request.getPayload();
report.validate();
BugReport saved = bugReportRepository.save(report);
request.getResponseBuilder().send(saved);
}
}
Here is another example of consuming websocket based service from javascript:
var ws = new WebSocket("ws://127.0.0.1:8181/users");
ws.onopen = function() {
var req = {"payload":{"title":"my title", "description":"my description","bugNumber":"story-201", "assignedTo":"mike", "developedBy":"mike"},"PlexSessionID":"4", "endpoint":"/projects/2/bugreports/2/assign", "method":"POST"};
ws.send(JSON.stringify(req));
};
ws.onmessage = function (evt) {
alert("Message: " + evt.data);
};
ws.onclose = function() {
};
ws.onerror = function(err) {
};
Defining a REST service for querying users
Here is an example REST service, which uses GET request to query users:
@ServiceConfig(gateway = GatewayType.HTTP, requestClass = User.class,
rolesAllowed = "Administrator", endpoint = "/users", method = Method.GET,
codec = CodecType.JSON)
public class QueryUserService extends AbstractUserService implements
RequestHandler {
public QueryUserService(UserRepository userRepository) {
super(userRepository);
}
@Override
public void handle(Request request) {
Collection<User> users = userRepository.getAll(new Predicate<User>() {
@Override
public boolean accept(User u) {
return true;
}
});
request.getResponseBuilder().send(users);
}
}
Here is how you can invoke this service from curl
curl --cookie cookies.txt -k -H "Content-Type: application/json" "http://127.0.0.1:8181/users"
which would return json array such as:
[{"id":2,"username":"alex","email":"alex@plexobject.com","roles":["Employee"]},{"id":3,"username":"jeff","email":"jeff@plexobject.com","roles":["Employee","Manager"]},{"id":4,"username":"scott","email":"scott@plexobject.com","roles":["Employee","Administrator","Manager"]},{"id":5,"username":"erica","email":"erica@plexobject.com","roles":["Employee"]}]
Defining a JMS service for querying users
Here is an example of defining query users via JMS service:
@ServiceConfig(gateway = GatewayType.JMS, requestClass = User.class,
rolesAllowed = "Administrator", endpoint = "queue:{scope}-query-user-service-queue",
method = Method.MESSAGE,
codec = CodecType.JSON)
public class QueryUserService extends AbstractUserService implements RequestHandler {
public QueryUserService(UserRepository userRepository) {
super(userRepository);
}
@Override
public void handle(Request request) {
Collection<User> users = userRepository.getAll(new Predicate<User>() {
@Override
public boolean accept(User u) {
return true;
}
});
request.getResponseBuilder().send(users);
}
}
The end-point can contain variables such as scope that are initialized from configuration.
Registering services and starting service container
You will need to register services with ServiceRegistry at runtime, which would initialize and start those services, e.g.
Collection<RequestHandler> services = new HashSet<>(); services.add(new CreateUserService(userRepository)); services.add(new UpdateUserService(userRepository)); services.add(new QueryUserService(userRepository)); services.add(new DeleteUserService(userRepository)); services.add(new LoginService(userRepository)); services.add(new CreateProjectService(projectRepository, userRepository)); services.add(new UpdateProjectService(projectRepository, userRepository)); services.add(new QueryProjectService(projectRepository, userRepository)); services.add(new AddProjectMemberService(projectRepository, userRepository)); services.add(new RemoveProjectMemberService(projectRepository, userRepository)); services.add(new CreateBugReportService(bugreportRepository, userRepository)); services.add(new UpdateBugReportService(bugreportRepository, userRepository)); services.add(new QueryBugReportService(bugreportRepository, userRepository)); services.add(new QueryProjectBugReportService(bugreportRepository, userRepository)); services.add(new AssignBugReportService(bugreportRepository, userRepository)); serviceRegistry = new ServiceRegistry(config, services, new BuggerRoleAuthorizer(userRepository)); serviceRegistry.start();
Creating Http to JMS bridge
You may choose to write all services as JMS and then expose them via HTTP using bridge provided by PlexService, e.g.
final String mappingJson = IOUtils.toString(new FileInputStream( args[1]));
Collection<HttpToJmsEntry> entries = new JsonObjectCodec().decode(
mappingJson, new TypeReference<List<HttpToJmsEntry>>() {
});
WebToJmsBridge bridge = new WebToJmsBridge(new Configuration(args[0]), entries, GatewayType.HTTP);
bridge.startBridge();
Creating Websocket to JMS bridge
Similarly, you may expose JMS services via websockets based transport using the bridge:
final String mappingJson = IOUtils.toString(new FileInputStream( args[1]));
Collection<HttpToJmsEntry> entries = new JsonObjectCodec().decode(
mappingJson, new TypeReference<List<HttpToJmsEntry>>() {
});
WebToJmsBridge bridge = new WebToJmsBridge(new Configuration(args[0]), entries, GatewayType.WEBSOCKET);
bridge.startBridge();
Here is JSON configuration for bridge:
[
{"codecType":"JSON","path":"/projects/{projectId}/bugreports/{id}/assign","method":"POST",
"destination":"queue:{scope}-assign-bugreport-service-queue","timeoutSecs":30},
{"codecType":"JSON","path":"/projects/{projectId}/bugreports","method":"GET",
"destination":"queue:{scope}-query-project-bugreport-service-queue","timeoutSecs":30},
{"codecType":"JSON","path":"/users","method":"GET",
"destination":"queue:{scope}-query-user-service-queue","timeoutSecs":30},
{"codecType":"JSON","path":"/projects","method":"GET",
"destination":"queue:{scope}-query-projects-service","timeoutSecs":30},
{"codecType":"JSON","path":"/bugreports","method":"GET",
"destination":"queue:{scope}-bugreports-service-queue","timeoutSecs":30},
{"codecType":"JSON","path":"/projects/{id}/membership/add","method":"POST",
"destination":"queue:{scope}-add-project-member-service-queue","timeoutSecs":30},
{"codecType":"JSON","path":"/projects/{id}/membership/remove","method":"POST",
"destination":"queue:{scope}-remove-project-member-service-queue","timeoutSecs":30},
{"codecType":"JSON","path":"/projects/{projectId}/bugreports","method":"POST",
"destination":"queue:{scope}-create-bugreport-service-queue","timeoutSecs":30},
{"codecType":"JSON","path":"/users","method":"POST",
"destination":"queue:{scope}-create-user-service-queue","timeoutSecs":30},
{"codecType":"JSON","path":"/projects","method":"POST",
"destination":"queue:{scope}-create-projects-service-queue","timeoutSecs":30},
{"codecType":"JSON","path":"/users/{id}","method":"POST",
"destination":"queue:{scope}-update-user-service-queue","timeoutSecs":30},
{"codecType":"JSON","path":"/users/{id}/delete","method":"POST",
"destination":"queue:{scope}-delete-user-service-queue","timeoutSecs":30},
{"codecType":"JSON","path":"/projects/{id}","method":"POST",
"destination":"queue:{scope}-update-project-service-queue","timeoutSecs":30},
{"codecType":"JSON","path":"/projects/{projectId}/bugreports/{id}","method":"POST",
"destination":"queue:{scope}-update-bugreport-service-queue","timeoutSecs":30},
{"codecType":"JSON","path":"/login","method":"POST",
"destination":"queue:{scope}-login-service-queue","timeoutSecs":30}]
Defining a Streaming Quotes Service over Websockets
Suppose you are building a high performance streaming quote service for providing real-time stock quotes, you can easily build it using PlexService framework, e.g.
@ServiceConfig(gateway = GatewayType.WEBSOCKET, requestClass = Void.class, endpoint = "/quotes", method = Method.MESSAGE, codec = CodecType.JSON)
public class QuoteServer implements RequestHandler {
public enum Action {
SUBSCRIBE, UNSUBSCRIBE
}
static final Logger log = LoggerFactory.getLogger(QuoteServer.class);
private QuoteStreamer quoteStreamer = new QuoteStreamer();
@Override
public void handle(Request request) {
String symbol = request.getProperty("symbol");
String actionVal = request.getProperty("action");
log.info("Received " + request);
ValidationException
.builder()
.assertNonNull(symbol, "undefined_symbol", "symbol",
"symbol not specified")
.assertNonNull(actionVal, "undefined_action", "action",
"action not specified").end();
Action action = Action.valueOf(actionVal.toUpperCase());
if (action == Action.SUBSCRIBE) {
quoteStreamer.add(symbol, request.getResponseBuilder());
} else {
quoteStreamer.remove(symbol, request.getResponseBuilder());
}
}
public static void main(String[] args) throws Exception {
Configuration config = new Configuration(args[0]);
QuoteServer service = new QuoteServer();
Collection<RequestHandler> services = new ArrayList<>();
services.add(new QuoteServer());
//
ServiceRegistry serviceRegistry = new ServiceRegistry(config, services, null);
serviceRegistry.start();
Thread.currentThread().join();
}
}
Above example defines a service that listen to websockets and responds to subscribe or unsubscribe requests from web clients.
You can define mock QuoteStreamer as follows, which periodically sends quotes to all subscribers:
public class QuoteStreamer extends TimerTask {
private int delay = 1000;
private Map<String, Collection<ResponseDispatcher>> subscribers = new ConcurrentHashMap<>();
private QuoteCache quoteCache = new QuoteCache();
private final Timer timer = new Timer(true);
public QuoteStreamer() {
timer.schedule(this, delay, delay);
}
public void add(String symbol, ResponseDispatcher dispatcher) {
symbol = symbol.toUpperCase();
synchronized (symbol.intern()) {
Collection<ResponseDispatcher> dispatchers = subscribers
.get(symbol);
if (dispatchers == null) {
dispatchers = new HashSet<ResponseDispatcher>();
subscribers.put(symbol, dispatchers);
}
dispatchers.add(dispatcher);
}
}
public void remove(String symbol, ResponseDispatcher dispatcher) {
symbol = symbol.toUpperCase();
synchronized (symbol.intern()) {
Collection<ResponseDispatcher> dispatchers = subscribers
.get(symbol);
if (dispatchers != null) {
dispatchers.remove(dispatcher);
}
}
}
@Override
public void run() {
for (Map.Entry<String, Collection<ResponseDispatcher>> e : subscribers
.entrySet()) {
Quote q = quoteCache.getLatestQuote(e.getKey());
Collection<ResponseDispatcher> dispatchers = new ArrayList<>(
e.getValue());
for (ResponseDispatcher d : dispatchers) {
try {
d.send(q);
} catch (Exception ex) {
remove(e.getKey(), d);
}
}
}
}
}
Here is a sample javascript/html client, which allows users to subscribe to different stock symbols:
var ws = new WebSocket("ws://127.0.0.1:8181/quotes");
ws.onopen = function() {
};
var lasts = {};
ws.onmessage = function (evt) {
//console.log(evt.data);
var quote = JSON.parse(evt.data).payload;
var d = new Date(quote.timestamp);
$('#time').text(d.toString());
$('#company').text(quote.company);
$('#last').text(quote.last.toFixed(2));
var prev = lasts[quote.company];
if (prev != undefined) {
var change = quote.last - prev;
if (change >= 0) {
$('#change').css({'background-color':'green'});
} else {
$('#change').css({'background-color':'red'});
}
$('#change').text(change.toFixed(2));
} else {
$('#change').text('N/A');
}
lasts[quote.company] = quote.last;
};
ws.onclose = function() {
};
ws.onerror = function(err) {
};
function send(payload) {
$('#input').text(payload);
ws.send(payload);
}
$(document).ready(function() {
$("#subscribe").click(function() {
var symbol = $("#symbol").val();
var req = {"endpoint":"/quotes", "symbol":symbol, "action":"subscribe"};
send(JSON.stringify(req));
});
});
$(document).ready(function() {
$("#unsubscribe").click(function() {
var symbol = $("#symbol").val();
var req = {"endpoint":"/quotes", "symbol":symbol, "action":"unsubscribe"};
send(JSON.stringify(req));
});
});
<script>
<body>
<form>
Symbol:<input type="text" id="symbol" value="AAPL" size="4" />
<input type="button" id="subscribe" value="Subscribe"/>
<input type="button" id="unsubscribe" value="Unsubscribe"/>
</form>
<br>
<table id="quotes" class="quote" width="600" border="2" cellpadding="0" cellspacing="3">
<thead>
<tr>
<th>Time</th>
<th>Company</th>
<th>Last</th>
<th>Change</th>
</tr>
</thead>
<tbody>
<tr>
<td id="time"></td>
<td id="company"></td>
<td id="last"></td>
<td id="change"></td>
</tr>
</tbody>
</table>
</body>
PlexService includes this sample code, where you can start streaming quote server by running “quote.sh” command and then open quote.html file in your browser.
Using JMX
PlexService uses JMX to expose key metrics and lifecycle methods to start or stop services. You can use jconsole to access the JMX controls, e.g.
jconsole localhost:9191


Summary
PlexService comes a full-fledged sample application under plexsvc-sample folder and you browse JavaDocs to view APIs.
April 23, 2014
Implementing Reactive Extensions (RX) using Java 8
In my last blog, I described new lambda support in Java 8. In order to try new Java features in more depth, I implemented Reactive extensions in Java 8. In short, reactive extensions allows processing synchronous and asynchronous in data uniform manner. It provides unified interfaces that can be used as an iterator or callback method for asynchronous processing. Though, Microsoft RX library is huge but I only implemented core features and focused on Observable API. Here is brief overview of implementation:
Creating Observable from Collection
Here is how you can create Observable from a collection:
List<String> names = Arrays.asList("One", "Two", "Three", "Four", "Five");
Observable.from(names).subscribe(System.out::println,
Throwable::printStackTrace, () -> System.out.println("done"));
In Microsoft’s version of RX, Observable takes an Observer for subscription, which defines three methods: onNext, onError and onCompleted. onNext is invoked to push next element of data, onError is used to notify errors and onCompleted is called when data is all processed. In my implementation, the Observable interface defines two overloaded subscribe method, first takes callback functions for onNext and onError and second method takes three callback functions including onCompleted. I chose to use separate function parameters instead of a single interface so that caller can pass inline lambda functions instead of passing implementation of Observer interface.
Creating Observable from Array of objects
Here is how you can create Observable from stream:
Observable.from("Erica", "Matt", "John", "Mike").subscribe(System.out::println,
Throwable::printStackTrace, () -> System.out.println("done"));
Creating Observable from Stream
Here is how you can create Observable from stream:
Stream<String> names = Stream.of("One", "Two", "Three", "Four", "Five");
// note third argument for onComplete is optional
Observable.from(names).subscribe(name -> System.out.println(name),
error -> error.printStackTrace());
Creating Observable from Iterator
Here is how you can create Observable from iterator:
Stream<String> names = Stream.of("One", "Two", "Three", "Four", "Five");
Observable.from(names.iterator()).subscribe(name -> System.out.println(name),
error -> error.printStackTrace());
Creating Observable from Spliterator
Here is how you can create Observable from spliterator:
List<String> names = Arrays.asList("One", "Two", "Three", "Four", "Five");
Observable.from(names.spliterator()).subscribe(System.out::println,
Throwable::printStackTrace);
Creating Observable from a single object
Here is how you can create Observable from a single object:
Observable.just("value").subscribe(v -> System.out.println(v),
error -> error.printStackTrace());
// if a single object is collection, it would be treated as a single entity, e.g.
Observable.just(Arrays.asList(1, 2, 3)).subscribe( num -> System.out.println(num),
error -> error.printStackTrace());
Creating Observable for an error
Here is how you can create Observable that would return an error:
Observable.throwing(new Error("test error")).subscribe(System.out::println,
error -> System.err.println(error));
// this will print error
Creating Observable from a consumer function
Here is how you can create Observable that takes user function for invoking onNext, onError and onCompleted function:
Observable.create(observer -> {
for (String name : names) {
observer.onNext(name);
}
observer.onCompleted();
}).subscribe(System.out::println, Throwable::printStackTrace);
Creating Observable from range
Here is how you can create Observable from stream that would create numbers from start to end range exclusively.
// Creates range of numbers starting at from until it reaches to exclusively
Observable.range(4, 8).subscribe(num -> System.out.println(num),
error -> error.printStackTrace());
// will print 4, 5, 6, 7
Creating empty Observable
It would call onCompleted right away:
Observable.empty().subscribe(System.out::println,
Throwable::printStackTrace, () -> System.out.println("Completed"));
Creating never Observable
It would not call any of call back methods:
Observable.never().subscribe(System.out::println, Throwable::printStackTrace);
Changing Scheduler
By default Observable notifies observer asynchronously using thread-pool scheduler but you can change default scheduler as follows:
Using thread-pool scheduler
Stream<String> names = Stream.of("One", "Two", "Three", "Four", "Five");
Observable.from(names).subscribeOn(Scheduler.getThreadPoolScheduler()).
subscribe(System.out::println, Throwable::printStackTrace);
Using new-thread scheduler
It will create new thread
Stream<String> names = Stream.of("One", "Two", "Three", "Four", "Five");
Observable.from(names).subscribeOn(Scheduler.getNewThreadScheduler()).
subscribe(System.out::println, Throwable::printStackTrace);
Using timer thread with interval
It will notify at each interval
Stream<String> names = Stream.of("One", "Two", "Three", "Four", "Five");
Observable.from(names).subscribeOn(Scheduler.getTimerSchedulerWithMilliInterval(1000)).
subscribe(System.out::println, Throwable::printStackTrace);
// this will print each name every second
Using immediate scheduler
This scheduler calls callback functions right away on the same thread. You can use this if you synchronous data and don’t want to create another thread. On the downside, you cannot unsubscribe with this scheduler.
Stream<String> names = Stream.of("One", "Two", "Three", "Four", "Five");
Observable.from(names).subscribeOn(Scheduler.getImmediateScheduler()).
subscribe(System.out::println, Throwable::printStackTrace);
Transforming
Observables keep sequence of items as streams and they support map/flatMap operation as supported by standard Stream class, e.g.
Map
Stream<String> names = Stream.of("One", "Two", "Three", "Four", "Five");
Observable.from(names).map(name -> name.hashCode()).
subscribe(System.out::println, Throwable::printStackTrace);
FlatMap
Stream integerListStream = Stream.of( Arrays.asList(1, 2),
Arrays.asList(3, 4), Arrays.asList(5));
Observable.from(integerListStream).flatMap(integerList -> integerList.stream()).
subscribe(System.out::println, Throwable::printStackTrace);
Filtering
Observables supports basic filtering support as provided by Java Streams, e.g.
Filter
Stream<String> names = Stream.of("One", "Two", "Three", "Four",
"Five");
Observable.from(names).filter(name -> name.startsWith("T")).
subscribe(System.out::println, Throwable::printStackTrace);
// This will only print Two and Three
Skip
skips given number of elements
Stream<String> names = Stream.of("One", "Two", "Three", "Four", "Five");
Observable.from(names).skip(2).subscribe(System.out::println,
Throwable::printStackTrace);
// This will skip One and Two
Limit
Stream<String> names = Stream.of("One", "Two", "Three", "Four", "Five");
Observable.from(names).limit(2).subscribe(System.out::println,
Throwable::printStackTrace);
// This will only print first two strings
Distinct
Stream<String> names = Stream.of("One", "Two", "Three", "One");
Observable.from(names).distinct.subscribe(System.out::println,
Throwable::printStackTrace);
// This will print One only once
Merge
This concates two observable data:
Observable<Integer> observable2 = Observable.from(Stream.of(4, 5, 6));
observable1.merge(observable2).subscribe(System.out::println,
Throwable::printStackTrace);
// This will print 1, 2, 3, 4, 5, 6
Summary
In summary, as Java 8 already supported a lot of functional primitives, adding support for reactive extensions was quite straight forward. For example, Nextflix’s implementation of reactive extensions in Java consists of over 80K lines of code but it took few hundred lines to implement core features with Java 8. You can download or fork the code from https://github.com/bhatti/RxJava8.
April 17, 2014
Introduction to Java 8 Lambda and Stream Syntax
Introduction
Java 8 was released in March 2014 with most language-level enhancements since Java 5 back in 2004. The biggest new feature is introduction to Lambda. Lambda or Closure is a block of code that you can pass to other methods or return from methods. Previously, Java supported a form of closure via anonymous class syntax, e.g.
import java.awt.event.ActionEvent;
import java.awt.event.ActionListener;
import java.awt.Dimension;
import java.awt.FlowLayout;
import javax.swing.JButton;
import javax.swing.JFrame;
public class SwingExample extends JFrame {
public SwingExample() {
this.getContentPane().setLayout(new FlowLayout());
final JButton btn = new JButton("Click Me");
btn.setPreferredSize(new Dimension(400,200));
add(btn);
btn.addActionListener(new ActionListener() {
public void actionPerformed(ActionEvent e) {
btn.setText("Clicked");
btn.setEnabled(false);
}
});
}
private static void createAndShowGUI() {
JFrame frame = new SwingExample();
frame.pack();
frame.setVisible(true);
frame.setDefaultCloseOperation(JFrame.EXIT_ON_CLOSE);
}
public static void main(String[] args) {
javax.swing.SwingUtilities.invokeLater(new Runnable() {
public void run() {
createAndShowGUI();
}
});
}
}
In above example, using anonymous class could use locally defined data in method that declares it as long as it is defined with final. Here is how the example looks like with Java 8 syntax:
import java.awt.event.ActionEvent;
import java.awt.event.ActionListener;
import java.awt.Dimension;
import java.awt.FlowLayout;
import javax.swing.JButton;
import javax.swing.JFrame;
public class SwingExample8 extends JFrame {
public SwingExample8() {
this.getContentPane().setLayout(new FlowLayout());
JButton btn = new JButton("Click Me");
btn.setPreferredSize(new Dimension(400,200));
add(btn);
btn.addActionListener(e -> {
btn.setText("Clicked");
btn.setEnabled(false);
}
);
}
private static void createAndShowGUI() {
JFrame frame = new SwingExample8();
frame.pack();
frame.setVisible(true);
frame.setDefaultCloseOperation(JFrame.EXIT_ON_CLOSE);
}
public static void main(String[] args) {
javax.swing.SwingUtilities.invokeLater(new Runnable() {
public void run() {
createAndShowGUI();
}
});
}
}
As you can see, lambda syntax is very minimal. In addition, lambda syntax doesn’t require that you declare externally accessible data as final, though it cannot be changed. Java lambda also adds type inferencing so that you don’t have to define types of arguments. The lambda features are implemented using “invokedynamic” instruction to dispatch method calls, which was added in Java 7 to support dynamic languages. For example, let’s take a simple example:
public class Run8 {
public static void main(String[] args) {
Runnable r = () -> System.out.println("hello there");
r.run();
}
}
You can de-compile it using:
javap -p Run8
You will see, it generated lambda$main$0 method, e.g.
public class Run8 {
public Run8();
public static void main(java.lang.String[]);
private static void lambda$main$0();
}
You can see real byte code using:
javap -p -c Run8
public class Run8 {
public Run8();
Code:
0: aload_0
1: invokespecial #1 // Method java/lang/Object
4: return
public static void main(java.lang.String[]);
Code:
0: invokedynamic #2, 0 // InvokeDynamic #0:run:()Ljava/lang/Runnable;
5: astore_1
6: aload_1
7: invokeinterface #3, 1 // InterfaceMethod java/lang/Runnable.run:()V
12: return
private static void lambda$main$0();
Code:
0: getstatic #4 // Field java/lang/System.out:Ljava/io/PrintStream;
3: ldc #5 // String hello there
5: invokevirtual #6 // Method java/io/PrintStream.println:(Ljava/lang/String;)V
8: return
}
This means lambdas don’t have to keep reference of enclosing class and “this” inside lambda does not create new scope.
Types of Functions
Java 8 provides predefined functions (See http://docs.oracle.com/javase/8/docs/api/java/util/function/package-summary.html), but there are four major types:
Supplier: () -> T
Consumer: T -> ()
Predicate: T -> boolean
Function: T -> RThe supplier method takes not arguments and produces an object, the consumer takes an argument for consumption, predicate evaluates given argument by returning true/false and function maps an argument of type T and returns an object of type R.
Supplier example
import java.util.function.*;
public class Supply8 {
public static void main(String[] args) {
Supplier<Double>> random1 = Math::random;
System.out.println(random1.get());
//
DoubleSupplier random2 = Math::random;
System.out.println(random2.getAsDouble());
}
}
Note: Java 8 provides special functions for primitive types that you can use instead of using wrapper classes for primitive types. Here is another example that shows how you can write efficient log messages:
import java.util.function.*;
public class SupplyLog {
private static boolean debugEnabled;
public static void debug(Supplier<String> msg) {
if (debugEnabled) {
System.out.println(msg.get());
}
}
public static void main(String[] args) {
debug(() -> "this will not be printed");
debugEnabled = true;
debug(() -> "this will be printed");
}
}Consumer example
import java.util.function.*;
public class Consume8 {
public static void main(String[] args) {
Consumer<String> consumer = s -> System.out.println(s);
consumer.accept("hello there");
consumer.andThen(consumer).accept("this will be printed twice");
}
}Predicate example
import java.util.function.*;
public class Predicate8 {
public static void main(String[] args) {
Predicate<Integer> gradeA = score -> score >= 90;
System.out.println(gradeA.test(80));
System.out.println(gradeA.test(90));
}
}In addition to test method, you can also use and, negate, or method to combine other predicates.
Function example
import java.util.function.*;
public class Function8 {
public static void main(String[] args) {
BinaryOperator<Integer> adder = (n1, n2) -> n1 + n2;
System.out.println("sum " + adder.apply(4, 5));
Function<Double>,Double> square = x -> x * x;
System.out.println("square " + square.apply(5.0));
}
}Custom Functions
In addition to predefined functions, you can define your own interface for functions as long as there is a single method is declared. You can optionally declare interface with @FunctionalInterface annotation so that compiler can verify it, e.g.
public class CustomFunction8 {
@FunctionalInterface
interface Command<T> {
void execute(T obj);
}
private static <T> void invoke(Command<T> cmd, T arg) {
cmd.execute(arg);
}
public static void main(String[] args) {
Command<Integer> cmd = arg -> System.out.println(arg);
invoke(cmd, 5);
}
}
Method Reference
In addition to passing lambda, you can also pass instance or static methods as closures using method reference. There are four kinds of method references:
- Reference to a static method ContainingClass::staticMethodName
- Reference to an instance method of a particular object ContainingObject::instanceMethodName
- Reference to an instance method of an arbitrary object of a particular type ContainingType::methodName
- Reference to a constructor ClassName::new
Streams
In addition to lambda support, Java 8 has updated collection classes to support streams. Streams don’t really store anything but they behave as pipes for computation lazily. Though, collections have limited size, but streams can be unlimited and they can be only consumed once. Streams can be accessed from collections using stream() and parallelStream() methods or from an array via Arrays.stream(Object[]). There are also static factory methods on the stream classes, such as Stream.of(Object[]), IntStream.range(int, int), etc. Common intermediate methods using as pipes that you can invoke on streams:
- filter()
- distinct()
- limit()
- map()
- peek()
- sorted()
- unsorted()
In above examples sorted, distinct, unsorted are stateful, whereas filter, map, limit are stateless. And here are terminal operations that trigger evaluation on streams:
- findFirst()
- min()
- max()
- reduce()
- sum()
You can implement your own streams by using helper methods in StreamSupport class.
Iterating
Java 8 streams support forEach method for iterating, which can optionally take a consumer function, e.g.
import java.util.stream.Stream;
public class StreamForEach {
public static void main(String[] args) {
Stream<String> symbols = Stream.of("AAPL", "MSFT", "ORCL", "NFLX", "TSLA");
symbols.forEach(System.out::println);
}
}Parallel iteration:
import java.util.Arrays;
import java.util.List;
import java.util.stream.Stream;
public class ParStreamForEach {
public static void main(String[] args) {
List<String> symbols = Arrays.asList("AAPL", "MSFT", "ORCL", "NFLX", "TSLA");
System.out.println("unordered");
symbols.parallelStream().forEach(System.out::println);
System.out.println("ordered");
symbols.parallelStream().forEachOrdered(System.out::println);
}
}Note that by default iterating parallel stream would be unordered but you can force ordered iteration using forEachOrdered method instead of forEach.
Filtering
We already saw predicate functions and filtering support in streams allow extract elements of collections that evaluates true to given predicate. Let’s create a couple of classes that we will use later:
import java.util.ArrayList;
import java.util.Collection;
import java.util.function.IntPredicate;
public class Game implements IntPredicate {
enum Type {
AGE_ALL,
AGE_13_OR_ABOVE,
AGE_18_OR_ABOVE
}
public final String name;
public final Type type;
public final Collection<Player> players = new ArrayList<>();
public Game(String name, Type type) {
this.name = name;
this.type = type;
}
public boolean suitableForAll() {
return type == Type.AGE_ALL;
}
public void add(Player player) {
if (test(player.age)) {
this.players.add(player);
}
}
@Override
public boolean test(int age) {
switch (type) {
case AGE_18_OR_ABOVE:
return age >= 18;
case AGE_13_OR_ABOVE:
return age >= 13;
default:
return true;
}
}
@Override
public String toString() {
return name;
}
}
public class Player {
public final String name;
public final int age;
public Player(String name, int age) {
this.name = name;
this.age = age;
}
@Override
public String toString() {
return name;
}
}
Now let’s create a class that will filter games by types:
import java.util.Arrays;
import java.util.Collection;
import static java.util.stream.Collectors.*;
import java.util.stream.Stream;
public class GameFilter {
public static void main(String[] args) {
Collection<Game>> games = Arrays.asList(new Game("Birdie", Game.Type.AGE_ALL), new Game("Draw", Game.Type.AGE_ALL), new Game("Poker", Game.Type.AGE_18_OR_ABOVE), new Game("Torpedo", Game.Type.AGE_13_OR_ABOVE));
Collection<Game>> suitableForAll = games.stream().filter(Game::suitableForAll).collect(toList());
System.out.println("suitable for all");
suitableForAll.stream().forEach(System.out::println);
Collection<Game>> adultOnly = games.stream().filter(game -> game.type == Game.Type.AGE_18_OR_ABOVE).limit(10).collect(toList());
System.out.println("suitable for adults only");
adultOnly.stream().forEach(System.out::println);
}
}As you can see, filter can accept lambda or method reference.
Map
Map operation on streams applies a given function to transform each element in stream and produces another stream with transformed elements.
import java.util.Arrays;
import java.util.Collection;
import static java.util.stream.Collectors.*;
import java.util.stream.Stream;
public class GameMap {
public static void main(String[] args) {
Collection<Game>> games = Arrays.asList(new Game("Birdie", Game.Type.AGE_ALL), new Game("Draw", Game.Type.AGE_ALL), new Game("Poker", Game.Type.AGE_18_OR_ABOVE), new Game("Torpedo", Game.Type.AGE_13_OR_ABOVE));
Collection<Player> players = Arrays.asList(new Player("John", 10), new Player("David", 15), new Player("Matt", 20), new Player("Dan", 30), new Player("Erica", 5));
for (Game game : games) {
for (Player player : players) {
game.add(player);
}
}
//
Collection<Game>.Type> types = games.stream().map(game -> game.type).collect(toList());
System.out.println("types:");
types.stream().forEach(System.out::println);
Collection<Player> allPlayers = games.stream().flatMap(game -> game.players.stream()).collect(toList());
System.out.println("\nplayers:");
players.stream().forEach(System.out::println);
}
}Note that flatMap takes collection of objects for each input and flattens it and produces a single collection. Java streams also produces map methods for primitive types such as mapToLong, mapToDouble, etc.
Sorting
Previously, you had to implement Comparable interface or provide Comparator for sorting but you can now pass lambda for comparison, e.g.
import java.util.Arrays;
import java.util.List;
import static java.util.stream.Collectors.*;
import java.util.stream.Stream;
import java.util.Comparator;
public class GameSort {
public static void main(String[] args) {
List<Player> players = Arrays.asList(new Player("John", 10), new Player("David", 15), new Player("Matt", 20), new Player("Dan", 30), new Player("Erica", 5));
players.sort(Comparator.comparing(player -> player.age));
System.out.println(players);
}
}Min/Max
Java streams provide helper methods for calculating min/max, e.g.
import java.util.Arrays;
import java.util.Collection;
import static java.util.stream.Collectors.*;
import java.util.stream.Stream;
import java.util.Comparator;
public class GameMinMax {
public static void main(String[] args) {
Collection<Player> players = Arrays.asList(new Player("John", 10), new Player("David", 15), new Player("Matt", 20), new Player("Dan", 30), new Player("Erica", 5));
Player min = players.stream().min(Comparator.comparing(player -> player.age)).get();
Player max = players.stream().max(Comparator.comparing(player -> player.age)).get();
System.out.println("min " + min + ", max " + max);
}
}
Reduce/Fold
Reduce or fold generalizes the problem where we compuate a single value from collection, e.g.
import java.util.Arrays;
import java.util.Collection;
import static java.util.stream.Collectors.*;
import java.util.stream.Stream;
public class GameReduce {
public static void main(String[] args) {
Collection<Player> players = Arrays.asList(new Player("John", 10), new Player("David", 15), new Player("Matt", 20), new Player("Dan", 30), new Player("Erica", 5));
double averageAge1 = players.stream().mapToInt(player -> player.age).average().getAsDouble();
double averageAge2 = players.stream().mapToInt(player -> player.age).reduce(0, Integer::sum) / players.size();
double averageAge3 = players.stream().mapToInt(player -> player.age).reduce(0, (sum, age) -> sum + age) / players.size();
System.out.println("average age " + averageAge1 + ", " + averageAge2 + ", or " + averageAge3);
}
}
Grouping/Partitioning
groupingBy method of Collectors allows grouping collection, e.g.
import java.util.Arrays;
import java.util.Collection;
import java.util.List;
import java.util.Map;
import static java.util.stream.Collectors.*;
import java.util.stream.Stream;
public class GameGrouping {
public static void main(String[] args) {
Collection<Player> players = Arrays.asList(new Player("John", 10), new Player("David", 15), new Player("Matt", 20), new Player("Dan", 30), new Player("Erica", 5));
Map<Integer, List<Player>> playersByAge = players.stream().collect(groupingBy(player -> player.age));
System.out.println(playersByAge);
}
}
partitioningBy groups collection into two collection, e.g.
import java.util.Arrays;
import java.util.Collection;
import java.util.List;
import java.util.Map;
import static java.util.stream.Collectors.*;
import java.util.stream.Stream;
public class GamePartition {
public static void main(String[] args) {
Collection<Player> players = Arrays.asList(new Player("John", 10), new Player("David", 15), new Player("Matt", 20), new Player("Dan", 30), new Player("Erica", 5));
Map<Boolean, List<Player>> playersByAge = players.stream().collect(groupingBy(player -> player.age >= 18));
System.out.println(playersByAge);
}
}String joining
Here is an example of creating a string from collection:
import static java.util.stream.Collectors.*;
import java.util.stream.Stream;
public class GameJoin {
public static void main(String[] args) {
Stream<Player> players = Stream.of(new Player("John", 10), new Player("David", 15), new Player("Matt", 20), new Player("Dan", 30), new Player("Erica", 5));
System.out.println(players.map(Player::toString).collect(joining(",", "[", "]")));
}
}Lazy Evaluation
Java Map interface now supports lazy evaluation by adding object to Map if the key is not present:
import java.util.HashMap;
import java.util.Map;
public class Fib {
private final static Map<Integer,Long> cache = new HashMap<Integer, Long>() {{
put(0,0L);
put(1,1L);
}};
public static long fib(int x) {
return cache.computeIfAbsent(x, n -> fib(n-1) + fib(n-2));
}
public static void main(String[] args) {
System.out.println(fib(10));
}
}Parallel Streams
Though, streams processes elements serially but you can change stream() method of collection to parallelStream to take advantage of parallel processing of stream. Parallel stream use ForkJoinPool by default and use as many threads as you have processors (Runtime.getRuntime().availableProcessors()), e.g.
import java.util.Arrays;
import java.util.List;
import java.util.stream.IntStream;
import java.util.stream.Stream;
public class ParStreamPrime {
private static boolean isPrime(int n) {
if (n%2==0) return false;
for(int i=3;i*i<=n;i+=2) {
if(n%i==0) return false;
}
return true;
}
private static long serialTest(int max) {
long started = System.currentTimeMillis();
IntStream.rangeClosed(1, max).forEach(num -> isPrime(num));
return System.currentTimeMillis() - started;
}
private static long parallelTest(int max) {
long started = System.currentTimeMillis();
IntStream.rangeClosed(1, max).parallel().forEach(num -> isPrime(num));
return System.currentTimeMillis() - started;
}
//
public static void main(String[] args) {
int max = 1000000;
System.out.println("Serial " + serialTest(max));
System.out.println("Parallel " + parallelTest(max));
}
}If you need to customize thread pool size, you can create parallel stream inside fork-join-pool, e.g.
private static void parallelTest(final int max) {
ForkJoinPool forkJoinPool = new ForkJoinPool(2);
forkJoinPool.submit(() ->
IntStream.rangeClosed(1, max).parallel().forEach(num -> isPrime(num));
).get();
}Default methods and Mixins
For anyone who has to support interfaces for multiple clients knows the frustration of adding new methods because it requires updating all clients. You can now add default methods on interfaces and add static methods, e.g.
interface Vehicle {
float getMaxSpeedMPH();
public static String getType(Vehicle v) {
return v.getClass().getSimpleName();
}
}
interface Car extends Vehicle {
void drive();
public default float getMaxSpeedMPH() {
return 200;
}
}
interface Boat extends Vehicle {
void row();
public default float getMaxSpeedMPH() {
return 100;
}
}
interface Plane extends Vehicle {
void fly();
public default float getMaxSpeedMPH() {
return 500;
}
}
public class AmphiFlyCar implements Car, Boat, Plane {
@Override
public void drive() {
System.out.println("drive");
}
@Override
public void row() {
System.out.println("row");
}
@Override
public void fly() {
System.out.println("fly");
}
public float getMaxSpeedMPH() {
return Plane.super.getMaxSpeedMPH();
}
public static void main(String[] args) {
AmphiFlyCar v = new AmphiFlyCar();
System.out.println(Vehicle.getType(v) + ": " + v.getMaxSpeedMPH());
}
}
Optional
Tony Hoare called nulls a billion dollar mistake and Java has an infamous NullPointerException error when you dereference a null object. Now you can get rid of those nasty errors using Optional, which acts as Maybe monads in other languages, e.g.
import java.util.HashMap;
import java.util.Map;
import static java.util.stream.Collectors.*;
import java.util.stream.Stream;
import java.util.Optional;
public class OptionalExample {
private static Map<String, Player> players = new HashMap<String, Player>() {{
put("John", new Player("John", 10));
put("David", new Player("David", 15));
put("Matt", new Player("Matt", 20));
put("Erica", new Player("Erica", 25));
}};
private static Optional<Player> findPlayerByName(String name) {
Player player = players.get(name);
return player == null ? Optional.empty() : Optional.of(player);
}
private static Integer getAge(Player player) {
return player.age;
}
public static void main(String[] args) {
findPlayerByName("John").ifPresent(System.out::println);
Player player = findPlayerByName("Jeff").orElse(new Player("Jeff", 40));
System.out.println("orElse " + player);
Integer age = findPlayerByName("Jeff").map(OptionalExample::getAge).orElse(-1);
System.out.println("Jeff age " + age);
}
}
CompletableFuture
Java has long supported future, but previously you had to call blocking get() to retrieve the result. With Java 8, you can use CompletableFuture to define the behavior when asynchronous processing is completed, e.g.
private static final ExecutorService executor = Executors.newFixedThreadPool(2);
public static CompletableFuture getPlayer(String name) {
return CompletableFuture.supplyAsync(() -> new Player(), executor);
}
getQuote("name").thenAccept(player -> System.out.println(player));
Summary
Over the past few years, functional programming languages have become mainstream and Java 8 brings many of those capabilities despite being late. These new features help write better and more concise code. You will have to change existing code to make more use of immutable objects and use streams instead of objects when possible. As far as stability, I found java 8 compiler on Linux environment a bit buggy that crashed often so it may take a little while before Java 8 can be used in production.
September 12, 2013
NFJS Seattle 2013 Review
I attended NFJS conference over the weekend. It was a short conference from Friday, Sep 6 to Sunday, Sep 8. It had a number of sessions on Java 8, Javascript, Mobile, and functional areas. Here are some of the sections that I enjoyed:
Java 8 Language Capabilities
This session gave a brief overview of new features of Java 8 mainly new closure/lambda syntax. Venkat is a great speaker and he was coding live throughout the session.
Concurrency without Pain in Pure Java
This was related to first talk by Venkat Subramaniam and covered a number of patterns such as Actors and STM for concurrency.
Functional SOLID
In this talk Matt Stine gave overview of SOLID principles and how functional programming make it easier to apply these patterns. This was more abstract talk and didn’t go into examples of those patterns.
Programming with Immutability
This was more practical talk by Matt Stine and he gave examples of immutability and functional programming in Java and Groovy with live coding. He mentioned a number of tools that can make it easier to build mutable and immutable pair of classes and how immutable classes can be used in other frameworks such as Hibernate.
Rich Web Apps with Angular
This was a short introduction to Angular by Raju Gandhi.
Vagrant: Virtualized Development Environments Made Simple
This was a great introduction to Vagrant and how to setup a complete development, testing and production environments on your local desktop.
Simulation Testing with Simulant
This was a short introduction to Simulant testing library that Stuart Halloway has been using for testing Datomic database. This testing library can be used for functional testing and load testing. It saves all data in datomic database and can be populated from existing data.
Generative Testing
This was another session of testing framework by Stuart Halloway. This framework provides great support for generating test data and is somewhat similar to QuickCheck, though it doesn’t offer reduction.
Summary
I didn’t go to OSCON this year and enjoyed smaller pavilion that NFJS provided. There were a couple of dud sessions, but overall I enjoyed it.