Tips from “Algorithms to Live By”

November 15, 2016

Tips from “Algorithms to Live By”

Filed under: Algorithms,Computing — admin @ 10:51 pm

The “Algorithms to Live By” by Brian Christian and Tom Griffiths reviews computer algorithms from several domains and illustrates practical examples for applying those algorithms in real-life problems. Here is a list of some of those algorithms that I found very useful:

1. Optimal Stopping

This class of problems determines the optimal time to stop further processing when searching or selecting an option. Here are a few examples:

Secretary Hiring Problem
This is a famous math problem, which was defined by a mathematician named Merril Flood based on “Look-Then-Leap-Rule” to find the best candidate by waiting until you review 37% of the candidates and then hiring the candidate who is better than all of the past candidates. There are several other applications of this algorithm such as finding a life partner or apartment hunting. This problem assumes that you cannot go back to the previous candidate once you reject but there are other variations of this algorithm that allow it in case the selected candidate rejects your offer.

Selling a House
When selling a house, you need to determine the range of expected offers and cost of waiting for the best offer.

Finding a Parking Spot
Given a percentage of parking spots available, you determine the number of vacant spots that can be passed before a certain distance until you take the first spot.

2. Explore/Exploit

In this chapter, authors describe several algorithms for exploring available paths and then using the optimal path. Here is a sampling of the approaches based on explore/exploit:
Multi-armed bandit
Given expected value of a slot machine (winnings/# of pulls), you need to maximize winnings. There are several approaches such as:

Win-Stay
You keep using a slot machine as long as you are winning and then switch to a different machine when you lose.
Gittins Index
It is named after Gittins, who was a professor at Oxford. It tries to maximize payoffs for future by calculating a Gittins index for all slot machines and then selecting slot machine with the highest Gittins index.
Regret and optimism
Many problems in life can be defined in terms of regrets and optimism by imagining being at the deathbed and thinking of decisions that you could have made differently.
Upper Confidence Bound
It is also referred as optimism in the face of uncertainty, where you choose your actions as if the environment is as nice as is plausibly possible. Given a range of plausible values and you pick the option with the highest confidence interval.
A/B Testing
It is often used to test new features by offering the new features to a subset of the customers.

One of insight the authors present is that people often explore longer by favoring new over the best older option.

3. Sorting

In this chapter, authors describe several algorithms for sorting and their computing cost in terms of O-notation. The O-notation is generally used to indicate algorithm’s worst performance such as:

O(1): Constant cost
O(N): Linear cost
O(N^2): Quadratic cost
O(2^N): Exponential cost
O(N!): Factorial cost

Merge-Sort
This algorithm breaks data recursively into smaller sets until there is a single element. It then merges those subsets to create a new sorted list.

Bucket-Sort
A group of n items can be grouped into m buckets in O(nm) time and this insight is used by bucket sorting where items are grouped into a number of sorted buckets. For example, you can use this approach to load returned books into carts based on the shelf numbers.

Sorting is a pre-requisite for searching and there are a lot of practical applications for sorting such as creating matchups between teams. For example, teams can use round-robin based matchup where each team plays each other team but it would result in a lot of matches (O(N^2)). Instead, competitions such as March Madness uses Merge-Sort to move from 64 teams to 32, 16, 8, 4 and finals. However, it doesn’t use full sort as there are only 63 games in the season instead of 192.

4. Caching

In computer design, John Von Neumann designed memory hierarchy to improve lookup performance. It was first used in IBM 360 mainframes. Other computer researchers such as Belady designed algorithms for page faults to load data from disk to memory. There are several algorithms for cache eviction such as First-In, First-Out, Least-Recently-Used, etc.

5. Scheduling

Here are a few of the scheduling algorithms described in this chapter:
Earliest Due Date Strategy
It minimizes maximum lateness by choosing task with the earliest due date first.

Moore’s algorithm
It is similar to Earliest Due Date but it throws out biggest task if the new job can’t be completed by due date.

The authors give an example of Getting Things Done (GTD) technique for time management where small tasks are handled first. The tasks can also have a weight or priority and then the scheduler minimizes the sum of weighted completion time by dividing weight by length of the task and selecting the task with the highest density.

Here are a few issues that can arise with priority based tasks:

Priority Inversion – when a low priority task possesses a resource and scheduler executes a higher priority task, which cannot make any progress. One way to address this issue is by allowing the low-priority task to inherit the priority of higher priority task and let it complete.
Thrashing – it occurs when system grinds to halt because work cannot be completed due to lack of resources.
Context switching – Modern operating system uses context switching to work on multiple tasks but each slice of time needs to be big enough so that the task can make progress. One technique to minimize context switching is interrupt coalescing, which delays hardware interrupt. Similar techniques can be used by batching small tasks, e.g. Getting Things Done technique encourages creating a chunk of time to handle similar tasks such as checking emails, making phone calls, etc.

6. Bayes’s Rule

Reverand Thomas Bayes’s postulated Bayes’s rule by looking at winning and losing tickets to determine overall ticket pool. It was later proved by Pierre-Simon Laplace, which is commonly referred as Laplace’s law. Laplace worked out Bayes’s Rule to use prior knowledge in prediction problems.

Copernican Principle
Richard Gott hypothesized that the moment you observe something, it is likely to be in the middle of its lifetime.

Normal or Gaussian distribution
It has a bell curve and can be used to predict average life span.

Power-law distribution
It uses range over many scales such as the population of cities or income of people.

Multiplicative Rule
It multiplies quantity observed with some constant factor.

Average Rule
It uses the distribution natural average.

Additive Rule
It predicts that the things that will go on just a constant amount longer such as a five more minute rule.

7. Overfitting

In machine learning, overfitting occurs when training data fits tightly with key factors so that it doesn’t accurately predict the outcome for the data that it has not observed.

Cross Validation
Overfitting can be solved with cross-validation by assessing model not just against training data but also against unseen data.

Regularization
It uses contents to penalize complexity.

Lasso
It uses penalty of the total weight of different factors to minimize complexity.

8. Relaxation

In constraint optimization problems, you need to find the best arrangement of a set of variables given a set of rules and scoring mechanism such as traveling salesman problem (O(N!)). Using constraint relaxation, you remove some of the problem constraints, e.g. you can create a minimum spanning tree that connects all nodes in O(N^2) amount of time. Techniques such as Lagrangian Relaxation removes some of the constraints and add them to the scoring system.

9. Randomness

This chapter describes examples of algorithms that are based on random numbers such as:

Monte Carlo Method
It uses random samples to handle qualitatively unmanageable problems.

Hill Climbing
It takes a solution and tries to improve it by permuting some of the factors. It only accepts changes if it results in improvements. However, it may not find the globally optimal solution.

Jitter
It makes random small changes and accepts them even if they don’t improve in order to find the better solution.

Metropolis algorithm
It uses Monte Carlo Method and accepts bad and good tweaks in trying different solutions.

Simulated Annealing
It optimizes problems like annealing by heating up and slowly cooling off.

10. Networking

This chapter describes algorithms used in the computer network such as:

Packet switching
One of key idea of Internet was to use packet switching where TCP/IP sends data packets over a number of connections as opposed to dedicated lines or circuit switching which were used by phone companies.

Acknowledgement
It is used to let the sender know that packet is received. TCP/IP uses the triple handshake to establish a connection and sender resends packets if ACK is not received.

Exponential Backoff
It increases average delay after successive failure.

Flow Control
TCP/IP uses Additive Increase Multiplicative Decrease to increase the number of packets sent and cut the transmission rate in half and ACK is not received.

Bufferbloat
A buffer is a queue that stores outgoing packets, but when the queue length is large, it can add a delay in sending ACK, which would result in redelivery. Explicit Congestion Notification can be used to address those issues.

11. Game Theory

In this chapter, authors discuss several problems from game theory such as:

Halting problem
This problem was first posed by Alan Turing who asserted that a computer program can never tell whether another program that it uses would take forever to compute something.

Prisoner’s dilemma
It is based on two prisoners who are caught and have to either cooperate or work against each other. In general, defection is the dominant strategy.

Nash Equilibrium
It is one of strategy where neither player changes their own play based on the opponent’s strategy.

The Tragedy of the Commons
It involves a shared-resource system where an individual can act independently in a selfish manner that is contrary to the common good of all participants, e.g. voluntary environmental laws where companies are not required to obey emission levels.

Information cascade
Information cascade occurs where an individual abandons their own information in favor of other people’s action. One application of this class of problems is auction systems. Here are a few variations of the auction systems:

Sealed-bid – where bidders are unaware of other bid prices so they would have to predict price that other bidders would use.
Dutch or descending auction – where bids start at a high price and is slowly lowered until someone accepts it.
English or ascending auction – where bid starts at a low price and is then increased.
Vickrey auction – it is similar to sealed-bid but winners pay second-place bid. It results in better valuation as bidders are incentivized to bid based on the true value.

Summary

This book presents several domains of algorithms and encourages computational kindness by applying these algorithms in real-life. For example, we can add constraints or reduce the number of available options when making a decision, which would lower the mental labor.

Shahzad Bhatti Welcome to my ramblings and rants!

November 15, 2016