Saturday, June 15, 2013

Project Euler - 14 - Longest Collatz sequence - memoization

This post is about the Project Euler - 14 - Longest Collatz sequence problem. This is one of the perfect examples of memoization technique. This is the actual problem.

The following iterative sequence is defined for the set of positive integers:

n → n/2 (n is even)
n → 3n + 1 (n is odd)

Using the rule above and starting with 13, we generate the following sequence:
13 → 40 → 20 → 10 → 5 → 16 → 8 → 4 → 2 → 1

It can be seen that this sequence (starting at 13 and finishing at 1) contains 10 terms. Although it has not been proved yet (Collatz Problem), it is thought that all starting numbers finish at 1.

Which starting number, under one million, produces the longest chain?

NOTE: Once the chain starts the terms are allowed to go above one million.

Simple Solution


The problem statement looks pretty simple and a brute force will work just fine. This code runs in 20 seconds.

def Series(N):
    Counter = 1
    while N != 1:
        N = N/2 if N % 2 == 0 else 3*N+1
        Counter += 1
    return Counter

Max, Element = 0, 0
for x in xrange (1, 1000000):
    series = Series(x)
    if (Max < series):
        Max = series
        Element = x

print Max, Element



As I had to wait for 20 seconds to get the solution, I was thinking about improving the performance. Then I found this. From the problem statement, 13 → 40 → 20 → 10 → 5 → 16 → 8 → 4 → 2 → 1, starting with any number, if it becomes 13 at any point of time, it has to take the same route and if it becomes 13 it will take 10 steps to become 1. So I wanted to store this information first time itself, so that I don't have to recalculate. This is called Memoization. Quoting from wikipedia

In computing, memoization is an optimization technique used primarily to speed up computer programs by having function calls avoid repeating the calculation of results for previously processed inputs

Optimized (Memoized) Solution


To optimize it, I just had to change the Series function alone. I stored the computed values in a dictionary and if the current number in the sequence happens to be in the dictionary, I ll just return the value corresponding to it. This program responded with the result in 2 seconds. That is 10 times faster than the first version.

Dict = {}

def Series(N):
    Counter = 1   #It is 1, because we are counting the initial N
    Original = N
    while N != 1:
        N = N/2 if N % 2 == 0 else 3*N+1
        if N in Dict:                    # Check if the length has been computed for this number already
            Counter += Dict[N]           # Add it to the current length counter
            Dict [Original] = Counter    # Store the computed value for future look back
            return Counter
        Counter += 1
    Dict [Original] = Counter            # Store the computed value for future look back
    return Counter





No comments: