Project Euler: Problem 12

It’s time once again for a favorite blog theme, The Project Euler post. This time around I am answering problem twelve. The website states the problem as:
The sequence of triangle numbers is generated by adding the natural numbers. So the 7th triangle number would be 1 + 2 + 3 + 4 + 5 + 6 + 7 = 28. The first ten terms would be:

1, 3, 6, 10, 15, 21, 28, 36, 45, 55, ...

Let us list the factors of the first seven triangle numbers:

1: 1
3: 1,3
6: 1,2,3,6
10: 1,2,5,10
15: 1,3,5,15
21: 1,3,7,21
28: 1,2,4,7,14,28
We can see that 28 is the first triangle number to have over five divisors.

What is the value of the first triangle number to have over five hundred divisors?

o spice things up I decided to use a language I haven't used for these in a while, Perl. I also included the usual suspects: Python and Haskell. So, here's the Perl code:

  1. #!/usr/bin/perl
  2.  
  3. use strict;
  4. use warnings;
  5.  
  6. my $index = 7;
  7. my $total = 28;
  8. my $divisors = 0;
  9.  
  10. sub divisors
  11. {
  12. my ($number) = @_;
  13. my $sq_n = sqrt($number);
  14. my $i = 1;
  15. my $t = 0;
  16.  
  17. while ($i <= $sq_n)
  18. {
  19. $t += 2 unless ($number % $i);
  20.  
  21. $i += 1;
  22. }
  23.  
  24. return $t;
  25. }
  26.  
  27. while( divisors($total) <= 500)
  28. {
  29. $index += 1;
  30. $total += $index;
  31. }
  32.  
  33. print "$total\n";

Nothing really new or interesting to mention in this code. Here is the Python code:
  1. #!/usr/bin/python
  2.  
  3. """
  4. solution for problem 12 in python.
  5. """
  6. import math
  7.  
  8. def get_divisors(number):
  9. tlist = []
  10. for x in xrange(2, int(math.sqrt(number))):
  11. d,r = divmod(number,x)
  12. if r == 0:
  13. tlist.append(x)
  14. tlist.append(d)
  15.  
  16. return len([1, number] + tlist)
  17.  
  18. def triangle_nums():
  19. iterator = 7
  20. num = 28
  21.  
  22. while True:
  23. yield num
  24. iterator += 1
  25. num += iterator
  26.  
  27. if __name__ == "__main__":
  28. tn = triangle_nums()
  29. for t in tn:
  30. tl = get_divisors(t)
  31. if tl > 500:
  32. print "num: %d\ncount: %d" % (t,tl)
  33. break

Pretty standard stuff for the most part. I think the only non-standard thing worth mentioning is the infinite triangle number generator. This took a little finangling, but I got it to work in the end.

Here is the Haskell code:

  1. module Main where
  2.  
  3. get_div_len number = foldl1 (+) [2 | x <- [1..x], number `mod` x == 0]
  4. where x = round . sqrt $ fromInteger number
  5.  
  6. main :: IO()
  7. main = do
  8. print . head $ dropWhile (\x -> fst x <= 499) (map (\x -> (get_div_len x ,x)) xs)
  9. where xs = map (\y -> sum [1..y]) [7..]

After creating these solutions, I did my usual, highly accurate, testing method to determine the speed of the computation. I was surprised by my results:

Perl: 12.462s
Python: 17.783s
Haskell (compiled): 13.877s

Normally the Haskell solution would be significantly faster, and I have a theory as to why the Haskell times are so close. In Perl and Python I’m doing two additions – one for the increase of the index number and another to increase the total number. In Haskell I’m doing 1 + n additions; the first is to increase the index, and the remaining additions (n) are those used to calculate the sum of all the numbers between (and including) 1 and the index. As the index variable gets larger, that calculation takes more and more time to perform. I would write this up as suboptimal. After spending some time traveling “the tubes,” I discovered the State Monad, which is the reason why this blog post took me so long. I had to spend a week going through random blogs, skimming books, and beating my head against a wall (more than usual) to figure this out.

Quick diversion, for those of you who do not know what the State Monad is, let me take a moment to to try and explain what it is and why it’s important in this context. Those of us that come from an imperative language (I am one of you in this regard) are used to being able to do a simple addition such as (in pseudo code):

Variable = 2
Variable = variable + 3 or Variable += 3

We can’t do this in Haskell; instead we have to create a new variable name for each new variable assignment. We could also create a function that recursively goes forward, generating the next number in the sequence and bringing our needed variables with us before going deeper down the recursion rabbit hole. With the State Monad, however, we can write our function in such a way that the necessary variables are implicitly passed. Take a look at the new solution to see what I mean:

  1. {-# LANGUAGE BangPatterns, UnboxedTuples #-}
  2. module Main where
  3.  
  4. import Control.Monad
  5. import Control.Monad.State
  6.  
  7. type MyState = (Int, Int)
  8. s0 = (7, 28)
  9.  
  10. tick = do
  11. (n,o) <- get
  12. let divs = getDivLen (n,o)
  13. if divs <= 500
  14. then do
  15. let n' = n + 1
  16. let o' = o + n'
  17. put (n', o')
  18. tick
  19. else
  20.  
  21. getDivLen :: MyState -> Int
  22. getDivLen (!n, !o) = foldl1 (+) [2 | x <- [1..x], o `mod` x == 0]
  23. where x = round . sqrt $ fromIntegral o
  24.  
  25. main :: IO ()
  26. main = print $ evalState tick s0

The tick function does not have any input parameters. All the information that the function needs comes from the “get” function call, which grabs the current state from the State Monad. If the tick function does not find a number of divisors greater than five hundred, it inserts new values back into the State Monad, and goes down to the next level of recursion.

It took me a long time to figure this out, mostly because of the lack of examples on the internet concerning the State Monad. If I wanted to create a random number generator I would have been set, but sadly I just wanted to create something that would hold a tuple of numbers and increment them accordingly. So I highly modified one of the “random number generator” examples.

My “highly accurate” speed test results for the new version is:
Haskell (compiled): 2.664s

which is a vast improvement (> 11s) over the previous implementation.

While a Project Euler problem may not have been the best way to learn about using the State Monad, I'm glad I stumbled upon it. I hope that it can be used as an example for others if they want to learn how to use the this particular monad to create things other than pseudo-random number generators.

One last thing – some of the brighter crayons in the box (which is most of you, based on the level of comments that I receive) might have noticed that I skipped problem 11. There is a simple response to that. I still haven’t solved it.

Scanl and Int

So basically there are two main issues with your original code. First, as others pointed out, is how you generate the triangles numbers - your method recalculates the prefix sums anew for each new number, instead of adding to the running total, which is indeed what scanl does. The reason I'm writing this is to provide some timings, so, on my computer using the scanl takes down the execution time from 15.71s to 9.26s.

Another thing is, we can really use Ints instead of Integers - and that is the main reason for your second version's fast execution too. Just by specifying the type, with

  1. xs :: [Int]
  2. xs = scanl (+) 28 [8..]

the run-time becomes 1.58s! (compiled with -O2 flag of course!).

Hi WillNess, Thank you for

Hi WillNess,

Thank you for your comment and teaching me how I could improve the performance of my program. I'm still intrigued and a little ignorant to a lot of those really handy built in functions like scanl. I will admit that I'm still unsure of what patterns to look for that tell me to use that instead of writing a recursive function by hand. Do you have any wisdom that you could impart regarding that?

List comprehensions and upper bound

In all fairness, you could have used list comprehensions in Python too

  1. [x for x in xrange(...) if number % x == 0]

which is both more pythonic and about twice faster, but you also should, as in Haskell, only increase a counter and not build the list:
  1. for x in xrange(...):
  2. if number % x == 0:
  3. count += 2

Finally, since the number of divisors of the triangle number of index i is bounded by the product of the number of divisors of i by that of i+1, don't compute it if not necessary (should get you at least one order of magnitude faster).

Soli, I'm generally under the

Soli,

I'm generally under the impression that anything you do with a list comprehension; you should use a generator instead. Unless, you need to a list. ;) You are right though, that I should have just increased the counter and not made a list.

another solution

I was surprised that your first solution in Haskell took so long as mine takes under a fifth of a second compiled and does not use monads. (I can do problem 1 of Project Euler but I don't yet grok monads :) It is a little longer but over 10 times faster than even your fast version. I guess this is because I use the fact that the sum of the first n integers is n(n + 1) /2 and that the number of factors of that product can be decomposed into the product of the number of factors of n and (n + 1)/2. I think the first is the key mathematical insight that leads to a faster problem.

  1. prob12 :: Int -> Integer
  2. prob12 nd =
  3. let n = snd $ head $ filter ((> nd) . fst) $ zip (map numFactors [1..]) [1..]
  4. in (n * (n + 1)) `div` 2
  5. where
  6. numFactors :: Integer -> Int
  7. numFactors n
  8. -- this is not true in general but is true for n and n + 1 as they cannot have
  9. -- any common factors other than 1
  10. | odd n = (length $ factors n) * (length $ factors ((n + 1) `div` 2))
  11. | otherwise = (length $ factors (n `div` 2)) * (length $ factors (n + 1))
  12.  
  13.  
  14. factors :: Integer -> [Integer]
  15. factors n =
  16. if isqrt == head lfs
  17. then sfs ++ tail lfs
  18. else sfs ++ lfs
  19. where
  20. isqrt = floor $ sqrt $ fromInteger n
  21. sfs = [x | x <- [ 1 .. isqrt], n `rem` x == 0]
  22. lfs = map (div n) sfs
  23.  
  24. > prob12 500
  25. 76576500
  26. it :: Integer
  27. (0.19 secs, 316297808 bytes)

math insight , Project Euler and good performance

The key point buried in poorly written first reply is that for Project Euler the idea is to use mathematical insight to get good performance. In this case that mathematical insight is that that the sum of the first n integers is n(n + 1)/2. fwiw I can get to almost a tenth of a second as we only need the number of factors and not the actual factors and we can limit the arguments and return values to Ints:

  1. prob12 :: Int -> Int
  2. prob12 nd =
  3. let n = snd $ head $ filter ((> nd) . fst) $ zip (map nfs [1..]) [1..]
  4. in (n * (n + 1)) `quot` 2
  5. where
  6. nfs n
  7. -- this is not true in general but is true for n and n + 1 as they cannot have
  8. -- any common factors other than 1
  9. | odd n = numFactors n * numFactors ((n + 1) `quot` 2)
  10. | otherwise = numFactors (n `quot` 2) * numFactors (n + 1)
  11.  
  12. numFactors :: Int -> Int
  13. numFactors n
  14. | (n `rem` isqrt == 0) = nfs + 1
  15. | otherwise = nfs
  16. where
  17. isqrt = floor $ sqrt $ fromIntegral n
  18. nfs = 2 * length [x | x <- [ 1 .. (isqrt - 1)], n `rem` x == 0]
  19.  
  20. > prob12 500
  21. 76576500
  22. it :: Int
  23. (0.12 secs, 250272392 bytes)

George, Thank you for taking

George,

Thank you for taking another stab at explaining what you meant the first time. I and anyone else who reads this will appriciate it. Those are some impressive run times as well, thank you for sharing your code.

source of performance

Thanks for your generous , positive comments. (I wish I could delete my first comment). Your response inspired me to followup as to the source of the performance. It turns out that the second math insight is more important than I thought as it brings the time down from 2.5 to 0.1 seconds. That math insight is: if a and h have no common factors than number of factors of (a * b) = number of Factors of a * number of factors of b. Following is the last (I promise) version with slightly better comments.

  1. -- key math insights to get good performance:
  2. -- 1. sum of first n numbers is (n * (n+1)) / 2
  3. -- 2. if gcd (a, b) = 1 then number of factors of (a * b) = number of Factors of a * number of factors of b
  4. prob12 :: Int -> Int
  5. prob12 nd =
  6. let n = snd $ head $ filter ((> nd) . fst) $ map numOfFactors [1..]
  7. in (n * (n + 1)) `quot` 2
  8. where
  9. numOfFactors n =
  10. -- gcd (n,n+1) = 1 thus gcd of the odd one and half the even one is 1.
  11. -- following gives a speedup of a factor of 25 over naive numFactors $ (n * (n + 1)) `quot` 2
  12. let num
  13. | odd n = numFactors n * numFactors ((n + 1) `quot` 2)
  14. | otherwise = numFactors (n `quot` 2) * numFactors (n + 1)
  15. in (num, n)
  16.  
  17. numFactors :: Int -> Int
  18. numFactors n
  19. | (n `rem` isqrt == 0) = nfs + 1
  20. | otherwise = nfs
  21. where
  22. isqrt = floor $ sqrt $ fromIntegral n
  23. nfs = 2 * length [x | x <- [ 1 .. (isqrt - 1)], n `rem` x == 0]
  24.  
  25. prob12 500
  26. 76576500
  27. it :: Int
  28. (0.13 secs, 249290832 bytes)

This is a simple two-liner in

This is a simple two-liner in perl6 ... but it's also painfully slow in the current implementations.

  1. sub divisors($n) { (1..$n).grep($n %% *); }
  2.  
  3. say ([\+] 1..*).first({ divisors($_) > 500 });

Colomon, this is perl6 code

Colomon,

this is perl6 code isn't it?

I promise that next time I

I promise that next time I will pay more attention to the comments before I comment back!

Another way to generate triangles numbers

triangles = scanl (+) 1 [2..]

scanl is basically a foldl that keeps tracks of the intermediate results, so exactly what you want to calculate the triangles : you add all the naturals in order and keep track of all the cumulative sums.

With this you get all the speed of your second solution while keeping your code elegantly State-free (which is accounted a good thing in most case in Haskell).

Thank you for that Jedai, I

Thank you for that Jedai, I was not aware of scanl. I will look into that deeper.

Triangles

You can generate the triangle numbers like this in Haskell:

triangles = 1:zipWith (+) triangles [2..]

Hi Thorsden, Looks like you

Hi Thorsden,

Looks like you created a recursive based version of Jedai's scanl based triangle number generator. Thank you for contributing it, it's really nice to have options and to see all the different ways of how to compute the same thing. It's really kind of eye opening and awe inspiring.

fibonacci

Hi Bryce,

I got the idea from a similar definition of the Fibonacci sequence which had me very impressed when I encountered it (from: The Haskell School of Expression, Paul Hudak, Cambridge University Press):

fib = 1:1:zipWith (+) fib (tail fib)

scanl is even nicer for triangles, of course, although it wouldn't work for the Fibonacci case, I think. At least not directly, because calculating the Fibonacci sequence requires keeping the last two results.

a simpler approach

Good for you for learning the State monad, but it wasn't really necessary in this case -- you could have instead done something like:

  1. xs :: [Int]
  2. xs = map snd (xs' [(7,28)])
  3. where
  4. xs' :: [(Int,Int)] -> [(Int,Int)]
  5. xs' ((n,s) : _) = (n,s) : (xs' [(n+1, s+n+1)])

Steven, Thanks for the praise

Steven,

Thanks for the praise for learing how to use the State Monad, it definetly was an adventure. Also, you condensed down all my "hard" work with the state monad quite nicely. I will have to remember this if I ever need to do something like this again. Thank you for sharing!

Note that one of the reasons

Note that one of the reasons your State implementation is faster is because you're using `Int` instead of `Integer`. If you remove `type MyState = (Int, Int)` and `getDivLen :: MyState -> Int`, performance drops. Likewise, if you force `Int` in your initial implementation, performance improves.

Thank you Anon for pointing

Thank you Anon for pointing that out. When I was tweaking things, I did change it from 'Int' to 'Integer' and nearly doubled my run time (approx 4.5 seconds). Sadly with everything else I wanted to express in my write up I forgot to include that.

Your get_divisors() is wrong:

Your get_divisors() is wrong: it doesn't give the correct answer for perfect squares.

If you made it this far down into the article, hopefully you liked it enough to share it with your friends. Thanks if you do, I appreciate it.

Bookmark and Share