Python

Project Euler - Problem 17

It's been to long since I posted a solution to one of these challenges. How time flies when you're having fun.

Here's the problem:
If the numbers 1 to 5 are written out in words: one, two, three, four, five, then there are 3 + 3 + 5 + 4 + 4 = 19 letters used in total.

If all the numbers from 1 to 1000 (one thousand) inclusive were written out in words, how many letters would be used?

Here is the python code:

  1. #!/usr/bin/env python
  2.  
  3. ones = {'1': 'one', '2': 'two', '3': 'three', '4': 'four', '5': 'five',
  4. '6': 'six', '7': 'seven', '8': 'eight', '9': 'nine', '0': ''}
  5.  
  6. tens = {'2': 'twenty', '3': 'thirty', '4': 'forty', '5': 'fifty',
  7. '6': 'sixty', '7': 'seventy', '8': 'eighty', '9': 'ninety'}
  8.  
  9. teens = {'10': 'ten', '11': 'eleven', '12': 'twelve', '13': 'thriteen',
  10. '14': 'fourteen', '15': 'fifteen', '16': 'sixteen', '17': 'seventeen',
  11. '18': 'eighteen', '19': 'nineteen'}
  12.  
  13. hundreds = {0: 0, 1: "onehundredand", 2: "twohundredand",
  14. 3: "threehundredand", 4: "fourhundredand",
  15. 5: "fivehundredand", 6: "sixhundredand",
  16. 7: "sevenhundredand", 8: "eighthundredand",
  17. 9: "ninehundredand" }
  18.  
  19. if __name__ == "__main__":
  20. tot = 0
  21. for h in xrange(10):
  22. for y in xrange(1,100):
  23. try:
  24. t,o = tuple(str(y))
  25. if t is '1':
  26. tot += len("{h}{t}".format(h=hundreds[h], t=teens[t + o]))
  27. else:
  28. tot += len("{h}{t}{o}".format(h=hundreds[h], t=tens[t],
  29. o=ones[o]))
  30. except ValueError:
  31. tot += len("{h}{o}".format(h=hundreds[h], o=ones[str(y)]))
  32. tot += len('onethousand')
  33. print tot

Even though I wrote it, I still look at it and think "that's not mine." It's been a long time since I wrote a for loop within a for loop. There isn't anything wrong with it, it's just not my style. This time however I wasn't really able to come up with a solution that would allow me to break out of the two for loops.

The one part of the code that I was surprised "worked" war breaking up the digits by turning the number to a string then a tuple. This allowed me to easily test an exception. This exception will only be thrown 10% of the time. While exceptions might be expensive, the other 90% of the time the code hums along without using a conditional. Everything has costs, but I think that the cost of throwing an exception 10% of the time as opposed to testing a conditional 100% of the time is a cost I'm willing to accept.

I will admit that I did not code up another solution in a different programming language. While part of that is due to being lazy - it's good for the soul once and a while - I'm also not sure how I can code this up in a functional language. I'm sure it can be done, I just don't know how (If anyone has a link or idea please share it.) But because I do like to compare things I tweaked the code to run within python 3.3.0. The differences in time are so minimal that I'm not even going to post it. If you're really inspired you can read the python 3 code here.

Questions and comments welcomed. One quick side note to my readers: I'm getting married this year (Yay!) and a lot of my free time is spent juggling and planning. So I might not be blogging as frequently as usual. Thanks for your patience.

First Flask web app

As everyone should know by now, I love coding challenges. A while ago I came across this one, which is rather long:
Using python with gevent 0.13.x and your choice of additional libraries and/or frameworks, implement a single HTTP server with API endpoints that provide the following functionalities:

      A Fibonacci endpoint that accepts a number and returns the Fibonacci calculation for that number, and returns result in JSON format. example:

      1. $ curl -s 'http://127.0.0.1:8080/fib/13'
      2. {"response": 233}
      1. $ curl -s 'http://127.0.0.1:8080/fib/12'
      2. {"response": 144}
      An endpoint that fetches the Google homepage and returns the sha1 of the response message-body (HTTP body data).example:

      1. $ curl -s 'http://127.0.0.1:8080/google-body'
      2. {"response": "272cca559ffe719d20ac90adb9fc4e5716479e96"}
      Using some external storage of your choice (can be redis, memcache, sqlite, mysql, etc), provide a means to store and then retrieve a value.Example:

      1. $ curl -d 'value=something' 'http://127.0.0.1:8080/store'
      2. $ curl 'http://127.0.0.1:8080/store'
      3. {"response": "something"}</li>

At one point in my past a coworker talked about his love of the Flask micro framework. Since this is just a simple web API, I figured I'd give it a shot. This is a bit of a complicated task with many pieces, so let's set a game plan for the rest of this post. I'm going to share the specific functions related to each piece of functionality, then at the very end I will share the entire code base so everyone can see how it all fits together. One last note - to test out the run, you need to call the run.py script, which will also be included below. Ready … BREAK!

Let's tackle the google-body endpoint first:

  1. @app.route('/google-body')
  2. def google_body():
  3. try:
  4. sh = sha.new(urlopen('http://www.google.com').read())
  5. return jsonify({'response' : sh.hexdigest()})
  6. except Exception as e:
  7. return jsonify({'response' : 'ERROR: %s' % str(e)})

While this code block is very compact, it's not difficult to understand. Go out to the internet, get the source code for google's home page and insert that into the sha object. Wrap up the hexdigest results into a dictionary object, throw that into the jsonify function, and send it on its way. Don't forget to package it all in a nice try/except block for safety.

Onto the Fibonacci API call:

  1. @app.route('/fib/<number>')
  2. def fib(number):
  3. try:
  4. return jsonify({'response' : real_fib(int(number))})
  5. except ValueError:
  6. return jsonify({'response' : 'ERROR: Input not a number'})
  7.  
  8. @lru_cache()
  9. def real_fib(n):
  10. """
  11. This code was modified from the fib code in the python3 functools
  12. documentation.
  13. """
  14. if n < 2:
  15. return n
  16. return real_fib(n-1) + real_fib(n-2)

During PyCon US 2012, I became aware of the lru_cache decorator in python3.3. I also learned that Raymond Hettinger wrote code that would allow it work in python2, which allowed me to copy the code from the python3 documentation with little modification. Knowing about the lru_cache allowed me to write a nice and concise Fibonacci function reminiscent of something I might write in Haskell or some other functional language.

Now the last part of this challenge, and the longest. Before we get to the python, I feel that it might help your understanding if you know what kind of database schema we're working with. So I'm going to post that first, then the python code.

  1. drop table if exists entries;
  2. create table entries (
  3. id integer primary key autoincrement,
  4. value string not null
  5. );

  1. @app.route('/store', methods=['GET', 'POST'])
  2. def store():
  3. if request.method == 'POST':
  4. try:
  5. g.db.execute('insert into entries (value) values (?)',
  6. [request.form['value']])
  7. g.db.commit()
  8. resp = jsonify()
  9. resp.status_code = 200
  10. return resp
  11. except Exception as e:
  12. return jsonify({"response" : "ERROR: %s" % str(e)})
  13. else:
  14. try:
  15. cur = g.db.execute('select value from entries order by id desc')
  16. #fetchone returns a list. To better meet the requirements,
  17. #just slicing the head of the list and output that.
  18. return jsonify({'response' : cur.fetchone()[0]})
  19. except IndexError:
  20. return jsonify({'response' : 'NOTHING IN THE DATABASE'})

This step is obviously a little more complex - the function has to process both the GET and the POST HTTP methods, while using an outside database to store and retrieve the information. I believe the code here is simple enough for you to understand, so I won't explain every line. For the POST method I had to do some juggling to get the desired return results. In the example above a POST method works, but does not receive an actual response. I was able to create this by using the jsonify object to create an empty Flask.Response object, and then set the status code of that response.

At the end of the day I'm pretty happy with this. It wasn't a hard challenge, but certainly allowed me to learn a bit about Flask. If I did over again, I might improve things by creating my own decorator to abstract all the try/except blocks. The errors would have to become more generic and maybe less helpful, but that is a worthwhile cost for being able to live the “Don't Repeat Yourself” mentality. The decorator would look something like this (the code below has not been tested - caveat emptor):

  1. def try_block(f):
  2. @wraps
  3. def wrapper(*args, **kwds):
  4. try:
  5. return f(*args,**kwds)
  6. except ValueError:
  7. return jsonify({'response' : 'ERROR: Input not a number'})

As more functions start to use this decorator it'll most likely get uglier as it has to juggle more and more exceptions. While that is certainly a cost of having all of the error checking wrapped up in one location, the benefit that I can see is if another programmer were to add a new piece of functionality, he would know exactly where to go to add in the exceptions if it wasn't there already.

  1. @app.route('/fib/<number>')
  2. @try_block
  3. def fib(number):
  4. return jsonify({'response' : real_fib(int(number))})

I think this would clean up the function quite a bit.

Next time I'm going to show how to do this in Haskell using the Yesod web framework. As always, questions, and comments are welcomed.

run.py

  1. from gevent.wsgi import WSGIServer
  2. from playhaven import app, init_db
  3.  
  4. init_db()
  5. http_server = WSGIServer(('0.0.0.0', 8080), app)
  6. http_server.serve_forever()

challenge.py

  1. from __future__ import with_statement
  2. from urllib2 import urlopen
  3. from contextlib import closing
  4. from flask import Flask, request, g, jsonify
  5. from lru_cache import lru_cache
  6. import sqlite3
  7. import sha
  8.  
  9. DATABASE = '/tmp/challenge.db'
  10. DEBUG = True
  11. SECRET_KEY = 'c29tZXRoaW5nY2xldmVyaGVyZQ==\n'
  12. USERNAME = 'challenge'
  13. PASSWORD = 'chang3m3'
  14.  
  15. app = Flask(__name__)
  16. app.config.from_object(__name__)
  17.  
  18. app.config.from_envvar('CHALLENGE_SETTINGS', silent=True)
  19.  
  20. def connect_db():
  21. return sqlite3.connect(app.config['DATABASE'])
  22.  
  23. def init_db():
  24. with closing(connect_db()) as db:
  25. with app.open_resource('schema.sql') as f:
  26. db.cursor().executescript(f.read())
  27. db.commit()
  28.  
  29. @app.before_request
  30. def before_request():
  31. g.db = connect_db()
  32.  
  33. @app.teardown_request
  34. def teardown_request(exception):
  35. g.db.close()
  36.  
  37. @app.route('/fib/<number>')
  38. def fib(number):
  39. try:
  40. return jsonify({'response' : real_fib(int(number))})
  41. except ValueError:
  42. return jsonify({'response' : 'ERROR: Input not a number'})
  43.  
  44. @lru_cache()
  45. def real_fib(n):
  46. """
  47. This code was modified from the fib code in the python3 functools
  48. documentation.
  49. """
  50. if n < 2:
  51. return n
  52. return real_fib(n-1) + real_fib(n-2)
  53.  
  54. @app.route('/google-body')
  55. def google_body():
  56. try:
  57. sh = sha.new(urlopen('http://www.google.com').read())
  58. return jsonify({'response' : sh.hexdigest()})
  59. except Exception as e:
  60. return jsonify({'response' : 'ERROR: %s' % str(e)})
  61.  
  62. @app.route('/store', methods=['GET', 'POST'])
  63. def store():
  64. if request.method == 'POST':
  65. try:
  66. g.db.execute('insert into entries (value) values (?)',
  67. [request.form['value']])
  68. g.db.commit()
  69. resp = jsonify()
  70. resp.status_code = 200
  71. return resp
  72. except Exception as e:
  73. return jsonify({"response" : "ERROR: %s" % str(e)})
  74. else:
  75. try:
  76. cur = g.db.execute('select value from entries order by id desc')
  77. #fetchone returns a list. To better meet the requirements,
  78. #just slicing the head of the list and output that.
  79. return jsonify({'response' : cur.fetchone()[0]})
  80. except IndexError:
  81. return jsonify({'response' : 'NOTHING IN THE DATABASE'})
  82.  
  83. if __name__ == '__main__':
  84. app.run()

Project Euler: Problem 16

I'm not dead yet! I've just been insanely busy the last month or two with changing jobs and preparing my first programming presentation for BayPiggies and Silicon Valley Code Camp (which is a post for the near future). Both of these have kept me away from my blog. Let me make it up to you with a solution to project Euler problem #16.

The challenge is:

2^15 = 32768 and the sum of its digits is 3 + 2 + 7 + 6 + 8 = 26.
What is the sum of the digits of the number 21000?

Let's start with some Python code:

  1. #!/usr/bin/python
  2.  
  3. print sum([int(i) for i in str(2 ** 1000)])

For this solution, using a more functional approach definitely reduced the code base. But one thing I was a little surprised about is that having a list comprehension within the sum function is actually faster than a generator expression. Usually one hears how generator expressions are preferred over list comprehensions because they are more efficient with memory, among other reasons. However, it's actually faster to give sum a list. One quick caveat, this whole sum and list comprehension thing applies to Python 2. The same seems to be also be true for Python 3, at least from the interpreter:

  1. >>> import timeit
  2. >>> timeit.timeit("sum(int(x) for x in str(2 ** 1000))", number=1000)
  3. 0.11109958100132644
  4. >>> timeit.timeit("sum([int(x) for x in str(2 ** 1000)])", number=1000)
  5. 0.09597363900684286
  6. >>> timeit.timeit("sum(int(x) for x in str(2 ** 1000))", number=10000)
  7. 1.051396899012616
  8. >>> timeit.timeit("sum([int(x) for x in str(2 ** 1000)])", number=10000)
  9. 0.9054670640034601
  10. >>> timeit.timeit("sum(int(x) for x in str(2 ** 1000))", number=100000)
  11. 10.498383879996254
  12. >>> timeit.timeit("sum([int(x) for x in str(2 ** 1000)])", number=100000)
  13. 8.992312036993098

On to the Haskell code:

  1. module Main where
  2.  
  3. import Data.Char
  4.  
  5. main :: IO ()
  6. main = print . sum . map digitToInt . show $ 2 ^ 1000

Maybe it's just me and my Haskell/Python-centric brain, but I think the algorithm is simple enough to easily see the similarities and differences between the two languages. If I wanted to write the Haskell code to better match the Python code (syntactic differences aside), it would look like this: (inside the Haskell interpreter)

  1. Prelude Data.Char> print . sum $ [ digitToInt x | x <- show (2 ^ 1000)]

Even though this code may be easier to read for a Python programmer, it's not “good” Haskell code. It'll get the job done, but the map is obfuscated by the list comprehension. We can also adjust the Python code to make it resemble Haskell by using map:

print sum(map(int, str(2 ** 1000)))

But that might get you “dinged” because some people think that using map is “too functional” or “not Pythonic”, even if the code might be faster. I don't subscribe to that line of thinking...but that's a discussion for another time.

Times:
python – list comprehension : .032s
python – map : .030s
haskell – list ( interpreted) : .155s
haskell – map (interpreted) : .155s
haskell – list (compiled) : .006s
haskell – map (compiled) : .006s

As always, questions, comments, and complaints are encouraged. I hope everyone will forgive me for not posting for so long... sometimes life happens.

Project Euler: Problem 14

“The following iterative sequence is defined for the set of positive integers: n → n/2 (n is even) n → 3n + 1 (n is odd).  Using the rule above and starting with 13, we generate the following sequence: 13 → 40 → 20 → 10 → 5 → 16 → 8 → 4 → 2 → 1.  It can be seen that this sequence (starting at 13 and finishing at 1) contains 10 terms.  Although it has not been proved yet (Collatz Problem), it is thought that all starting numbers finish at 1.  Which starting number, under one million, produces the longest chain?  NOTE: Once the chain starts the terms are allowed to go above one million.”

Not many of you may be aware of this, but about a year ago I wrote up a blog post that discussed Collatz chains in Haskell.  You can find that post here: . Having some of the code already written made coming up with the solution easier.  However, just because I had one function doesn't mean I had the whole problem licked.  I still had a fair amount of work in front of me.  Below is my code from the first attempt at a solution:

  1. module Main where
  2.  
  3. import Data.List
  4.  
  5. chain' :: Integer -> [Integer]
  6. chain' 1 = [1]
  7. chain' n
  8.    | n <= 0 = []
  9.    | even n = n : chain' (n `div` 2)
  10.    | odd n = n : chain' (n * 3 + 1)
  11.  
  12. main :: IO ()
  13. main = do
  14.     let seqx = map chain' [3..1000000]
  15.     let lengthx = map length seqx
  16.     print . maximum $ zip lengthx seqx

This code appears to be logically correct but was incredibly slow - so slow that after over 2 minutes it still hadn’t completed.  I admit I can be a little impatient with these things from time to time, but in this case something was obviously wrong.

I devised two optimizations:

  • Reverse the order of the list. I will be more likely to find the number with the longest chain near 1,000,000 than 3.
  • Use odd numbers only. This is based on the fact that in the chain' function an odd number gets multiplied right off the bat, whereas an even number is instantly divided by 2, and also on the assumption that a higher number will be more likely to have a longer chain.  (I admit this was a complete experiment - I had no proof that it would work ahead of time, and knew it gave me the right answer only after the fact.)

The code then morphed into:

  1. module Main where
  2.  
  3. import Data.List
  4.  
  5. chain' :: Integer -> [Integer]
  6. chain' 1 = [1]
  7. chain' n
  8.    | n <= 0 = []
  9.    | even n = n : chain' (n `div` 2)
  10.    | odd n = n : chain' (n * 3 + 1)
  11.  
  12. main :: IO ()
  13. main = do
  14.     let seqx = map chain' [999999,999997..3]
  15.     let lengthx = map length seqx
  16.     print . maximum $ zip lengthx seqx

The problem I ran into with this code was that I received stack overflow errors; my list of tuples holding another long list of int’s was taking up to much memory.  I fixed this problem by computing the length of the list immediately after generating it.  The new code looked like this:

  1. import Data.List
  2.  
  3. chain' :: Integer -> [Integer]
  4. chain' 1 = [1]
  5. chain' n
  6.    | n <= 0 = []
  7.    | even n = n : chain' (n `div` 2)
  8.    | odd n = n : chain' (n * 3 + 1)
  9.  
  10. main :: IO ()
  11. main = do
  12.     let seqx = map (\x → (x, length $ chain' x) [999999,999997..3]
  13.     print . maximum $ seqx

This got me a result within the one minute time frame, but it still wasn't the right answer.  Can you figure out why?  Using the great code Jedai posted in the comments of my Apache log post, I was able to get my answer and finally complete the problem:

  1. module Main where                                                                             
  2.  
  3.  import Data.Tuple
  4.  import Data.List (sortBy)
  5.  import Data.Function (on)
  6.  
  7.  chain' :: Integer -> [Integer]
  8.  chain' 1 = [1]
  9.  chain' n
  10.    | n <= 0 = []
  11.    | even n = n : chain' (n `div` 2)
  12.    | odd n = n : chain' (n * 3 + 1)
  13.  
  14.  main :: IO ()
  15.  main = do
  16.      let seqx = map (\x -> (x, length $ chain' x)) [999999,999997..3]
  17.      print . fst . head $ sortBy (flip compare `on` snd) seqx

After figuring that out, getting the python answer was a breeze:

  1. #!/usr/bin/python
  2. """Python solution for Project Euler problem #14."""
  3.  
  4. from itertools import imap
  5.  
  6. def sequence(number):
  7. t_num = number
  8. count = 1
  9.  
  10. while(t_num > 1):
  11. if t_num % 2 == 0:
  12. t_num /= 2
  13. else:
  14. t_num = (t_num * 3) + 1
  15.  
  16. count += 1
  17.  
  18. return (count, number)
  19.  
  20. if __name__ == "__main__":
  21. print max(imap(sequence, xrange(999999,3,-2)))

Here are the speed numbers:
Haskell (complied) : 14.758s
Python : 18.537s
Haskell (runghc): 15.217s

I think the use of recursion in my Haskell code is affecting its speed of computation.  As I learned from problem 12, I can use the State Monad again to speed things up.  But I also learned from the comments of problem 12 that some people were able to substitute a scan or fold in the State Monad’s place.  So I decided to shoot for one more solution.  After studying up on scan and fold, and finding that neither was really what I wanted, I found iterate. Using iterate I was able to change the program to this:

  1. module Main where
  2.  
  3. import Data.Tuple
  4. import Data.List (sortBy, iterate)
  5. import Data.Function (on)
  6.  
  7. chain' :: Integer -> Int
  8. chain' n  
  9.     | n < 1 = 0
  10.     | otherwise = 1 + (length $ (takeWhile ( > 1) $ iterate (\x -> if even x then x `div` 2 else x * 3 + 1) n))
  11.  
  12. main :: IO ()
  13. main = do
  14.     let seqx = map (\x -> (x, chain' x)) [999999,999997..3]
  15.     print . fst . head $ sortBy (flip compare `on` snd) seqx

The new chain' function doesn't read as cleanly as the old one, but it does remove the recursion I was talking about earlier.  The computer gods rewarded my efforts by reducing the run times to these:

Haskell (complied) : 10.933s
Haskell (runghc): 11.744s

From 14.758 to 10.933 - almost 4 seconds taken off the clock!  I think a speed up like that calls for some celebrating.  Which is exactly what I'm going to do before I start on problem 15.

If you made it this far down into the article, hopefully you liked it enough to share it with your friends. Thanks if you do, I appreciate it.

Bookmark and Share

Syndicate content