Project Euler: Problem 8

The website has problem eight stated as such,“Find the greatest product of five consecutive digits in the 1000-digit number.
73167176531330624919225119674426574742355349194934
96983520312774506326239578318016984801869478851843
85861560789112949495459501737958331952853208805511
12540698747158523863050715693290963295227443043557
66896648950445244523161731856403098711121722383113
62229893423380308135336276614282806444486645238749
30358907296290491560440772390713810515859307960866
70172427121883998797908792274921901699720888093776
65727333001053367881220235421809751254540594752243
52584907711670556013604839586446706324415722155397
53697817977846174064955149290862569321978468622482
83972241375657056057490261407972968652414535100474
82166370484403199890008895243450658541227588666881
16427171479924442928230863465674813919123162824586
17866458359124566529476545682848912883142607690042
24219022671055626321111109370544217506941658960408
07198403850962455444362981230987879927244284909188
84580156166097919133875499200524063689912560717606
05886116467109405077541002256983155200055935729725
71636269561882670428252483600823257530420752963450”

Yeah, looks pretty intimadating to me too. ;) Lucky enough for both of us this really isn't all that hard to figure out.

Python

  1. #!/usr/bin/python3
  2. """
  3. code solution for Euler Project #8
  4. """
  5. import operator
  6. from functools import reduce
  7.  
  8. if __name__ == "__main__":
  9. index = 0
  10. high_score = 0
  11. investigate_this = 7316717653133062491922511967442657474235534919493496983520312774506326239578318016984801869478851843858615607891129494954595017379583319528532088055111254069874715852386305071569329096329522744304355766896648950445244523161731856403098711121722383113622298934233803081353362766142828064444866452387493035890729629049156044077239071381051585930796086670172427121883998797908792274921901699720888093776657273330010533678812202354218097512545405947522435258490771167055601360483958644670632441572215539753697817977846174064955149290862569321978468622482839722413756570560574902614079729686524145351004748216637048440319989000889524345065854122758866688116427171479924442928230863465674813919123162824586178664583591245665294765456828489128831426076900422421902267105562632111110937054421750694165896040807198403850962455444362981230987879927244284909188845801561660979191338754992005240636899125607176060588611646710940507754100225698315520005593572972571636269561882670428252483600823257530420752963450
  12.  
  13.  
  14. digit_list = [int(x) for x in str(investigate_this)]
  15. digit_list_length = len(digit_list)
  16.  
  17. while index < digit_list_length:
  18. if index == 0:
  19. temp_product = reduce(operator.mul, digit_list[index: index +5])
  20. elif index == 1:
  21. temp_product = reduce(operator.mul, digit_list[index - 1: index + 4])
  22. elif index == digit_list_length - 1:
  23. temp_product = reduce(operator.mul, digit_list[index - 4: index + 1])
  24. elif index == digit_list_length - 2:
  25. temp_product = reduce(operator.mul, digit_list[index - 3: index + 2])
  26. else:
  27. temp_product = reduce(operator.mul, digit_list[index -2 : index + 3])
  28.  
  29. if temp_product > high_score:
  30. high_score = temp_product
  31.  
  32. index += 1
  33.  
  34. print(high_score)

For my first attempt I took an approach of having a travelling index and checking both two digits before and two digits after the index. This worked out pretty well except that I needed to add four special use cases (the first two index points and the last two). Some people may say that it's not “pythonic” to use reduce, and in some ways I can agree with it. However, in this case my reduce fuctions really aren't very complicated and I think they look cleaner than using for loops in this instance. Does anyone out there have any opinions?

Haskell

  1. module Main where
  2.  
  3. import Data.Char
  4.  
  5. investigate_this = map digitToInt $ show 7316717653133062491922511967442657474235534919493496983520312774506326239578318016984801869478851843858615607891129494954595017379583319528532088055111254069874715852386305071569329096329522744304355766896648950445244523161731856403098711121722383113622298934233803081353362766142828064444866452387493035890729629049156044077239071381051585930796086670172427121883998797908792274921901699720888093776657273330010533678812202354218097512545405947522435258490771167055601360483958644670632441572215539753697817977846174064955149290862569321978468622482839722413756570560574902614079729686524145351004748216637048440319989000889524345065854122758866688116427171479924442928230863465674813919123162824586178664583591245665294765456828489128831426076900422421902267105562632111110937054421750694165896040807198403850962455444362981230987879927244284909188845801561660979191338754992005240636899125607176060588611646710940507754100225698315520005593572972571636269561882670428252483600823257530420752963450
  6.  
  7. investigate_length = length investigate_this
  8.  
  9. productof5 index
  10. | index >= investigate_length - 4 = []
  11. | otherwise = prod5 : productof5 (index + 1)
  12. where prod5 = product [ investigate_this !! y | y <- [index..(index + 4)]]
  13.  
  14. main :: IO()
  15. main = print . maximum $ productof5 0

In my next solution, I followed the same idea of using a traveling index, but this time I only paid attention to the index and the next four digits. The advantage to this was that I only needed to worry about one special case, and that was if the number of multipliers were less than five total. This did seem to simplify things.

Sorry to all my Perl & Java readers. Life is kind of chaotic at the moment and I only really had the energy to do these two languages. One day I hope to revisit this problem in those languages, and when I do I'll update this post with the new code additions.

I realize in the last few times I've posted project euler solution I haven't said anything so just to nail this thought home amongst the people reading this. I look forward to hearing any and all constructive comments regarding algorythms and code. Oh, and one more thing, I apologize in advance for the formatting of the rather large number.

Code to lyrics

It's not everyday that one can turn song lyrics into code. I've been a big fan of the group One Minute Silence for a pretty long time (1998 if memory serves me correctly), but it wasn't until recently that I was listening to the song “I Can Change” that I noticed that the spoken word part that happens at both the beginning and the end of the song followed a pattern. Noticing this pattern I realized I could reproduce the lyrics in code:

  1. #!/usr/bin/python3
  2.  
  3. def print_lyrics(words):
  4. if "I" in words:
  5. print("If I can change " + words[0] + ", " + words[1] + " can change.")
  6. else:
  7. print("If I can change " + words[0] + ", I can change " + words[1] + ",")
  8.  
  9. if __name__ == "__main__":
  10. number = [ str(2 ** x) for x in range(0,4)]
  11. number2 = number[1:]
  12. number2.append('I')
  13. for x in zip(number, number2):
  14. print_lyrics(x)

As a side note, I'm not sure if Yap intended to follow the powers of two, but it worked out really well when trying to generate the numerical output. And if you think your the kind of person that enjoys that hip-hop metal with anti-capitalist message, then you might want to give the song a try, and maybe the rest of their material if you like it.

Oh yeah, before I forget, Happy Friday everyone.

Project Euler: Problem 7

Here is the seventh installment of this series. And just to keep things interesting I'm going to do something different this time around. I still performed the solutions for Haskell, Java, and Perl (if you want to view them check out my github repo), but instead of posting the code for all four languages, I'm going to be talking about the evolution of my Python solution, showing you all three revisions, and talking briefly about how or why things changed.

Python code:
Solution 1:

  1. #!/usr/bin/python3
  2.  
  3. import math
  4.  
  5. def is_prime(divided):
  6. divisor = 3
  7. sqrt_divided = math.sqrt(divided)
  8. while divisor <= sqrt_divided:
  9. if divided % divisor == 0:
  10. return False
  11.  
  12. divisor += 2
  13. return True
  14.  
  15. if __name__ == "__main__":
  16. prime_list = [2]
  17. count = 3
  18.  
  19. while len(prime_list) < 10001:
  20. if is_prime(count) == True:
  21. prime_list.append(count)
  22.  
  23. count += 2
  24.  
  25. print(prime_list[-1])

Looking at this first solution, we see that it follows the Haskell solution quite closely. This is of course not an accident; I'm lazy. ;) And since I'm still learning Haskell, I find it easier to learn the basics of a language through translation. So this Python solution was my first solution, which I then translated to Haskell.

Solution 2:

  1. #!/usr/bin/python3
  2.  
  3. import math
  4.  
  5. def is_prime(divided):
  6. divisor = 3
  7. sqrt_divided = math.sqrt(divided)
  8. while divisor <= sqrt_divided:
  9. if divided % divisor == 0:
  10. return False
  11.  
  12. divisor += 2
  13. return True
  14.  
  15.  
  16. if __name__ == "__main__":
  17. count = 1 #to substitute the prime number 2 in the example
  18. last = 0
  19. iterator = 3
  20.  
  21. while count < 10001:
  22. if is_prime(iterator):
  23. count += 1
  24. last = iterator
  25.  
  26. iterator += 2
  27.  
  28. print(last)

So in coming up with my Java solution, I didn't want to deal with trying to come up with a linked list-based solution for number storage. At which point I had the great thought: why not just hold the last number to be prime, and not all primes. Spending all this time with lists and arrays instead of just straight Ints has really played with my thought patterns. So after I wrote up the Java solution, I translated it to Python, to see if there would be a reduction in the execution time based on only using Ints instead of a list of Ints. Paint me a little surprised when I saw that the difference was minimal if non-existent most of the time.

Solution 3:

  1. #!/usr/bin/python3
  2.  
  3. import math
  4.  
  5. def is_prime(divided):
  6. divisor = 3
  7. sqrt_divided = int(math.sqrt(divided))
  8. while divisor <= sqrt_divided:
  9. if divided % divisor == 0:
  10. return False
  11.  
  12. divisor += 2
  13. return True
  14.  
  15.  
  16. if __name__ == "__main__":
  17. # using a list to store each int value
  18. data = [1,0,3]
  19.  
  20. while data[0] < 10001:
  21. if is_prime(data[2]):
  22. data[0] += 1
  23. data[1] = data[2]
  24.  
  25. data[2] += 2
  26.  
  27. print(data[1])

After digging into how Python does its memory management, like Lists being mutable while Ints are not, I modified the second solution so that the Ints are now in a list. This way the Python memory manager does not have to create a new variable to store each value. After these changes I got the speed difference I was expecting to see between the first and second revision.

Also, you might have noticed that in the is_prime function, I cast math.sqrt to an Int. As math.sqrt returns a float, every time that comparison happens for the while loop there has to be a conversion of divisor from its current state of int to a float. Math.sqrt returns a float, thus sqrt_divided is a float, and divisor is an int. So every time that while loop cycles again, there has to be a conversion of an int to a float. By casting that float to an int during the variable declaration, I save myself a few thousand cast operations and shave off a couple hundredths of a second from the total run time.

Language Run Times:
Python:
solution 1: .452s
solution 2: .433s
solution 3: .376
Haskell: .187s
Perl : .333s
Java: .128s

I don't think there are enough words in the English language to express my surprise that Java actually had the lowest overall run time on this one. I reran the Java solution a couple of times to see if it was an error, but it's correct. This is something that I will have to dig into more deeply to find out why.

And Now For Something Completely Different...

Or maybe not. I came across this programming challenge through means I'm not quite comfortable sharing on the interwebs. So disregard the origin of this challenge, and just put those spare brain cells toward the challenge and my answer.

The challenge:
Write a program that assigns products to sites in a way that maximizes the scores over
the set of customers. Each customer can only have one product and each product can
only be offered to one customer. Your program should run on the command line and
take as input two newline separated files, the first containing the names of the products
and the second containing the names of the customers. The output should be the total
SS and a matching between customers and products. You do not need to worry about
malformed input, but you should certainly handle both upper and lower case names.

The algorithm for generating the number is as follows:
• If the length of the product name is even, the base score is the number
of vowels in the customer’s name multiplied by 1.5.
• If the length of the product name is odd, the base score is the number of consonants in
the customer’s name multiplied by 1.
• If the length of the product name shares any common factors (besides 1) with the
length of the customer’s name, the score is increased by 50% above the base score.

My answer:

  1. #!/usr/bin/python
  2.  
  3. import sys
  4. from optparse import OptionParser
  5.  
  6. def build_list_from_file(filename):
  7. """
  8. builds a list from a newline delimited file.
  9. INPUT: string object, the filename
  10. OUTPUT: a list object with the lines of the file
  11. """
  12. temp_list = []
  13. try:
  14. with open(filename) as file:
  15. for lines in file:
  16. temp_list.append(lines.strip())
  17. except IOError:
  18. print "the file %s does not exist. Please look into it." % filename
  19. sys.exit(1)
  20.  
  21. return temp_list
  22.  
  23. def count_vowels(name):
  24. """
  25. counts the number of vowels in a string.
  26. INPUT: string object, the string to be parsed
  27. OUTPUT: int object, the number of vowels in the string
  28. """
  29. vowels = 0
  30. for x in name.lower(): #setting all characters to lower case for comparison
  31. for y in ['a', 'e', 'i', 'o', 'u']:
  32. if x == y:
  33. vowels += 1
  34.  
  35. return vowels
  36.  
  37.  
  38. def count_consinants(name):
  39. return len(name) - count_vowels(name)
  40.  
  41.  
  42. def generate_ss(matrix_object):
  43. """
  44. The function to generate the SS score based on the product name
  45. and customer name.
  46. INPUT: tuple object, with the name in the first element and
  47. the product name in the second
  48. OUTPUT: a float object with the value after the computation
  49. """
  50. cust = matrix_object[0]
  51. prod = matrix_object[1]
  52. cust_factors = get_factors(cust)
  53. prod_factors = get_factors(prod)
  54. ss = 0.0
  55.  
  56. if len(prod) % 2 == 0:
  57. ss = float(1.5 * count_vowels(cust))
  58. else:
  59. ss = float(count_consinants(cust)) # muliplicaton by 1 is implied.
  60.  
  61. # by using sets, we can now use itersection to see if there are any
  62. # common factors between the two names
  63. if len(cust_factors.intersection(prod_factors)) > 0:
  64. ss *= 1.5
  65.  
  66. return ss
  67.  
  68.  
  69. def get_factors(name):
  70. """
  71. Retrive the factors of a particular number.
  72. INPUT: int object, number to gather factors of.
  73. OUTPUT: set object, the factors of a particular number
  74. """
  75. # using a variable instead of calling len() twice, should be faster
  76. top_num = len(name)
  77. return set([x for x in range(2,top_num+1) if top_num % x == 0])
  78.  
  79.  
  80. if __name__ == "__main__":
  81. """self explanitory... I hope."""
  82. parser = OptionParser()
  83. parser.add_option("-p", "--prod", dest="prod",
  84. help = "Path to the prodcts file.Which should be newline delimited file.")
  85. parser.add_option("-c", "--cust", dest="cust",
  86. help = "Path to the customer file.Which should be newline delimited file.")
  87.  
  88. (options,args) = parser.parse_args()
  89.  
  90. if not (options.prod and options.cust):
  91. print "A file for customers (-c) and one for products (-p) are needed."
  92. sys.exit(1)
  93.  
  94. customer_list = build_list_from_file(options.cust)
  95. prodcut_list = build_list_from_file(options.prod)
  96. matrix = []
  97. final_list = []
  98.  
  99. matrix = [(a,b) for a in customer_list for b in prodcut_list]
  100.  
  101. ss_list = map(generate_ss, matrix)
  102.  
  103. # since we're zipping things up, might as well sort it
  104. # with largest SS value first
  105. zipped_by_ss = list(reversed(sorted(zip( ss_list, matrix))))
  106.  
  107. # need to prime the list in order for this to work
  108. final_list.append(zipped_by_ss[0])
  109. del zipped_by_ss[0]
  110.  
  111. while 1 < len(zipped_by_ss):
  112. j = 0
  113. while j < len(zipped_by_ss):
  114. if ( zipped_by_ss[j][1][0] is final_list[-1][1][0] or
  115. zipped_by_ss[j][1][0] is final_list[-1][1][1] or
  116. zipped_by_ss[j][1][1] is final_list[-1][1][0] or
  117. zipped_by_ss[j][1][1] is final_list[-1][1][1]
  118. ):
  119. del zipped_by_ss[j]
  120. continue
  121. j += 1
  122.  
  123.  
  124. final_list.append(zipped_by_ss[0])
  125. del zipped_by_ss[0]
  126.  
  127. for item in final_list:
  128. print "%s has item %s, with SS score of %s." % (item[1][0], item[1][1],item[0])

Thoughts about the code
Before I posted this code here, I actually had the matrix generation done with two for loops. Changing it to the list comprehension that is in the code reduced the code size by a good eight lines or so, increased the readability of that section of the program, and gave me a bit of a speed up to boot. I will remember this little trick for the future.

When I was in school I took the requisite algorithms course, but that was mostly about sorting and some other things. I don't recall them going over a “duplicate removal” algorithm. For this part of the program I believe my program runs correctly though I feel like it’s ugly code. Doing the removal of duplicates with two indexes made sense to me for removing the duplicates in line, but I feel like this could be optimized more. I'm totally open to comments on all of my code, but I would really appreciate it if people would focus more on that aspect of the program. Also, if anyone knows of any “duplicate removal” style algorithms, or places where I could learn more about them, I would be very grateful.

Syndicate content