Here is the seventh installment of this series. And just to keep things interesting I'm going to do something different this time around. I still performed the solutions for Haskell, Java, and Perl (if you want to view them check out my github repo), but instead of posting the code for all four languages, I'm going to be talking about the evolution of my Python solution, showing you all three revisions, and talking briefly about how or why things changed.
Python code:
Solution 1:
#!/usr/bin/python3 import math def is_prime(divided): divisor = 3 sqrt_divided = math.sqrt(divided) while divisor <= sqrt_divided: if divided % divisor == 0: return False divisor += 2 return True if __name__ == "__main__": prime_list = [2] count = 3 while len(prime_list) < 10001: if is_prime(count) == True: prime_list.append(count) count += 2 print(prime_list[-1])
Looking at this first solution, we see that it follows the Haskell solution quite closely. This is of course not an accident; I'm lazy. ;) And since I'm still learning Haskell, I find it easier to learn the basics of a language through translation. So this Python solution was my first solution, which I then translated to Haskell.
Solution 2:
#!/usr/bin/python3 import math def is_prime(divided): divisor = 3 sqrt_divided = math.sqrt(divided) while divisor <= sqrt_divided: if divided % divisor == 0: return False divisor += 2 return True if __name__ == "__main__": count = 1 #to substitute the prime number 2 in the example last = 0 iterator = 3 while count < 10001: if is_prime(iterator): count += 1 last = iterator iterator += 2 print(last)
So in coming up with my Java solution, I didn't want to deal with trying to come up with a linked list-based solution for number storage. At which point I had the great thought: why not just hold the last number to be prime, and not all primes. Spending all this time with lists and arrays instead of just straight Ints has really played with my thought patterns. So after I wrote up the Java solution, I translated it to Python, to see if there would be a reduction in the execution time based on only using Ints instead of a list of Ints. Paint me a little surprised when I saw that the difference was minimal if non-existent most of the time.
Solution 3:
#!/usr/bin/python3 import math def is_prime(divided): divisor = 3 sqrt_divided = int(math.sqrt(divided)) while divisor <= sqrt_divided: if divided % divisor == 0: return False divisor += 2 return True if __name__ == "__main__": # using a list to store each int value data = [1,0,3] while data[0] < 10001: if is_prime(data[2]): data[0] += 1 data[1] = data[2] data[2] += 2 print(data[1])
After digging into how Python does its memory management, like Lists being mutable while Ints are not, I modified the second solution so that the Ints are now in a list. This way the Python memory manager does not have to create a new variable to store each value. After these changes I got the speed difference I was expecting to see between the first and second revision.
Also, you might have noticed that in the is_prime function, I cast math.sqrt to an Int. As math.sqrt returns a float, every time that comparison happens for the while loop there has to be a conversion of divisor from its current state of int to a float. Math.sqrt returns a float, thus sqrt_divided is a float, and divisor is an int. So every time that while loop cycles again, there has to be a conversion of an int to a float. By casting that float to an int during the variable declaration, I save myself a few thousand cast operations and shave off a couple hundredths of a second from the total run time.
Language Run Times:
Python:
solution 1: .452s
solution 2: .433s
solution 3: .376
Haskell: .187s
Perl : .333s
Java: .128s
I don't think there are enough words in the English language to express my surprise that Java actually had the lowest overall run time on this one. I reran the Java solution a couple of times to see if it was an error, but it's correct. This is something that I will have to dig into more deeply to find out why.

One optimization is to make
One optimization is to make is_prime use the list of primes already calculated instead of test all odd possible divisors. Another optimization is to check only the numbers 6k +/- 1.
Lists are not magic pixie dust
Replacing variables with list items is a pointless obfuscation that makes your code slightly slower (as you've discovered). The speed difference between version 2 and version 3 comes from the int(math.sqrt()) change, since you're performing the type conversion only once, and not once per loop.
This is why changing two things at once is not a good technique: you end up not knowing which of the changes caused the speedup.
Hey Marius, Thanks for the
Hey Marius,
Thanks for the comment. After researching what you said I have to agree that you're are absolutely right, the code is faster without using lists and that the improvement came from doing the int cast.
In my defense, I did the change from ints to list first, and at that time I saw a slight speed improvement... although for the life of me I can't seem to reproduce it now.
However this does bring up a weird scenario. If ints are immutable, and a new int has to be created every time a variable is changed, one would think that there would be a lot of time wasted dealing with memory management and garbage collection. And that the time spend doing that would disappear if using a list to store the information, being those are mutable.
It doesn't matter if you put
It doesn't matter if you put an immutable int into a list, or assign it directly to a name. In both cases when you do arithmetic Python has to create a new immutable int to hold the new value. (Small ints are cached, so the memory-management overhead will be reduced, but that's an implementation detail.)
AAAAHHHHH!! That makes a lot
AAAAHHHHH!! That makes a lot of sense now that you explain it.
Thanks for sharing that one!
Cool stuff
Very cool examples. I read the first couple sentences of your post and then tried implementing it myself before reading the rest. I ended up with your solution #2, only I was foolishly testing for primes by counting down from the sqrt of the number to be divided, rather than counting up to the sqrt. When I changed this my runtime increased dramatically :)
Very smart optimization in solution 3. I wonder if it's possible to make this any faster...?
Correction
When I changed this my runtime increased dramatically :)
->
When I changed this my speed increased/runtime decreased dramatically :)
That's what I get for posting too quickly... :op
Hey Nat, Thanks for the
Hey Nat,
Thanks for the comments. I think most people would probably come up with the second solution first, it's simpler. But as I said in the blog I guess I'm use to thinking about these things in list data type patterns, which although are not incorrect, they are just not as efficient for all uses.
After you asked your question regarding if there was a way to speed things up bit; I tried putting in the same list manipulation technique that I used in solution 3 and put into the is_prime function. Oddly enough it made the code slower than the original solution. I admit I'm completely baffled by this. As I told Nico I hope to spend some time with profiling tools in the near future and hopefully I might be able to get a little more insight into why these things are happening the way they are.
Yeah, actually I tried the
Yeah, actually I tried the same thing earlier--implementing the is_prime function with lists instead of straight ints, and it made it considerably slower. Let me know if you figure it out, because I'd be interested to know what's going on.
runtime comparison
That Java is fast, is not such a surprise to me. I found the position of the Haskell solution (2nd in terms of speed) more surprising. Do you have any idea why Haskell is so fast in this case, and why all Python solutions are at the bottom of the list (in terms of speed)?
Hi Nico, Thanks for
Hi Nico,
Thanks for commenting and asking that question.
The reason why the Haskell solution is much faster this time around is because of an algorithmic change I made to the code but didn't talk about because this blog post was much more Python centric. If you check out the code for problem 3 against problem 7 you'll see that I pulled out the square root function call to now only being once per divided, instead before it was being called once per divisor. This change was pretty much the reason for the huge number improvement between problem 3's and problem 7's run times.
As for the slow performance of the Python code, the only thing that I can guess might be the problem is that I'm running the code through Python 3.1 and not 2.7 or 2.6. If I run the fastest python solution in 2.6 the runtime speed decreases to .235s or so.
What I will want to for the next blog is spend some time learning how to use the profiling tools for Java, Haskell, and Python. And use then to discover the reason as to why the Python code is so much slower than the rest of the languages for this solution.