De Morgan’s and (un)Pythonic code


The term “Pythonic” is a subjective and ambiguous one. There are plenty of sites on the internet that try to explain what it means. Some of the better sties will even tell you why one way of doing something is better than another. I am adding my scrollingtext to the latter list.

I believe that one of the key requirements for writing pythonic code is readability, as briefly mentioned in “The Zen of Python” by Tim Peters. I also believe that this little poem (I use that term loosely here) gives some pretty good hints on how to write pythonic code. (This is for more of the thought process outside the general coding style stuff defined in PEP-8). Here is the Zen of Python:

“Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one-- and preferably only one --obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!”

Since this post is about pythonic code, I want to take a quick moment to show you what I mean, saving you a google search or two in the process. Here is a class method that I would consider non-pythonic code:

  1. def inNodeExistInPool(self, poolName, nodeName):
  2. try:
  3. currentNodes = self.__getNodesInPool(poolName)
  4. if currentNodes:
  5. for n in currentNodes:
  6. if n == nodeName:
  7. return True
  8. return False
  9. except Exception, e:
  10. raise Exception('Exception in isNodeExistInPool: ' + poolName +
  11. ', Node: ' + nodeName + ', Error: ' + str(e))

Now here is a more pythonic version of the same code:

  1. def isNodeExistInPool(self, poolName, nodeName):
  2. try:
  3. if nodeName in self.__getNodesInPool(poolName):
  4. return True
  5. else:
  6. return False
  7. except Exception, e:
  8. raise Exception('Exception in isNodeExistInPool: %s, Node: %s, Error: ' %
  9. (poolName, nodeName, str(e)))

While the first version of the code above is not wrong and would probably be the correct way to do this in other programming languages, it’s not the best way to do it in python. (I also expect to see quite a few comments on how other python programmers might improve the first or second versions of these functions.)

Now that we have that little example out of the way, let me share with you the motivation for this blog post. At work one day I saw some code that looked like this:

  1. # ‘a’ being an instance of a class
  2. if not a.att1 and not a.att2 and not a.att3 and not a.att4:
  3. # more code down here

oddly enough, I remembered De Morgan’s Law from my formal logic courses, so I used it to simplify the expression. Here is the proof (in formal logic notation because it’s quicker):

  • ~P ^ ~Q ^ ~R ^ ~S
  • (~P ^ ~Q) ^ (~R ^ ~S)
  • ~(P v Q) ^ ~(R v S)
  • (~(P v Q) ^ ~(R v S))
  • ~((P v Q) v (R v S))
  • ~(P v Q v R v S)

After the logic juggling above I renamed the variables and then modified the code to resemble the ending result:

  1. if not (a.att1 or a.att2 or a.att3 or a.att4):
  2. # still more code down here

After the change we can immediately see that I’ve reduced number of not calls. While I’m sure there is some performance improvement, I seriously doubt that it’s something that might be noticed by a human being, so we’ll just discount this as any form of optimization. I believe I’ve also made the code easier to read. The line itself is smaller, so a maintainer would have less “things” to juggle in their head to verify that the expression should pass or fail. Thus this code change has fulfilled some of the ideas in the “Zen of Python,” particularly the parts about “Simple is better than complex,” and “Readability counts.” However, in the final product I’ve had to add some parentheses in order for the logic to be correct, and this is where my ultimate question lies. In a lot of “pythonic” code examples on the web, one doesn’t really see “if” statements written like this one. So a somewhat self-conscious part of me thinks that I even though I have gone through the trouble of trying to make the code cleaner, I still haven’t gotten the code to be “pythonic”.

After thinking about it a little more, I decided that my version is pythonic. While I did add some clutter by adding parentheses, I also removed 3 “not calls” and made the line easier to understand. I believe this follows the spirit of pythonic code, even if it does not follow the the code to the letter.

Top photo credit goes to: mromtz

Pangrams


If you haven’t figured it out by now, I enjoy solving problems. And over the course of the last year or two, I’ve learned that interview questions make for great problems to work on. Actual interview problems are nice because they are usually quick, but have a quirk or two in there that makes them challenging, unlike simple questions like the Fizz Buzz problem that just checks if you have the most basic coding skills. (Has anyone actually been asked that question in an interview?)

The most recent problem I got to sink my teeth into (found it on a recruiting site, but not going to share where I got it; wouldn’t be fair to the company posting the problem) is for finding pangrams in sentences. If you don’t know what a pangram is Wikipedia defines them as, “a sentence using every letter of the alphabet at least once.” Yeah, I didn’t know what they were either until I started programming this little puzzle. Here is the code:

  1. module Main (main) where
  2.  
  3. import System (getArgs)
  4. import qualified Data.Set as S
  5. import qualified Data.Text as T
  6. import qualified Data.Text.IO as TI (readFile)
  7.  
  8. buildList :: FilePath -> IO [T.Text]
  9. buildList filename = TI.readFile filename >>=
  10. return . map (T.toLower . T.filter (/=' ')) . T.lines
  11.  
  12. compareAndPrint :: S.Set Char -> String
  13. compareAndPrint sset = if S.null result
  14. then "NULL"
  15. else S.toList result
  16. where result = S.difference (S.fromList ['a'..'z']) sset
  17.  
  18. main = do
  19. args <- getArgs
  20. sentences <- buildList $ head args
  21. mapM_ (putStrLn . compareAndPrint) $ map( S.fromList . T.unpack) sentences

I came up with the solution pretty quickly by using Sets. Having a set of the alphabet and finding the difference of the letters used in the sentence makes the problem almost trivial. The hard part for me was figuring out how to filter out the spaces and change all characters to lower case in the buildList function. I eventually figured it out, but it took some head against wall action to get it right.

This is going to be my last post for this year. I would like to wish you all Happy Holidays and a Happy New Year. Thank for reading and see you again in 2012. I would also like to thank everyone from planet.haskell.org who decided to read this. Welcome!

99 bottles

It’s been a while since I put up one of these simple code posts. Let me fix that. :P

Quiet a while ago, this code needed to have the dust blown off of it. Programming Praxis had its readers program out the song for “Nintey-Nine Bottles of Beer on the Wall".

Here is my version of it:

  1. #!/usr/bin/env python
  2.  
  3. from __future__ import print_function
  4.  
  5. def output(in_int):
  6. if in_int != 1:
  7. return "%d bottles of beer on the wall, %d bottles of beer." \
  8. "You take one down, pass it around, %d bottles of beer" \
  9. "on the wall." % (in_int,in_int, in_int - 1)
  10. else:
  11. return "%d bottle of beer on the wall, %d bottle of beer." \
  12. "You take one down, pass it around, %d bottles of beer" \
  13. "on the wall." % (in_int,in_int, in_int - 1)
  14.  
  15. if __name__ == "__main__":
  16. [ print(output(x)) for x in reversed(xrange(1,99)) ]

That’s all there is to it really. The only enlightening thing about this I can say is that this algorithm reminds me a lot of my Code to lyrics post I wrote a little over a year ago. It’s amazing how simple songs are when they’re broken down to their basic elements. Anyone feel like setting up an “automatic song generation” business with me?

Git Churn

I’m still really on the fence regarding the flame war between Git and Mercurial; they both have their strengths and weaknesses. (In case you’re wondering, the reason I haven’t been forced to make a decision yet is because $WORK uses Perforce .) When I learn about one trick in one of these tools, I try to see if that trick is possible in the other. So far this search has been pretty limited, and both applications seem to be equally capable.

My buddy OJ pointed me out to a Mercurial extension called churn, which is a pretty nifty tool. It allows you to quickly see who made how many changes to either a particular file or an entire repo (emails removed to avoid these people being spammed).

guido 1970354 *******************************************************************
jack.jansen 1665771 *********************************************************
solipsis 1588311 ******************************************************
georg 1331332 *********************************************
martin 1005541 **********************************
benjamin 697038 ************************
thomas 460885 ****************
christian 445461 ***************
fdrake 437229 ***************
tools 418957 **************

After finding out about this cool little feature I decided to see if Git had something similar. After a little digging around I came across this thread on StackOverflow(2) which talked about the Git shortlog command. Running this command with the “-sne” argument (as given from the StackOverflow page) on my Project Euler git repo returned these results:

129 Bryce Verdier
42 Bryce Verdier
4 Bryce
1 bryce

While there are mentions of the gitstats project, I didn’t want to bring that into the equation. I just want to compare the core applications. Using a third party application, like gitstats, wouldn’t be fair to Mercurial.

Seeing how both of these tools seemed to have similar capabilities, I decided to investigate a little more. I started off by trying to see if there might have been a flag to remove the stars from the Mercurial churn output; while the stars are cool, I think the Git output looks cleaner without them. Sadly, there isn’t a flag in churn to remove the stars.

One thing I did like about churn over Git shortlog was the ability to see either the number of lines changed or the number of change sets. I think that having more options for how you want information presented is a good thing. I also liked the output of churn over that of shortlog. I want to see the user information before I see the number of changes they made; I think it’s easier to process information that way than the other way around. Sadly, even though there is a quite a lot of documentation regarding formatting within the git log man page, there doesn’t seem to be a print variable for summarizing one’s commits. Or for changing the formatting if one uses the “-s” flag in the shortlog command.

As hard as I tried to break the tie between Git and Mercurial, these two particular extensions each have their own advantages and disadvantages. While it might be ideal to attempt to combine churn and shortlog into the ULTIMATE CHANGE TRACKING EXTENSION, that is a task for another day (and probably another programmer.) I guess the flame war between these two tools is just going to have to continue for a little while longer.

Syndicate content