Showing posts with label python. Show all posts
Showing posts with label python. Show all posts

Friday, February 10, 2012

Free resources for learning Python

I've blogged on here a bit about Python, but it occurred to me that I haven't actually provided any resources for anyone interested in learning Python.

A great place to start is learnpython.org. It's got simple tutorials as well as an interactive shell so you can try it out as you learn.

There's also Dive Into Python, a free book that's also been published by Apress.

Dive Into Python covers Python 2, but there's a Python 3 version available as well for free, published by Apress as well.

Of course there are a lot of other free resources available, but less is more, and these two are a great place to start. A more comprehensive list of resources, free and non-free, are available here: http://wiki.python.org/moin/PythonBooks

Have fun!

Monday, July 18, 2011

Python mysqldb UnicodeDecodeError: 'ascii' codec can't decode byte

Solution:

If you run into the error mentioned in the title of this post using python's mysqldb module version 1.2.1 or less, decode your data/query first:

mydata.decode('utf8')

(modifying 'utf8' to whatever encoding your data happens to be in)

Details:

So I was writing some code in python on Ubuntu, and it was working just fine. When I went to run it in RHEL, I got this error:

Traceback (most recent call last):
File "", line 50, in ?
File "/usr/lib64/python2.4/site-packages/MySQLdb/cursors.py", line 146, in execute
query = query.encode(charset)
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 223: ordinal not in range(128)


My first thought was that it was due to the incredibly old version of Python that ships with RHEL 5 (Python 2.4), but it didn't take me long to realize the problem was with the MySQLdb module itself. Ubuntu 10.10 ships with version 1.2.2 of that module, while RHEL 5 ships with version 1.2.1. A minor difference, but apparently in that time this bug was fixed:
http://sourceforge.net/tracker/index.php?func=detail&aid=1521274&group_id=22307&atid=374932

Apparently MySQLdb 1.2.1 tries to indiscriminately encode the data to be put into the database to utf8 (well, at least when you specify utf8 as the database character set), without checking whether the string is already utf8 or not. My solution was just to decode my data from utf8 (to unicode) before passing it to my mysql query, at which point the encoding works just fine.

Like so (the first line's the relevant one):

mydata.decode('utf8')
query = ('INSERT INTO %(database)s (%(column)s) VALUES (%(value)s)' % {'database': database, 'column': column, 'value': mydata})
cursor.execute(query)


Of course, you should modify the 'utf8' part to whatever encoding your data is in.

Edit: If you're using MySQLdb.escape_string(), make sure you run that first before doing the decode, like so:

MySQLdb.escape_string(mydata).decode('utf8')

Wednesday, June 1, 2011

nullege.com: search python source code

the title sums it up pretty well. I know you can find just about anything using google, but supposedly nullege.com is a little smarter when it comes to python. it's a cool idea, at any rate. I haven't had a need to try it out yet, but I'm sure I will before long:


http://nullege.com/

Wednesday, March 9, 2011

python's only problem: concurrency

python has its naysayers, but that's what jealousy will do to you :) seriously though, python does have one major issue: concurrent/parallel programming. now I'm no engineer, and I'm sure this is an oversimplification, but three ways of accomplishing this are:


  1. using multiple threads

  2. using multiple processes

  3. event-based programming



the main killer out of these three is trying to do multi-threaded programming in python. the short answer: don't. when Python was designed, it was given the philosophy that people using it should have to worry about as little as possible under the hood. while that's great, the way the python threading library was designed makes it inefficient for CPU-bound tasks. essentially, the more threads you use in python for CPU-bound tasks, the slower your code will be. ironically the problem seems to be even worse on machines with multiple processors. the problem has a name: the Global Interpreter Lock (GIL). you can read more about it here:

http://wiki.python.org/moin/GlobalInterpreterLock

there's also a great presentation that explains the problem pretty well if you have time to watch it. I'll put it at the bottom of this post.

then there's the issue of event-based programming. the problem in this arena (in my opinion, of course) is that there are too many choices for event-based programming, and none of them are included in the python standard library, so there's no standard as of yet.

lastly there's multi-process programming, which thankfully is in a better place than the other two. although it's only been fairly recently (within the last couple of years), python now has a multiprocessing module in the standard library. so if you're looking to write some concurrent/parallel code, I'd say start with here before wading through the alternatives that aren't in the standard library:

http://docs.python.org/library/multiprocessing.html

as the speed of computing becomes increasingly more about multiple processors than faster clock speed, the more important this issue will be. thankfully there's a standard solution (the multiprocessing module) as well as plenty of alternatives, and hopefully in the near future one or two of these alternatives will become standard, if not at least de facto.

here's a page that has links to some resources related to concurrency in python, as well as links to plenty of the afore-mentioned alternatives:

http://wiki.python.org/moin/Concurrency

and here's that video I promised:
[blip.tv ?posts_id=2243379&dest=-1]

Tuesday, March 8, 2011

Python and MySQL autocommit

Solution:

If your MySQL database is using the InnoDB engine, commit your changes after database transactions:
cursor.connection.commit()

Or just turn on autocommit to automatically commit after every database transaction:
cursor.connection.autocommit(True)

Details:

So I was using python's MySQLdb module to edit a mysql database, and I noticed that even though python was telling me my modifications were taking place, I wasn't seeing any changes. in addition, when I would log onto the database from something other than python and try to make changes, I would get this error:

ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction

Well apparently, the mysqldb module now disables mysql autocommit by default. so now, when I'm done with my transactions to the database, I need to commit my changes by calling the commit() function of the connection object. the connection object is what's returned when you call the MySQLdb.connect() function. I haven't really been using that object in my code other than to create the cursor object, which is what I normally use:

db = MySQLdb.connect(...)
cursor = db.cursor()


But the cursor can access the connection object, which I can then use to commit the changes:

cursor.connection.commit()

The strange thing with all of this is that I just started noticing this problem, even though according to the MySQLdb module authors, this functionality (disabling autocommit by default) has been in place since version 1.2.0 of the module. but when I look at the package versions of python-mysqldb for Ubuntu (what I'm currently using and have been for a while), it looks like it's been past version 1.2.0 for the last several years:
http://packages.ubuntu.com/search?keywords=python-mysqldb&searchon=names&suite=all&section=all

Maybe this "feature" has just recently made it into Ubuntu's package for this module. or maybe I'm missing something else here. at any rate, at least I know what's going on.

You can see here for more information:
http://mysql-python.sourceforge.net/FAQ.html#my-data-disappeared-or-won-t-go-away

Edit:
Okay, so apparently what happened is in my other code where I was modifying mysql databases, they used the default engine (MyISAM). the database I was having problems with was using the InnoDB engine, which is a transactional storage engine. this explains why I just now saw this issue.

More information here:
http://stackoverflow.com/questions/1617637/pythons-mysqldb-not-getting-updated-row

As well as an alternate solution: instead of committing after every transaction, I can turn autocommit on when I'm working with databases using the InnoDB engine, so they'll function just like the rest:

cursor.connection.autocommit(True)

Thursday, February 17, 2011

Wednesday, October 27, 2010

Why Python?

It's occurred to me I haven't posted much on Python, which has been my preferred computer language for the last couple of years. Nothing else compares: it's high-level, object-oriented, actually encourages easily-readable code, and doesn't get in my way as a programmer.

Granted, I'm not an expert in the field, but every time I have to program in other high-level languages such as PHP, Java, bash, or (in a worst-case scenario) Perl, I realize how much Python has spoiled me.

Here's a link to an amazing article that got me to consider Python (and I would imagine it's probably done the same for many others):

http://www.linuxjournal.com/article/3882

Enjoy!

Thursday, July 15, 2010

Genie/Vala

You wouldn't know this from reading my blog, but I'm a Python fanboy. Maybe I haven't blogged anything about Python because it's already so completely awesome that there isn't much to blog about. Well, anyway, today I stumbled upon an interesting programming language called Genie:

http://live.gnome.org/Genie
http://en.wikipedia.org/wiki/Genie_%28programming_language%29

Genie is pretty intriguing for many reasons. It's a high-level language that uses the Vala compiler. What is Vala?, you might ask. Vala is a high-level language that can be used to write cross-platform code. It's syntax is similar to C#, but unlike C#, Python, Java, or many other modern high-level languages, it doesn't need a runtime environment/VM to run, because it compiles to C, which then of course compiles to machine code. And because it compiles to C, it can be used to write cross-platform applications, on anything that supports GLib, which at the very least covers the big 3 (Windows, OS X, GNU/Linux).

So Vala itself is pretty cool. What makes Genie even cooler is that it has more of a Python-like syntax, which, of course, is one of the coolest things about Python (for those of us that happen to like Python). And Genie and Vala can be used alongside each other.

How's this for a Hello World?:

init
print "Hello, world!"


Granted, not quite as basic as a Python Hello World:

print "Hello, world!"


but basic enough. It's sure nice on the eyes without all those curly braces and semicolons at any rate.

So... if at any point I am developing an application and want the benefits of a compiled language that doesn't rely on a VM, I'll definitely be checking out Genie. And for those of you who aren't too fond of Genie's syntactical style, I'd recommend checking out Vala.

If nothing else, it sure beats trudging through C/C++. Yeah, they're not bad languages, but it seems to me with the way technology is continually advancing, it's ridiculous that C/C++ are still sometimes the best options for software development, especially when it comes to performance. Who knows, maybe Genie/Vala are the next C++. Then again, maybe not, but they're definitely worth looking into.