tag:blogger.com,1999:blog-201426682024-03-13T23:06:51.506-07:00$0.02 poorercludwinhttp://www.blogger.com/profile/14797643674981332072noreply@blogger.comBlogger9125tag:blogger.com,1999:blog-20142668.post-1146996067608275672006-05-07T02:45:00.000-07:002007-09-20T00:39:32.973-07:00Using MySql 5.0 stored procedures with MySQLdbUsing <a href='http://mysql.com/'>MySql 5.0</a> stored procedures with <a href='http://sourceforge.net/projects/mysql-python'>MySQLdb</a><br /><br />When I wanted to try out MySQL 5.0 stored procedures I didn't find too much on the web. This was a few months ago and there may be some better tutorials out there now but I figured I would share some of my tricks.<br /><br />Why use stored procedures in the first place? While stored procedures give a you a performance boost (I have not benchmarked so I can't say how much) the argument I usually use when making the case for using stored procedures is abstraction. One can make changes to a stored procedure and not have to touch the calling code just as long as you don't break the interface. There have been many instances where I have avoided a code change - QA cycle - rollout just by adding an additional field to a select statement in a stored procedure.<br /><br />Step 0: Install and configure MySql 5.0 <br />MySql is at version 5.0.21 at the time of this writing. Installing MySql can be tricky, and going into all the detail of installing MySql is beyond the scope of this article, but here are some tips... <br /><br />I usually install from source. The key is to follow the directions provided in the install notes and make sure that the directory the MySql installs to is owned by "mysql:mysql". Once the permissions are correct and the database has been installed, mysql usually fires right up.<br /><br />Step 1: Installing MySQLdb.<br />Get the latest version of MySQLdb (1.2.1_p2 when I wrote this) and install...<br />$> tar zxfv MySQL-python-1.2.1_p2.tar.gz && cd MySQL-python-1.2.1_p2/<br />#> python setup.py install<br /><br />Step 2: Create some tables and insert some data<br /><code><pre><br />mysql> use test;<br />mysql> create table products (<br /> productid int unsigned not null primary key auto_increment,<br /> categoryid int unsigned not null,<br /> productname varchar(80) not null,<br /> description varchar(255) not null,<br /> statuscd tinyint(1) unsigned not null,<br /> createdt datetime not null,<br /> index cat_prod_idx <br /> (categoryid, productid, productname, description, statuscd)<br /> ) type=innodb;<br /><br />mysql> create table categories (<br /> categoryid int unsigned not null primary key auto_increment,<br /> category varchar(80) not null,<br /> description varchar(255) not null,<br /> statuscd tinyint(1) unsigned not null,<br /> createdt datetime not null<br /> ) type=innodb;<br /><br />mysql> insert into categories (category, description, statuscd, createdt) <br />values ('cool category', 'these are all cool products', 1, now());<br /><br />mysql> insert into categories (category, description, statuscd, createdt) <br />values ('lame category', 'these are the not so cool products', 1, now());<br /><br />mysql> insert into products (categoryid, productname, description, statuscd, createdt) <br />values (1, 'slackware linux', 'this is my distro of choice.', 1, now());<br /><br />mysql> insert into products (categoryid, productname, description, statuscd, createdt) <br />values (1, 'MicroSoft Windows', 'see category for description.', 2, now());<br /></pre><br /></code><br /><br />Step 3: Build the 4 basic types of queries (select, insert, update and delete) <br /><br />Select stored procedures are pretty simple the things to note are:<br />1. It is good to only bring back fields that you need 'select *'s' are evil<br />2. I prefixed the input parameter with an underscore so that it is easily identified by MySql and the next person that has to read this sp.<br /><br /><code><pre><br />mysql> delimiter //<br />mysql> create procedure usp_get_products_by_category<br /> (<br /> _categoryid int unsigned<br /> )<br /> begin<br /> select<br /> p.productid,<br /> p.productname,<br /> p.description,<br /> c.category,<br /> c.description<br /> from <br /> products p<br /> inner join categories c on p.categoryid = c.categoryid<br /> where<br /> p.categoryid = _categoryid<br /> and p.statuscd = 1<br /> and c.statuscd = 1;<br /> end;<br /> //<br />mysql> delimiter ;<br /></pre><br /></code><br /><br />When writing an insert sp you always want to return the id of the record you just created hence the "select LAST_INSERT_ID()". This means that you will get a one record result set back after you call the insert sp and the first and only field in the the record will be the id.<br /><br /><code><pre><br />mysql> delimiter //<br />mysql> create procedure usp_ins_product<br /> (<br /> _categoryid int unsigned,<br /> _productname varchar(80),<br /> _description varchar(255),<br /> _statuscd tinyint(1) unsigned<br /> )<br /> begin<br /> insert into products<br /> (categoryid, productname, description, statuscd, createdt)<br /> values<br /> (_categoryid, _productname, _description, _statuscd, now());<br /> select LAST_INSERT_ID();<br /> end;<br /> //<br />mysql> delimiter ;<br /></pre><br /></code><br /><br />When I write my update and delete stored procedures I usually let MySql do it's thing. I always check the number of rows updated through the MySQLdb cursor object:<br />"count = curs.execute(stmt)"<br /><br /><code><pre><br />mysql> delimiter //<br />mysql> create procedure usp_upd_productstatus<br /> (<br /> _productid int unsigned,<br /> _statuscd tinyint(1) unsigned<br /> )<br /> begin<br /> update products<br /> set statuscd = _statuscd<br /> where<br /> productid = _productid;<br /> end;<br /> //<br />mysql> delimiter ;<br /><br />mysql> delimiter //<br />mysql> create procedure usp_del_product<br /> (<br /> _productid int unsigned<br /> )<br /> begin<br /> delete from products where productid = _productid;<br /> end;<br /> //<br />mysql> delimiter ;<br /></pre><br /></code><br /><br />Step 4: Testing it all out! <br /><br />Fire up a python interpreter and ...<br /><code><pre><br />>>> import MySQLdb<br />>>> from MySQLdb.constants import CLIENT<br />>>> cnn = MySQLdb.connect(host='localhost', user='USER', passwd='PASS', db='test', client_flag=CLIENT.MULTI_STATEMENTS)<br />>>> cnn.autocommit(1)<br />>>> curs = cnn.cursor()<br />>>> count = curs.execute('call usp_upd_productstatus(%d, %d)' % (1,0)) # update<br />>>> count = curs.execute('call usp_get_products_by_category(%d)' % (2)) # select<br />>>> curs.fetchall()<br />>>> curs.execute("call usp_ins_product(1, 'FreeBSD', 'Another great OS', 1)")<br />>>> curs.fetchall() # this returns the id of the new 'FreeBSD' record<br />>>> count = curs.execute('call usp_del_product(%d)' % (2)) # delete <br />>>> curs.close()<br />>>> cnn.close()<br /></pre><br /></code><br />The INNODB storage engine supports row level locking which means writes won't tie up the entire table. Because I created the tables using the 'type=innodb' I had to set the autocommit flag to true in the connection otherwise my queries won't get commited.cludwinhttp://www.blogger.com/profile/14797643674981332072noreply@blogger.com0tag:blogger.com,1999:blog-20142668.post-1141067341068786252006-02-27T10:47:00.000-08:002006-02-27T11:09:01.093-08:00URL to DSL: Turning JobJitsu's URLs into a job query language.Useful URL's (KIHF)<br />URL to <a href="http://en.wikipedia.org/wiki/Domain-specific_language">DSL</a>: Turning <a href="http://jobjitsu.com">JobJitsu's</a> URLs into a job query language.<br /><br />One of the goals behind <a href="http://jobjitsu.com">JobJitsu</a> is to uphold the KISS (Keep It Simple Stupid) philosophy and where ever possible uphold the KIHF (Keep It Hacker Friendly) philosophy. That is especially true for URL's. One thing I really like about del.icio.us is their <a href="http://del.icio.us/help/navigation">site navigation</a>. It is pretty simple, and at the same time very expressive/powerful. That being said we decided to make our URL's useful for searching and navigation.<br /><br />For example:<br />When searching for python software engineering jobs in San Diego, one would use a url like... <br /> <code>http://jobjitsu.com/jobs/sandiego/sw/python</code><br />The structure of the url is:<br /> <code>http://jobjitsu.com/jobs/locations/categories/tags</code><br /><br />where<br /> locations -> comma separated list of locations<br /> categories - > comma separated list of categories<br /> tags -> list of tags and <i>operators</i> <br /> operators -> ,+!<br /> , := or<br /> + := and<br /> ! := not<br /><br />A more complex example:<br />Show me all the jobs in San Diego or Los Angeles that are in the software engineering or quality assurance categories having the tags python and lisp and linux but not microsoft...<br /> <code>http://jobjitsu.com/jobs/sandiego,losangeles/sw,qa/+python+lisp+linux!microsoft</code><br /><br />But wait there is more...<br />By adding the feed sub-domain to your url you'll get a jobs <a href="http://en.wikipedia.org/wiki/Atom_(standard)">atom</a> feed:<br /> <code>http://jobjitsu.com/jobs/sandiego/sw/python</code><br />gives you something that looks good in a browser, and <br /> <code>http://feed.jobjitsu.com/jobs/sandiego/sw/python</code><br />will give you something that looks good in your feed reader. This allows you to easily syndicate your job searches. So if you always want to see the latest lisp jobs you can now embed:<br /> <code>http://feed.jobjitsu.com/jobs/all/all/lisp</code><br />into your google/ig homepage.<br /><br />Now for the fun stuff... how do we make it work?<br />We first built a url library to handle most of the low level url parsing which allows us to build a parse tree of operators and tokens. we then created the sql-generation code that allowed us to translate this parse tree into sql statements and finally we wrapped up the whole thing w/ some query caching. Disclaimer: this is all still first pass and is subject to bugs and change but I figured I would post it cause it is kind of neat.<br /><br />The Sql-Generation is not included cause it is quite hackish (even more so then the rest of my code) and will need to be improved...<br /><br />Job handler...<br /><code><br /><pre><br />from ats.cache import memcache_mgr<br />from ats.db import dao<br />from ats.log import logger<br />from ats.util import converter<br />from ats.util import url as url_lib<br />from common.model import job as job_model <br />from common.model import sqlgen<br />from common.ui import navigation<br />from home.view.jobs import jobs as jobtmpl <br />from home.view.sidebar import sidebar<br />from mod_python import apache<br /><br /># ------------------------------------------------------------------<br /># globals <br /># ------------------------------------------------------------------<br />cache = memcache_mgr.get_cache()<br />log = logger.get_logger()<br /><br /># ------------------------------------------------------------------<br /># handler <br /># ------------------------------------------------------------------<br />def handler(req, sesh):<br /> """ This function will return all the jobs based on the query string<br /> """<br /> global log<br /> jobs, start, pagesize, error = [], 0, 0, 0<br /> try:<br /> jobs, start, pagesize = get_jobs(req.uri)<br /> jobs = sort_jobs(jobs)<br /> if jobs and start > 0:<br /> jobs = jobs[start-1:pagesize]<br /> except Exception, e:<br /> log.exception(e)<br /> error = 1<br /><br /> if error:<br /> # do some error handling here<br /> pass<br /><br /> return jobtmpl(searchList=[{<br /> "site_nav": navigation.get_site_nav('jobs'),<br /> "sesh": sesh,<br /> "jobs": jobs,<br /> "start": start,<br /> "pagesize": pagesize,<br /> "sidebar": <br /> sidebar(searchList=[{"sesh":sesh,"metros": job_model.get_metros()}])}]) <br /><br /># ------------------------------------------------------------------<br /># get jobs from cache or db<br /># ------------------------------------------------------------------<br />def get_jobs(url):<br /> """ Check the cache to see if this query has been run recently<br /> if so then return those results. <br /> On Cache miss we construct the query and go to the database<br /> """<br /> global cache<br /> if not url: return [], 0, 0 <br /> urlkey = 'jj:jobs:%s' % url<br /> segments = url_lib.unpack(<br /> url, <br /> ['locations', 'categories', 'tags', 'start', 'pagesize'],<br /> keep_blank_values=0) <br /> start = converter.toint(segments.get('start', 1))<br /> pagesize = converter.toint(segments.get('pagesize', 50))<br /> <br /> # check cache first<br /> jobs = cache.get(urlkey)<br /> if jobs:<br /> return jobs, start, pagesize<br /><br /> # on cache miss parse query string and run query<br /> locations = filter( # split on ',' + filter empties and 'all'<br /> lambda x: x and x != 'all',<br /> segments.get('locations', '').lower().split(','))<br /> categories = filter( # split on ',' + filter empties and 'all'<br /> lambda x: x and x != 'all',<br /> segments.get('categories', '').lower().split(','))<br /> tags = url_lib.process(segments.get('tags', 'all').lower())<br /><br /> # run query<br /> ct, jobs = dao.get_records(<br /> 'jobjitsu', <br /> construct_sql(locations, categories, tags))<br /><br /> # cache for 5 min<br /> if ct > 0:<br /> cache.set(urlkey, jobs, 5)<br /> else:<br /> jobs, start, pagesize = [], 0, 0<br /><br /> return jobs, start, pagesize<br /><br /># ------------------------------------------------------------------<br /># translate query string to job query <br /># ------------------------------------------------------------------<br />def construct_sql(locations, categories, tags):<br /> """ build sql stmt from sections of query string of the form:<br /> /jobs/locations/categories/tags/start/pagesize<br /> returns (stmt, hash(url), start, pagesize)<br /> """ <br /> global log<br /><br /> # construct the query<br /> sql = []<br /> if locations:<br /> sql.append(sqlgen.job_loc_or(locations))<br /> if categories:<br /> sql.append(sqlgen.job_cat_or(categories))<br /> if tags and tags[0]:<br /> sql.append(sqlgen.job_tags_and(tags[0]))<br /> if tags and tags[1]:<br /> sql.append(sqlgen.job_tags_or(tags[1]))<br /> if tags and tags[2]:<br /> sql.append(sqlgen.job_tags_not(tags[2]))<br /> stmt = "%s%s;" % (" select j.* from jb_jobs j where "," \nand ".join(sql))<br /> return stmt<br /><br /># ------------------------------------------------------------------<br /># utility <br /># ------------------------------------------------------------------<br />def sort_jobs(jobs):<br /> if not jobs: return [] <br /> sorted = jobs[:]<br /> sorted.sort(lambda x,y: cmp(y.createdt, x.createdt))<br /> return sorted <br /></pre><br /></code><br /><br /><br />Low Level URI handling...<br /><code><br /><pre><br />from ats.db import dao<br /><br />escape_char = "'"<br />op_and = '+'<br />op_or = ','<br />op_not = '!'<br />operators = (op_and, op_or, op_not) <br /><br />def unpack(uri, keys, keep_blank_values=0):<br /> """ This function will take a req.uri and a list and return a dict<br /> of key->val pairs where the keys are taken from the list and <br /> values are taken from the req.uri.split('/')<br /> """<br /> d = {}<br /> if keep_blank_values:<br /> for key in keys:<br /> d[key] = None<br /> if not uri or not keys:<br /> return d<br /> try:<br /> num_keys = len(keys) <br /> tokens = uri.split('/')[2:]<br /> for i in xrange(len(tokens)):<br /> if i < num_keys and tokens[i]:<br /> d[keys[i]] = tokens[i]<br /> return d<br /> except: <br /> return d<br /><br />def process(str):<br /> global log<br /> if not str or str == 'all': return None<br /> try: <br /> return simplify_qry_expr(classify_qry_tokens(tokenize(str)))<br /> except Exception, e:<br /> log.exception(e) <br /> return None<br /> <br />def tokenize(str):<br /> """ takes a string like 'red,green+blue' and returns a list <br /> in [OPER1, TOKEN1, OPER2, TOKEN2, ...] formant e.g.: <br /> [',', 'red', ',', 'green', '+', 'blue']<br /> this list can then be later operated on and transformed into<br /> a sql query by examining the operators and tokens<br /> """<br /> global operators, op_or, escape_char<br /> tokens = []<br /> word = [] <br /> i = 0<br /> while i < len(str):<br />#########################################################<br />### save this section for when we impl group'd exprs ###<br />#########################################################<br /># if str[i] == '(':<br /># jmp = match_brace(str[i:]) <br /># tokens.append(tokenize(str[i+1:i+jmp]))<br /># i += jmp+1<br /># continue<br /># if str[i] == ')':<br /># if word: <br /># tokens.append("".join(word))<br /># if tokens and tokens[0] not in operators:<br /># return [op_or] + tokens<br /># return tokens<br />#########################################################<br /> if str[i] == escape_char:<br /> i+=1<br /> word.append(str[i]) <br /> i+=1<br /> continue<br /> elif str[i] in operators:<br /> if word: <br /> tokens.append(dao.escape("".join(word)))<br /> word = []<br /> tokens.append(str[i])<br /> else:<br /> word.append(str[i])<br /> i += 1<br /> if word: <br /> tokens.append(dao.escape("".join(word)))<br /> if tokens and tokens[0] not in operators:<br /> return [op_or] + tokens<br /> return tokens<br /><br />def match_brace(str):<br /> """ Given a string '(hello (not) cool) world' will return 17<br /> useful for (grouped expressions) -- coming soon<br /> """<br /> brc_count = 1<br /> length = len(str)<br /> for i in xrange(length-1):<br /> if str[i+1] == '(':<br /> brc_count += 1<br /> elif str[i+1] == ')':<br /> brc_count -= 1<br /> if brc_count == 0:<br /> return i + 1<br /> if str[length-1] == ')':<br /> return length-1<br /> return 0<br /><br />def classify_qry_tokens(tokens):<br /> """ Takes a flat list of the form op, word, op, word, ...<br /> returns a tuple of lists representing AND-tokens, OR-tokens, NOT-tokens<br /> """<br /> global operators, op_and, op_or, op_not<br /> _and, _or, _not = [], [], []<br /> coppied = tokens[:]<br /> while coppied:<br /> token = coppied.pop()<br /> oper = coppied.pop()<br /> if isinstance(token, list):<br /> token = classify(token)<br /> if oper == op_and: _and.append(token)<br /> elif oper == op_or: _or.append(token)<br /> elif oper == op_not: _not.append(token)<br /> return _and, _or, _not <br /><br />def simplify_qry_expr(expr):<br /> """ OR is lowest level operator <br /> it only matters if AND and NOT aren't present.<br /> * AND + OR reduces to just AND..<br /> * NOT + OR reduces to just NOT...<br /> """<br /> if expr[0] or expr[2]:<br /> _or = []<br /> else:<br /> _or = expr[1]<br /> return expr[0], _or, expr[2] <br /><br />if __name__ == "__main__":<br /> print "keep_blank_values=0"<br /> print """/cool/666/42 - ['evilid', 'goodid']"""<br /> d = unpack("/cool/666/42", ['evilid', 'goodid'])<br /> for k,v in d.items(): print k, v<br /> <br /> print """/cool/666/42 - ['evilid']"""<br /> d = unpack("/cool/666/42", ['evilid'])<br /> for k,v in d.items(): print k, v<br /> <br /> print """/cool/666 - ['evilid', 'goodid']"""<br /> d = unpack("/cool/666", ['evilid', 'goodid'])<br /> for k,v in d.items(): print k, v<br /><br /> print "\nkeep_blank_values=1"<br /> print """/cool/666/42 - ['evilid', 'goodid']"""<br /> d = unpack("/cool/666/42", ['evilid', 'goodid'], 1)<br /> for k,v in d.items(): print k, v<br /> <br /> print """/cool/666/42 - ['evilid']"""<br /> d = unpack("/cool/666/42", ['evilid'], 1)<br /> for k,v in d.items(): print k, v<br /> <br /> print """/cool/666 - ['evilid', 'goodid']"""<br /> d = unpack("/cool/666", ['evilid', 'goodid'], 1)<br /> for k,v in d.items(): print k, v<br /></pre><br /></code>cludwinhttp://www.blogger.com/profile/14797643674981332072noreply@blogger.com0tag:blogger.com,1999:blog-20142668.post-1140212373175475562006-02-17T12:56:00.000-08:002007-09-20T00:41:19.335-07:00The Technology behind JobJitsu.comJobJitsu Technology Overview:<br />So now a little bit about the technology behind the job board...<br /><br /><a href="http://jobjitsu.com">JobJitsu</a> is more of a home grown web app. we have some general purpose modules that handle most of the common web application facilities:<ul><br /> <li>cache -- a wrapper around the <a href="ftp://ftp.tummy.com/pub/python-memcached/">memcache client</a> provided by <a href="http://tummy.com/">tummy.com</a><br /> <li>conf -- where we keep common configuration settings<br /> <li>db -- a data access module that manages connections and uses Greg Stein's <a href="http://www.lyra.org/greg/python/dtuple.py">DTuple.py</a><br /> <li>dispatch -- <a href="http://2centspoorer.blogspot.com/2006/01/creating-your-own-modpython-request.html">this guy</a> handles the url to resource mapping, sessions and a few other things.<br /> <li>log -- wrapper around python logging module<br /> <li>mail -- mail wrapper <br /> <li>re -- common precompiled regex's (email, url and etc.)<br /> <li>sesh -- our session module<br /> <li>util -- utility stuffs like (string manipulation, url parsing and etc.) <br /></ul><br />The presentation is handled by <a href="http://cheetahtemplate.org">Cheetah</a>. Cheetah has nice syntax (no-xml), flexibility, and performance (precompiled templates). One other feature that I find hard to live without is the template inheritance. Cheetah allows you to define a base page which other pages can extend (good for header + sidebar + footer). <br /><br />We are using <a href="http://www.mysql.com/">MySQL</a> 5.0 (which supports stored procedures) on top of the innodb storage engine for persistence.<br /><br />Finally what modern web app would be complete without some cool AJAX? <a href="http://blog.mmrichards.com/">Matt</a> has some neat <a href="http://undefined.org/python/#simple_json">JSON</a> stuffs that he is working on.<br /><br />Next post will be an in depth look at how we manage our data including how we defined our data model, a quick discussion on stored procedures, indexing strategies and how we actually pass data back and forth between Python and the MySQLcludwinhttp://www.blogger.com/profile/14797643674981332072noreply@blogger.com1tag:blogger.com,1999:blog-20142668.post-1140116490035269652006-02-16T10:57:00.000-08:002007-09-20T00:40:56.598-07:00AmbitionIt's all about Ambition<br /><br />The real challenge of starting a job board is not the technology behind it, but getting employers to post jobs and job seekers to search it. There is a critical mass that needs to happen on both sides before the job board becomes relevant. In the past I believe that this was accomplished with massive advertising campaigns, <a href="http://blog.mmrichards.com/">Matt</a> and I discussed the idea of running a 30 second commercial during the superbowl but decided to save that for next year and take a more grass roots style approach instead. <br /><br />Getting employers to post shouldn't be too hard... make it free to post for the first 6 months, when other job boards are charging hundreds of dollars a posting free isn't so bad. <br /><br />Getting job seekers to use it will be a bit harder (enter the shameless self promotion you are now reading). A good place to start will be by documenting the project in my blog. I figure the worst case scenario is that in the future someone may find a shred of something useful here, while the best case scenario would be that I might be able to generate some *interest*.<br /><br />I am convinced that if you give 2 good hackers 2 months with nothing else to do and they can build just about anything, we on the other hand have day jobs so this is going to be a tight deadline for us. <br /><br />That being said, my philosophy on making things happen is this... <br /> 1) If you have half a brain.<br /> 2) And you are completely honest with yourself and others.<br /> 3) And you are not afraid of a little hard work.<br /> <br />Then the only thing stopping you from achieving whatever it is that you set out to do is your own ambition.<br /><br />The name of the site is going to be <a href="http://jobjitsu.com/">JobJitsu.com</a>.<br />Why did we choose <a href="http://jobjitsu.com/">JobJitsu.com</a>? <br />Simple cause JobNinja was already taken.<br /><br />Anyway enough Tony Robins speak, tomorrow's entry is all bout tech.cludwinhttp://www.blogger.com/profile/14797643674981332072noreply@blogger.com4tag:blogger.com,1999:blog-20142668.post-1140024424175785512006-02-15T09:17:00.000-08:002007-09-20T00:40:56.598-07:002 hackers2 hackers, 2 months, and an idea...<br /><br />Build a better job board. I was talking to my friend <a href="http://blog.mmrichards.com/">Matt</a> and about the idea of building a better job board. It didn't take much to convince each other that we could do much better then what is currently out there. <br /><br />Monster, Dice, HotJobs, Jobing (spelling?) et al have all gotten fat dumb and happy - Internet Explorer style (oh yes I did), the one thing they all have in common is how hard they suck. It's like there has been a feature freeze since 2002, moreover these job boards have all been overrun by staffing companies.<br /><br />I am going to assert that coming up with better technology then what the current job boards have to offer won't be a problem. So what is it that we are going to do better? Good question, the answer is a closely guarded secret, but you should have some idea by 4/1/06 (our first launch date). Over the next 6-8 weeks I will blog about our methods and progress. <br /><br />What makes us so sure that we can pull this off? <br />We have a <a href="http://python.org/">secret weapon</a>. <br /><br />More on that tomorrow.cludwinhttp://www.blogger.com/profile/14797643674981332072noreply@blogger.com3tag:blogger.com,1999:blog-20142668.post-1136874337646290852006-01-09T21:50:00.000-08:002007-09-20T00:39:32.974-07:00Creating your own mod_python request dispatcherCreating your own mod_python request dispatcher that maps url's to python request handlers.<br /><br />Conceptually I have something that looks like this:<br /><pre><br /> +-------------+<br /> | Clients |<br /> | Web Browser |<br /> +-------------+<br /> | [http://somecooldomain.com/coorequest]<br /> | ^ <br /> | | [Cool Response] <br /> V |<br /> +--------+ +------------+ +---------------+ +---------+<br /> | Apache | --> | Mod_Python | --> | dispatcher.py | --> | cool.py | --+<br /> | (A) | <-- | | <-- | (B) | | (D) | |<br /> +--------+ +------------+ +---------------+ +---------+ | <br /> | ^ |<br /> | | |<br /> | +------------------------+<br /> V<br /> +-------------+<br /> | urlmap.conf |<br /> | (C) |<br /> +-------------+<br /></pre><br /><br />A) Apache Configuration:<br />On my server I have mod_python set up to handle any request that comes in without a file extension. That way I can have those cool REST style URI's. I use the apache "FilesMatch" directive in my httpd.conf file to look for any request that does not contain a '.' before the '?' in the query string:<pre><br /><code><br /> <FilesMatch "(^[^\.]*$|^[^\?]*[\?]+[^$]+$)"><br /> SetHandler python-program<br /> PythonHandler common.dispatch.dispatcher<br /> PythonDebug On<br /> </FilesMatch><br /></code><br /></pre><br />B) dispatcher.py<br />The source for dispatcher.py can be found <a href="http://socallocal.com/blog/dispatcher.py">here</a>.<br /><br />C) Example urlmap.conf<br /><pre><br /><code><br />[somecooldomain.com]<br />/=cooldomain.handlers.home.handler<br />/signup=cooldomain.handlers.member.signup<br />/login=cooldomain.handlers.member.login<br />/logout=cooldomain.handlers.member.logout<br />/cool=cooldomain.handlers.cool.handler<br /></code><br /></pre><br /><br />D) cool.py<br />This is the handler that generates your content. In my content handlers I usually do things like query the database, process business logic, select a <a href="http://www.cheetahtemplate.org/">cheetah</a> template and return the result as html.<br /><br />In summary I use this approch for a couple of reasons, first it allows me to decouple my url's from my python code that way I don't have python code + html sitting around in the same directory. Secondly I get one piece of code that handles every request. (good for sessions and things like that)cludwinhttp://www.blogger.com/profile/14797643674981332072noreply@blogger.com6tag:blogger.com,1999:blog-20142668.post-1136576688180839592006-01-06T11:26:00.000-08:002007-09-20T00:39:32.974-07:00urlencoder/decoder for pythonEDIT (I so stand corrected.):<br /><br />I could have just used:<br /> urllib.quote<br /> urllib.unquote<br /><br />---------------------------------------------------<br /><br />Every once and a while I run into the need for a urlencoder/decoder function that just accepts a string and returns a string. <br /><br />The one that comes w/ urllib(2) takes a dictionary and returns a "name=val" formatted string.<br />My only problem is that it does 2 things at once:<br />1) url encodes.<br />2) constructs a query or post string.<br /><br />So I wrote a quick utility to do just #1:<br /><pre><br /><code><br />_keys = [<br /> "$", "&", "+", ",", "/", ":", ";", "=", "?", "@", " ", '"',<br /> "<", ">", "#", "%", "{", "}", "|", "\\", "^", "~", "[", "]", "`"]<br /><br />_vals = [<br /> '%24', '%26', '%2B', '%2C', '%2F', '%3A', '%3B', '%3D', '%3F',<br /> '%40', '%20', '%22', '%3C', '%3E', '%23', '%25', '%7B', '%7D',<br /> '%7C', '%5C', '%5E', '%7E', '%5B', '%5D', '%60']<br /><br />def encode(str=""):<br /> """ URL Encodes a string with out side effects<br /> """<br /> return "".join([_swap(x) for x in str])<br /><br />def decode(str=""):<br /> """ Takes a URL encoded string and decodes it with out side effects<br /> """<br /> if not str: return None<br /> for v in _vals:<br /> if v in str: str = str.replace(v, _keys[_vals.index(v)])<br /> return str<br /><br />def _swap(x):<br /> """ Helper function for encode.<br /> """<br /> if x in _keys: return _vals[_keys.index(x)]<br /> return x<br /><br />### units ###<br />if __name__ == "__main__":<br /> assert("".join(_keys) == decode(encode("".join(_keys))))<br /> assert("".join(_vals) == encode(decode("".join(_vals))))<br /> print "passed all unit tests."<br /></code><br /></pre>cludwinhttp://www.blogger.com/profile/14797643674981332072noreply@blogger.com5tag:blogger.com,1999:blog-20142668.post-1135625345232373012005-12-26T11:21:00.000-08:002007-09-20T00:39:32.974-07:00DTuple for O/R mappingWhy use <a href="http://rubyforge.org/projects/activerecord/">ActiveRecord</a> or <a href="http://www.sqlobject.org/">SQLObject</a> when there is <a href="http://www.lyra.org/greg/python/dtuple.py">DTuple</a>?<br /><br />Both of these systems attempt to hide sql through an object oriented abstraction. (I thought that this was something that only the J2EE guys tried to do.) Long story short it won't perform nearly as well as hand coded SQL and it may be trying to solve one problem by introducing another. Why learn another DSL when SQL works so well?<br /><br />It's not the mapping that I have issue with it's the fact that sql is code generated behind the scenes as if it were not meant for human consumption. Don't get me wrong I am a huge fan of O/R mapping and I am a huge fan of abstraction, however I think that attempting to talk to a database while speaking objects actually makes it harder to express what you actually want the database to do. The other disadvantage that this approach has is that it may make your code hard to debug.<br /><br />What am I suggesting? Well for Ruby I am not quite sure but for python there is <a href="http://www.lyra.org/greg/python/dtuple.py">DTuple -- written by Greg Stein</a>. I am not sure why DTuple gets no ink but it should. First off it has been around 4-ever since 1996 I think, and secondly it roks! It examines the result of your query and returns a collection of typed objects. This is done dynamically and preforms quite nicely. You basically get O/R Mapping for free and don't have to learn any object query syntax.<br /><br />Example Data Access object:<br />(caution this code is almost 2 years old and may need a little tweaking with the latest version of MySQLdb)<br /><br />Pay special attention to the get_records where DTuple is used...<br /><code><pre><br />import dtuple<br />import MySQLdb<br />from common.util import config<br /><br />def _connect():<br /> """ Login to the database return the connection<br /> """<br /> conf = config.get_config()<br /> return MySQLdb.connect(<br /> host=conf.get('db', 'host'),<br /> user=conf.get('db', 'user'),<br /> passwd=conf.get('db', 'passwd'),<br /> db=conf.get('db','database'))<br /><br />def _get_conn(relogin = 0):<br /> """ Returns a persistent connection to the database<br /> attempts to reconnect if the connection is lost.<br /> """<br /> global CONN<br /><br /> if relogin:<br /> CONN = _connect()<br /> return CONN<br /> else:<br /> try:<br /> conn = CONN<br /> return conn<br /> except NameError:<br /> CONN = _connect()<br /> return CONN<br /><br />def _exec_stmt(stmt):<br /> """ Runs SQL statement and returns (count, curs)<br /> """<br /> curs = None<br /> try:<br /> curs = _get_conn().cursor()<br /> count = curs.execute(stmt)<br /> except MySQLdb.OperationalError, e:<br /> curs = _get_conn(relogin=1).cursor()<br /> count = curs.execute(stmt)<br /> return count, curs<br /><br />def do_test(stmt):<br /> count, curs = _exec_stmt(stmt)<br /> try: return curs.fetchall()<br /> finally: curs.close()<br /><br />def get_first_record(stmt=None):<br /> """ Runs select statement and returns 1st record<br /> """<br /> rec = get_records(stmt, 0, 1)<br /> if rec[0] == 1: return rec[1][0]<br /> return None<br /><br />def get_records(stmt=None, start=0, max=500):<br /> """ Runs select statement and returns a list of the form:<br /> (record count, result set)<br /> """<br /> if stmt == None:<br /> return None<br /> limit_stmt = "%s limit %d, %d" % (stmt, start, max)<br /> count, curs = _exec_stmt(limit_stmt)<br /> try:<br /> return count, map(<br /> lambda x:<br /> dtuple.DatabaseTuple(<br /> dtuple.TupleDescriptor(curs.description),<br /> x),<br /> curs.fetchall())<br /> finally: curs.close()<br /><br />def do_insert(stmt=None):<br /> """ Runs insert statement and returns the id of the newly<br /> inserted record or -1 on error.<br /> """<br /> if stmt == None:<br /> return -1<br /> count, curs = _exec_stmt(stmt)<br /> if curs == None:<br /> return -1<br /> try: return curs.insert_id()<br /> finally: curs.close()<br /><br />def do_update(stmt=None):<br /> """ Runs update statement and returns the number of records<br /> updated or -1 on failure.<br /> """<br /> if stmt == None:<br /> return -1<br /> count, curs = _exec_stmt(stmt)<br /> curs.close()<br /> return count<br /><br />def do_delete(stmt=None):<br /> """ Runs delete statement and returns the number of records<br /> deleted or -1 on failure.<br /> """<br /> return do_update(stmt) # same impl as update<br /><br />def escape_string(str=""):<br /> return MySQLdb.escape_string(str)<br /></pre><br /></code>cludwinhttp://www.blogger.com/profile/14797643674981332072noreply@blogger.com1tag:blogger.com,1999:blog-20142668.post-1135376437967581182005-12-23T13:58:00.000-08:002007-09-20T00:40:56.598-07:00It all starts here"<a href="http://www.geocities.com/SunsetStrip/Alley/9797/" title="holdstrong">It all starts here</a>" is the title of a classic wmhc song and also makes a sweet title for my fist post.<br /><br />The purpose of this blog is to allow me to document my adventures in both hacking and starting my own company, and who knows one day I may even post something useful to someone interested in the same stuffs.<br /><br />I am a bit of a language nerd so there is a good chance you may see some ramblings on <a href="http://www.blogger.com/python.org">Python</a> and <a href="http://planet.lisp.org/">lisp</a>. <br /><br />In the past I have worked at 2 <a href="http://www.paulgraham.com/">startups</a>, next up I plan on starting my own. My goal is to document that process here and hopefully help others interested in doing the same.<br /><br />Worst case: I get to contribute my $0.02 -- hence the name.cludwinhttp://www.blogger.com/profile/14797643674981332072noreply@blogger.com0