Go Back   CORTEX Forums > Best Practices > Tips and Techniques > Data Warehousing Tips and Techniques > DWH Tip Feeds
Register Blogs FAQ Members List Calendar Search Today's Posts Mark Forums Read

MySQL - the best stored routine is the one you don't write

This is a discussion on MySQL - the best stored routine is the one you don't write within the DWH Tip Feeds forums, part of the Data Warehousing Tips and Techniques category; At Fosdem 2010 , already two weeks ago, I had the pleasure of hearing Geert van der Kelen explain the work he has been doing on connecting MySQL and Python ...


Reply
 
LinkBack Thread Tools Search this Thread Display Modes
Old 18th February 2010, 08:48 AM   #1
News Bot
 
Join Date: Nov 2007
Posts: 15,067
Latest News Headlines is on a distinguished road
Post MySQL - the best stored routine is the one you don't write

At Fosdem 2010, already two weeks ago, I had the pleasure of hearing Geert van der Kelen explain the work he has been doing on connecting MySQL and Python. I don't know anything about Python, but anybody that has the courage, perseverance and coding skills to create an implementation of the the MySQL wire protocol from scratch is a class-A programmer in my book. So, I encourage everyone that needs MySQL connectivity for Python programs to check out Geert's brainchild, MySQL Connector/Python.

In relation to MySQL Connector/Python, I just read a post from Geert about how he uses the MySQL information_schema to generate some Python code. In this particular case, he needs the data from the COLLATIONS table to maintain a data structure that describes all collations supported by MySQL.

For some reasons that I cannot fathom, Geert needed to generate a structure for each possible collation, not just the ones for which the COLLATIONS table contains a row. To do this, he wrote a stored procedure that uses a cursor to loop through the COLLATIONS table. In the loop, he detects it whenever there's a gap in the sequence of values from the ID column, and then starts a new loop to "fill the gaps". For each iteration of the outer cursor loop, a piece of text is emitted that conforms to the syntax of a Python tuple describing the collation, and each iteration of the inner loop generates the text None, a Python built-in constant.

The final result of the procedure is a snippet of Python code shown below (abbreviated):
..
("cp1251","cp1251_bulgarian_ci"), # 14
("latin1","latin1_danish_ci"), # 15
("hebrew","hebrew_general_ci"), # 16
None,
("tis620","tis620_thai_ci"), # 18
("euckr","euckr_korean_ci"), # 19
..


In the final code, these lines are themselves used to form yet another tuple:
desc = (
None,
("big5","big5_chinese_ci"), # 1
("latin2","latin2_czech_cs"), # 2
("dec8","dec8_swedish_ci"), # 3
("cp850","cp850_general_ci"), # 4
..


This is excellent use of the information schema! However, I am not too thrilled about using a stored routine for this. Enter my fosdem talk about refactoring stored routines.

In this case, performance is not really an issue, so I won't play that card. But many people that do need well-performing stored procedures might start out like Geert and write a cursor loop, and perhaps do some looping inside that loop. One of the big take-aways in my presentation is to become aware of the ways that you can avoid a stored procedure. Geerts procedure is an excellent candidate to illustrate the point. As a bonus, I'm adding the code that is necessary to generate the entire snippet, not just the collection of tuples inside the outer pair of parenthesis.

So, here goes:
set group_concat_max_len := @@max_allowed_packet;

select concat('desc = (',
group_concat('\n '
, if( collations.id is null, 'None',
concat('(', '"', character_set_name, '"',
',', '"', collation_name, '"', ')')
)
, if(ids.id=255, '', ','), ' #', ids.id
order by ids.id
separator ''
), '\n)'
)
from (select (t0.id
Latest News Headlines is offline  
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiTweet this Post!
Reply With Quote
Reply

Bookmarks

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is On
Trackbacks are On
Pingbacks are On
Refbacks are On


Similar Threads
Thread Thread Starter Forum Replies Last Post
SAS® Text Analytics reveals new insights in stored documents Latest News Headlines Other International Vendors 0 17th February 2010 04:06 AM
SAS® Text Analytics reveals new insights in stored documents Latest News Headlines SAS Forum 0 17th February 2010 01:59 AM
MySQL Stored Functions: Impact of DECLARE HANDLER on Performance Latest News Headlines DWH Tip Feeds 0 25th November 2009 11:03 AM
MySQL: Refactoring a Stored Function Latest News Headlines DWH Tip Feeds 0 25th November 2009 11:03 AM
CALUMO 11: Delivering Outstanding Query and Write-back Performance Calumo CALUMO Blog 0 20th July 2009 04:09 PM


All times are GMT +11. The time now is 07:09 PM.

© The Business Intelligence Group

Search Engine Optimization by vBSEO