<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Patrick&#039;s playground &#187; Python</title>
	<atom:link href="http://www.vankouteren.eu/blog/tag/programming-python/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.vankouteren.eu/blog</link>
	<description>Random thoughts, problems and solutions</description>
	<lastBuildDate>Sun, 29 Jan 2012 07:53:06 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<item>
		<title>Python: check for substring speed</title>
		<link>http://www.vankouteren.eu/blog/2009/06/python-check-for-substring-speed/</link>
		<comments>http://www.vankouteren.eu/blog/2009/06/python-check-for-substring-speed/#comments</comments>
		<pubDate>Mon, 29 Jun 2009 15:07:23 +0000</pubDate>
		<dc:creator>Patrick van Kouteren</dc:creator>
				<category><![CDATA[Python]]></category>
		<category><![CDATA[comparison]]></category>
		<category><![CDATA[method speed]]></category>
		<category><![CDATA[substring]]></category>

		<guid isPermaLink="false">http://www.vankouteren.eu/blog/?p=110</guid>
		<description><![CDATA[I was looking for options on how to check if a certain substring (in my case ' FROM ') is present in a SQL query string when I found this blog entry. Just for fun I decided to have a look at how fast these checks would be compared to each other. I was dealing [...]]]></description>
			<content:encoded><![CDATA[<p>I was looking for options on how to check if a certain substring (in my case ' FROM ') is present in a SQL query string when I found <a title="Python check for substring" href="http://bka-bonn.de/wordpress/index.php/2008/12/26/python-trick-check-for-substring/" target="_blank">this</a> blog entry. Just for fun I decided to have a look at how fast these checks would be compared to each other.</p>
<p><span id="more-110"></span></p>
<p>I was dealing with a two queries, knowing:</p>
<pre class="sql"><span style="color: #993333; font-weight: bold;">SELECT</span> oid,typname,typlen,typlem,typdefault,typbasetype,typnotnull,typtype
<span style="color: #993333; font-weight: bold;">FROM</span> pg_type;</pre>
<p>And</p>
<pre class="sql"><span style="color: #993333; font-weight: bold;">SELECT</span> attname,attnum,atttypid,attndims,attnotnull,atthasdef,
pg_get_expr<span style="color: #66cc66;">&#40;</span>adbin,adrelid<span style="color: #66cc66;">&#41;</span> <span style="color: #993333; font-weight: bold;">AS</span> adbin
<span style="color: #993333; font-weight: bold;">FROM</span> pg_attribute <span style="color: #993333; font-weight: bold;">LEFT</span> <span style="color: #993333; font-weight: bold;">JOIN</span> pg_attrdef <span style="color: #993333; font-weight: bold;">ON</span> attrelid = adrelid <span style="color: #993333; font-weight: bold;">AND</span> attnum = adnum
<span style="color: #993333; font-weight: bold;">WHERE</span> attisdropped = false <span style="color: #993333; font-weight: bold;">AND</span> attnum &amp;gt; <span style="color: #cc66cc;">0</span> <span style="color: #993333; font-weight: bold;">AND</span> attrelid <span style="color: #993333; font-weight: bold;">IN</span>
<span style="color: #66cc66;">&#40;</span> <span style="color: #993333; font-weight: bold;">SELECT</span> oid <span style="color: #993333; font-weight: bold;">FROM</span> pg_class <span style="color: #993333; font-weight: bold;">WHERE</span> relname=%s <span style="color: #993333; font-weight: bold;">AND</span> relkind=<span style="color: #ff0000;">'r'</span> <span style="color: #993333; font-weight: bold;">ORDER</span> <span style="color: #993333; font-weight: bold;">BY</span> attnum<span style="color: #66cc66;">&#41;</span>;</pre>
<p>The code is pretty simple:</p>
<pre class="python"><span style="color: #ff7700;font-weight:bold;">import</span> <span style="color: #dc143c;">time</span>
t1 = <span style="color: #dc143c;">time</span>.<span style="color: #dc143c;">time</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span>
      <span style="color: #ff7700;font-weight:bold;">if</span> <span style="color: #483d8b;">' FROM '</span> <span style="color: #ff7700;font-weight:bold;">in</span> query:
          <span style="color: #ff7700;font-weight:bold;">print</span> <span style="color: #483d8b;">'IN found it!'</span>
      t2 = <span style="color: #dc143c;">time</span>.<span style="color: #dc143c;">time</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span>
      <span style="color: #ff7700;font-weight:bold;">print</span> <span style="color: #483d8b;">&quot;Took me &quot;</span> + <span style="color: #008000;">str</span><span style="color: black;">&#40;</span>t2-t1<span style="color: black;">&#41;</span> + <span style="color: #483d8b;">&quot; sec.&quot;</span>
&nbsp;
      t3 = <span style="color: #dc143c;">time</span>.<span style="color: #dc143c;">time</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span>
      <span style="color: #ff7700;font-weight:bold;">if</span> query.<span style="color: black;">find</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">' FROM '</span><span style="color: black;">&#41;</span> != <span style="color: #ff4500;">-1</span>:
          <span style="color: #ff7700;font-weight:bold;">print</span> <span style="color: #483d8b;">'FIND found it!'</span>
      t4 = <span style="color: #dc143c;">time</span>.<span style="color: #dc143c;">time</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span>
      <span style="color: #ff7700;font-weight:bold;">print</span> <span style="color: #483d8b;">&quot;Took me &quot;</span> + <span style="color: #008000;">str</span><span style="color: black;">&#40;</span>t4-t3<span style="color: black;">&#41;</span> + <span style="color: #483d8b;">&quot; sec.&quot;</span></pre>
<p>The results look as follows:</p>
<pre>IN found it!
Took me 4.72068786621e-05 sec.
FIND found it!
Took me 1.09672546387e-05 sec.
IN found it!
Took me 4.19616699219e-05 sec.
FIND found it!
Took me 1.12056732178e-05 sec.
IN found it!
Took me 3.48091125488e-05 sec.
FIND found it!
Took me 9.05990600586e-06 sec.
Took me 1.90734863281e-06 sec.
Took me 5.00679016113e-06 sec.
IN found it!
Took me 2.59876251221e-05 sec.
FIND found it!
Took me 1.19209289551e-05 sec.
Took me 9.53674316406e-07 sec.
Took me 1.90734863281e-06 sec.
IN found it!
Took me 0.00103211402893 sec.
FIND found it!
Took me 2.50339508057e-05 sec.
Took me 9.53674316406e-07 sec.
Took me 4.05311584473e-06 sec.</pre>
<p>As we can see: if we use the if-in test, we get only one result even if there are more instances of ' FROM ' in the string. When using the find method, all instances are retrieved. When having only one instance in your string, the find method is usually faster. When having multiple instances, the if-in test will be faster.<br />
It doesn't make much sense with small strings, but if you're just interested in finding a substring one or more times in a large string or a piece of text, it can make a difference.<br />
So far my little experiment. Knowing the answer, I can sleep well again tonight <img src='http://www.vankouteren.eu/blog/wp-includes/images/smilies/icon_wink.gif' alt=';)' class='wp-smiley' /> </p>
]]></content:encoded>
			<wfw:commentRss>http://www.vankouteren.eu/blog/2009/06/python-check-for-substring-speed/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Getting IPython readline and auto-completion to work on Mac OS X</title>
		<link>http://www.vankouteren.eu/blog/2009/06/getting-ipython-readline-and-auto-completion-to-work-on-mac-os-x/</link>
		<comments>http://www.vankouteren.eu/blog/2009/06/getting-ipython-readline-and-auto-completion-to-work-on-mac-os-x/#comments</comments>
		<pubDate>Wed, 24 Jun 2009 12:11:01 +0000</pubDate>
		<dc:creator>Patrick van Kouteren</dc:creator>
				<category><![CDATA[Mac OS X]]></category>
		<category><![CDATA[auto-complete]]></category>
		<category><![CDATA[Python]]></category>
		<category><![CDATA[readline]]></category>

		<guid isPermaLink="false">http://www.vankouteren.eu/blog/?p=98</guid>
		<description><![CDATA[It's taken me some time and a lot of web pages which tried to solve the readline support with all kinds of hacks, but finally I've been able to get readline support and auto-completion for IPython to work. As it can be quite confusing and hard to follow all posts, this will be a step-by-step [...]]]></description>
			<content:encoded><![CDATA[<p>It's taken me some time and a lot of web pages which tried to solve the readline support with all kinds of hacks, but finally I've been able to get readline support and auto-completion for IPython to work. As it can be quite confusing and hard to follow all posts, this will be a step-by-step approach to get things to work. Note that I've got it working on Mac OS X 10.5.7 Leopard. It is expected to work on Leopard at least. Other versions might not require the exact same solution.</p>
<p><span id="more-98"></span></p>
<p>If you don't already have IPython you can install it by opening a console and typing</p>
<pre>sudo easy_install ipython</pre>
<p>Mac OS X does include a readline functionality, but not 'the real one' gnureadline, because of a license issue. It can be manually installed, which we will do next. If you've not done so already, open a console and type</p>
<pre>sudo easy_install -f http://ipython.scipy.org/dist/ readline</pre>
<p>Type in your password and the readline functionality will be installed. Big joy you would think, but it's very well possible that it still doesn't work. Try for yourself by typing the following in your console:</p>
<pre>ipython</pre>
<p>IPython will start. Let's see if we can auto-complete here. First type</p>
<pre class="python"><span style="color: #ff7700;font-weight:bold;">import</span> <span style="color: #dc143c;">sys</span></pre>
<p>Now we've done an import and we will try to have auto-complete hint what we can do with it. In the next line type</p>
<pre class="python"><span style="color: #dc143c;">sys</span>.</pre>
<p>Press Tab directly after the dot. If IPython hints all types of functions preceded by sys. in a list, you're done with this tutorial. If your cursor just jumps, auto-complete doesn't work and you might want to execute the following steps.<br />
IPython has a file which is used to start it. We need to edit this file to tell where the readline module is installed.<br />
If you're still in IPython, close it by pressing CTRL+D and confirming (y). You are now back on the console again.<br />
The file which we are about to edit (done with Vim here) is a read-only file, so we need to sudo once again. Type the following</p>
<pre>sudo vim /usr/local/bin/ipython</pre>
<p>Enter your password and Vim will open a config file with just a few lines like</p>
<pre class="python"><span style="color: #808080; font-style: italic;">#!/System/Library/Frameworks/Python.framework/Versions/2.5/Resources/Python.app/Contents/MacOS/Python</span>
<span style="color: #808080; font-style: italic;"># EASY-INSTALL-ENTRY-SCRIPT: 'ipython==0.9.1','console_scripts','ipython'</span>
__requires__ = <span style="color: #483d8b;">'ipython==0.9.1'</span>
<span style="color: #ff7700;font-weight:bold;">import</span> <span style="color: #dc143c;">sys</span>
<span style="color: #ff7700;font-weight:bold;">from</span> pkg_resources <span style="color: #ff7700;font-weight:bold;">import</span> load_entry_point
&nbsp;
<span style="color: #dc143c;">sys</span>.<span style="color: black;">exit</span><span style="color: black;">&#40;</span>
   load_entry_point<span style="color: black;">&#40;</span><span style="color: #483d8b;">'ipython==0.9.1'</span>, <span style="color: #483d8b;">'console_scripts'</span>, <span style="color: #483d8b;">'ipython'</span><span style="color: black;">&#41;</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span>
<span style="color: black;">&#41;</span></pre>
<p>First find the location of the egg directory of readline which you've just installed. It's probably something like /Library/Python/site-packages/readline-.egg<br />
I've installed the i386 readline module version 2.5.1, so my location is /Library/Python/2.5/site-packages/readline-2.5.1-py2.5-macosx-10.5-i386.egg<br />
We will need this location next.<br />
Go back to your console where the file is still opened.<br />
Leave the first two lines and the import sys line. The rest can be commented out.<br />
Now insert the following three lines after import sys:</p>
<pre class="python"><span style="color: #dc143c;">sys</span>.<span style="color: black;">path</span>.<span style="color: black;">insert</span><span style="color: black;">&#40;</span><span style="color: #ff4500;">0</span>, <span style="color: #483d8b;">'path to readline egg'</span><span style="color: black;">&#41;</span>
<span style="color: #ff7700;font-weight:bold;">import</span> IPython.<span style="color: black;">Shell</span>
IPython.<span style="color: black;">Shell</span>.<span style="color: black;">start</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span>.<span style="color: black;">mainloop</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span></pre>
<p>In my case, the ipython file looks as follows:</p>
<pre class="python"><span style="color: #808080; font-style: italic;">#!/System/Library/Frameworks/Python.framework/Versions/2.5/Resources/Python.app/Contents/MacOS/Python</span>
<span style="color: #808080; font-style: italic;"># EASY-INSTALL-ENTRY-SCRIPT: 'ipython==0.9.1','console_scripts','ipython'</span>
<span style="color: #808080; font-style: italic;">#__requires__ = 'ipython==0.9.1'</span>
<span style="color: #ff7700;font-weight:bold;">import</span> <span style="color: #dc143c;">sys</span>
<span style="color: #808080; font-style: italic;">#from pkg_resources import load_entry_point</span>
&nbsp;
<span style="color: #dc143c;">sys</span>.<span style="color: black;">path</span>.<span style="color: black;">insert</span><span style="color: black;">&#40;</span><span style="color: #ff4500;">0</span>, <span style="color: #483d8b;">'/Library/Python/2.5/site-packages/readline-2.5.1-py2.5-macosx-10.5-i386.egg'</span><span style="color: black;">&#41;</span>
&nbsp;
<span style="color: #ff7700;font-weight:bold;">import</span> IPython.<span style="color: black;">Shell</span>
IPython.<span style="color: black;">Shell</span>.<span style="color: black;">start</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span>.<span style="color: black;">mainloop</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span>
&nbsp;
<span style="color: #808080; font-style: italic;">#sys.exit(</span>
<span style="color: #808080; font-style: italic;">#   load_entry_point('ipython==0.9.1', 'console_scripts', 'ipython')()</span>
<span style="color: #808080; font-style: italic;">#)</span></pre>
<p>Save and quit the file by typing</p>
<pre class="bash">:wq</pre>
<p>You're back at the command line again. Let's test the auto-complete. Again type</p>
<pre>ipython</pre>
<p>IPython will start. Let's see if we can auto-complete here. First type</p>
<pre class="python"><span style="color: #ff7700;font-weight:bold;">import</span> <span style="color: #dc143c;">sys</span></pre>
<p>Now we've done an import and we will try to have auto-complete hint what we can do with it. In the next line type</p>
<pre class="python"><span style="color: #dc143c;">sys</span>.</pre>
<p>Press Tab directly after the dot. If all went well Python hints all types of functions preceded by sys. in a list.</p>
<p><strong>Sites that helped me and / or might be useful for you:</strong></p>
<ul>
<li><a title="IPython on Mac OS X" href="http://hurley.wordpress.com/2008/11/15/ipython-on-mac-os-x/" target="_blank">IPython on Mac OS X</a></li>
<li><a title="Incredible Vehicle" href="http://incrediblevehicle.com/2009/06/11/readline-python-ipython-and-mac-os-x/" target="_blank">Incredible Vehicle</a></li>
<li><a title="Launchpad" href="https://bugs.launchpad.net/ipython/+bug/254023" target="_blank">Launchpad</a></li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://www.vankouteren.eu/blog/2009/06/getting-ipython-readline-and-auto-completion-to-work-on-mac-os-x/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
		</item>
		<item>
		<title>Python ParserFactory</title>
		<link>http://www.vankouteren.eu/blog/2009/05/python-parserfactory/</link>
		<comments>http://www.vankouteren.eu/blog/2009/05/python-parserfactory/#comments</comments>
		<pubDate>Tue, 05 May 2009 05:13:32 +0000</pubDate>
		<dc:creator>Patrick van Kouteren</dc:creator>
				<category><![CDATA[Python]]></category>
		<category><![CDATA[Factory model]]></category>
		<category><![CDATA[ParserFactory]]></category>

		<guid isPermaLink="false">http://www.vankouteren.eu/blog/?p=62</guid>
		<description><![CDATA[Yesterday I had an idea about how to manage parsers in dynamic way. My idea was like this A managing class can access the parsers folder and check which parsers (files) are available A parser describes which file formats (and which versions of it) it can parse The result should be that parser classes are [...]]]></description>
			<content:encoded><![CDATA[<div>
<p>Yesterday I had an idea about how to manage parsers in dynamic way. My idea was like this</p>
<ul>
<li>A managing class can access the parsers folder and check which parsers (files) are available</li>
<li>A parser describes which file formats (and which versions of it) it can parse</li>
</ul>
</div>
<div>The result should be that parser classes are implementing a parser interface and a ParserFactory can get info from the available parsers. As I have been working with Java, PHP, Haskell and some other languages, but rather new at Python this is an interesting problem to get to know Python a little better.</div>
<div><span id="more-62"></span>With an inheritance structure (which is not fully supported by Python &lt; 2.6 I thought) we can ensure that every parser implementing a parser interface has functions implemented which the ParserFactory can call to obtain information about that particular parser.</div>
<div>Basically, what I would like to see is this:</div>
<table border="1">
<tbody>
<tr>
<td>In a script I have a file <em>f</em></td>
</tr>
<tr>
<td>I call <em>pf = ParserFactory()</em></td>
</tr>
<tr>
<td>Upon call this ParserFactory constructor, the ParserFactory class gathers information from all parsers available. Say they are in a folder <em>parsers</em>. The ParserFactory will then get the contents of the folder <em>parsers</em> and for each class file construct an object, from which it can request information.</p>
<p>So for example <em>psimi.py</em> will the ParserFactory give a PSIMIParser object. The ParserFactory can then call a method like <em>getSupportedFileFormats() </em>which returns the file formats which can be parsed by this PSIMIParser.</td>
</tr>
</tbody>
</table>
<div>It would really be great if such a thing would be possible so people can create their own parser by implementing a parser interface and do not have to worry about anything else as all is done by the ParserFactory.</div>
<div>I currently have this code:</div>
<p>Testfile:</p>
<pre class="python">pf = parsers.<span style="color: black;">parserfactory</span>.<span style="color: black;">ParserFactory</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span>
pf.<span style="color: black;">getAvailableParsers</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span></pre>
<p>Then for a single parser I have a GenericParser interface which every parser should implement:</p>
<pre class="python"><span style="color: #483d8b;">&quot;&quot;</span><span style="color: #483d8b;">&quot;
This GenericParser is an abstract superclass. It defines methods which subclasses should override. This guarantees us that
certain functions exist.
@author: Patrick van Kouteren
&nbsp;
@version: 0.1
&quot;</span><span style="color: #483d8b;">&quot;&quot;</span>
&nbsp;
<span style="color: #ff7700;font-weight:bold;">class</span> GenericParser:
&nbsp;
    <span style="color: #483d8b;">&quot;&quot;</span><span style="color: #483d8b;">&quot;
    self.fileformats will be a dictionary in the form 'extension : version '. E.g. 'xml : 1.0'
    &quot;</span><span style="color: #483d8b;">&quot;&quot;</span>
    <span style="color: #ff7700;font-weight:bold;">def</span> <span style="color: #0000cd;">__init__</span><span style="color: black;">&#40;</span><span style="color: #008000;">self</span><span style="color: black;">&#41;</span>:
        <span style="color: #ff7700;font-weight:bold;">raise</span> <span style="color: #008000;">NotImplementedError</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;This is a GenericParser. Please define a list 'self.fileformat' here with extensions which your parser can parse!&quot;</span><span style="color: black;">&#41;</span>
&nbsp;
    <span style="color: #ff7700;font-weight:bold;">def</span> getSupportedFileFormats<span style="color: black;">&#40;</span><span style="color: #008000;">self</span><span style="color: black;">&#41;</span>:
        <span style="color: #ff7700;font-weight:bold;">raise</span> <span style="color: #008000;">NotImplementedError</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;This is a GenericParser. Please return the list 'self.fileformat' here which should be defined at self.init&quot;</span><span style="color: black;">&#41;</span>
&nbsp;
    <span style="color: #ff7700;font-weight:bold;">def</span> parse<span style="color: black;">&#40;</span><span style="color: #008000;">self</span>, filename<span style="color: black;">&#41;</span>:
        <span style="color: #ff7700;font-weight:bold;">raise</span> <span style="color: #008000;">NotImplementedError</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;This is a GenericParser. Please implement a proper parse function&quot;</span><span style="color: black;">&#41;</span></pre>
<p>The crux is in this getSupportedFileFormats function. I would like to call this function on every parser file. This file contains a parser class, so basically I want to create an object from a file, but I don't know the object's name from the file.</p>
<p>Currently my ParserFactory looks as follows (note that I'm still working on it, so not all is finished yet!):</p>
<pre class="python"><span style="color: #483d8b;">&quot;&quot;</span><span style="color: #483d8b;">&quot;
This ParserFactory contains knowledge about how to parse files. It can be fed a file and return the parsed data.
It uses the parsers in this parsers folder, but abstracts away various operations.
@author: Patrick van Kouteren
&nbsp;
@version: 0.1
&quot;</span><span style="color: #483d8b;">&quot;&quot;</span>
<span style="color: #ff7700;font-weight:bold;">import</span> <span style="color: #dc143c;">os</span>, <span style="color: #dc143c;">types</span>, <span style="color: #dc143c;">sys</span>
&nbsp;
<span style="color: #ff7700;font-weight:bold;">class</span> ParserFactory:
&nbsp;
    <span style="color: #483d8b;">&quot;&quot;</span><span style="color: #483d8b;">&quot;
    Gather info about the contents of the database which are important for parsing
    &quot;</span><span style="color: #483d8b;">&quot;&quot;</span>
    <span style="color: #ff7700;font-weight:bold;">def</span> <span style="color: #0000cd;">__init__</span><span style="color: black;">&#40;</span><span style="color: #008000;">self</span><span style="color: black;">&#41;</span>:
        <span style="color: #008000;">self</span>.<span style="color: black;">importedDatabases</span> = <span style="color: #008000;">self</span>.<span style="color: black;">checkImportedDatabases</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span>
&nbsp;
    <span style="color: #483d8b;">&quot;&quot;</span><span style="color: #483d8b;">&quot;
    Check which databases (and which versions of them) are present in the database
    &quot;</span><span style="color: #483d8b;">&quot;&quot;</span>
    <span style="color: #ff7700;font-weight:bold;">def</span> checkImportedDatabases<span style="color: black;">&#40;</span><span style="color: #008000;">self</span><span style="color: black;">&#41;</span>:
        databases = <span style="color: black;">&#123;</span><span style="color: black;">&#125;</span>
        <span style="color: #ff7700;font-weight:bold;">return</span> databases 
&nbsp;
    <span style="color: #483d8b;">&quot;&quot;</span><span style="color: #483d8b;">&quot;
    Return a list of databases and their version which are present (imported) in IBIDAS
    &quot;</span><span style="color: #483d8b;">&quot;&quot;</span>
    <span style="color: #ff7700;font-weight:bold;">def</span> getImportedDatabases<span style="color: black;">&#40;</span><span style="color: #008000;">self</span><span style="color: black;">&#41;</span>:
        <span style="color: #ff7700;font-weight:bold;">return</span> <span style="color: #008000;">self</span>.<span style="color: black;">importedDatabases</span>
&nbsp;
    <span style="color: #483d8b;">&quot;&quot;</span><span style="color: #483d8b;">&quot;
    Check the parser directory for files which import the GenericParser class. If a file does so, it is guaranteed
    that we can call certain methods to request properties
    &quot;</span><span style="color: #483d8b;">&quot;&quot;</span>
    <span style="color: #ff7700;font-weight:bold;">def</span> getAvailableParsers<span style="color: black;">&#40;</span><span style="color: #008000;">self</span><span style="color: black;">&#41;</span>:
       parserdir = <span style="color: #dc143c;">sys</span>.<span style="color: black;">path</span><span style="color: black;">&#91;</span><span style="color: #ff4500;">0</span><span style="color: black;">&#93;</span> + <span style="color: #483d8b;">&quot;/parsers&quot;</span>
       classnames = <span style="color: black;">&#91;</span><span style="color: black;">&#93;</span>
       parserfiles = <span style="color: black;">&#91;</span><span style="color: black;">&#93;</span>
       <span style="color: #ff7700;font-weight:bold;">for</span> subdir, dirs, files <span style="color: #ff7700;font-weight:bold;">in</span> <span style="color: #dc143c;">os</span>.<span style="color: black;">walk</span><span style="color: black;">&#40;</span>parserdir<span style="color: black;">&#41;</span>:
           <span style="color: #ff7700;font-weight:bold;">for</span> <span style="color: #008000;">file</span> <span style="color: #ff7700;font-weight:bold;">in</span> files:
               <span style="color: #ff7700;font-weight:bold;">if</span> <span style="color: #008000;">file</span>.<span style="color: black;">endswith</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;.py&quot;</span><span style="color: black;">&#41;</span> <span style="color: #ff7700;font-weight:bold;">and</span> <span style="color: #ff7700;font-weight:bold;">not</span> <span style="color: #008000;">file</span>.<span style="color: black;">startswith</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;__&quot;</span><span style="color: black;">&#41;</span>:
                   parserfiles.<span style="color: black;">append</span><span style="color: black;">&#40;</span>parserdir + <span style="color: #483d8b;">&quot;/&quot;</span> + <span style="color: #008000;">file</span><span style="color: black;">&#41;</span>
       <span style="color: #ff7700;font-weight:bold;">for</span> parserfile <span style="color: #ff7700;font-weight:bold;">in</span> parserfiles:
           fileobject = <span style="color: #008000;">open</span><span style="color: black;">&#40;</span>parserfile<span style="color: black;">&#41;</span>
           content = fileobject.<span style="color: black;">read</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span>
           importfound = <span style="color: #ff4500;">0</span>
           <span style="color: #ff7700;font-weight:bold;">for</span> l <span style="color: #ff7700;font-weight:bold;">in</span> content.<span style="color: black;">splitlines</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span>:
               <span style="color: #ff7700;font-weight:bold;">if</span> l.<span style="color: black;">startswith</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;import&quot;</span><span style="color: black;">&#41;</span>:
                   <span style="color: #483d8b;">&quot;&quot;</span><span style="color: #483d8b;">&quot; If this line contains genericparser, we know that this file is interesting &quot;</span><span style="color: #483d8b;">&quot;&quot;</span>
                   <span style="color: #ff7700;font-weight:bold;">if</span> l.<span style="color: black;">find</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;genericparser&quot;</span><span style="color: black;">&#41;</span> &amp;gt; <span style="color: #ff4500;">0</span>:
                       importfound += <span style="color: #ff4500;">1</span>
               <span style="color: #483d8b;">&quot;&quot;</span><span style="color: #483d8b;">&quot;
               If we:
                * find a class definition
                * have found an import of generic parser
                * find a genericparser argument
               Then we know that this parser is a subclass of genericparser
               &quot;</span><span style="color: #483d8b;">&quot;&quot;</span>
               <span style="color: #ff7700;font-weight:bold;">if</span> l.<span style="color: black;">startswith</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;class&quot;</span><span style="color: black;">&#41;</span> <span style="color: #ff7700;font-weight:bold;">and</span> importfound &amp;gt; <span style="color: #ff4500;">0</span> <span style="color: #ff7700;font-weight:bold;">and</span> l.<span style="color: black;">find</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;genericparser&quot;</span><span style="color: black;">&#41;</span> &amp;gt; <span style="color: #ff4500;">0</span>:
                   <span style="color: #ff7700;font-weight:bold;">import</span> <span style="color: #dc143c;">re</span>
                   m = <span style="color: #dc143c;">re</span>.<span style="color: black;">split</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;<span style="color: #000099; font-weight: bold;">\W</span>+&quot;</span>, l<span style="color: black;">&#41;</span>
                   classname = m<span style="color: black;">&#91;</span>m.<span style="color: black;">index</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;class&quot;</span><span style="color: black;">&#41;</span> + <span style="color: #ff4500;">1</span><span style="color: black;">&#93;</span>
                   <span style="color: #ff7700;font-weight:bold;">print</span> <span style="color: #483d8b;">&quot;file &quot;</span> + parserfile + <span style="color: #483d8b;">&quot; contains a callable parser class called &quot;</span> + classname
                   thisfilename = parserfile<span style="color: black;">&#91;</span>parserfile.<span style="color: black;">rfind</span><span style="color: black;">&#40;</span>u<span style="color: #483d8b;">&quot;/&quot;</span><span style="color: black;">&#41;</span><span style="color: #ff4500;">+1</span>:<span style="color: black;">&#93;</span>
                   ppath = <span style="color: #483d8b;">&quot;parsers.&quot;</span> + thisfilename<span style="color: black;">&#91;</span>:thisfilename.<span style="color: black;">rfind</span><span style="color: black;">&#40;</span>u<span style="color: #483d8b;">&quot;.&quot;</span><span style="color: black;">&#41;</span><span style="color: black;">&#93;</span>+<span style="color: #483d8b;">&quot;.&quot;</span> + classname
                   <span style="color: #808080; font-style: italic;">#print &quot;the full import path will become &quot; + ppath</span>
                   classnames.<span style="color: black;">append</span><span style="color: black;">&#40;</span>ppath<span style="color: black;">&#41;</span>
       <span style="color: #ff7700;font-weight:bold;">return</span> classnames
&nbsp;
    <span style="color: #483d8b;">&quot;&quot;</span><span style="color: #483d8b;">&quot;
    Return the supported file formats. The supported file formats are determined by checking all parsers which import
    the GenericParser. We can request the file formats they support and return this list.
    &quot;</span><span style="color: #483d8b;">&quot;&quot;</span>
    <span style="color: #ff7700;font-weight:bold;">def</span> getSupportedFileFormats<span style="color: black;">&#40;</span><span style="color: #008000;">self</span><span style="color: black;">&#41;</span>:
        fileformats = <span style="color: black;">&#91;</span><span style="color: black;">&#93;</span>
        parserlist = <span style="color: #008000;">self</span>.<span style="color: black;">getAvailableParsers</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span>
        <span style="color: #ff7700;font-weight:bold;">for</span> <span style="color: #dc143c;">parser</span> <span style="color: #ff7700;font-weight:bold;">in</span> parserlist:
            p = <span style="color: #008000;">self</span>._get_func<span style="color: black;">&#40;</span><span style="color: #dc143c;">parser</span><span style="color: black;">&#41;</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span>
            <span style="color: #ff7700;font-weight:bold;">for</span> ff <span style="color: #ff7700;font-weight:bold;">in</span> p.<span style="color: black;">getSupportedFileFormats</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span>:
                fileformats.<span style="color: black;">append</span><span style="color: black;">&#40;</span>ff<span style="color: black;">&#41;</span>
        <span style="color: #ff7700;font-weight:bold;">return</span> fileformats
&nbsp;
    <span style="color: #483d8b;">&quot;&quot;</span><span style="color: #483d8b;">&quot;
    Try to import a module. Then we can use this to get its class
    Source: http://code.activestate.com/recipes/223972/
    &quot;</span><span style="color: #483d8b;">&quot;&quot;</span>
    <span style="color: #ff7700;font-weight:bold;">def</span> _get_mod<span style="color: black;">&#40;</span><span style="color: #008000;">self</span>, modulePath<span style="color: black;">&#41;</span>:
        <span style="color: #ff7700;font-weight:bold;">try</span>:
            aMod = <span style="color: #dc143c;">sys</span>.<span style="color: black;">modules</span><span style="color: black;">&#91;</span>modulePath<span style="color: black;">&#93;</span>
            <span style="color: #ff7700;font-weight:bold;">if</span> <span style="color: #ff7700;font-weight:bold;">not</span> <span style="color: #008000;">isinstance</span><span style="color: black;">&#40;</span>aMod, <span style="color: #dc143c;">types</span>.<span style="color: black;">ModuleType</span><span style="color: black;">&#41;</span>:
                <span style="color: #ff7700;font-weight:bold;">raise</span> <span style="color: #008000;">KeyError</span>
        <span style="color: #ff7700;font-weight:bold;">except</span> <span style="color: #008000;">KeyError</span>:
            <span style="color: #808080; font-style: italic;"># The last [''] is very important!</span>
            aMod = <span style="color: #008000;">__import__</span><span style="color: black;">&#40;</span>modulePath, <span style="color: #008000;">globals</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span>, <span style="color: #008000;">locals</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span>, <span style="color: black;">&#91;</span><span style="color: #483d8b;">''</span><span style="color: black;">&#93;</span><span style="color: black;">&#41;</span>
            <span style="color: #dc143c;">sys</span>.<span style="color: black;">modules</span><span style="color: black;">&#91;</span>modulePath<span style="color: black;">&#93;</span> = aMod
        <span style="color: #ff7700;font-weight:bold;">return</span> aMod
&nbsp;
    <span style="color: #483d8b;">&quot;&quot;</span><span style="color: #483d8b;">&quot;
    Return the class from 'parsers.file.class'
    Source: http://code.activestate.com/recipes/223972/
    &quot;</span><span style="color: #483d8b;">&quot;&quot;</span>
    <span style="color: #ff7700;font-weight:bold;">def</span> _get_func<span style="color: black;">&#40;</span><span style="color: #008000;">self</span>,fullFuncName<span style="color: black;">&#41;</span>:
        <span style="color: #483d8b;">&quot;&quot;</span><span style="color: #483d8b;">&quot;Retrieve a function object from a full dotted-package name.&quot;</span><span style="color: #483d8b;">&quot;&quot;</span>
        <span style="color: #808080; font-style: italic;"># Parse out the path, module, and function</span>
        lastDot = fullFuncName.<span style="color: black;">rfind</span><span style="color: black;">&#40;</span>u<span style="color: #483d8b;">&quot;.&quot;</span><span style="color: black;">&#41;</span>
        funcName = fullFuncName<span style="color: black;">&#91;</span>lastDot + <span style="color: #ff4500;">1</span>:<span style="color: black;">&#93;</span>
        modPath = fullFuncName<span style="color: black;">&#91;</span>:lastDot<span style="color: black;">&#93;</span>
&nbsp;
        aMod = <span style="color: #008000;">self</span>._get_mod<span style="color: black;">&#40;</span>modPath<span style="color: black;">&#41;</span>
        aFunc = <span style="color: #008000;">getattr</span><span style="color: black;">&#40;</span>aMod, funcName<span style="color: black;">&#41;</span>
&nbsp;
        <span style="color: #808080; font-style: italic;"># Assert that the function is a *callable* attribute.</span>
        <span style="color: #ff7700;font-weight:bold;">assert</span> <span style="color: #008000;">callable</span><span style="color: black;">&#40;</span>aFunc<span style="color: black;">&#41;</span>, u<span style="color: #483d8b;">&quot;%s is not callable.&quot;</span> % fullFuncName
&nbsp;
        <span style="color: #808080; font-style: italic;"># Return a reference to the function itself,</span>
        <span style="color: #808080; font-style: italic;"># not the results of the function.</span>
        <span style="color: #ff7700;font-weight:bold;">return</span> aFunc
&nbsp;
    <span style="color: #483d8b;">&quot;&quot;</span><span style="color: #483d8b;">&quot;
    Returns the class where a method is defined
&nbsp;
    def find_defining_class(self, obj, meth_name):
        for ty in type(obj).mro():
            if meth_name in ty.__dict__:
                return ty
    &quot;</span><span style="color: #483d8b;">&quot;&quot;</span>
&nbsp;
    <span style="color: #483d8b;">&quot;&quot;</span><span style="color: #483d8b;">&quot;
    Check if all databases needed to import a file are present
    &quot;</span><span style="color: #483d8b;">&quot;&quot;</span>
    <span style="color: #ff7700;font-weight:bold;">def</span> checkPrerequisites<span style="color: black;">&#40;</span><span style="color: #008000;">self</span>, file_prerequisites<span style="color: black;">&#41;</span>:
        errors = <span style="color: black;">&#91;</span><span style="color: black;">&#93;</span>
&nbsp;
    <span style="color: #483d8b;">&quot;&quot;</span><span style="color: #483d8b;">&quot;
    Parse a list of files. This means that we not only have to check the prerequisites, but also an order in which to
    parse the files as a file can be a prerequisite for another file
    &quot;</span><span style="color: #483d8b;">&quot;&quot;</span>
    <span style="color: #ff7700;font-weight:bold;">def</span> parseList<span style="color: black;">&#40;</span><span style="color: #008000;">self</span>, filelist, filelist_prerequisites<span style="color: black;">&#41;</span>:
        order = <span style="color: #008000;">self</span>.<span style="color: black;">findParseOrder</span><span style="color: black;">&#40;</span><span style="color: #008000;">file</span><span style="color: black;">&#41;</span>
&nbsp;
    <span style="color: #ff7700;font-weight:bold;">def</span> parse<span style="color: black;">&#40;</span><span style="color: #008000;">self</span>, <span style="color: #008000;">file</span>, file_prerequisites, <span style="color: #dc143c;">parser</span>=<span style="color: #008000;">None</span><span style="color: black;">&#41;</span>:
        <span style="color: #483d8b;">&quot;&quot;</span><span style="color: #483d8b;">&quot; First check if all prerequisites are present &quot;</span><span style="color: #483d8b;">&quot;&quot;</span>
        errors = <span style="color: #008000;">self</span>.<span style="color: black;">checkPrerequisites</span><span style="color: black;">&#40;</span>file_prerequisites<span style="color: black;">&#41;</span>
        <span style="color: #ff7700;font-weight:bold;">if</span> <span style="color: #ff7700;font-weight:bold;">not</span> empty<span style="color: black;">&#40;</span>errors<span style="color: black;">&#41;</span>:
            data = <span style="color: #483d8b;">&quot;<span style="color: #000099; font-weight: bold;">\n</span>&quot;</span>.<span style="color: black;">join</span><span style="color: black;">&#40;</span>errors<span style="color: black;">&#41;</span>
            <span style="color: #ff7700;font-weight:bold;">raise</span> prerequisitesError, data
        <span style="color: #ff7700;font-weight:bold;">else</span>:
            <span style="color: #ff7700;font-weight:bold;">if</span> <span style="color: #ff7700;font-weight:bold;">not</span> <span style="color: #dc143c;">parser</span>:
                <span style="color: #dc143c;">parser</span> = <span style="color: #008000;">self</span>.<span style="color: black;">findParser</span><span style="color: black;">&#40;</span><span style="color: #008000;">file</span><span style="color: black;">&#41;</span>
            <span style="color: #ff7700;font-weight:bold;">else</span>:
                <span style="color: #ff7700;font-weight:bold;">pass</span>
            <span style="color: #ff7700;font-weight:bold;">if</span> <span style="color: #dc143c;">parser</span>:
                <span style="color: #008000;">self</span>.<span style="color: black;">doParsing</span><span style="color: black;">&#40;</span><span style="color: #008000;">file</span>, <span style="color: #dc143c;">parser</span><span style="color: black;">&#41;</span>
            <span style="color: #ff7700;font-weight:bold;">else</span>:
                <span style="color: #ff7700;font-weight:bold;">raise</span> parserError
&nbsp;
    <span style="color: #483d8b;">&quot;&quot;</span><span style="color: #483d8b;">&quot;
    Based on several things a parser is tried to be found.
    1. The file extension: certain file extensions belong to specific formats
    2. The first line:
    &quot;</span><span style="color: #483d8b;">&quot;&quot;</span>
    <span style="color: #ff7700;font-weight:bold;">def</span> findParser<span style="color: black;">&#40;</span><span style="color: #008000;">self</span>, <span style="color: #008000;">file</span><span style="color: black;">&#41;</span>:
        <span style="color: #ff7700;font-weight:bold;">pass</span></pre>
<p>Any thoughts, comments and discussions are appreciated. For more information: Chris Leary has posted an improvement <a href="http://blog.cdleary.com/2009/06/registry-pattern-trumps-import-magic/" target="_blank">here</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.vankouteren.eu/blog/2009/05/python-parserfactory/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

