Python: check for substring speed

I was looking for options on how to check if a certain substring (in my case ‘ FROM ‘) is present in a SQL query string when I found this blog entry. Just for fun I decided to have a look at how fast these checks would be compared to each other.

I was dealing with a two queries, knowing:

SELECT oid,typname,typlen,typlem,typdefault,typbasetype,typnotnull,typtype
FROM pg_type;

And

SELECT attname,attnum,atttypid,attndims,attnotnull,atthasdef,
pg_get_expr(adbin,adrelid) as adbin
FROM pg_attribute LEFT JOIN pg_attrdef ON attrelid = adrelid AND attnum = adnum
WHERE attisdropped = false AND attnum > 0 AND attrelid IN
( SELECT oid FROM pg_class WHERE relname=%s AND relkind='r' ORDER BY attnum);

The code is pretty simple:

import time
t1 = time.time()
      if ' FROM ' in query:
          print 'IN found it!'
      t2 = time.time()
      print "Took me " + str(t2-t1) + " sec."

      t3 = time.time()
      if query.find(' FROM ') != -1:
          print 'FIND found it!'
      t4 = time.time()
      print "Took me " + str(t4-t3) + " sec."

The results look as follows:

IN found it!
Took me 4.72068786621e-05 sec.
FIND found it!
Took me 1.09672546387e-05 sec.
IN found it!
Took me 4.19616699219e-05 sec.
FIND found it!
Took me 1.12056732178e-05 sec.
IN found it!
Took me 3.48091125488e-05 sec.
FIND found it!
Took me 9.05990600586e-06 sec.
Took me 1.90734863281e-06 sec.
Took me 5.00679016113e-06 sec.
IN found it!
Took me 2.59876251221e-05 sec.
FIND found it!
Took me 1.19209289551e-05 sec.
Took me 9.53674316406e-07 sec.
Took me 1.90734863281e-06 sec.
IN found it!
Took me 0.00103211402893 sec.
FIND found it!
Took me 2.50339508057e-05 sec.
Took me 9.53674316406e-07 sec.
Took me 4.05311584473e-06 sec.

As we can see: if we use the if-in test, we get only one result even if there are more instances of ‘ FROM ‘ in the string. When using the find method, all instances are retrieved. When having only one instance in your string, the find method is usually faster. When having multiple instances, the if-in test will be faster.
It doesn’t make much sense with small strings, but if you’re just interested in finding a substring one or more times in a large string or a piece of text, it can make a difference.
So far my little experiment. Knowing the answer, I can sleep well again tonight 😉

Getagged , , ,

Geef een reactie

Het e-mailadres wordt niet gepubliceerd. Vereiste velden zijn gemarkeerd met *