Thursday, July 19, 2018

Making your Python program faster


Here are some ways of making your program efficient and fast.

Some Techniques
---------------

Concatenate faster
Avoid:
s = ""
for substring in list:
    s += substring

Use:
slist = [some_function(elt) for elt in somelist]
s = "".join(slist)

Avoid:
out = "" + head + prologue + query + tail + ""

Use:
out = "%s%s%s%s" % (head, prologue, query, tail)

or:
out = "%(head)s%(prologue)s%(query)s%(tail)s" % locals()
Faster looping
Avoid:

newlist = []
for word in oldlist:
    newlist.append(word.upper())

Use any of the following instead:

newlist = map(str.upper, oldlist)

newlist = [s.upper() for s in oldlist]

upper = str.upper
newlist = []
append = newlist.append
for word in oldlist:
    append(upper(word))
Use local variables as much
as possible
Python accesses local variables more efficiently compared to
global variables.

def func():
    upper = str.upper
    newlist = []
    append = newlist.append
    for word in oldlist:
        append(upper(word))
    return newlist
Dictionaries can be used to
get record count faster
The following code will look up all keys inside a dict to check if it exists

wdict = {}
for word in words:
    if word not in wdict:
        wdict[word] = 0
    wdict[word] += 1

It is cheaper to use `try-except` clause

wdict = {}
for word in words:
    try:
        wdict[word] += 1
    except KeyError:
        wdict[word] = 1

Or using `dict.get()`

wdict = {}
get = wdict.get
for word in words:
    wdict[word] = get(word, 0) + 1
Reduce repeated imports since
it slows down performance
Function 1 places import inside:

def doit1():
    import string ###### import statement inside function
    string.lower('Python')

for num in range(100000):
    doit1()

Function 2 places import outside:

import string ###### import statement outside function
def doit2():
    string.lower('Python')

for num in range(100000):
    doit2()

Function 2 runs faster because it only import once

>>> def doit1():
... import string
... string.lower('Python')
...
>>> import string
>>> def doit2():
... string.lower('Python')
...
>>> import timeit
>>> t = timeit.Timer(setup='from __main__ import doit1', stmt='doit1()')
>>> t.timeit()
11.479144930839539
>>> t = timeit.Timer(setup='from __main__ import doit2', stmt='doit2()')
>>> t.timeit()
4.6661689281463623
Use `string` methods instead
of importing `string` module
There are cases where you don't even need to import `string`

def doit3():
    'Python'.lower()

for num in range(100000):
    doit3()

>>> def doit3():
... 'Python'.lower()
...
>>> t = timeit.Timer(setup='from __main__ import doit3', stmt='doit3()')
>>> t.timeit()
2.5606080293655396

This is only useful if you `string` module was not imported at all. If it is
already loaded from other modules, avoiding to import it doesn't make any
difference. To see if it is loaded, use `sys.modules`.
Lazy imports can be used
This will import `email` only once which is on the first invocation of
parse_email()

email = None

def parse_email():
    global email
    if email is None:
        import email
    ...
Data aggregation
Putting loop inside a function is faster then looping the function

example 1:

import time
x = 0
def doit1(i):
    global x
    x = x + i

list = range(100000)
t = time.time()
for i in list:
    doit1(i)

print "%.3f" % (time.time()-t)

example 2:

import time
x = 0
def doit2(list):
    global x
    for i in list:
        x = x + i

list = range(100000)
t = time.time()
doit2(list)
print "%.3f" % (time.time()-t)

The second example is faster. Here's a demo:

>>> t = time.time()
>>> for i in list:
... doit1(i)
...
>>> print "%.3f" % (time.time()-t)
0.758
>>> t = time.time()
>>> doit2(list)
>>> print "%.3f" % (time.time()-t)
0.204
Reduce interpreter interval
checks
You can set `sys.setcheckinterval` to a higher value to reduce the times the
interpreter does periodic checks.


Some Tools to benchmark program speed
-------------------------------------

trace
  - available under `sys.path`
  - usage: trace.py -t spam.py eggs
  - or simply hit: python -m trace

runsnake
  - GUI tool
  - usage: runsnake some_profile_dump.prof

pycallgraph
  - creates call graphs for python programs
  - generates PNG file showing the graph traces

No comments:

Post a Comment