String Pattern Matching
-----------------------
`re`
|
This
module uses regular expressions for advanced string processing.
>>>
import re
>>>
re.findall(r'\bf[a-z]*', 'which foot or hand fell fastest')
['foot',
'fell', 'fastest']h
>>>
re.sub(r'(\b[a-z]+) \1', r'\1', 'cat in the the hat')
'cat in the hat'
For simple cases, you can make use of "string methods" like this one below.
>>>
'tea for too'.replace('too', 'two')
'tea
for two'
Compiling an
expression
--> patterns are compiles into bytecode
and executed by mactching engine in C
--> this way makes it run faster
>>>
import re
>>>
p = re.compile('[a-z]+')
>>>
p
re.compile('[a-z]+')
>>>
p.match("")
>>>
print(p.match(""))
None
>>>
print(p.match("abc"))
<_sre .sre_match="" 3="" match="abc" object="" span="(0,">
>>>
match() vs search()
--> match() - searches at the beginning
of string
--> search() - searches anywhere on the
string
>>>
re.match('[a-z]+', '123abc456')
>>>
>>>
re.search('[a-z]+', '123abc456')
<_sre .sre_match="" 0x7f769f7fd4a8="" at="" object="">
>>>
re.match('[a-z]+', 'abc456')
<_sre .sre_match="" 0x7f769ddf5cc8="" at="" object="">
>>>
|
Templating
----------
`string`
|
You can
use `Template` class from this module to create a base string that has
editable
values.
>>>
from string import Template
>>>
t = Template('${village}folk send $$10 to $cause.')
>>>
t.substitute(village='Nottingham', cause='the ditch fund')
'Nottinghamfolk send $10 to the ditch
fund.'
`substitute()` method will raise `KeyErorr` exception if there is a missing key value but you can bpyass that.
>>>
t = Template('Return the $item to $owner.')
>>>
d = dict(item='unladen swallow')
>>>
t.substitute(d)
Traceback
(most recent call last):
...
KeyError:
'owner'
>>>
t.safe_substitute(d)
'Return the unladen swallow to
$owner.'
You can also do a batch renamer like this:
>>>
import time, os.path
>>>
photofiles = ['img_1074.jpg', 'img_1076.jpg', 'img_1077.jpg']
>>>
class BatchRename(Template):
... delimiter = '%'
>>>
fmt = input('Enter rename style (%d-date %n-seqnum %f-format): ')
Enter
rename style (%d-date %n-seqnum %f-format):
Ashley_%n%f
>>>
t = BatchRename(fmt)
>>>
date = time.strftime('%d%b%y')
>>>
for i, filename in enumerate(photofiles):
... base, ext = os.path.splitext(filename)
... newname = t.substitute(d=date, n=i,
f=ext)
... print('{0} --> {1}'.format(filename,
newname))
img_1074.jpg
--> Ashley_0.jpg
img_1076.jpg
--> Ashley_1.jpg
img_1077.jpg
--> Ashley_2.jpg
|
Tools for working with Lists
----------------------------
`array`
|
Stores
homogeneous data and stores it compactly.
>>>
from array import array
>>>
a = array('H', [4000, 10, 700, 22222])
>>>
sum(a)
26932
>>>
a[1:3]
array('H',
[10, 700])
* NEED MORE READING * |
`deqeue`
|
Can be
used for faster appends and pops from the left but with slower lookups in the
middle.
>>>
from collections import deque
>>>
d = deque(["task1", "task2", "task3"])
>>>
d.append("task4")
>>>
print("Handling", d.popleft())
Handling
task1
unsearched
= deque([starting_node])
def
breadth_first_search(unsearched):
node = unsearched.popleft()
for m in gen_moves(node):
if is_goal(m):
return m
unsearched.append(m)
* NEED MORE READING * |
`bisect`
|
Can
manipulate sorted lists by automatically insert element on their correct
position.
>>>
import bisect
>>>
scores = [(100, 'perl'), (200, 'tcl'), (400, 'lua'), (500, 'python')]
>>>
bisect.insort(scores, (300, 'ruby'))
>>>
scores
[(100,
'perl'), (200, 'tcl'), (300, 'ruby'), (400, 'lua'), (500, 'python')]
|
`heapq`
|
Can be
used by applications that repeatedly access the smallest element(s) but don't
want to
a run a full list sort.
>>>
from heapq import heapify, heappop, heappush
>>>
data = [1, 3, 5, 7, 9, 2, 4, 6, 8, 0]
>>>
heapify(data) #
rearrange the list into heap order
>>>
heappush(data, -5) #
add a new entry
>>>
[heappop(data) for i in range(3)] #
fetch the three smallest entries
[-5,
0, 1]
|
Collections
-----------
collections.Counter()
|
>>>
from collections import Counter
>>>
l = ['a', 'c', 'b', 'd', 'a']
>>>
c = Counter(l)
>>>
c
Counter({'a':
2, 'd': 1, 'c': 1, 'b': 1})
|
Json
----
Basics
|
Sample json data:
{
"a": "apple",
"b": "banana",
"c": "carrot"
}
|
Loading json data
from a file
|
>>>
with open('file.json') as f:
... data = json.load(f)
>>>
data = json.load(open('file.json'))
|
Loading json data
from a string
|
>>>
json.loads('{"a": "apple", "b":
"banana"}')
{'a':
'apple', 'b': 'banana'}
>>>
|
Random
------
Generate random
strings
|
''.join(random.choice(string.ascii_uppercase
+ string.digits) for _ in range(N))
''.join(random.choices(string.ascii_uppercase
+ string.digits, k=N))
|
Itertools
---------
Generates
permutation (useful for cracking passwords)
|
>>>
import itertools
>>>
for i in itertools.permutations('abc'):
... print(i)
...
('a',
'b', 'c')
('a',
'c', 'b')
('b',
'a', 'c')
('b',
'c', 'a')
('c',
'a', 'b')
('c',
'b', 'a')
>>>
|
No comments:
Post a Comment