Thursday, May 13, 2021

Python Strings

Introduction 
------------ 

 

- strings are also known as "string literals" 

 

strings are contained inside '...' or "..." 

>>> 'spam eggs'  # single quotes 

'spam eggs' 

>>> 'doesn\'t'  # use \' to escape the single quote... 

"doesn't" 

>>> "doesn't"  # ...or use double quotes instead 

"doesn't" 

>>> '"Yes," he said.' 

'"Yes," he said.' 

>>> "\"Yes,\" he said." 

'"Yes," he said.' 

>>> '"Isn\'t," she said.' 

'"Isn\'t," she said.' 

print() function produces a more 
readable output 

>>> '"Isn\'t," she said.' 

'"Isn\'t," she said.' 

>>> print('"Isn\'t," she said.') 

"Isn't," she said. 

>>> s = 'First line.\nSecond line.'  # \n means newline 

>>> s  # without print(), \n is included in the output 

'First line.\nSecond line.' 

>>> print(s)  # with print(), \n produces a new line 

First line. 

Second line. 

`r` or "raw strings" this prevents escaping 

characters 

>>> print('C:\some\name')  # here \n means newline! 

C:\some 

ame 

>>> print(r'C:\some\name')  # note the r before the quote 

C:\some\name 

printing string literals can span multiple 
lines by using """ or ''' 

# includes end line 
>>> print(""" 

... hi there 

... im ok 

... """) 

 

hi there 

im ok 

 

>>> 

 

# prevents including end line by using `\` 

>>> print("""\ 

... hi there 

... im ok 

... """) 

hi there 

im ok 

 

>>> 

strings can be combined using '+' and 
repeated using '*' 

>>> 'hello' + 'world' 

'helloworld' 

>>> 'a' + 'b' + 'c' 

'abc' 

>>> 3 * 'yada' 

'yadayadayada' 

>>> 'shout: ' + 4 * 'oooo' 

'shout: oooooooooooooooo' 

>>> 

Strings beside each other can also be 
combined w/o using '+' but this doesn't 
work when a string literal is beside a 
variable. Use again '+' if you want to 
achieve the latter. 

>>> 'hello' 'world' 

'helloworld' 

>>> 'a' 'b' 'c' 

'abc' 

>>> 

>>> firstname = 'John' 

>>> firstname 'Williams' 

  File "<stdin>", line 1 

    firstname 'Williams' 

                       ^ 

SyntaxError: invalid syntax 

>>> 

>>> lastname = 'Williams' 

>>> firstname + lastname 

'JohnWilliams' 

>>> 

You may break long strings into several 
lines by this method 

>>> z = ('This is a very long string ' 

...      'and will end at this line') 

>>> print(z) 

This is a very long string and will end at this line 

>>> 

String literals can be indexed 
(subscripted) with the first character 
having index 0. This allows obtaining a 
specific character. 

>>> fruit = 'apple' 

>>> fruit[0] 

'a' 

>>> fruit[4] 

'e' 

>>> fruit[0] + fruit[4] 

'ae' 

>>> 10 * fruit[0] 

'aaaaaaaaaa' 

>>> 

Indices can be referenced starting from the 
right using negative numbers 

# using same example as above 
>>> fruit[-1] 

'e' 

>>> fruit[-2] 

'l' 

>>> 

String literals can also be sliced to obtain 
substrings. The starting index is always 
included while the last index is not. This 
makes sure that s[:i] + s[i:] is always 

equal to s. 
 
Summary of slices and indices: 

 

 +---+---+---+---+---+ 

 | a | p | p | l | e | 

 +---+---+---+---+---+ 

 0   1   2   3   4   5 

-5  -4  -3  -2  -1 

# using same example as above 
>>> fruit[0:2]  # characters from 0 (included) to 2 (excluded) 

'ap' 

>>> fruit[2:4]  # characters from 2 (included) to 4 (excluded 

'pl' 

>>> 

>>> fruit[:3]  # index 0 (included) to 3 (excluded) 

'app' 

>>> fruit[2:]  # index 2 (included) to last (excluded) 

'ple' 

>>> 

>>> fruit[-3:] # index -3 (included) to the last 

'ple'          # why last is included ^ ?? 

>>> 

Error handling in slices and indices. 
Accessing an index too large will result in 
an error while doing same thing in slices 
are handled gracefully. 

>>> fruit[100] 

Traceback (most recent call last): 

  File "<stdin>", line 1, in <module> 

IndexError: string index out of range 

>>> fruit[:100] 

'apple' 

>>> fruit[-100:] 

'apple' 

>>> fruit[3:100] 

'le' 

>>> 

String literals are immutable. Assigning a 
value on an index or a slice will result to 
an error. Only way is to create a new 
string. 

>>> fruit[0] = 'x' 

Traceback (most recent call last): 

  File "<stdin>", line 1, in <module> 

TypeError: 'str' object does not support item assignment 

>>> fruit[1:3] = 'xyz' 

Traceback (most recent call last): 

  File "<stdin>", line 1, in <module> 

TypeError: 'str' object does not support item assignment 

>>> 

built-in function len() returns the length 
of a string 

>>> len(fruit) 

5 

>>> 

 

String Formatting 

----------------- 

 

Using str() and repr() 

str() is used to convert an object to string while repr() is used to print out what the interpreter 
sees on the input. 
 
As an example, let's convert an integer into string using str() 

>>> type(2) 

<class 'int'> 

>>> type(str(2)) 

<class 'str'> 

>>> 

 

On the other hand, repr() will show you how the interpreter sees the input. 

>>> s = 'hello world\n' 

>>> print(s) 

hello world      # --> this is what human see 

 

>>> print(repr(s)) 

'hello world\n'  # --> this is what the interpreter see 

>>> 

 

Other examples from official docs. 

>>> s = 'Hello, world.' 

>>> str(s) 

'Hello, world.' 

>>> repr(s) 

"'Hello, world.'" 

>>> str(1/7) 

'0.14285714285714285' 

>>> x = 10 * 3.25 

>>> y = 200 * 200 

>>> s = 'The value of x is ' + repr(x) + ', and y is ' + repr(y) + '...' 

>>> print(s) 

The value of x is 32.5, and y is 40000... 

>>>                          # The repr() of a string adds string quotes and backslashes: 

... hello = 'hello, world\n' 

>>> hellos = repr(hello) 

>>> print(hellos) 

'hello, world\n' 

>>>                          # The argument to repr() may be any Python object: 

... repr((x, y, ('spam', 'eggs'))) 

"(32.5, 40000, ('spam', 'eggs'))" 

Making tables of squares 
and cubes 

>>> for x in range(1, 11): 

...     print(repr(x).rjust(2), repr(x*x).rjust(3), end=' ') 

...     # Note use of 'end' on previous line 

...     print(repr(x*x*x).rjust(4)) 

... 

 1   1    1 

 2   4    8 

 3   9   27 

 4  16   64 

 5  25  125 

 6  36  216 

 7  49  343 

 8  64  512 

 9  81  729 

10 100 1000 

 

>>> for x in range(1, 11): 

...     print('{0:2d} {1:3d} {2:4d}'.format(x, x*x, x*x*x)) 

... 

 1   1    1 

 2   4    8 

 3   9   27 

 4  16   64 

 5  25  125 

 6  36  216 

 7  49  343 

 8  64  512 

 9  81  729 

10 100 1000 
 
*** need more reading on this 

Using zfill() to pad zeros 
on strings 

>>> 2.zfill(5)  # since zfill operates on strings, need to convert this first 

  File "<stdin>", line 1 

    2.zfill(5) 

          ^ 

SyntaxError: invalid syntax 

>>> '2'.zfill(5)  # first way to convert 

'00002' 

>>> str(2).zfill(5)  # and another way 

'00002' 

>>> '12'.zfill(5) 

'00012' 

>>> '-3.14'.zfill(7) 

'-003.14' 

>>> '3.14159265359'.zfill(5) 

'3.14159265359' 

Using format() together 
with print() 

Basic usage: 

>>> print('We are the {} who say "{}!"'.format('knights', 'Ni')) 

We are the knights who say "Ni!" 

>>> 
 
{} are called "format fields". You may use positional arguments to specify their values. 

>>> print('{0} and {1}'.format('spam', 'eggs')) 

spam and eggs 

>>> print('{1} and {0}'.format('spam', 'eggs')) 

eggs and spam 

>>> 

 

Keyword arguments can also be used to specify value in format fields. 

>>> print('This {food} is {adjective}.'.format( 

...       food='spam', adjective='absolutely horrible')) 

This spam is absolutely horrible. 

>>> 
 
Combining positional and keyword arguments: 

>>> print('The story of {0}, {1}, and {other}.'.format('Bill', 'Manfred', 

...       other='Georg')) 

The story of Bill, Manfred, and Georg. 

>>> 
 

'!a' (apply ascii()), '!s' (apply str()) and '!r' (apply repr()) can be used to convert 

the value before it is formatted: 

>>> contents = 'eels' 

>>> print('My hovercraft is full of {}.'.format(contents)) 

My hovercraft is full of eels. 

>>> print('My hovercraft is full of {!r}.'.format(contents)) 

My hovercraft is full of 'eels'. 
 
Number of decimal places can be also limited using optional ":" in the format field 

>>> import math 

>>> print('The value of PI is approximately {0:.3f}.'.format(math.pi)) 

The value of PI is approximately 3.142. 
 
Specifiying minimum width of a table: 

>>> table = {'Sjoerd': 4127, 'Jack': 4098, 'Dcab': 7678} 

>>> for name, phone in table.items(): 

...     print('{0:10} ==> {1:10d}'.format(name, phone)) 

... 

Jack       ==>       4098 

Dcab       ==>       7678 

Sjoerd     ==>       4127 
 

If you have a really long format string that you don’t want to split up, it would be nice if 

you could reference the variables to be formatted by name instead of by position. This can be 

done by simply passing the dict and using square brackets '[]' to access the keys 

>>> table = {'Sjoerd': 4127, 'Jack': 4098, 'Dcab': 8637678} 

>>> print('Jack: {0[Jack]:d}; Sjoerd: {0[Sjoerd]:d}; ' 

...       'Dcab: {0[Dcab]:d}'.format(table)) 

Jack: 4098; Sjoerd: 4127; Dcab: 8637678 
 

A shorter way of doing the above example. 

>>> table = {'Sjoerd': 4127, 'Jack': 4098, 'Dcab': 8637678} 

>>> print('Jack: {Jack:d}; Sjoerd: {Sjoerd:d}; Dcab: {Dcab:d}'.format(**table)) 

Jack: 4098; Sjoerd: 4127; Dcab: 8637678 

Old string formatting 

The % operator can also be used for string formatting. It interprets the left argument much 

like a sprintf()-style format string to be applied to the right argument, and returns the 

string resulting from this formatting operation. For example: 

>>> import math 

>>> print('The value of PI is approximately %5.3f.' % math.pi) 

The value of PI is approximately 3.142. 

Advanced slicing techniques 

Gets reverse order 
>>> s = 'rise to vote sir' 

>>> s[::-1] 

'ris etov ot esir' 

>>>  
 
Prints by 2 steps (same as getting all even indices) 

>>> s = 'H1e2l3l4o5w6o7r8l9d' 

>>> s[::2] 

'Helloworld' 

>>> 

Converts bytes into string 

bytes.decode(encoding="utf-8") 

  - where `bytes` may be in the form of b'[]\n' 

 

JSON 

---- 

 

- "json" module can convert dictionary to json and vice versa 

 

Converting a json to dict 

>>> import json 

>>> json_string = '{"first_name": "Guido", "last_name":"Rossum"}' 

>>> x = json.loads(json_string) 

>>> x['first_name'] 

'Guido' 

Converts dict into json 

>>> dict = {'name': 'John', 'occupation': 'driver'} 

>>> json.dumps(dict) 

'{"name": "John", "occupation": "driver"}' 

>>> 

 

Sources 

------- 

 

Complete overview of string formatting 

No comments:

Post a Comment