Sunday, July 15, 2018

Strings


Introduction
------------

- strings are also known as "string literals"

strings are contained inside '...' or "..."
>>> 'spam eggs'  # single quotes
'spam eggs'
>>> 'doesn\'t'  # use \' to escape the single quote...
"doesn't"
>>> "doesn't"  # ...or use double quotes instead
"doesn't"
>>> '"Yes," he said.'
'"Yes," he said.'
>>> "\"Yes,\" he said."
'"Yes," he said.'
>>> '"Isn\'t," she said.'
'"Isn\'t," she said.'
print() function produces a more
readable output
>>> '"Isn\'t," she said.'
'"Isn\'t," she said.'
>>> print('"Isn\'t," she said.')
"Isn't," she said.
>>> s = 'First line.\nSecond line.'  # \n means newline
>>> s  # without print(), \n is included in the output
'First line.\nSecond line.'
>>> print(s)  # with print(), \n produces a new line
First line.
Second line.
`r` or "raw strings" this prevents escaping
characters
>>> print('C:\some\name')  # here \n means newline!
C:\some
ame
>>> print(r'C:\some\name')  # note the r before the quote
C:\some\name
printing string literals can span multiple
lines by using """ or '''
# includes end line
>>> print("""
... hi there
... im ok
... """)

hi there
im ok

>>>

# prevents including end line by using `\`
>>> print("""\
... hi there
... im ok
... """)
hi there
im ok

>>>
strings can be combined using '+' and
repeated using '*'
>>> 'hello' + 'world'
'helloworld'
>>> 'a' + 'b' + 'c'
'abc'
>>> 3 * 'yada'
'yadayadayada'
>>> 'shout: ' + 4 * 'oooo'
'shout: oooooooooooooooo'
>>>
Strings beside each other can also be
combined w/o using '+' but this doesn't
work when a string literal is beside a
variable. Use again '+' if you want to
achieve the latter.
>>> 'hello' 'world'
'helloworld'
>>> 'a' 'b' 'c'
'abc'
>>>
>>> firstname = 'John'
>>> firstname 'Williams'
  File "", line 1
    firstname 'Williams'
                       ^
SyntaxError: invalid syntax
>>>
>>> lastname = 'Williams'
>>> firstname + lastname
'JohnWilliams'
>>>
You may break long strings into several
lines by this method
>>> z = ('This is a very long string '
...      'and will end at this line')
>>> print(z)
This is a very long string and will end at this line
>>>
String literals can be indexed
(subscripted) with the first character
having index 0. This allows obtaining a
specific character.
>>> fruit = 'apple'
>>> fruit[0]
'a'
>>> fruit[4]
'e'
>>> fruit[0] + fruit[4]
'ae'
>>> 10 * fruit[0]
'aaaaaaaaaa'
>>>
Indices can be referenced starting from the
right using negative numbers
# using same example as above
>>> fruit[-1]
'e'
>>> fruit[-2]
'l'
>>>
String literals can also be sliced to obtain
substrings. The starting index is always
included while the last index is not. This
makes sure that s[:i] + s[i:] is always
equal to s.

Summary of slices and indices:

 +---+---+---+---+---+
 | a | p | p | l | e |
 +---+---+---+---+---+
 0   1   2   3   4   5
-5  -4  -3  -2  -1
# using same example as above
>>> fruit[0:2]  # characters from 0 (included) to 2 (excluded)
'ap'
>>> fruit[2:4]  # characters from 2 (included) to 4 (excluded
'pl'
>>>
>>> fruit[:3]  # index 0 (included) to 3 (excluded)
'app'
>>> fruit[2:]  # index 2 (included) to last (excluded)
'ple'
>>>
>>> fruit[-3:] # index -3 (included) to the last
'ple'          # why last is included ^ ??
>>>
Error handling in slices and indices.
Accessing an index too large will result in
an error while doing same thing in slices
are handled gracefully.
>>> fruit[100]
Traceback (most recent call last):
  File "", line 1, in
IndexError: string index out of range
>>> fruit[:100]
'apple'
>>> fruit[-100:]
'apple'
>>> fruit[3:100]
'le'
>>>
String literals are immutable. Assigning a
value on an index or a slice will result to
an error. Only way is to create a new
string.
>>> fruit[0] = 'x'
Traceback (most recent call last):
  File "", line 1, in
TypeError: 'str' object does not support item assignment
>>> fruit[1:3] = 'xyz'
Traceback (most recent call last):
  File "", line 1, in
TypeError: 'str' object does not support item assignment
>>>
built-in function len() returns the length
of a string
>>> len(fruit)
5
>>>

String Formatting
-----------------

Using str() and repr()
str() is used to convert an object to string while repr() is used to print out what the interpreter
sees on the input.

As an example, let's convert an integer into string using str()
>>> type(2)
>>> type(str(2))
>>>

On the other hand, repr() will show you how the interpreter sees the input.
>>> s = 'hello world\n'
>>> print(s)
hello world      # --> this is what human see

>>> print(repr(s))
'hello world\n'  # --> this is what the interpreter see
>>>

Other examples from official docs.
>>> s = 'Hello, world.'
>>> str(s)
'Hello, world.'
>>> repr(s)
"'Hello, world.'"
>>> str(1/7)
'0.14285714285714285'
>>> x = 10 * 3.25
>>> y = 200 * 200
>>> s = 'The value of x is ' + repr(x) + ', and y is ' + repr(y) + '...'
>>> print(s)
The value of x is 32.5, and y is 40000...
>>>                          # The repr() of a string adds string quotes and backslashes:
... hello = 'hello, world\n'
>>> hellos = repr(hello)
>>> print(hellos)
'hello, world\n'
>>>                          # The argument to repr() may be any Python object:
... repr((x, y, ('spam', 'eggs')))
"(32.5, 40000, ('spam', 'eggs'))"
Making tables of squares
and cubes
>>> for x in range(1, 11):
...     print(repr(x).rjust(2), repr(x*x).rjust(3), end=' ')
...     # Note use of 'end' on previous line
...     print(repr(x*x*x).rjust(4))
...
 1   1    1
 2   4    8
 3   9   27
 4  16   64
 5  25  125
 6  36  216
 7  49  343
 8  64  512
 9  81  729
10 100 1000

>>> for x in range(1, 11):
...     print('{0:2d} {1:3d} {2:4d}'.format(x, x*x, x*x*x))
...
 1   1    1
 2   4    8
 3   9   27
 4  16   64
 5  25  125
 6  36  216
 7  49  343
 8  64  512
 9  81  729
10 100 1000

*** need more reading on this
Using zfill() to pad zeros
on strings
>>> 2.zfill(5)  # since zfill operates on strings, need to convert this first
  File "", line 1
    2.zfill(5)
          ^
SyntaxError: invalid syntax
>>> '2'.zfill(5)  # first way to convert
'00002'
>>> str(2).zfill(5)  # and another way
'00002'
>>> '12'.zfill(5)
'00012'
>>> '-3.14'.zfill(7)
'-003.14'
>>> '3.14159265359'.zfill(5)
'3.14159265359'
Using format() together
with print()
Basic usage:
>>> print('We are the {} who say "{}!"'.format('knights', 'Ni'))
We are the knights who say "Ni!"
>>>

{} are called "format fields". You may use positional arguments to specify their values.
>>> print('{0} and {1}'.format('spam', 'eggs'))
spam and eggs
>>> print('{1} and {0}'.format('spam', 'eggs'))
eggs and spam
>>>

Keyword arguments can also be used to specify value in format fields.
>>> print('This {food} is {adjective}.'.format(
...       food='spam', adjective='absolutely horrible'))
This spam is absolutely horrible.
>>>

Combining positional and keyword arguments:
>>> print('The story of {0}, {1}, and {other}.'.format('Bill', 'Manfred',
...       other='Georg'))
The story of Bill, Manfred, and Georg.
>>>
 
'!a' (apply ascii()), '!s' (apply str()) and '!r' (apply repr()) can be used to convert
the value before it is formatted:
>>> contents = 'eels'
>>> print('My hovercraft is full of {}.'.format(contents))
My hovercraft is full of eels.
>>> print('My hovercraft is full of {!r}.'.format(contents))
My hovercraft is full of 'eels'.

Number of decimal places can be also limited using optional ":" in the format field
>>> import math
>>> print('The value of PI is approximately {0:.3f}.'.format(math.pi))
The value of PI is approximately 3.142.

Specifiying minimum width of a table:
>>> table = {'Sjoerd': 4127, 'Jack': 4098, 'Dcab': 7678}
>>> for name, phone in table.items():
...     print('{0:10} ==> {1:10d}'.format(name, phone))
...
Jack       ==>       4098
Dcab       ==>       7678
Sjoerd     ==>       4127
 
If you have a really long format string that you don’t want to split up, it would be nice if
you could reference the variables to be formatted by name instead of by position. This can be
done by simply passing the dict and using square brackets '[]' to access the keys
>>> table = {'Sjoerd': 4127, 'Jack': 4098, 'Dcab': 8637678}
>>> print('Jack: {0[Jack]:d}; Sjoerd: {0[Sjoerd]:d}; '
...       'Dcab: {0[Dcab]:d}'.format(table))
Jack: 4098; Sjoerd: 4127; Dcab: 8637678
 
A shorter way of doing the above example.
>>> table = {'Sjoerd': 4127, 'Jack': 4098, 'Dcab': 8637678}
>>> print('Jack: {Jack:d}; Sjoerd: {Sjoerd:d}; Dcab: {Dcab:d}'.format(**table))
Jack: 4098; Sjoerd: 4127; Dcab: 8637678
Old string formatting
The % operator can also be used for string formatting. It interprets the left argument much
like a sprintf()-style format string to be applied to the right argument, and returns the
string resulting from this formatting operation. For example:
>>> import math
>>> print('The value of PI is approximately %5.3f.' % math.pi)
The value of PI is approximately 3.142.
Advanced slicing techniques
Gets reverse order
>>> s = 'rise to vote sir'
>>> s[::-1]
'ris etov ot esir'
>>>

Prints by 2 steps (same as getting all even indices)
>>> s = 'H1e2l3l4o5w6o7r8l9d'
>>> s[::2]
'Helloworld'
>>>
Converts bytes into string
bytes.decode(encoding="utf-8")
  - where `bytes` may be in the form of b'[]\n'

JSON
----

- "json" module can convert dictionary to json and vice versa

Converting a json to dict
>>> import json
>>> json_string = '{"first_name": "Guido", "last_name":"Rossum"}'
>>> x = json.loads(json_string)
>>> x['first_name']
'Guido'
Converts dict into json
>>> dict = {'name': 'John', 'occupation': 'driver'}
>>> json.dumps(dict)
'{"name": "John", "occupation": "driver"}'
>>>

Sources
-------

Complete overview of string formatting

No comments:

Post a Comment