Outline

Let's go over some of the basics of strings and string formatting. Python's built-in string class is named "str". Single quotes, double quotes, and tripple quotes are the same when denoting a string. Single quotes are most commonly used. Backslash escapes work the usual way within both single and double quoted literals. If you need single quote literals in your string you can use double quotes. Likewise, single quoted strings can contain double quotes. Let's look at some examples:

# All of these assignments are the same
foo = 'example1'
foo = "example2"
foo = """example3"""

# If you need to extend to multiple lines, use Triple quotes
foo = """ There is not a difference in using 
the different quotes to make
all of the different kinds of strings """

# If you need to wrap multiple lines without triple quotes you can use "\"
foo = 'This is a string that ' \
    'that needs to wrap multiple lines'

# You can replace items in the string
new_foo = foo.replace('multiple', 'many')
print(new_foo)

String Formatting

The Back Story

In the early day's of Python, you had a couple of ways of formatting a Python string: str.format() and %-formatting. It's important that you understand how they are used and what their limitations are.

%-formatting

Check out the Python documentation on %-Formatting

A word of caution from the Python documentation:

Note: The formatting operations described here exhibit a variety of quirks that lead to a number of common errors (such as failing to display tuples and dictionaries correctly). Using the newer formatted string literals, the str.format() interface or template strings may help avoid these errors. Each of these alternatives provides their own trade-offs and benefits of simplicity, flexibility, and/or extensibility.

String objects have a built-in operation using the % operator, which you can use to format strings. Here is an example:

def test(got, expected):
    if got == expected:
        prefix = ' OK '
    else:
        prefix = '  X '
    print '%s got: %s expected: %s' % (prefix, repr(got), repr(expected))


test('foo', 'bar')

str.format()

Python improved on the %-formatting with str.format(). With str.format(), you use {} to mark the replacement fields. Extensible through the __format__() method

>>> 'Hello, my name is {}.  I am a {}.'.format(name, profession)
'Hello, my name is John.  I am a Janitor.'

You can also move the order of variables in the tuple and reference them by their index:

>>> 'Hello, my name is {1}.  I am a {0}'.format(profession, name)
'Hello, my name is John.  I am a Janitor.'

One of the issues with using str.format() is that it can get even more complicated from here. Although it's better than %-formatting, it can still be noisy when you are dealing with multiple parameters and longer strings.

f-strings

F-Strings provide a concise, readable way to include the value of Python expressions inside strings.

The advantage of using the f-string is that it supports the __format()__ protocol which allows it to be extended to additional types that want to control how they are converted to strings.

You can use 'f' or 'F', although 'f' is preferred

>>> foo = f'this is a string'
>>> print(foo)
this is a string
>>> foo = F'this is a string'
>>> print(foo)
this is a string

You can also use it in combination 'r' or 'R'

>>> foo = rf'this is a raw string \to enter\ text in'
>>> print(foo)
this is a raw string \to enter\ text in

Similar to str.format() without the extra method call.

>>> foo = 30
>>> print(f'Congradulations you are {str(foo)} years old!')
Congradulations you are 30 years old!
Get familiar with the Python documentation on formatted string literals

You can call functions and handle conditional statements inside the brackets

import math

radius = 10


def area_of_circle(radius):
    return 2*math.pi*radius

print(f'Area of a circle with a radius of {radius} is {area_of_circle(radius)}')
gender = "Female"

print(f'Congrats on your new baby {"boy" if gender == "Male" else "girl"}!')

String Methods

String Methods

Read up on the Python documentation and get familiar with the options.

As in other programming languages, Python has method extensions for strings. here are some of the most common string methods.

$ python
Python 3.7.2 (v3.7.2:9a3ffc0492, Dec 24 2018, 02:44:43) 
[Clang 6.0 (clang-600.0.57)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> foo = 'It was the best of times.  It was the worst of times.  '
>>> # s.lower(), s.upper() -- returns the lowercase or uppercase version of the string
... foo.lower()
'it was the best of times.  it was the worst of times.  '
>>> foo.upper()
'IT WAS THE BEST OF TIMES.  IT WAS THE WORST OF TIMES.  '
>>> # s.strip() -- returns a string with the whitespace removed from the start and end
... foo.strip()
'It was the best of times.  It was the worst of times.'
>>> # s.startswith('word'), s.endswith('word') -- tests if the string starts or ends with the given word string
... foo.startswith('It')
True
>>> foo.endswith('times.')
False
>>> # s.find('word') -- searches for the given other string - not a regular expression - returns the first index where it begins or -1 if not found
... foo.find('best')
11
>>> foo.find('word')
-1
>>> # s.replace('old','new') -- returns a string where all occurrences of 'old' have been replaced by 'new'
... foo.replace('times', 'ages')
'It was the best of ages.  It was the worst of ages.  '
>>> # s.split('delim') -- returns a list of substrings separated by the given delimiter.
... # not a regular expression - .split() with no argument will, by default, 
... # return a list split on whitespace characters.
... bar = foo.split()
>>> bar
['It', 'was', 'the', 'best', 'of', 'times.', 'It', 'was', 'the', 'worst', 'of', 'times.']
>>> # s.join(list) -- opposite of split(), joins the elements of the provided list together 
... # using the string as the delimiter
... foo = '-'.join(bar)
>>> foo
'It-was-the-best-of-times.-It-was-the-worst-of-times.'

Prefer .join to the a += b or a = a + b

Get up to date on some code performance improvements

The code in the Python wiki performance tips is a little outdated but still a good reference point worth mentioning.

Avoid this:

 story = ''

 for word in list:
     story += word

Remember that strings are immutable, so every time you concatenate it, it creates a new string and abandons the previous one. Use ' '.join(list) instead.

 word_list = ['It', 'was', 'the', 'best', 'of', 'times.', 'It', 'was', 'the', 'worst', 'of', 'times.']
 story = ' '.join(word_list)

Avoid this:

 out = '<html>' + head + prologue + query + tail + '</html>'

Instead use:

head = r'<h1>Heading</h1>'
prologue = r'<p>What is the measure of a man?</p>'
query = r'<p>this is the query of the body</p>'
tail = r'<p>The End.</p>'

out = fr'<html>{head}{prologue}{query}{tail}</html>'

# out result
'<html><h1>Heading</h1><p>What is the measure of a man?</p><p>this is the query of the body</p><p>The End.</p></html>'

Slicing and dicing a string

A great place to end this tutorial on strings and move towards collections is to discuss the Python slice syntax. Use this to refer to the sub-parts of strings and lists. The slice str[start:end] is the beginning at start and extending to but excluding end. Consider foo = "Hello World"

>>> foo = 'Hello World'
>>> foo[1:4]          # chars starting at index 1 and extending up to but not including index 4
'ell'
>>> foo[1:]           # omitting either index defaults to the start or end of the string
'ello World'
>>> foo[:7]           
'Hello W'
>>> foo[:]            # omitting both gives us the entire string or list
'Hello World'
>>> foo[-5:-1]         # using the negative index (-), you work backward from the end of the string
'Worl'
>>> foo[-5:]
'World'
>>> foo[:-6]
'Hello'

We can also use slicing to step through the items by their index using theList[x:x:x] format. The third element is the step count.

>>> numList = [ iter for iter in range(11) ]    # create the list using comprehensions
>>> print(numList)
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
>>> print(numList[1::2])        # print all the odd index items
[1, 3, 5, 7, 9]
>>> print(numList[::2])         # print all the even index items
[0, 2, 4, 6, 8, 10]
>>> print(numList[1:6:2])       # print all of the odd index items in a specified range
[1, 3, 5]

This will be a good place to transition into collections.

 

I finished! On to the next chapter