9.1.2. String Methods#

import sys
from pathlib import Path

# Find project root by looking for _config.yml
current = Path.cwd()
for parent in [current, *current.parents]:
    if (parent / '_config.yml').exists():
        project_root = parent
        break
else:
    project_root = Path.cwd().parent.parent

# Add project root to path
sys.path.insert(0, str(project_root))

# Import shared teaching helpers and cell magics
from shared import thinkpython, diagram, jupyturtle, structshape
from shared.download import download

Python provides strings methods that perform a variety of useful operations. A method is similar to a function, it usually takes arguments and returns a value. But the syntax for methods is different from that of functions. A method belongs to an object, so, for example, the method upper() that returns a new all uppercase string has to come after a string object with a . (dot notation), which makes the method syntax like'banana'.upper() to output ‘BANANA’, instead of what a function would look like upper('banana').

word = 'banana'
new_word = word.upper()
new_word
'BANANA'

This use of the dot operator specifies the name of the method, upper, and the name of the string to apply the method to, word. The empty parentheses indicate that this method takes no arguments.

A method call is called an invocation; in this case, we would say that we are invoking upper on word.

methods = [m for m in dir(str) if not m.startswith('_')]
num_str_methods = len(methods)
print(num_str_methods)  # 47
print(methods)
47
['capitalize', 'casefold', 'center', 'count', 'encode', 'endswith', 'expandtabs', 'find', 'format', 'format_map', 'index', 'isalnum', 'isalpha', 'isascii', 'isdecimal', 'isdigit', 'isidentifier', 'islower', 'isnumeric', 'isprintable', 'isspace', 'istitle', 'isupper', 'join', 'ljust', 'lower', 'lstrip', 'maketrans', 'partition', 'removeprefix', 'removesuffix', 'replace', 'rfind', 'rindex', 'rjust', 'rpartition', 'rsplit', 'rstrip', 'split', 'splitlines', 'startswith', 'strip', 'swapcase', 'title', 'translate', 'upper', 'zfill']
from myst_nb import glue
glue("num_str_methods", num_str_methods)
47

Python offers 47 string methods. Here below is a collection of some of the commonly used ones.

Category

Method

Description

Case

.upper()

All uppercase

Case

.lower()

All lowercase

Search

.find(x)

Index of first match, -1 if missing

Search

.index(x)

Index of first match, raises error if missing

Search

.count(x)

Count occurrences

Whitespace

.strip()

Remove leading/trailing whitespace

Split

.split(x)

Split on delimiter

Join

.join(lst)

Join list into string

Replace

.replace(a, b)

Replace all occurrences

Check

.isspace()

All whitespace

Check

.isupper()

All uppercase

Check

.islower()

All lowercase

s = "  Hello, World!  "
words = "the quick brown fox"

# Case
print("--- Case ---")
print(s.upper())
print(s.lower())

# Search
print("\n--- Search ---")
print(words.find("quick"))
print(words.index("fox"))
print(words.count("o"))

# Whitespace
print("\n--- Whitespace ---")
print(repr(s.strip()))

# Split
print("\n--- Split ---")
print(words.split(" "))

# Join
print("\n--- Join ---")
print(", ".join(["apple", "banana", "cherry"]))

# Replace
print("\n--- Replace ---")
print(words.replace("fox", "cat"))

# Check
print("\n--- Check ---")
print("   ".isspace())
print("HELLO".isupper())
print("hello".islower())
--- Case ---
  HELLO, WORLD!  
  hello, world!  

--- Search ---
4
16
2

--- Whitespace ---
'Hello, World!'

--- Split ---
['the', 'quick', 'brown', 'fox']

--- Join ---
apple, banana, cherry

--- Replace ---
the quick brown cat

--- Check ---
True
True
True

9.1.2.1. Case Methods#

Python provides several methods for changing the case of a string. These are useful for normalizing text before comparison or display.

s = 'hello, world!'

print(s.upper())        # 'HELLO, WORLD!'  — all uppercase
print(s.lower())        # 'hello, world!'  — all lowercase
print(s.title())        # 'Hello, World!'  — first letter of each word capitalized
print(s.capitalize())   # 'Hello, world!'  — first letter of string capitalized
print(s.swapcase())     # 'HELLO, WORLD!'  — swap upper and lower
HELLO, WORLD!
hello, world!
Hello, World!
Hello, world!
HELLO, WORLD!

Case methods are often used to make comparisons case-insensitive. For example, you might want to turn a username or email address all uppercase in the case of user login.

user_input = 'Alice'
username    = 'alice'

print(user_input == username)                    # False
print(user_input.lower() == username.lower())    # True
False
True
### EXERCISE: Case Methods
# Difficulty: Basic
user_input = 'PyThOn'
target = 'python'
# 1. Print user_input in upper, lower, and title case
# 2. Print whether user_input matches target case-insensitively
### Your code starts here:


### Your code ends here.

Hide code cell source

# Solution
user_input = 'PyThOn'
target = 'python'
print(user_input.upper())
print(user_input.lower())
print(user_input.title())
print(user_input.casefold() == target.casefold())
PYTHON
python
Python
True

9.1.2.2. Searching and Testing#

9.1.2.2.1. Finding a Substring#

find(sub) returns the index of the first occurrence of sub, or -1 if not found. index(sub) works the same way but raises a ValueError if the substring is not found.

s = 'data science and data engineering'

print(s.find('data'))     # 0  — first occurrence
print(s.find('data', 5))  # 17 — search starting at index 5
print(s.find('math'))     # -1 — not found

print(s.rfind('data'))    # 17 — last occurrence
0
17
-1
17

9.1.2.2.2. Counting Occurrences#

count(sub) returns the number of non-overlapping occurrences of a substring.

s = 'banana'
print(s.count('a'))    # 3
print(s.count('an'))   # 2
3
2

9.1.2.2.3. Starts and Ends With#

startswith(prefix) and endswith(suffix) test whether a string begins or ends with a given substring. Both return True or False.

filename = 'report_2025.csv'

print(filename.startswith('report'))   # True
print(filename.endswith('.csv'))       # True
print(filename.endswith('.xlsx'))      # False
True
True
False

9.1.2.2.4. The in Operator#

The in operator tests whether a substring appears anywhere in a string. It is the most readable way to check for membership.

s = 'machine learning'

print('learning' in s)    # True
print('deep' in s)        # False
print('deep' not in s)    # True
True
False
True
### EXERCISE: Searching and Testing
# Difficulty: Intermediate
sentence = 'data science uses data pipelines'
# 1. Find the index of the first "data"
# 2. Find the index of "data" starting from position 5
# 3. Count how many times "data" appears
# 4. Check if "science" is in sentence
### Your code starts here:


### Your code ends here.

Hide code cell source

# Solution
sentence = 'data science uses data pipelines'
print(sentence.find('data'))
print(sentence.find('data', 5))
print(sentence.count('data'))
print('science' in sentence)
0
18
2
True

9.1.2.3. Cleaning#

Real-world text data often contains extra whitespace or unwanted characters. Python provides several methods for cleaning strings.

9.1.2.3.1. Stripping Whitespace#

  • strip() removes leading and trailing whitespace.

  • lstrip() (left strip) removes only leading whitespace.

  • rstrip() (right strip) removes only trailing whitespace.

s = '   hello, world!   '

print(repr(s.strip()))    # 'hello, world!'
print(repr(s.lstrip()))   # 'hello, world!   '
print(repr(s.rstrip()))   # '   hello, world!'
'hello, world!'
'hello, world!   '
'   hello, world!'

You can also pass a character to strip. For example, s.strip('.') removes leading and trailing periods.

s = '...hello...'
print(s.strip('.'))    # 'hello'
hello

9.1.2.3.2. Replacing Substrings#

replace(old, new) returns a new string with all occurrences of old replaced by new. An optional third argument limits the number of replacements.

s = 'I like cats. Cats are great.'

print(s.replace('cats', 'dogs'))        # replace all
print(s.replace('cats', 'dogs', 1))     # replace first occurrence only

# Useful for removing characters
s2 = 'hello, world!'
print(s2.replace(',', '').replace('!', ''))   # 'hello world'
I like dogs. Cats are great.
I like dogs. Cats are great.
hello world
### EXERCISE: Cleaning Strings
# Difficulty: Intermediate
raw = '...  Hello, Python!  ...'
# 1. Strip leading/trailing dots
# 2. Strip leading/trailing whitespace from the result
# 3. Replace "Python" with "Data Science"
# 4. Print the cleaned string
### Your code starts here:


### Your code ends here.

Hide code cell source

# Solution
raw = '...  Hello, Python!  ...'
clean = raw.strip('.').strip().replace('Python', 'Data Science')
print(clean)
Hello, Data Science!

9.1.2.4. Splitting and Joining#

9.1.2.4.1. Splitting#

split(sep) breaks a string into a list of substrings at each occurrence of the separator sep. If no separator is given, it splits on any whitespace and removes empty strings.

s = 'Python,R,SQL,Julia'
print(s.split(','))           # ['Python', 'R', 'SQL', 'Julia']

s2 = 'one two   three'
print(s2.split())             # ['one', 'two', 'three']

# Split on a specific delimiter, keeping empty strings
s3 = 'a,,b,,c'
print(s3.split(','))          # ['a', '', 'b', '', 'c']

# Limit the number of splits
s4 = '2025-08-26'
print(s4.split('-', 1))       # ['2025', '08-26']
['Python', 'R', 'SQL', 'Julia']
['one', 'two', 'three']
['a', '', 'b', '', 'c']
['2025', '08-26']

9.1.2.4.2. Joining#

join(iterable) is the inverse of split(). It concatenates a list of strings into one string, inserting the separator between each element.

words = ['Python', 'is', 'fun']

print(' '.join(words))     # 'Python is fun'
print('-'.join(words))     # 'Python-is-fun'
print(''.join(words))      # 'Pythonisfun'

# Practical: reassemble a cleaned sentence
sentence = '  too   many   spaces  '
cleaned  = ' '.join(sentence.split())
print(cleaned)             # 'too many spaces'
Python is fun
Python-is-fun
Pythonisfun
too many spaces
### EXERCISE: Splitting and Joining
# Difficulty: Intermediate
record = 'alice,bob,charlie'
# 1. Split the record into a list of names
# 2. Join names with " - "
# 3. Print both the list and joined string
### Your code starts here:


### Your code ends here.

Hide code cell source

# Solution
record = 'alice,bob,charlie'
names = record.split(',')
joined = ' - '.join(names)
print(names)
print(joined)
['alice', 'bob', 'charlie']
alice - bob - charlie

9.1.2.5. String Formatting#

String formatting inserts values into a string template. Python offers three approaches: f-strings (modern, recommended), str.format(), and % formatting (legacy).

9.1.2.5.1. f-Strings#

An f-string is prefixed with f and uses {} to embed expressions directly inside the string. F-strings are the most readable and most commonly used approach.

name  = 'Alice'
score = 95.678

print(f'Student: {name}')
print(f'Score: {score:.2f}')        # 2 decimal places
print(f'Score: {score:>10.2f}')     # right-aligned, width 10
print(f'{name.upper()}')              # apply conversion (capitalize)
print(f'Double score: {score * 2}') # expressions work inside {}
Student: Alice
Score: 95.68
Score:      95.68
ALICE
Double score: 191.356

9.1.2.5.2. Format Specification Mini-Language#

Inside {}, a colon : introduces a format spec that controls how the value is displayed.

Spec

Meaning

Example

.2f

2 decimal places (float)

3.14

d

integer

42

e

scientific notation

3.14e+00

%

percentage

75.00%

>10

right-align, width 10

      3.14

<10

left-align, width 10

3.14     

^10

center, width 10

  3.14 

,

thousands separator

1,000,000

pi = 3.14159265
n  = 1000000
r  = 0.756

print(f'{pi:.4f}')      # '3.1416'
print(f'{pi:e}')        # '3.141593e+00'
print(f'{n:,}')         # '1,000,000'
print(f'{r:.1%}')       # '75.6%'
print(f'{pi:^10.2f}')   # '   3.14   '
3.1416
3.141593e+00
1,000,000
75.6%
   3.14   

9.1.2.5.3. str.format()#

str.format() is an older but still widely used formatting approach. Values are passed as arguments and inserted into {} placeholders.

name  = 'Bob'
grade = 88.5

print('Name: {}, Grade: {:.1f}'.format(name, grade))
print('Name: {0}, Grade: {1:.1f}'.format(name, grade))   # positional
print('Name: {n}, Grade: {g:.1f}'.format(n=name, g=grade))  # keyword
Name: Bob, Grade: 88.5
Name: Bob, Grade: 88.5
Name: Bob, Grade: 88.5
### EXERCISE: String Formatting
# Difficulty: Intermediate
name = 'Alice'
score = 92.456
# 1. Print name and score with score rounded to 1 decimal place using f-string
# 2. Print score as a percentage with 1 decimal place (assume score/100)
# 3. Print name right-aligned in width 10
### Your code starts here:


### Your code ends here.

Hide code cell source

# Solution
name = 'Alice'
score = 92.456
print(f'Name: {name}, Score: {score:.1f}')
print(f'Percent: {score/100:.1%}')
print(f'{name:>10}')
Name: Alice, Score: 92.5
Percent: 92.5%
     Alice

9.1.2.6. Type-Checking Methods#

Python strings have a family of is*() methods that test the character composition of a string. Each returns True or False.

Method

Returns True if…

isdigit()

all characters are digits (0–9)

isalpha()

all characters are letters

isalnum()

all characters are letters or digits

isspace()

all characters are whitespace

isupper()

all cased characters are uppercase

islower()

all cased characters are lowercase

istitle()

string is in title case

print('12345'.isdigit())     # True
print('abc'.isalpha())       # True
print('abc123'.isalnum())    # True
print('   '.isspace())       # True
print('HELLO'.isupper())     # True
print('hello'.islower())     # True
print('Hello World'.istitle()) # True

# Mixed cases return False
print('abc123!'.isalnum())   # False — '!' is not alphanumeric
print(''.isdigit())          # False — empty string
True
True
True
True
True
True
True
False
False

These methods are useful for input validation:

user_input = '2025'

if user_input.isdigit():
    year = int(user_input)
    print(f'Valid year: {year}')
else:
    print('Please enter a number.')
Valid year: 2025
### EXERCISE: Type-Checking Methods
# Difficulty: Intermediate
samples = ['123', 'abc', 'abc123', '   ', 'Hello World']
# 1. For each sample, print isdigit, isalpha, and isalnum results
# 2. For "Hello World", print istitle result
### Your code starts here:


### Your code ends here.

Hide code cell source

# Solution
samples = ['123', 'abc', 'abc123', '   ', 'Hello World']
for s in samples:
    print(s, s.isdigit(), s.isalpha(), s.isalnum())
print('Hello World'.istitle())
123 True False True
abc False True True
abc123 False False True
    False False False
Hello World False False False
True

9.1.2.7. Methods Reference#

Python provides a number of function and methods for string operations. The commonly used methods are:

Operation

Syntax

Description

Length

len(s)

Number of characters

Indexing

s[i]

Character at position i

Slicing

s[start:stop:step]

Extract substring

Concatenation

s1 + s2

Join two strings

Repetition

s * n

Repeat string n times

Uppercase

s.upper()

All uppercase

Lowercase

s.lower()

All lowercase

Title case

s.title()

Capitalize each word

Find

s.find(sub)

Index of first match, or -1

Count

s.count(sub)

Number of occurrences

Membership

sub in s

Test if substring present

Strip

s.strip()

Remove leading/trailing whitespace

Replace

s.replace(old, new)

Substitute substring

Split

s.split(sep)

String → list

Join

sep.join(list)

List → string

f-string

f'{var:.2f}'

Formatted string literal

Type check

s.isdigit(), etc.

Test character composition

### EXERCISE: Methods Reference Practice
# Difficulty: Intermediate
s = '  banana split  '
# Use at least 4 methods from this section to:
# 1. Remove outer spaces
# 2. Replace "split" with "bread"
# 3. Convert to uppercase
# 4. Check whether "BANANA" is in the final string
### Your code starts here:


### Your code ends here.

Hide code cell source

# Solution
s = '  banana split  '
t = s.strip().replace('split', 'bread').upper()
print(t)
print('BANANA' in t)
BANANA BREAD
True