| Strip / clean |
.strip(), .lower(), .replace() |
| Split / join |
.split(sep), sep.join(seq) |
| Membership |
"sub" in s, .find(), .count() |
| Regex search |
re.search(pattern, string) |
| Find all |
re.findall(pattern, string) |
| Substitute |
re.sub(pattern, repl, string) |
| Compile |
re.compile(pattern) for reuse |
| Word freq |
Counter(words).most_common(n) |
| Markov |
{(w1,w2): [w3, ...]} bigram model |