Regex for Beginners: A Practical Guide with Real Examples
Regular expressions — "regex" for short — are one of the most powerful tools in computing, and also one of the most feared. A string like ^[\w.-]+@[\w-]+\.[\w.]+$ looks like someone fell asleep on their keyboard. But once you learn the building blocks, regex becomes a superpower for searching, validating, and transforming text.
What Is a Regular Expression?
A regex is a pattern that describes a set of strings. Think of it as a search query on steroids. Where a normal "Find" looks for exact text, regex lets you say things like "find any word that starts with a capital letter and ends with a number" or "match anything that looks like a phone number."
Every programming language supports regex, and so do most text editors (VS Code, Sublime Text, Notepad++). Once you learn the syntax, you can use it everywhere.
The Building Blocks
Literal characters match themselves. hello matches the word "hello" exactly. Simple enough.
Character classes match one character from a set. [aeiou] matches any vowel. [0-9] matches any digit. The shorthand \d does the same thing.
Quantifiers control how many times something repeats. + means "one or more," * means "zero or more," and {3} means "exactly three." So \d{3}-\d{4} matches a pattern like "555-1234."
Anchors don't match characters — they match positions. ^ means "start of line" and $ means "end of line." Using both ensures your pattern matches the entire string, not just a substring.
Five Patterns You'll Actually Use
1. Email validation (basic): ^[\w.-]+@[\w-]+\.[\w.]+$ — matches most standard email addresses. Not bulletproof (email RFCs are wild), but covers 99% of real-world input.
2. Extract numbers from text: \d+\.?\d* — matches integers and decimals. Great for scraping prices, measurements, or statistics out of unstructured text.
3. Match a URL: https?://\S+ — a quick and dirty way to pull URLs from text. It matches "http://" or "https://" followed by any non-whitespace characters.
4. Find duplicated words: \b(\w+)\s+\1\b — uses a "backreference" to find repeated words like "the the." Handy for proofreading.
5. Remove HTML tags: <[^>]+> — strips HTML tags from a string. Useful for extracting plain text from markup.
Common Mistakes to Avoid
Greedy matching is the most common trap. The pattern <.+> looks like it should match a single HTML tag, but + is greedy — it will match from the first < to the last > on the line. Add ? to make it lazy: <.+?>.
Forgetting to escape special characters is another pitfall. Characters like ., +, and ( have special meaning in regex. To match a literal period, use \..
Practice Makes Perfect
The fastest way to learn regex is to experiment with a live tester. Our Regex Tester highlights matches in real time as you type your pattern and test string. You can see exactly what's matching (and what isn't) without running any code — and since it runs entirely in your browser, your test data never leaves your machine.