Regular Expressions in MySQL

decodec's Avatar author of Regular Expressions in MySQL
This is an article on Regular Expressions in MySQL in MySQL.
Rated 5.00 By 3 users
A very interesting and useful capability of MySQL is to incorporate Regular Expressions (regex) in SQL queries. The regular expression support in MySQL is extensive. Let's take a look at using Regular Expressions in queries and the supported metacharacters.

Using Regular Expressions in queries



A simple example of using Regular Expressions in a SQL query would be to select all names from a table that start with 'A'.
Code: SQL
SELECT name FROM employees WHERE name REGEXP '^A'
A slight modification in the above example to look for names starting with 'A' or 'D' or 'F' will look like this.
Code: SQL
SELECT name FROM employees WHERE name REGEXP '^(A|D|F)'
If we want to select all names ending with 'P', then the SQL query goes like this
Code: SQL
SELECT name FROM employees WHERE name REGEXP 'P$'
We can use much complex patterns in our SQL queries, but first let's have a look at various MySQL Regular Expression metacharacters.

Regular Expression Metacharacters



*
Matches zero or more instances of the string preceding it
+
Matches one or more instances of the string preceding it
?
Matches zero or one instances of the string preceding it
.
Matches any single character, except a newline
[xyz]
Matches any of x, y, or z (match one of enclosed characters)
[^xyz]
Matches any character not enclosed
[A-Z]
Matches any uppercase letter
[a-z]
Matches any lowercase letter
[0-9]
Matches any digit
^
Anchors the match from the beginning
$
Anchors the match to the end
|
Separates alternatives
{n,m}
String must occur at least n times, but not more than m times
{n}
String must occur exactly n times
{n,}
String must occur at least n times
[[:<:]]
Matches beginning of words
[[:>:]]
Matches ending of words
[:class:]
match a character class i.e.,

[:alpha:] for letters
[:space:] for whitespace
[:punct:] for punctuation
[:upper:] for upper case letters
Extras

MySQL interprets a backslash (\) character as an escape character. To use a backslash in a regular expression, you must escape it with another backslash (\\).

Whether the Regular Expression match is case sensitive or otherwise is decided by the collation method of the table. If your collation method name ends with ci then the comparison/match is case-insensitive, else if it end in cs then the match is case sensitive.

Examples



Checking only for numbers
Code: SQL
SELECT age FROM employees WHERE age REGEXP '^[0-9]+$'

/* starts, ends and contains numbers */
Contains a specific word, for example the skill PHP in skill sets
Code: SQL
SELECT name FROM employees WHERE skill_sets REGEXP '[[:<:]]php[[:>:]]'
Fetching records where employees have entered their 10-digit mobile number as the contact number.
Code: SQL
SELECT name FROM employees WHERE contact_no REGEXP '^[0-9]{10}$'
0
pradeep's Avatar, Join Date: Apr 2005
Team Leader
We can also use RLIKE, it works just like REGEXP.
0
ash0279's Avatar, Join Date: Jan 2008
Newbie Member
this is great!
from my local test results, it seems the more complex the search pattern goes, the faster it performs..
for the developers standpoint, i think this would really help them a lot since regexp are widely used in open source devt tools, esp perl.
my 2 cents..
0
malu's Avatar, Join Date: Feb 2008
Light Poster
Can someone tell me why would this be true?

searching text: [[:<:]]word[[:>:]]

DB: word-word2

Thanks
0
pradeep's Avatar, Join Date: Apr 2005
Team Leader
[[:<:]]word[[:>:]] is a word boundary match!

These markers stand for word boundaries. They match the beginning and end of words, respectively. A word is a sequence of word characters that is not preceded by or followed by word characters. A word character is an alphanumeric character in the alnum class or an underscore (_).
0
mrcosgrove's Avatar, Join Date: Jun 2009
Newbie Member
I dont know if you can help me or not but what i am trying to do is search database for words containing certain letters eg. i fI entered TEST it would find words contianing those letters (or a subset of them)

so how would I write that. I tried:
regexp '[t{0,2}e{0,1},s{0,1}]' but that isnt working it says that its the wrong syntax.

So Im wondering how I would right an expression to do this?

Any help would be appreciated.
0
pradeep's Avatar, Join Date: Apr 2005
Team Leader
Try
Code:
regexp 't{0,2}e{0,1}s{0,1}'
0
gcue's Avatar
Newbie Member
i am trying to match string A to these scenarios (sql table values)

string A = "G283H88620TS"

matching sql table values
MX0G2H883H7426288620TS (contains the string but has characters before, in the middle, and at the end)
UIERG283H88620TSERU (contains the string but has characters before and at the end)
G283H88620TSKDJF (contains the string but has characters at the end
KEKEG283H88620TS (contains the string but has characters at the begining

unfortunately there is no pattern as to where these interrupting characters reside

(this only works for the last 3 scenarios but not the first)

"SELECT * FROM aInventoryInfo WHERE SerialNumber LIKE '%" & $serial & "%'"
0
gicprope's Avatar, Join Date: Oct 2009
Newbie Member
Relational Database Design is one of the most powerful ways to ensure data integrity and a great way to kick-off any project. Very often the first thing developers do when starting a new project, or stub-project, is to design the database. This way the structure of the application is already in place and we just have to fill in the pieces with some server-side code. Iíve found when adding relational constraints to your database design you add in a very powerful error reporting tool that will let you know during the development process that you have allowed something to happen that shouldnít have. In this article, I go through, step by step, showing how to set up a simple relational database and discuss the benefits that are enjoyed.

Letís take a step back and describe what a relational database looks like. In any normal database design there are fields in one table that reference another table. For example, a books table might have a field labeled author_id which is meant to come from a table named authors. Creating hard-coded relations solidifies these associations and actually returns a MySQL error if violated.

As I hinted in the opening I have found this to be invaluable during the development and testing process as MySQL will immediately let me know that I have made a glaring error that otherwise may not have been noticed until after the service has launched. At that point the data could be irreparably corrupt and forced to start from scratch.

So letís get right to it. For the purposes of this article, Iím going to pretend Iím creating a simple Books and Authors website with a simple 2-table setup. The first step is to create our tables.
Code: SQL


CREATE TABLE `library`.`books` (
`id` INT UNSIGNED NOT NULL AUTO_INCREMENT ,
`name` VARCHAR( 150 ) NOT NULL ,
`author_id` INT UNSIGNED NOT NULL ,
PRIMARY KEY ( `id` ) ,
INDEX ( `author_id` )
) ENGINE = InnoDB
0
scvinodkumar's Avatar, Join Date: Nov 2009
Newbie Member
Is this both query are similar? or which query is best to you use.

SELECT name FROM employees WHERE skill_sets REGEXP '[[:<:]]php[[:>:]]'

SELECT name FROM employees WHERE skill_sets LIKE '%php%'