Saturday, February 6, 2010

Mystry of Regular Expressions in Oracle

Hi,

With Database 10g Oracle came up with this new Regular expression support which takes SQL to the new level. i had no background of regular expression so it took me quite a while to digest it so i m sharing it with u.

By using Regular expression functions we can find solutions of many problems like u can replace your old methods of using nesting functions like substr or instr to achieve the result and also use to validate some specific patterns of telephone nos, zip codes, email addresses and so on.

These functions works same as traditional Oracle functions but with additional regular expression support.

There are 4 modes in Regular expressions functions ('x' in 11g)
'c'  Case sensitive matching
'i'  Insensitive case matching
'm' Treat source string as multi-line activating Anchor chars

'n' Allow the period (.) to match any newline character
'x'  Ignores whitespace characters. By default, whitespace characters match themselves.

Oracle 10g has four regular expressions

- REGEXP_LIKE (source, regexp, modes) (This is Condition function can only be used in Where clause)
- REGEXP_SUBSTR(source, regexp, position, occurrence, modes)
- REGEXP_INSTR (source, regexp, position, occurrence, return_option, modes)
- REGEXP_REPLACE (source, regexp, replacement, position, occurrence, modes)

In Oracle 11g we have 1 more addition

- REGEXP_COUNT (source, regexp, position, occurrence, modes)

Today i m covering REGEXP_LIKE

How it works
- Insert the sample data into test table as well as we will use SCOTT Schema for some examples.
Note: There are many ways to achieve the same results i m providing examples as per my understanding.


create table regexp_table (string_value varchar2(60));
insert into regexp_table values('Car');
insert into regexp_table values('Cartoon');
insert into regexp_table values('bicarbonate');
insert into regexp_table values('www.oracle.com');
insert into regexp_table values('myemail@mydomain.com');
insert into regexp_table values('myemail@gmail.com');
insert into regexp_table values('123-09998-2123');
insert into regexp_table values('+7701-815-3456');
insert into regexp_table values('http://www.google.com/mail');
insert into regexp_table values('http://baigsorcl.blogspot.com/');
insert into regexp_table values('My Blog Examples');
insert into regexp_table values('My Regular Expression Examples');
insert into regexp_table values('$10,000.00');
insert into regexp_table values('12323423434234');
insert into regexp_table values('abc.xyz');
insert into regexp_table values('1951-2007');
commit; 

- Find all rows where word 'car' occurs regardless of Case.


select * from regexp_table
where regexp_like(string_value, 'car','i');

STRING_VALUE
------------------------
Car
Cartoon
bicarbonate

- Find all rows which ends with '.com'  and case sensitive

select * from regexp_table
where regexp_like(string_value, '.com$','c');

STRING_VALUE                                                
---------------------------------
www.oracle.com                                              
myemail@mydomain.com                                        
myemail@gmail.com                                           



-Find rows with only numbers in a column

select * from regexp_table
where regexp_like(string_value, '^[0-9]+$');

STRING_VALUE                                    
---------------------------------
12323423434234                                              

- Find all employees who were hired between 1982 to 1987


select empno, ename, hiredate from scott.emp
where regexp_like(to_char(hiredate,'RRRR'),'^198[2-7]$');

EMPNO                  ENAME      HIREDATE                 
---------------------- ---------- -------------------------
7788                   SCOTT      19-APR-87                
7876                   ADAMS      23-MAY-87                
7934                   MILLER     23-JAN-82                


- Find rows contains either 'http' or '@' in source

select * from regexp_table
where regexp_like(string_value, 'http|@');

STRING_VALUE                                                
------------------------------------------------------------
myemail@mydomain.com                                        
myemail@gmail.com                                           
http://www.google.com/mail                                  
http://baigsorcl.blogspot.com/                              

- Find rows with '$' sign in the source 
(Remember $ is special character for regular expression for this u have to use escape character \)

select * from regexp_table
where regexp_like(string_value,'\$');

STRING_VALUE                                                
------------------------
$10,000.00                                                  

- Find rows with " " space as 3rd character in the string

select * from regexp_table
where regexp_like(string_value,'^.. .');

STRING_VALUE                                                
------------------------------------------------------------
My Blog Examples                                            
My Regular Expression Examples                              


More details can found at Oracle documentation , PSOUG website and for basics about regular expressions at Wikipedia

More examples i will post soon.

Have a nice day,
Baig

3 comments:

  1. such a nice information sharing and thanks for that.
    - Hire Dedicated Resources

    ReplyDelete
  2. HI Baig,

    This is pavan. I am novice in this.

    IN my table one column may contains data like this

    variable=1
    abc=abc
    1=1
    variable=variable
    true=true

    I want to retrieve columns which have same data before and after '='
    example. 1=1 and variable=variable.
    The equal symbol '=' is always there.
    whatever data is there before '=' same data should be there after '=' .
    I want to retrieve such data column.

    Can I use regexp_like ? if yes How can I use it?
    are there any other ways?
    Please hlep me.

    ReplyDelete
    Replies
    1. Hi...read through this document you will understand it http://psoug.org/reference/regexp.html

      Zeeshan

      Delete