Go4Expert

Go4Expert (http://www.go4expert.com/)
-   Oracle (http://www.go4expert.com/articles/oracle/)
-   -   Difference between rank, dense_rank and row_number function in Oracle (http://www.go4expert.com/articles/difference-rank-denserank-rownumber-t29532/)

bashamsc 7Mar2013 14:52

Difference between rank, dense_rank and row_number function in Oracle
 
We will try to understand the difference by looking into below examples.

Let us suppose we have a table called rnk_tbl with below data

Code:

EMP_NO        MNG_NO        DPT_NO        EFF_DT                END_DT

123        789        345        01/01/2011        12/31/2011
123        789        345        01/01/2010        12/31/2011
123        789        678        01/01/2012        12/31/2099
123        789        345        01/01/2011        12/31/2011
123        789        345        01/01/2010        12/31/2011
123        789        678        01/01/2012        12/31/2099


Now we will use all the three functions in below query to see the differences

Code:

select emp_no,mng_no,dpt_no,eff_dt,end_dt,
rank() over (partition by emp_no,mng_no,dpt_no order by eff_dt desc,end_dt desc ) rnk,
dense_rank() over (partition by emp_no,mng_no,dpt_no order by eff_dt desc,end_dt desc ) dense_rnk,
row_number() over (partition by emp_no,mng_no,dpt_no order by eff_dt desc,end_dt desc ) row_num
from rnk_tbl


We will get below query

Code:

EMP_NO        MNG_NO        DPT_NO        EFF_DT                END_DT                RNK        DENSE_RNK        ROW_NUM

123        789        345        01/01/2011        12/31/2011        1        1                1
123        789        345        01/01/2011        12/31/2011        1        1                2
123        789        345        01/01/2010        12/31/2011        3        2                3
123        789        345        01/01/2010        12/31/2011        3        2                4
123        789        678        01/01/2012        12/31/2099        1        1                1
123        789        678        01/01/2012        12/31/2099        1        1                2


By looking into the output we can clearly understand the difference between three functions.

Rank function will skip the sequence while assigning the rank for the group of partition and assigns same rank for equals.
As we observe in the above output it skipped 2 for the group of partition (123,789,345). Again it re-started new rank sequence for new partition group.

Dense_rank function will not skip the sequence while assigning the rank for the group of partition and assigns same rank for equals.

Row_number function will just assigns a sequence for the group of partition.

Now we will see with a example how row_number function is useful in eliminating duplicates rows.


Take same table rnk_tbl and data as above.

We can clearly see duplicates in the table.

Below query will be usefull in identifying the duplicates in the table by using row_number function

Code:

select emp_no,mng_no,dpt_no,eff_dt,end_dt,
rank() over (partition by emp_no,mng_no,dpt_no,eff_dt,end_dt order by eff_dt desc,end_dt desc ) rnk,
dense_rank() over (partition by emp_no,mng_no,dpt_no,eff_dt,end_dt order by eff_dt desc,end_dt desc ) dense_rnk,
row_number() over (partition by emp_no,mng_no,dpt_no,eff_dt,end_dt order by eff_dt desc,end_dt desc ) row_num
from rnk_tbl

The output will be

Code:

EMP_NO        MNG_NO        DPT_NO        EFF_DT                END_DT                RNK        DENSE_RNK        ROW_NUM

123        789        345        01/01/2010        12/31/2011        1        1                1
123        789        345        01/01/2010        12/31/2011        1        1                2
123        789        345        01/01/2011        12/31/2011        1        1                1
123        789        345        01/01/2011        12/31/2011        1        1                2
123        789        678        01/01/2012        12/31/2099        1        1                1
123        789        678        01/01/2012        12/31/2099        1        1                2

In the above output we can clearly see that rnk and dense_rnk gave same rnk for duplicate records and row_number function incremented the sequence.

If we use below query the duplicates will be removed

Code:

select * from
(
select emp_no,mng_no,dpt_no,eff_dt,end_dt,
rank() over (partition by emp_no,mng_no,dpt_no,eff_dt,end_dt order by eff_dt desc,end_dt desc ) rnk,
dense_rank() over (partition by emp_no,mng_no,dpt_no,eff_dt,end_dt order by eff_dt desc,end_dt desc ) dense_rnk,
row_number() over (partition by emp_no,mng_no,dpt_no,eff_dt,end_dt order by eff_dt desc,end_dt desc ) row_num
from rnk_tbl
)
where row_num=1

Output will be as below

Code:

EMP_NO        MNG_NO        DPT_NO        EFF_DT                END_DT                RNK        DENSE_RNK        ROW_NUM

123        789        345        01/01/2010        12/31/2011        1        1                1
123        789        345        01/01/2011        12/31/2011        1        1                1
123        789        678        01/01/2012        12/31/2099        1        1                1

If we want to only duplicate records then we need to use below query

Code:

select * from
(
select emp_no,mng_no,dpt_no,eff_dt,end_dt,
rank() over (partition by emp_no,mng_no,dpt_no,eff_dt,end_dt order by eff_dt desc,end_dt desc ) rnk,
dense_rank() over (partition by emp_no,mng_no,dpt_no,eff_dt,end_dt order by eff_dt desc,end_dt desc ) dense_rnk,
row_number() over (partition by emp_no,mng_no,dpt_no,eff_dt,end_dt order by eff_dt desc,end_dt desc ) row_num
from rnk_tbl
)
where row_num>1

Output will be

Code:

EMP_NO        MNG_NO        DPT_NO        EFF_DT                END_DT                RNK        DENSE_RNK        ROW_NUM

123        789        345        01/01/2010        12/31/2011        1        1                2
123        789        345        01/01/2011        12/31/2011        1        1                2
123        789        678        01/01/2012        12/31/2099        1        1                2



All times are GMT +5.5. The time now is 20:00.