This is continuation of my previous post : http://vkoracle.blogspot.com/2012/08/query-not-using-index-1-of-4.html
Index Selectivity - Selectivity decide which data access method is optimum for execution plan. As a guideline index should be created on columns when query is returning 15% or less of table rows. Selectivity ranges from 0 to 1, 1 being ideal and it can be 1 on primary keys or unique keys indexes. The higher the index selectivity( near to 1) the more chances of optimizer using it for index scan.
Selectivity = No of distinct values in column/Total number of rows.
Selectivity ranges from 0 to 1 and 1 is ideal. Selectivity of 1 exists in primary key or unique keys columns where each value is unique. The poor index selectivity means - the less chance of index being utilized.
Selectivity Example:
SQL>Drop table test;
SQL>Create table test( id number, name varchar(10)) ;
SQL>Insert into test select round(DBMS_RANDOM.VALUE(1,15)), upper(DBMS_RANDOM.STRING('A',10)) NAME from user_objects;
541 Records inserted.
SQL>Create index idx_test_id on test(id);
SQL>Create index idx_test_name on test(name);
SQL>Exec dbms_stats.gather_table_stats(user, 'test', method_opt=>'for all columns size auto' );
SQL>Exec dbms_stats.gather_table_stats(user, 'test' );
SQL>Select index_name,distinct_keys, num_rows from user_indexes where table_name='TEST'
Index_name distinct_keys num_rows
------------------- -------------- --------------
IDX_TEST_NAME 541 541
IDX_TEST_ID 15 541
There are all unique values in IDX_TEST_NAME and selectivity of doing index scan would be 1.
Select id, count(1), round(count(1)/541,2)*100||'%' "% of Total" from test group by id order by id
ID Count %Of Total
-- ---- -------
1 34 6%
2 39 7%
3 33 6%
4 36 7%
5 30 6%
6 47 9%
7 31 6%
8 30 6%
9 36 7%
10 50 9%
11 32 6%
12 45 8%
13 40 7%
14 42 8%
15 16 3%
Select distinct_keys, B.NUM_ROWS, round(distinct_keys/b.num_rows*100)||'%' "Selectivity of using Index" from user_indexes a, user_tables b where a.table_name=b.table_name and a.table_name = 'TEST' and index_name like 'IDX%';
Distinct_Keys Num_rows Selectivity of using Index
-------------- ------------ -----------------------------
15 541 3%
541 541 100%
SQL> select * from test where id=1;
Execution Plan
----------------------------------------------------------
---------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)|
---------------------------------------------------------------
| 0 | SELECT STATEMENT | | 34 | 476 | 2 (0)|
| 1 | TABLE ACCESS FULL| TEST | 34 | 476 | 2 (0)|
---------------------------------------------------------------
SQL> select * from test where id=2;
Execution Plan
----------------------------------------------------------
---------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)|
---------------------------------------------------------------
| 0 | SELECT STATEMENT | | 39 | 546 | 2 (0)|
| 1 | TABLE ACCESS FULL| TEST | 39 | 546 | 2 (0)|
---------------------------------------------------------------
SQL> select * from test where id=16;
Execution Plan
----------------------------------------------------------
--------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)|
--------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 7 | 98 | 2 (0)|
| 1 | TABLE ACCESS BY INDEX ROWID| TEST | 7 | 98 | 2 (0)|
| 2 | INDEX RANGE SCAN | IDX_TEST_ID | 7 | | 1 (0)|
--------------------------------------------------------------------------------
In case of id=1 or 2 optimizer thinks the cost of doing table scan (2) would be cheaper than cost of index scan( 2(index scan) +2(table scan) =4). Look into the following cost when I am forcing optimizer to use index.
SQL> select /*+ index(test, idx_test_id) */ * from test where id=2;
Execution Plan
----------------------------------------------------------
--------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)|
--------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 34 | 504 | 4 (0)|
| 1 | TABLE ACCESS BY INDEX ROWID| TEST | 34 | 504 | 4 (0)|
| 2 | INDEX RANGE SCAN | IDX_TEST_ID | 34 | | 1 (0)|
--------------------------------------------------------------------------------