It's All About ORACLE

Oracle - The number one Database Management System. Hope this Blog will teach a lot about oracle.

Showing posts with label Indexes. Show all posts
Showing posts with label Indexes. Show all posts

Index Organized Table - IOT

An index-organized table (IOT) is a type of table that stores data in a B*Tree index structure.
Normal relational tables, called heap-organized tables, store rows in any order (unsorted). In contrast to this, index-organized tables store rows in a B-tree index structure that is logically sorted in primary key order. Unlike normal primary key indexes, which store only the columns included in its definition, IOT indexes store all the columns of the table (see below for an exception to this rule - called the overflow area).

Properties and restrictions

  • An IOT must contain a primary key.
  • Rows are accessed via a logical rowid and not a physical rowid like in heap-organized tables.
  • An IOT cannot be in a cluster.
  • An IOT cannot contain a column of LONG data type.
  • You cannot modify an IOT index property using ALTER INDEX (error ORA-25176), you must use an ALTER TABLE instead.

Advantages of an IOT

  • As an IOT has the structure of an index and stores all the columns of the row, accesses via primary key conditions are faster as they don't need to access the table to get additional column values.
  • As an IOT has the structure of an index and is thus sorted in the order of the primary key, accesses of a range of primary key values are also faster.
  • As the index and the table are in the same segment, less storage space is needed.
  • In addition, as rows are stored in the primary key order, you can further reduce space with key compression.
  • As all indexes on an IOT uses logical rowids, they will not become unusable if the table is reorganized.

Row overflow area

If some columns of the table are infrequently accessed, it is possible to offload them into another segment named the overflow area. An overflow segment will decrease the size of the main (or top) segment and will increase the performance of statements that do not need access the columns in the overflow area. The overflow segments can reside in a tablespace different from the main segments.
Notes:
  • The overflow area can contains only columns that are not part of the primary key.
  • If a row cannot fit in a block, you must define an overflow area.
  • Consequently, the primary key values of an IOT must fit in a single block.
The columns of the table that are recorded in the overflow segment are defined using the PCTHRESHOLD and/or INCLUDING options of the OVERFLOW clause (see example below).

Example of an IOT without an overflow area

The following example creates a simple IOT table and shows the objects and segments that are created. (This example was tested on Oracle versions 9.2 to 11.2.)
SQL> CREATE TABLE my_iot (id INTEGER PRIMARY KEY, value VARCHAR2(50))
  2    ORGANIZATION INDEX;

Table created.

SQL> SELECT table_name, iot_type, iot_name FROM user_tables;

TABLE_NAME                     IOT_TYPE     IOT_NAME
------------------------------ ------------ ------------------------------
MY_IOT                         IOT

SQL> SELECT index_name, index_type, table_name FROM user_indexes;

INDEX_NAME                     INDEX_TYPE                  TABLE_NAME
------------------------------ --------------------------- ------------------------------
SYS_IOT_TOP_71133              IOT - TOP                   MY_IOT

SQL> SELECT object_id, object_name, object_type FROM user_objects ORDER BY 1;

 OBJECT_ID OBJECT_NAME          OBJECT_TYPE
---------- -------------------- -------------------
     71133 MY_IOT               TABLE
     71134 SYS_IOT_TOP_71133    INDEX

SQL> SELECT segment_name, segment_type FROM user_segments ORDER BY 1;

SEGMENT_NAME         SEGMENT_TYPE
-------------------- ------------------
SYS_IOT_TOP_71133    INDEX
Note: In 11g, you must use the following syntax to see the same output:
CREATE TABLE my_iot (id INTEGER PRIMARY KEY, value VARCHAR2(50)) 
  SEGMENT CREATION IMMEDIATE 
  ORGANIZATION INDEX;
The reason for is that by default in 11g the segment is created only when the first row is inserted.
As you can see 2 objects are created: the table and the index, but there is only 1 segment (implementation of the object) which is the index one.
The name of the index is by default "SYS_IOT_TOP_" and its type is "IOT - TOP". You can choose the name of your index using the following syntax:
SQL> CREATE TABLE my_iot (id INTEGER CONSTRAINT my_iot_pk PRIMARY KEY, value VARCHAR2(50))
  2    ORGANIZATION INDEX;

Table created.

SQL> SELECT table_name, iot_type, iot_name FROM user_tables;

TABLE_NAME                     IOT_TYPE     IOT_NAME
------------------------------ ------------ ------------------------------
MY_IOT                         IOT

SQL> SELECT index_name, index_type, table_name FROM user_indexes;

INDEX_NAME                     INDEX_TYPE                  TABLE_NAME
------------------------------ --------------------------- ------------------------------
MY_IOT_PK                      IOT - TOP                   MY_IOT

SQL> SELECT object_id, object_name, object_type FROM user_objects ORDER BY 1;

 OBJECT_ID OBJECT_NAME          OBJECT_TYPE
---------- -------------------- -------------------
     71135 MY_IOT               TABLE
     71136 MY_IOT_PK            INDEX

SQL> SELECT segment_name, segment_type FROM user_segments ORDER BY 1;

SEGMENT_NAME         SEGMENT_TYPE
-------------------- ------------------
MY_IOT_PK            INDEX
The IOT_NAME column is empty and will be useful in the example of the next section.


Example of an IOT with an overflow area

The following example creates an IOT with an overflow area and shows the objects and segments that are created. (This example was tested in versions 9.2 to 11.2, see the note in the previous section about 11g.)
SQL> CREATE TABLE my_iot (id INTEGER PRIMARY KEY, value VARCHAR2(50), comments varchar2(1000))
2 ORGANIZATION INDEX 3 INCLUDING value OVERFLOW; Table created. SQL> SELECT table_name, iot_type, iot_name FROM user_tables ORDER BY 1; TABLE_NAME IOT_TYPE IOT_NAME ------------------------------ ------------ ------------------------------ MY_IOT IOT SYS_IOT_OVER_71142 IOT_OVERFLOW MY_IOT SQL> SELECT table_name, column_name FROM user_tab_columns ORDER by table_name, column_id; TABLE_NAME COLUMN_NAME ------------------------------ ------------------------------ MY_IOT ID MY_IOT VALUE MY_IOT COMMENTS SQL> SELECT index_name, index_type, table_name, include_column FROM user_indexes; INDEX_NAME INDEX_TYPE TABLE_NAME INCLUDE_COLUMN ------------------------------ --------------------------- ------------------------- -------------- SYS_IOT_TOP_71142 IOT - TOP MY_IOT 2 SQL> SELECT object_id, object_name, object_type FROM user_objects ORDER BY 1; OBJECT_ID OBJECT_NAME OBJECT_TYPE ---------- -------------------- ------------------- 71142 MY_IOT TABLE 71143 SYS_IOT_OVER_71142 TABLE 71144 SYS_IOT_TOP_71142 INDEX SQL> SELECT segment_name, segment_type FROM user_segments ORDER BY 1; SEGMENT_NAME SEGMENT_TYPE -------------------- ------------------ SYS_IOT_OVER_71142 TABLE SYS_IOT_TOP_71142 INDEX
All the columns up to and including the one named in the INCLUDING option of the OVERFLOW clause are in the top segment; the remaining ones are in the overflow segment.
Here, we see that 2 table objects are created, the main one is of iot_type IOT and the overflow one is of iot_type IOT_OVERFLOW. The name of the overflow table is SYS_IOT_OVER_. You can see now the purpose of the
IOT_NAME column, it gives the name of the IOT table for its overflow one.
You can also see that 2 segments are created: the index of the IOT and the overflow area.
In the end, the USER_INDEXES view gives you the last column included in the index in its INCLUDE_COLUMN column.

FAQ

How to move an IOT into another tablespace?

Based on the example used in the previous section:
SQL> SELECT segment_name, segment_type, tablespace_name FROM user_segments ORDER BY 1;

SEGMENT_NAME                   SEGMENT_TYPE       TABLESPACE_NAME
------------------------------ ------------------ ------------------------------
SYS_IOT_OVER_71142             TABLE              TS_D01
SYS_IOT_TOP_71142              INDEX              TS_D01
Let's assume you want to move all the segments from their current tablespace to another tablespace. You'll quickly discover that you cannot use the standard commands:
SQL> ALTER TABLE sys_iot_over_71142 MOVE TABLESPACE ts_i01;

ALTER TABLE sys_iot_over_71142 MOVE TABLESPACE ts_i01
            *
ERROR at line 1:
ORA-25191: cannot reference overflow table of an index-organized table

SQL> ALTER INDEX sys_iot_top_71142 REBUILD TABLESPACE ts_i01;

ALTER INDEX sys_iot_top_71142 REBUILD TABLESPACE ts_i01
*
ERROR at line 1:
ORA-28650: Primary index on an IOT cannot be rebuilt
You have to first find the name of the associated IOT table:
SQL> SELECT iot_name FROM user_tables WHERE table_name = 'SYS_IOT_OVER_71142';

IOT_NAME
------------------------------
MY_IOT

SQL> SELECT table_name FROM user_indexes WHERE index_name = 'SYS_IOT_TOP_71142';

TABLE_NAME
------------------------------
MY_IOT

Then you can move the segments:
SQL> ALTER TABLE my_iot MOVE TABLESPACE ts_i01;

Table altered.
However, this is only the main part of the IOT - the overflow part continues to reside in its original tablespace:
SQL> SELECT segment_name, segment_type, tablespace_name FROM user_segments ORDER BY 1;
SEGMENT_NAME                   SEGMENT_TYPE       TABLESPACE_NAME
------------------------------ ------------------ ------------------------------
SYS_IOT_OVER_71142             TABLE              TS_D01
SYS_IOT_TOP_71142              INDEX              TS_I01
To move the overflow area also, an additional statement is necessary:
 SQL> ALTER TABLE my_iot MOVE OVERFLOW TABLESPACE ts_i01;

Table altered.

SQL> SELECT segment_name, segment_type, tablespace_name FROM user_segments ORDER BY 1;

SEGMENT_NAME                   SEGMENT_TYPE       TABLESPACE_NAME
------------------------------ ------------------ ------------------------------
SYS_IOT_OVER_71142             TABLE              TS_I01
SYS_IOT_TOP_71142              INDEX              TS_I01
Both steps can be combined in one statement:
SQL> ALTER TABLE my_iot MOVE TABLESPACE ts_d01 OVERFLOW TABLESPACE ts_d01;

Table altered.


How to know which columns are in the overflow segment?

Remember that the INCLUDE_COLUMN of the USER_INDEXES view gives you the last column included in the index and so in the top segment, so joining this view with the USER_TAB_COLUMNS you can separate the columns that are in the top segment from those that are in the overflow one as with the following statement:
SQL> SELECT c.table_name, c.column_name,
2 CASE WHEN c.column_id <= i.include_column THEN 'TOP' ELSE 'OVERFLOW' END segment 3 FROM user_tab_columns c, user_indexes i 4 WHERE i.table_name = c.table_name 5 ORDER by table_name, column_id; TABLE_NAME COLUMN_NAME SEGMENT ------------------------------ ------------------------------ -------- MY_IOT ID TOP MY_IOT VALUE TOP MY_IOT COMMENTS OVERFLOW

http://www.orafaq.com/wiki/Index-organized_table

http://docs.oracle.com/cd/B28359_01/server.111/b28310/tables012.htm#ADMIN11685

http://www.dba-oracle.com/t_index_organized_tables.htm

Oracle Bitmap Indexes Limitations/Restrictions

Overview

Oracle's two major index types are Bitmap indexes and B-Tree indexes. B-Tree indexes are the regular type that OLTP systems make much use of, and bitmap indexes are a highly compressed index type that tends to be used primarily for data warehouses.

Characteristic of Bitmap Indexes
  • For columns with very few unique values (low cardinality)
Columns that have low cardinality are good candidates (if the cardinality of a column is <= 0.1 %  that the column is ideal candidate, consider also 0.2% – 1%)
  • Tables that have no or little insert/update are good candidates (static data in warehouse)
     
  • Stream of bits: each bit relates to a column value in a single row of table
create bitmap index person_region on person (region);

        Row     Region   North   East   West   South
        1       North        1      0      0       0
        2       East         0      1      0       0
        3       West         0      0      1       0
        4       West         0      0      1       0
        5       South        0      0      0       1                            6       North        1      0      0       0 

Advantage of Bitmap Indexes

The advantages of them are that they have a highly compressed structure, making them fast to read and their structure makes it possible for the system to combine multiple indexes together for fast access to the underlying table.
Compressed indexes, like bitmap indexes, represent a trade-off between CPU usage and disk space usage. A compressed structure is faster to read from disk but takes additional CPU cycles to decompress for access - an uncompressed structure imposes a lower CPU load but requires more bandwidth to read in a short time.
One belief concerning bitmap indexes is that they are only suitable for indexing low-cardinality data. This is not necessarily true, and bitmap indexes can be used very successfully for indexing columns with many thousands of different values.

Disadvantage of Bitmap Indexes

The reason for confining bitmap indexes to data warehouses is that the overhead on maintaining them is enormous. A modification to a bitmap index requires a great deal more work on behalf of the system than a modification to a b-tree index. In addition, the concurrency for modifications on bitmap indexes is dreadful. 


Bitmap Indexes and Deadlocks

Bitmap indexes are not appropriate for tables that have lots of single row DML operations (inserts) and especially concurrent single row DML operations. Deadlock situations are the result of concurrent inserts as the following example shows: Open two windows, one for Session 1 and one for Session 2
Session 1Session 2
create table bitmap_index_demo (
  value varchar2(20)
);

insert into bitmap_index_demo
select decode(mod(rownum,2),0,'M','F')
  from all_objects;
create bitmap index
  bitmap_index_demo_idx
  on bitmap_index_demo(value);

insert into bitmap_index_demo
  values ('M');
1 row created.


insert into bitmap_index_demo
  values ('F');
1 row created.
insert into bitmap_index_demo
  values ('F');
...... waiting ......

ERROR at line 1:
ORA-00060: deadlock detected while waiting for resource
insert into bitmap_index_demo
  values ('M');
...... waiting ......

Why Bitmap not suitable for Frequently Updating Columns of Table 

Bitmap should not be created on a table or column of a table which suffer more DML operations. I will demonstrate some cases where DML operation in one session on Bitmap index column affected or hanged the DML operation on other session. We will see how column with Bitmap index got locked.

First, create a table and create Bitmap index on it.
CREATE TABLE Bitmap_Demo ( EmpId number, sex char2(1));

INSERT INTO Bitmap_Demo
SELECT emp_btmp.NEXTVAL, CASE WHEN DBMS_RANDOM.VALUE(1,10)>5 THEN 'M' ELSE 'F' END 
FROM USER_OBJECTS;


CREATE BITMAP INDEX BTMP_EMP_IND ON BITMAP_DEMO(SEX);

Test case 1:  
Update Sex on EmpID 6 ( Previous Value: M, New Value: M) and check update of sex on another M employee

Session 1:
UPDATE  Bitmap_Demo SET SEX = 'M' where EmpId = 6;

Session 2:
SQL> UPDATE  Bitmap_Demo SET Sex = 'M' WHERE EmpId = 7;

1 row updated.

SQL> UPDATE  Bitmap_Demo SET Sex = 'F' WHERE EmpId = 7;

1 row updated.

SQL> INSERT INTO Bitmap_Demo VALUES(emp_btmp.NEXTVAL, 'F');

1 row created.

SQL> INSERT INTO Bitmap_Demo VALUES(emp_btmp.NEXTVAL, 'M');

1 row created.

It allowed all operations.

Test case 2:  
Update Sex on EmpID 6 ( Previous Value: M; New Value: F) 

Session 1:
UPDATE  Bitmap_Demo SET Sex = 'F' WHERE EmpId = 6;

Session 2:
UPDATE  Bitmap_Demo SET Sex = 'M' WHERE EmpId = 7;
This statement got executed.

INSERT INTO Bitmap_Demo VALUES(emp_btmp.NEXTVAL, 'F');
This statement hanged and waited for ROLLBACK, COMMIT operation on Session1.

UPDATE  Bitmap_Demo SET Sex = 'F' WHERE EmpId = 7;
This statement hanged and waited for ROLLBACK, COMMIT operation on Session1.

INSERT INTO Bitmap_Demo VALUES(emp_btmp.NEXTVAL, 'M');
This statement hanged and waited for ROLLBACK, COMMIT operation on Session1.

Test case 3:  
Insert new employee with Sex M 

Session 1:
INSERT INTO Bitmap_Demo VALUES(emp_btmp.NEXTVAL, 'M');

Session 2:
INSERT INTO Bitmap_Demo VALUES(emp_btmp.NEXTVAL, 'F');
It is allowed to execute.

INSERT INTO Bitmap_Demo VALUES(emp_btmp.NEXTVAL, 'M');
New data insert with same Sex as in session 1. It hanged and waited for ROLLBACK, COMMIT operation on Session1.

UPDATE  Bitmap_Demo SET Sex = 'F' WHERE EmpId = 7;
Previous Sex of EmpId 7 was also M, when tried to change it to F, it hanged.  It hanged and waited for ROLLBACK, COMMIT operation on Session1. 

UPDATE  Bitmap_Demo SET Sex = 'M' WHERE EmpId = 11;
Previous Sex of EmpId 11 was also F, when tried to change it to M, it hanged.  It hanged and waited for ROLLBACK, COMMIT operation on Session1. 

Test Case 4: 
Insert both sex in both sessions: DEADLOCK
 INSERT INTO Bitmap_Demo VALUES(emp_btmp.NEXTVAL, 'M');
 1 row created.

Session 2:
 INSERT INTO Bitmap_Demo VALUES(emp_btmp.NEXTVAL, 'F'); 
 1 row created.

Session 1:
 INSERT INTO Bitmap_Demo VALUES(emp_btmp.NEXTVAL, 'F'); 
...... waiting ......

Session 2:
 INSERT INTO Bitmap_Demo VALUES(emp_btmp.NEXTVAL, 'M'); 
...... waiting ......

Session 1: Immediately after execution of insert M in session 2, error prompt on Session 1
ERROR at line 1:
ORA-00060: deadlock detected while waiting for resource

This clarified that we should never create BITMAP index on table/columns which are updated frequently like in OLTP applications with multi-user environment.

Oracle Bitmap Index Concepts

Bitmap indexes are widely used in data warehousing environments. The environments typically have large amounts of data and ad hoc queries, but a low level of concurrent DML transactions. For such applications, bitmap indexing provides:

Fully indexing a large table with a traditional B-tree index can be prohibitively expensive in terms of space because the indexes can be several times larger than the data in the table. Bitmap indexes are typically only a fraction of the size of the indexed data in the table.

An index provides pointers to the rows in a table that contain a given key value. A regular index stores a list of rowids for each key corresponding to the rows with that key value. In a bitmap index, a bitmap for each key value replaces a list of rowids.



Oracle bitmap indexes are very different from standard b-tree indexes. In bitmap structures, a two-dimensional array is created with one column for every row in the table being indexed. Each column represents a distinct value within the bitmapped index. This two-dimensional array represents each value within the index multiplied by the number of rows in the table.


Each bit in the bitmap corresponds to a possible rowid, and if the bit is set, it means that the row with the corresponding rowid contains the key value. A mapping function converts the bit position to an actual rowid, so that the bitmap index provides the same functionality as a regular index. If the number of different key values is small, bitmap indexes save space.

At row retrieval time, Oracle decompresses the bitmap into the RAM data buffers so it can be rapidly scanned for matching values. These matching values are delivered to Oracle in the form of a Row-ID list, and these Row-ID values may directly access the required information.



The real benefit of bitmapped indexing occurs when one table includes multiple bitmapped indexes. Each individual column may have low cardinality. The creation of multiple bitmapped indexes provides a very powerful method for rapidly answering difficult SQL queries.


A bitmap merge operation build ROWID lists

Using this bitmap merge methodology, Oracle can provide sub-second response time when working against multiple low-cardinality columns.

For example, assume there is a motor vehicle database with numerous low-cardinality columns such as car_color, car_make, car_model, and car_year. Each column contains less than 100 distinct values by themselves, and a b-tree index would be fairly useless in a database of 20 million vehicles.


However, combining these indexes together in a query can provide blistering response times a lot faster than the traditional method of reading each one of the 20 million rows in the base table. For example, assume we wanted to find old blue Toyota Corollas manufactured in 1981:

select
   license_plat_nbr
from
   vehicle
where
   color = ?blue?
and
   make = ?toyota?
and
   year = 1981;


Oracle uses a specialized optimizer method called a bitmapped index merge to service this query. In a bitmapped index merge, each Row-ID, or RID, list is built independently by using the bitmaps, and a special merge routine is used in order to compare the RID lists and find the intersecting values.


As the number if distinct values increases, the size of the bitmap increases exponentially, such that an index with 100 values may perform thousands of times faster than a bitmap index on 1,000 distinct column values. 
Also, remember that bitmap indexes are only suitable for static tables and materialized views which are updated at nigh and rebuilt after batch row loading.  If your tables are not read-only during query time, DO NOT consider using bitmap indexes!



  • 1 - 7 distinct key values - Queries against bitmap indexes with a low cardinality are very fast.
  • 8-100 distinct key values - As the number if distinct values increases, performance decreases proportionally.
  • 100 - 10,000 distinct key values - Over 100 distinct values, the bitmap indexes become huge and SQL performance drops off rapidly.
  • Over 10,000 distinct key values - At this point, performance is ten times slower than an index with only 100 distinct values.

    Cardinality

    The advantages of using bitmap indexes are greatest for columns in which the ratio of the number of distinct values to the number of rows in the table is under 1%. We refer to this ratio as the degree of cardinality. A gender column, which has only two distinct values (male and female), is ideal for a bitmap index. However, data warehouse administrators also build bitmap indexes on columns with higher cardinalities.

    For example, on a table with one million rows, a column with 10,000 distinct values is a candidate for a bitmap index. A bitmap index on this column can outperform a B-tree index, particularly when this column is often queried in conjunction with other indexed columns. In fact, in a typical data warehouse environments, a bitmap index can be considered for any non-unique column.

    B-tree indexes are most effective for high-cardinality data: that is, for data with many possible values, such as customer_name or phone_number. In a data warehouse, B-tree indexes should be used only for unique columns or other columns with very high cardinalities (that is, columns that are almost unique). The majority of indexes in a data warehouse should be bitmap indexes.

    In ad hoc queries and similar situations, bitmap indexes can dramatically improve query performance. AND and OR conditions in the WHERE clause of a query can be resolved quickly by performing the corresponding Boolean operations directly on the bitmaps before converting the resulting bitmap to rowids. If the resulting number of rows is small, the query can be answered quickly without resorting to a full table scan.

    Example 6-1 Bitmap Index

    The following shows a portion of a company's customers table.
    SELECT cust_id, cust_gender, cust_marital_status, cust_income_level
    FROM customers;
    
    CUST_ID    C CUST_MARITAL_STATUS  CUST_INCOME_LEVEL
    ---------- - -------------------- ---------------------
    ... 
            70 F                      D: 70,000 - 89,999
            80 F married              H: 150,000 - 169,999
            90 M single               H: 150,000 - 169,999
           100 F                      I: 170,000 - 189,999
           110 F married              C: 50,000 - 69,999
           120 M single               F: 110,000 - 129,999
           130 M                      J: 190,000 - 249,999
           140 M married              G: 130,000 - 149,999
    ...
    
    

    Because cust_gendercust_marital_status, and cust_income_level are all low-cardinality columns (there are only three possible values for marital status and region, two possible values for gender, and 12 for income level), bitmap indexes are ideal for these columns. Do not create a bitmap index on cust_id because this is a unique column. Instead, a unique B-tree index on this column provides the most efficient representation and retrieval.

    Bitmap Join Indexes

    In addition to a bitmap index on a single table, you can create a bitmap join index, which is a bitmap index for the join of two or more tables. A bitmap join index is a space efficient way of reducing the volume of data that must be joined by performing restrictions in advance. For each value in a column of a table, a bitmap join index stores the rowids of corresponding rows in one or more other tables. In a data warehousing environment, the join condition is an equi-inner join between the primary key column or columns of the dimension tables and the foreign key column or columns in the fact table.
    Bitmap join indexes are much more efficient in storage than materialized join views, an alternative for materializing joins in advance. This is because the materialized join views do not compress the rowids of the fact tables.

    Example 6-3 Bitmap Join Index: Example 1
    Creating a bitmap join index with the following sales table:

    SELECT time_id, cust_id, amount FROM sales;
    
    TIME_ID   CUST_ID    AMOUNT
    --------- ---------- ----------
    01-JAN-98      29700       2291
    01-JAN-98       3380        114
    01-JAN-98      67830        553
    01-JAN-98     179330          0
    01-JAN-98     127520        195
    01-JAN-98      33030        280
    ...
    
    CREATE BITMAP INDEX sales_cust_gender_bjix
    ON sales(customers.cust_gender)
    FROM sales, customers
    WHERE sales.cust_id = customers.cust_id
    LOCAL;
    
    
    The following query shows how to use this bitmap join index and illustrates its bitmap pattern:
    SELECT sales.time_id, customers.cust_gender, sales.amount
    FROM sales, customers
    WHERE sales.cust_id = customers.cust_id;
    
    TIME_ID   C AMOUNT
    --------- - ----------
    01-JAN-98 M       2291
    01-JAN-98 F        114
    01-JAN-98 M        553
    01-JAN-98 M          0
    01-JAN-98 M        195
    01-JAN-98 M        280
    01-JAN-98 M         32
    
    
    Example 6-4 Bitmap Join Index: Example 2
    You can create a bitmap join index on more than one column, as in the following example, which uses customers(gender, marital_status):
    CREATE BITMAP INDEX sales_cust_gender_ms_bjix
    ON sales(customers.cust_gender, customers.cust_marital_status)
    FROM sales, customers
    WHERE sales.cust_id = customers.cust_id
    LOCAL NOLOGGING;
    Example 6-5 Bitmap Join Index: Example 3
    
    You can create a bitmap join index on more than one table, as in the following, which uses customers(gender) and products(category):
    CREATE BITMAP INDEX sales_c_gender_p_cat_bjix
    ON sales(customers.cust_gender, products.prod_category)
    FROM sales, customers, products
    WHERE sales.cust_id = customers.cust_id
    AND sales.prod_id = products.prod_id
    LOCAL NOLOGGING;
    
    Example 6-6 Bitmap Join Index: Example 4
    
    You can create a bitmap join index on more than one table, in which the indexed column is joined to the indexed table by using another table. For example, we can build an index on countries.country_name, even though the countries table is not joined directly to the sales table. Instead, the countries table is joined to the customers table, which is joined to the sales table. This type of schema is commonly called a snowflake schema.
    CREATE BITMAP INDEX sales_c_gender_p_cat_bjix
    ON sales(customers.cust_gender, products.prod_category)
    FROM sales, customers, products
    WHERE sales.cust_id = customers.cust_id
    AND sales.prod_id = products.prod_id
    LOCAL NOLOGGING;
    
    

    Bitmap Join Index Restrictions

    Join results must be stored, therefore, bitmap join indexes have the following restrictions:
    • Parallel DML is currently only supported on the fact table. Parallel DML on one of the participating dimension tables will mark the index as unusable.
    • Only one table can be updated concurrently by different transactions when using the bitmap join index.
    • No table can appear twice in the join.
    • You cannot create a bitmap join index on an index-organized table or a temporary table.
    • The columns in the index must all be columns of the dimension tables.
    • The dimension table join columns must be either primary key columns or have unique constraints.
    • If a dimension table has composite primary key, each column in the primary key must be part of the join.

    You will want a bitmap index when:

     Bitmap indexes are primarily intended for data warehousing applications where users query the data rather than update it. They are not suitable for OLTP applications with large numbers of concurrent transactions modifying the data.
    1 - Table column is low cardinality - As a ROUGH guide, consider a bitmap for any index with less than 100 distinct values
        select region, count(*) from sales group by region;
    2 - The table has LOW DML - You must have low insert./update/delete activity.  Updating bitmapped indexes take a lot of resources, and bitmapped indexes are best for largely read-only tables and tables that are batch updated nightly.
    3 - Multiple columns - Your SQL queries reference multiple, low cardinality values in there where clause.  Oracle cost-based SQL optimizer (CBO) will scream when you have bitmap indexes on . 

    Restrictions on Bitmap Indexes 

    Bitmap indexes are subject to the following restrictions:
    • You cannot specify BITMAP when creating a global partitioned index.
    • You cannot create a bitmap secondary index on an index-organized table unless the index-organized table has a mapping table associated with it.
    • You cannot specify both UNIQUE and BITMAP.
    • You cannot specify BITMAP for a domain index.

    Troubleshooting Oracle bitmap indexes:

    Some of the most common problems when implementing bitmap indexes include:
    1. Small table - The CBO may force a full-table scan if your table is small!

    2. Bad stats - Make sure you always analyze the bitmap with dbms_stats right after creation:
    CREATE BITMAP INDEX
    emp_bitmap_idx
    ON index_demo (gender);

    exec dbms_stats.gather_index_stats(OWNNAME=>'SCOTT', INDNAME=>'EMP_BITMAP_IDX');
      3. Test with a hint - To force the use of your new bitmap index, just use a Oracle INDEX hint:
    select /*+ index(emp emp_bitmap_idx) */
       count(*)
    from
       emp, dept
    where
       emp.deptno = dept.deptno;

    Using B-Tree Indexes - Indexes Fundamental

    Content:

    •  When B Tree Index should be used. We will try to understand some of the situations when the optimizer will use B Tree Index.
    •  Index Clustering Factor.
    •  Influence of parameter OPTIMIZER_MODE and OPTIMIZER_INDEX_COST_ADJ on using B Tree Index.

    When to use B-Tree Index:

    Although it can differ from situation to situation that when Oracle Optimizer will decide to use index or not, based on table size, available statistics, index clustering factor, several db parameters like OPTIMIZER_MODE and OPTIMIZER_INDEX_COST_ADJ, OPTIMIZER_INDEX_CACHING etc and other things.
    But mainly B*Tree index are used, when:
    1. You are fetching very small fraction of the rows from the table using index. B*Tree index should be created on the columns that are frequently used in predicates, join condition of query.
    2. You are fetching many/all rows of a table and the index can be used instead of the table (you are only selecting the columns that you have indexed).

    When we access the table using index, Oracle will scan the index and from the index leaf nodes it will get the ROWID, then using the rowid it will read the data block from the table segment. This is typically known as "TABLE ACCES BY ROWID". This is very efficient method when you are accessing a small percentage of rows but however its not that efficient while you are reading a large amount of rows. Now the statement "small percentage" is very much relative. In a thin table it might be 2-3% or 5% but in a fat table it may be upto 20-25%. And Index Clustering Factor has significant impact on this small percentage value.

    Table Access by Index Rowid

     An index is stored sorted by index key. The index might be ascending order or it might be descending order in case of Descending index. The index will be accessed in sorted order by key, it will be sequential access and from the leaf block it will get the rowid to access the data blocks of the table segment and data block are stored randomly, scattered in heap. Therefore when Oracle does a table access by index rowid, it will not access the table in sorted order, it will not go to block 1, block 2, block 3, rather it will search for blocks that are scattered in the heap. For example it might go to block 10, then block 533, then block 777, then again block 10. It will do n number of "table access by index Rowid" to get each and every row and to do it will have to read and reread blocks.

    There will be lots of scattered, single block read from here and there in table segment. Typically in the thin table a single block will hold more number of rows, while in a fat table a block will hold less number of rows.

    Suppose DB_BLOCK_SIZE = 8k. A thin table has 1 Million rows. And if rows are almost about 80 bytes in size, so logically there will be about 8k/80b = 100 rows per block. That means the table has approzimately 100*100 = 10,000 blocks.

    If read 2,00,000 rows via the index; there will be 2,00,000 "TABLE ACCESS BY ROWID" operations. Oracle will read the blocks 2,00,000 times to execute this query but there are only about 10,000 block in the entire table. If the index column data that you are fetching is scattered in all the 10000 blocks then it will read and reread 1 single block in the table on average 20 times. So in this case optimizer will prefer Full Table Scan (FTS) than using index.

    Now if a fat table has 1 million rows, and if avg size of row is about 1600k then per block there will have about 5 rows, so there will be about 2,00,000 blocks (20 times more than thin table), so if you are accessing 2,00,000 "TABLE ACCESS BY ROWID" then in avg you will read the block once, so there will be lesser number of rereading blocks. So optimizer might think about using the index. But this is just a rough measure, but this is actually how it works. Although index clustering factor have a great impact on this.

    CREATE TABLE FAT_EMP(
    EMPLOYEE_ID NUMBER,
    FIRST_NAME VARCHAR2(20),
    LAST_NAME VARCHAR2(25) NOT NULL,
    EMAIL VARCHAR2(232) NOT NULL,
    PHONE_NUMBER VARCHAR2(20),
    HIRE_DATE DATE NOT NULL,
    JOB_ID VARCHAR2(10) NOT NULL,
    SALARY NUMBER(8,2)
    COMMISSION_PCT NUMBER(2,2)
    MANAGER_ID NUMBER(6)
    DEPARTMENT_ID NUMBER(4),
    EMP_STS_FLAG    VARCHAR2(10),
    CTC    NUMBER,
    MANAGER_NAME VARCHAR2(60),
    FIRST_NAME2 VARCHAR2(20),
    LAST_NAME2 VARCHAR2(25) NOT NULL,
    EMAIL2 VARCHAR2(232) NOT NULL,
    PHONE_NUMBER2 VARCHAR2(20),
    HIRE_DATE2 DATE NOT NULL,
    JOB_ID2 VARCHAR2(10) NOT NULL,
    SALARY2 NUMBER(8,2)
    COMMISSION_PCT2 NUMBER(2,2)
    MANAGER_ID2 NUMBER(6)
    DEPARTMENT_ID2 NUMBER(4),
    EMP_STS_FLAG2    VARCHAR2(10),
    CTC2    NUMBER,
    MANAGER_NAME2 VARCHAR2(60),
    FIRST_NAME3 VARCHAR2(20),
    LAST_NAME3 VARCHAR2(25) NOT NULL,
    EMAIL3 VARCHAR2(232) NOT NULL,
    PHONE_NUMBER3 VARCHAR2(20),
    HIRE_DATE3 DATE NOT NULL,
    JOB_ID3 VARCHAR2(10) NOT NULL,
    SALARY3 NUMBER(8,2)
    .
    .
    .
    .
    MANAGER_ID7 NUMBER(6)
    DEPARTMENT_ID7 NUMBER(4),
    EMP_STS_FLAG7    VARCHAR2(10),
    CTC7    NUMBER,
    MANAGER_NAME7 VARCHAR2(60));


    CREATE TABLE THIN_EMP
    AS 
    SELECT EMPLOYEE_ID, FIRST_NAME, LAST_NAME, EMAIL, JOB_ID, SALARY, COMMISSION_PCT, CTC
    FROM FAT_EMP;

    ALTER TABLE FAT_EMP ADD CONSTRAINT FAT_EMP_PK PRIMARY_KEY (EMPLOYEE_ID);
    ALTER TABLE THIN_EMP ADD CONSTRAINT THIN_EMP_PK PRIMARY KEY (EMPLOYEE_ID);

    BEGIN
      SYS.DBMS_STATS.GATHER_TABLE_STATS (
    OwnName => 'SYS',
    TabName  => 'FAT_EMP',
    Method_Opt  => 'FOR ALL COLUMNS SIZE AUTO',
    Cascade  => 'TRUE');
    END;
    /

    BEGIN
      SYS.DBMS_STATS.GATHER_TABLE_STATS (
    OwnName => 'SYS',
    TabName  => 'THIN_EMP',
    Method_Opt  => 'FOR ALL COLUMNS SIZE AUTO',
    Cascade  => 'TRUE');
    END;
    /
    Both tables will now have around 1 million records. We have created primary key on these tables and by default indexes have been created on these tables.

    SELECT TABLE_NAME, BLOCKS, NUM_ROWS, NUM_ROWS/BLOCKS "Rows Per Block"
    FROM DBA_TABLES
    WHERE TABLE_NAME IN ('FAT_EMP', 'THIN_EMP');

    TABLE_NAME BLOCKS NUM_ROWS         "Rows Per Block"
    THIN_EMP 9197 1000000 108.731107966999
    FAT_EMP 203536 1000000 4.91313575976731

    Now selecting around 2% from each table:

    SELECT * FROM THIN_EMP WHERE EMPLOYEE_ID < 20000;

    SELECT * FROM FAT_EMP WHERE EMPLOYEE_ID < 20000;

    In both the cased there is TABLE ACCESS BY ROWID and index is being used for two percent.

    Now Let's check what happen when we fetch 5% of the record:
    SELECT * FROM THIN_EMP WHERE EMPLOYEE_ID < 50000;

    SELECT * FROM FAT_EMP WHERE EMPLOYEE_ID < 50000;

    In case of THIN_EMP table, it is not using the Index anymore. It is doing TABLE ACCESS FULL. But in case of FAT_EMP table it is still using INDEX RANGE SCAN and then TABLE ACCESS BY ROWID

    Now fetching 20% records from FAT_EMP table:
    SELECT * FROM FAT_EMP WHERE EMPLOYEE_ID < 200000;

    We were fetching around 20% records and it is still using the index for the fact table but for THIN_EMP table it was not using index even when we were fetching 5% of rows. 

    With this example we can clear our concept that FTS will be performed if oracle optimizer sees huge reread of blocks to fetch data from table ( as in case of THIN_EMP). In FAT_EMP table data was less in each block so number of reread of block was less.

    Index Clustering Factor is important factor on which Optimizer will decide whether to use index or not when we select 5% rows.

    Index Clustering Factor

    The clustering factor is a measure that indicates how many adjacent index key do not refer to the same data block in the table. It compares the order of the index with the degree of disorder in the table. It is typically the number of Block changes while ypu are reading the table using the index.

    If you look conceptually, we have one index leaf block and four data block, each block contaiing five rows. Suppose scan begin from 100 key value. This Rowid pointing to BLOCK 1 (2), counter will become 1. Now 101 it is going to Block 2 (1). There is a block change, it is reading from a different block so the counter will become 2. Now for Employee_ID 102, it is again referring to Block 1, there is again a block change, reading from Block1 (5). Counter value become 3. For Employee_ID 103, it again reading from different block (BLOCK 2), counter is then set to 4.  For 104 it is reading from same block (BLOCK 2) so counter will remain same 4. 105 referring to different block - BLOCK 1, counter changes to 5. 106 pointing to BLOCK 3 counter will change to 6. 107 and 108 are also pointing to BLOCK 3, counter will remain 6. Read of Employee_ID 109 and 110 mapping to BLOCK 1, again a block change and counter will become 7 and so on.
    Finally for Key Values 115 to 119, all data will be in BLOCK 4 and Clustering Factor will be 10.

    This is how Index Clustering Factor can be defined.
    •  If the clustering factor his high, then Oracle Database performs a relatively high number of I/O during index range scan. The index entries points to random table blocks, so the database may have to read and reread the same blocks over and over again to retrieve the data pointed to by the index.
    •  If the clustering factor is low, then Oracle Database performs a relatively low number of I/O during a large index range scan. The index keys in a range tend to point to the same data block, so the database does not have to read and reread the same block over and over. For example the read of Key Value 115 to 119 in the diagram above, all these values referring to block 4. So Low clustering factor is a good indicator of good index.
    •  The smallest possible value of clustering factor will be same as the number of table blocks, and the largest possible value will be the same as the number of rows in the table.

    We will create two table. One table will have Organized data in particular order of Primary Key 

    CREATE TABLE ORGANIZED (COL1 INT, COL2 VARCHAR2(200), COL3 VARCHAR2(200));

    BEGIN 
     FOR I IN 1..100000
     LOOP 
     INSERT INTO ORGANIZED VALUES(I, DBMS_RANDOM.STRING('x', 10),  DBMS_RANDOM.STRING('Y', 10));
     END LOOP; 
    COMMIT;
    END;
    /

    ALTER TABLE ORGANIZED ADD CONSTRAINT ORGANIZED_PK PRIMARY KEY(COL1);

    In case of sequence populated values using such FOR LOOP, the particular index will hold index key in the same sequence. Typically clustering factor will be low.

    Now Creating table DISORGANIZED.

    CREATE TABLE DISORGANIZED 
    AS SELECT COL1, COL2, COL3 FROM ORGANIZED ORDER BY COL2;

    By making it Order By COL2, values in DISORGANIZED table will not be stored in in order by COL1. It will be disorganized.

    ALTER TABLE DISORGANIZED ADD CONSTRAINT DISORGANIZED_PK PRIMARY KEY(COL1);

    EXEC DBMS_STATS.GATHER_TABLE_STATS ('SCOTT', 'ORGANIZED');
    EXEC DBMS_STATS.GATHER_TABLE_STATS ('SCOTT', 'DISORGANIZED');

    SELECT IND.INDEX_NAME, IND.TABLE_NAME, IND.CLUSTERING_FACTOR, TAB.NUM_ROWS, TAB.BLOCKS
    FROM ALL_INDEXES IND, ALL_TABLES TAB
    WHERE IND.TABLE_NAME = TAB.TABLE_NAME
    AND IND.TABLE_NAME IN ('ORGANIZED','DISORGANIZED');

    SQL> SELECT IND.INDEX_NAME, IND.TABLE_NAME, IND.CLUSTERING_FACTOR, TAB.NUM_ROWS, TAB.BLOCKS
      2  FROM ALL_INDEXES IND, ALL_TABLES TAB
      3  WHERE IND.TABLE_NAME = TAB.TABLE_NAME
      4  AND IND.TABLE_NAME IN ('ORGANIZED','DISORGANIZED');

    INDEX_NAME                     TABLE_NAME                     CLUSTERING_FACTOR   NUM_ROWS     BLOCKS
    ------------------------------ ------------------------------ ----------------- ---------- ----------
    ORGANIZED_PK                   ORGANIZED                                    440     100000        496
    DISORGANIZED_PK                DISORGANIZED                               99766     100000        458

    In Organized table Clustering Factor is close to number of Blocks. This is a very good indicator. In the Disorganized table data is not organized by particular key value, clustering factor is very high, it is close to number of rows 100000. 

    SELECT * /* 0.1 percent */  FROM DISORGANIZED WHERE COL1 < 100;

    SELECT * /* 0.1 percent */  FROM ORGANIZED WHERE COL1 < 100;

    We are fetching 0.1 percent of records and in both cases it is using the index to access the table in both cases.

    Now fetching one percent records:

    SELECT * /* 1 percent */  FROM DISORGANIZED WHERE COL1 < 1000;

    SELECT * /* 1 percent */  FROM ORGANIZED WHERE COL1 < 1000;

    In DISORGANIZED table we are accessing just one percent rows but it is not using the Index any more. It is doing TABLE ACCESS FULL. In case of ORGANIZED table it is using Index.

    SELECT * /* 10 percent */  FROM ORGANIZED WHERE COL1 < 10000;

    It is still using index for 10,000 rows.

    SELECT * /* 15 percent */  FROM ORGANIZED WHERE COL1 < 15000;

    Even with 15 percent data fetch it is using index because the clustering factor is too low.

    Optimizer_Mode Effect

    Apart from clusting factor there are other various thing that has crucial affect on whether Optimizer will use the index or not. Among them most crucial is Optimizer_Mode parameter.

    ALL_ROWS:
      ALL_ROWS is the default mode. If the value of the Optimizer_Mode parameter is set to "ALL_ROWS" then the optimizer will attempt to find an execution plan that completes the statement (typically meaning "returning all rows") in the shortest possible time. ALL_ROWS mode is desinged to minimize computing resource & provide best throughput. The Default value of optimizer_mode in oracle 11g is set to ALL_ROWS. 
      If the Optimizer_Mode is set to ALL_ROWS then the CBO will favor FULL scan compared to index scan as the index scan is an additional IO.
      

    FIRST_ROWS:
     If the value of the Optimizer_Mode parameter is set to "FIRST_ROWS" the optimizer will attempt to find an execution plan to return the first row of a result set as fast as possible. This mode always prefer Index scan over FTS even if FTS is better option.
    It blindly rely on INDEX SCAN.   
    It was deprecated in Oracle 9i.

    FIRST_ROWS_N:
    It was introduced in Oracle 9i. The number N can be 1,10,100 or 1000 (using first_rows(n) hint, the number n can be any positive whole number). The FIRST_ROWS_N mode instructs the optimizer to choose a query execution plan that minimizes the response time to produce the first N rows of query result & it also favors index scan to access table row. It is useful in case of interactive front end /web app where you want to view first N number of rows as soon as possible. You don't care about the whole result set. At a time you want to see first 100 rows.

    Checking default Optimizer Mode:
    SQL> show parameter optimizer_mode

    NAME                                 TYPE        VALUE
    ------------------------------------ ----------- ------------------------------
    optimizer_mode                       string      ALL_ROWS

    Creating table EMP which has around 0.2 million records, we have also created Primary Key on Employee_ID, be dafault an Index is created and I have gathered the statistics.

    ALTER TABLE HR.T_EMP 
    ADD CONSTRAINT T_EMP_PK PRIMARY KEY (EMPLOYEE_ID);

    BEGIN
       SYS.DBMS_STATS.GATHER_TABLE_STATS(
    OwnName => 'HR',
    TabNmae =>  'T_EMP',
    Cascade =>  'TRUE');
    END;
    /
    SELECT * FROM HR.T_EMP
    WHERE EMPLOYEE_ID < 100000;


    It is doing TABLE ACCESS FULL. Now we will change the parameter to FIRST_ROWS.

    ALTER SESSION SET OPTIMIZER_MODE = FIRST_ROWS;

    It is now using Index to access the 50% of the rows. Here now even if we fetch 90% rows, it will still use index no matter there are more IOs. FIRST_ROWS mode will always try to use an index if present on table.

    This is the problem with FIRST_ROWS mode. It blindly uses the Index.

    Now setting OPTIMIZER_MODE to FIRST_ROWS_100, the oracle will try to return first 100 rows.

    ALTER SESSION SET OPTIMIZER_MODE = FIRST_ROWS_100;

    SELECT * FROM HR.T_EMP WHERE EMPLOYEE_ID < 100000;


    We are trying to access 50% of the data and using FIRST_ROWS_100 as optimizer mode. In execution plan it is still using Index in execution plan to give the data. It uses the index to give you the first 100 rows as fast as possible. As can be seen from execution plan ROWS is 100. All the information of ROWS, Bytes, Cost and Time is of first 100 rows. 

    So even if you are fetching large number of rows, if your optimizer mode is FIRST_ROWS or FIRST_ROWS_100 oracle might use the index. 

    Impact of OPTIMZER_INDEX_COST_ADJ:

    •  Using OPTIMZER_INDEX_COST_ADJ parameter you can change the cost of the table access through index scans.
    •  Valid Values goes from 1 to 10,000. The default is 100.
    •  Values greater than 100 make index scans more expensive and favor full table scans. Values less than 100 make index scans less expensive & make index scan more favourable.
    •  By default it is 100.

    Setting Optimizer_Mode agian to ALL_ROWS.
    ALTER SESSION SET OPTIMIZER_MODE = ALL_ROWS;

    SQL> show parameter OPTIMIZER_INDEX_COST_ADJ
    NAME                                 TYPE        VALUE
    ------------------------------------ ----------- ---------------------------
    optimizer_index_cost_adj             integer     100

    ALTER SESSION SET OPTIMIZER_INDEX_COST_ADJ=20;

    SELECT * FROM HR.T_EMP WHERE EMPLOYEE_ID < 70000;

    We tried to fetch thirty percent data and it is using the Index.

    ALTER SESSION SET OPTIMIZER_INDEX_COST_ADJ = 100;

    SELECT * FROM HR.T_EMP WHERE EMPLOYEE_ID < 70000;

    By setting OPTIMIZER_INDEX_COST_ADJ to 100 ( which is default), it is using INDEX as we are fetching 30% of the data. However it was using index when we set it to 20.

    Now fetching EMPLOYEE_ID less than 15,000.
    SELECT * FROM HR.T_EMP WHERE EMPLOYEE_ID < 15000;

    We are fetching around 7.5 percent data and it is using index, with  OPTIMIZER_INDEX_COST_ADJ parameter set to 100.

    Setting the OPTIMIZER_INDEX_COST_ADJ parameter to 400. According to theory by setting parameter to higher value, it will prefer FULL TABLE SCAN

    ALTER SESSION SET OPTIMIZER_INDEX_COST_ADJ = 400;

    SELECT * FROM HR.T_EMP WHERE EMPLOYEE_ID < 15000;

    As we can see it is doing TABLE ACCESS FULL, it is not using the index any more.

    This is how different values of different parameter OPTIMIZER_MODE and INDEX_OPTIMIZER_MODE_ADJ can affect optimizer to use the index or not.

    Source: https://www.youtube.com/user/Anindya007das

    You Might Also Like

    Related Posts with Thumbnails

    Pages