It's All About ORACLE: Rapidly Changing Dimension (RCD) in Data Warehouse

A dimension is a fast changing or rapidly changing dimension if one or more of its attributes in the table changes very fast and in many rows. Handling rapidly changing dimension in data warehouse is very difficult because of many performance implications.

As you know slowly changing dimension type 2 is used to preserve the history for the changes. But the problem with type 2 is, with each and every change in the dimension attribute, it adds new row to the table. If in case there are dimensions that are changing a lot, table become larger and may cause serious performance issues. Hence, use of the type 2 may not be the wise decision to implement the rapidly changing dimensions.

For example: Consider patient dimension where there are 1000 rows in it. On average basis, each patient changes the 10 of attributes in a year. If you use the type 2 to manage this scenario, there will be 1000*10 = 10000 rows. Imagine if the table has 1 million rows, it’ll become very hard to handle the situation with type 2. Hence we use rapidly changing dimension approach.

Separate Rapidly Changing Attribute by Implementing Junk Dimension

Consider the fact table, in which not all the attributes of the table changes rapidly. There may be some attribute that may be changing rapidly and other not. The idea here is to separate the rapidly changing attribute from the slowly changing ones and move those attribute to another table called junk dimension and maintain the slowly changing attribute in same table. In this way, we can handle situation of increasing table size.

PATIENT

Patient_id

Name

Gender

Marital_status

Weight

BMI

The attribute like patient_id, Name, Gender, Marital_status will not change or changes very rarely. And attribute like weight and BMI (body mass index) changes every month based on the patient visit to hospital. So, we need to separate the weight column out of the patient table otherwise we end up filling the table if we use SCD type 2 on PATIENT dimension. We can put the weight column which is rapidly changing into junk dimension table. Below is the structure of Junk dimension table.

PATIENT_JNK_DIM

Pat_SK

Weight

BMI

Pat_SK is the surrogate key and acts as a primary key for junk dimension table. Below is how our PATIENT table looks after removing weight and BMI from it.

PATIENTPatient_idNameGenderMarital_status

Link Junk Dimension and PATIENT Table

In this step, we must link the junk dimension and patient table. Keep in mind; we cannot simply refer the junk dimension table by adding its primary key to patient table as foreign key. Because any changes made to junk dimension will have to reflect in the patient table, this obviously increases the data in patient dimension. Instead, we create one more table called mini dimension that acts as a bridge between Patient and Junk dimension. We can also add the columns such as start and end date to track the change history. Below is the structure of the mini dimension:
PATIENT_MINI_DIMPat_idPat_SKSTART_DATEEND_DATE

This table is just bridge between two tables and does not require any surrogate key in it. Below is the diagrammatic representation of the Rapidly Changing Dimension implementation.

2 comments:

Admin February 17, 2020 at 4:05 AM: Animachi is really hard to recognize since you took over the website - what you made of it and I really appreciate your commitment to it
BCOM 1st Year TimeTable 2020
BCOM 2nd Year TimeTable 2020
BCOM 3rd Year TimeTable 2020
Blogger February 22, 2020 at 12:57 AM: This comment has been removed by a blog administrator.

It's All About ORACLE

Oracle - The number one Database Management System. Hope this Blog will teach a lot about oracle.

Rapidly Changing Dimension (RCD) in Data Warehouse

Separate Rapidly Changing Attribute by Implementing Junk Dimension

Link Junk Dimension and PATIENT Table

2 comments:

You Might Also Like

Pages

Recent Comments

Translate

CLOCK

Search This Blog

About this blog

Oracle the number one Business solution company also It is most powerful database management system.

In this Blog I will be covering topics on oracle database management system, Working and developing Oracle Application, its Database Administration and Database Development concepts, also bit on Unix OS.

About me

Labels

Blog Archive

My Other Blog