Page 38 - DCAP606_BUSINESS_INTELLIGENCE
P. 38
Unit 3: Dimensional Data Warehouse
This way, in your fact table, you have your old data (i.e. before the day you added) with the SID Notes
of the Employee ‘Emp12 >> ‘B1.’ All new data (i.e. after the day you added) would take the SID
of the employee ‘Emp12 >> ‘B2.’
3.4.1 Advantages of Surrogate Keys
Immutability: Surrogate keys do not change while the row exists. Thus applications cannot
misplace their reference in the database.
Change in Requirements: Attributes that uniquely recognize an entity might change over
the time, which might lead to invalidation of the suitability of the compound keys.
Example: An employee’s network username is chosen as a natural key. If it is merged
with another company, new employees must be inserted. Now, some of the new user names
may lead to conflict because their user names were developed independently.
In these cases, usually a new attribute should be added to the natural key (for example, an
old_company column). In the case of a surrogate key, only the table that characterizes the
surrogate key must be altered. But in the case of natural keys, all tables that use the natural
key will have to change.
Performance: Surrogate keys tend to be a compact data type, such as a four-byte integer.
This allows the database to query the single key column faster than it could multiple
columns.
Uniformity: When every table has a uniform surrogate key, some tasks can be easily
automated by composing the code in a table-independent way.
Validation: It is possible to design key-values that are in coordination with a well-known
pattern which can be automatically verified.
Example: The keys that are intended to be used in some column of some table might be
designed to “look differently from” those that are intended to be used in another column or
table, thereby simplifying the detection of application errors in which the keys have been
misplaced.
3.4.2 Disadvantages of Surrogate Keys
But surrogate keys also come with some disadvantages. The values of surrogate keys have no
relationship with the real world meaning of the data held in a row. Therefore over usage of
surrogate keys lead to the problem of disassociation and creates unnecessary ETL burden and
performance degradation.
Query optimization also becomes difficult when one disassociates the surrogate key with the
natural key. This is because when surrogate key takes the place of primary key, unique index is
applied on that column. And any query based on natural key identifier leads to full table scan as
that query cannot take the advantage of unique index on the surrogate key.
Referential Integrity: Referential integrity must be maintained between all dimension
tables and the fact table. Each fact record contains foreign keys which are related to primary
keys in the dimension tables.
LOVELY PROFESSIONAL UNIVERSITY 33