Aug 8, 2008

Business Intelligence with Data warehousing for the newbie

Let’s start with this a raw value “5”. What is this? Off course it’s a value. Can you derive any meaning from it? I think no. We people call it Data/Fact. Let’s add something more with this raw value - “Sale amount USD 5”. Can you derive some meaning now? I think so. We people call it Information. Now if I add like “Fountain Pen’s sale amount USD 5” – can you know/learn something from this?-yup. This collection of information is known to us as Knowledge.

Ok, this was so simple, now let’s beautify it with three more terms context/domain, orientation and understanding.

It is like, more you relate data with one another, the more it defines the context, so the more its give you a clear understanding.


The figure can be like:

Figure: 1.1 Trend toward knowledge


So, we can come to a summary, that if we want to get knowledge we have to do a good orientation among data according to the context.



Data Orientation

Then let’s talk about data orientation/organization/relationship. Now a question comes. How to relate data? So now we can search for data structures/models. The below one is most popular model in the current market to organize data.

Figure: 1.2 Relational Data Model


What do you think about the above model? Isn’t it a table (table is known as relation)? This model is known as Relational Data Model. Databases that use this model to organize data are known as Relational Database. Following Edgar F. Codd ‘s rules and some other vendor specific rules RDBMS (system for managing relational databases) are developed. But this is not my concern. My concern is how we can define or relate data/fact in such way that it provides us knowledge.

Before going further I want to clear some aspects like, in an enterprise system two types of databases may exist. One is known as transactional/operational support database and the other is known as Decision Support/ analytical software support database.

In transactional Databases day to day transactions records are stored. And it is not much concerned with knowledge or decision making. Whereas, Decision Support/ analytical software support database which is also known as report database, main concern is summary, aggregation and knowledge.


Relational Data Model, more concerned with defining Entities but not fact

What is happening currently? We are defining a table with some columns which represents a particular entity of real world. And when we are putting data into cells maintaining column type then each row is representing an instance of that entity.


Product

Order

Customer

Date

Currency

Amount

Fountain Pen

#0001

Mr. David

01, Jan 2005

USD

5








Figure: 1.3 Relational Model to represent Entity


But if business asks question like:

What is Mr. David’s area?

What is the category of Fountain Pen?

This table can not answer the questions.

So to meet the need of the business we will define Customer and Product as entity and refer then to Sold Product with their id.


Figure: 1.4 Entity Relationships

In the same way we can relate Order with Sold Product.

But my question is why we are relating these tables. What is out objective? Can we achieve the objective in some other way more effectively and efficiently?

Let’s give a closer look. Every business has some key facts/values that define the business.

Selling Business: Sale Quantity and Amount, etc.

Telecom Business: Call duration and amount, etc.

Banking: Withdraw and deposit amount, etc.

Business to business it will vary. We call these facts as Measures of the business which will define revenue, profit, loss, future condition and all other aspects of the business.

Why these are called measures?

As, all the related perspectives of the business are measured by these values.

So, as all the business objects are measured by these facts so analysts have invented a way of relating entities. Like the below one:

Figure: 1.5 Star Schema


Figure: 1.6 Relationship with fact table.

These relate the facts more perfectly with all the entities. The base table that contains the business basic facts is known as Fact Table. The fact table has just measures and the references of the other entities. And the other entities are known as dimensions.

What is the benefit? Still this question can arise in one’s mind. But get a snap, you will get that this single table is now our concentrating point and it can answer the questions like who, how, where, why, when.

Further I will discuss lots of other things but I must mention that this is the base of all sorts of analytical data processing from relational database.

Subject Orientation in Relational database


What is a subject for a business?

- Business stakeholders interest area. Too short Answer!

I mean the certain activity where some stakeholders keep their eye. For easy understanding i can clarify it as "Business Function". So in business; Subject is Function.

Are you going to teach us something new? I know i have to face this question for the previous line. No, i am not going to teach something new. Look, Function is the objective of the business to achieve. About this all related stakeholders are discussing. How to improve this function? where to change? when to change? Who are the major role players of this function? How to motivate them? So, all headaches are about the function. Then what is subject? Subject is a topic of discussion. So without any doubt, in business Functions are subjects. Ha Ha I am really pleased to draw such logic to prove this. Now you are thinking its a simple matter, no need to write that much to make this clear. But at the beginning i know the question seemed to you a little troublesome.





Next topics:


- CWM – Common Warehouse Model

- Star Schema and Snowflake Schema

- How knowledge comes

- More flexibility in defining knowledge through Object Orientation

- Meta Data

- Relational to OLAP

- OLAP Objects

- MDX (Multi dimensional Expressions)

- ETL

- And more.. .



References

http://www.systems-thinking.org/dikw/dikw.htm

http://www.viktoria.se/~dixi/km/chap3.htm

http://en.wikipedia.org/wiki/Codd%27s_12_rules

http://www.tonymarston.net/php-mysql/database-design.html#relational.db








5 comments:

Anonymous said...

This is just too good. Again hats off to u Man. This gonna be a very helpful document for people interested in Data Warehousing as this concept is not so clear so far in our country as far as I know. Thanks a lot. U made it easy to understand.

All the best.

A F Quazi

** I wish ur blog reach people for their benefit.

Mahmudur Rahman Manna said...

Thank you. I have tried to start from the very beginning. Your inspiration will help me to complete this doc.

Anonymous said...

After studying this article I found that Data Warehouse concept is created over analytical data processing form relational database. Of-course its a nice article and I am eagerly waiting for your next topic and if possible then plz inform me after writing your new topics.

Unknown said...

Carry on!

Anonymous said...

Man u kept us waiting. Plz proceed...
Looking forward to the next part of this article.

A F Quazi