BI4ALL-007-Customer Dimension Associations

0
1556

Hello and welcome to this new blog post.

Thank you very much for dropping by to read this. I really appreciate you!

I am sssssuuuuuuuuurrrrrrrrrrrreeeeeeeeeeeeeee you are going to LOVE this one.

On this blog post I am going to talk about a specific idea that I was introduced to in February 2001.

The very first time I saw this idea presented my very first reaction was:

“HOLY COW!! This is the BEST idea I have ever seen!!!”

I immediately said so to the presenter and asked him who had invented this idea. He smiled and indicated it was his good self and he thanked me for my compliment.

In 2001 these things were called “profiles”. The person who invented them remains a good friend of mine.

In this blog post I am going to talk about just one specific instance of “Profiles” and that being customer profiles.

The reason I am going to stick to customer profiles is that they are the easiest to understand and the value of storing them is much more obvious than other forms of profiles.

You can get access to a large scale PDF of what I am talking about by clicking on the button below.


Get Models PDF


Now that you have the large scale PDF available to you? Let’s discuss it. There is some text in it that you can also read.

The diagram in the PDF is the single most important diagram that has ever been presented to me in a data modeling class about Business Intelligence.

Yes, I learned Bill Inmons Time Variance and Stability Analysis from his books in 1993/4.

Yes, I learned Dimensional Modeling from a former Metaphor employee in 1994-5.

But in that period there was a very hot debate of “Inmon vs Kimball” going on.

Since I met Bill Inmon at the 1993 Metaphor Users Conference in San Francisco that debate never made any sense to me.

Bills company, Prism Solutions, was implementing some dimensional models even at that early time.

In 1996 my mentor and I first combined the ideas of Bill and Ralph in to one single consistent architecture that provided “the best of both worlds”.

My mentor worked for PwC at the time. In 1997 he was able to make this new hybrid idea of “archive everything” and “put dimensional models on top” the standard for PwC.

The obvious reason being that it was not only the most comprehensive way of building a data warehouse, but at the time it was also the most E-xpensive. And PwC liked E-xpensive. (Meaning expensive with a capital E).

Once we had standardised on this combined model we wondered if we could bring down the cost of development and support one day.

My mentor and I spent the next 5 years, through to February 2001, trying to figure out how to perfectly store history in a dimensional model. We wanted to be able to emulate Bills TV+SA models inside dimensional models.


 

We failed. After both of us thinking about it for 5 years and not being able to come up with a solution? We both believed it was not possible. So we continued to commonly implement both separate models in our customers.

So the day that this model was presented to me I immediately knew this was the modeling solution my mentor and I had been searching for over the last 5 years.

Later on, in 2002, it was I who thought of the idea of putting the bigint primary key on this table and linking it to all fact tables. The man who was the inventor of the idea returned my compliment because this was a VERY good idea!

So… What does the Customer Dimension Association fact table do for you?

It allows you to perfectly store history for customer attributes over an extended period of time in a highly compressed, highly performant data model.

As you can see it consists almost entirely of the dimension table keys that describe the customer and a set of date from, date to, current flag fields.

It is immediately obvious to all experienced data modelers that this model allows for the daily ETL running and storing of customer images where there is very little redundant data stored.

The current image of the customer is returned by setting the where clause for the current flag to 1.

An historical version of the customer is returned by setting:

desired_date between date_from and date_to.

This means the actual customer dimension table can be a type 1 dimension table because the history of customer attributes is not stored there.

Of course, when I wrote SeETL I supported this type of fact table as a standard table.


 

What sort of analysis does this table make possible?

The possibilities are quite endless.

But the most common use we put this table to is as follows.

Consider you have many good customers today. They have evolved in to good customers over time. So…

      1. Why did they become good customers?
      2. How did they become good customers?
      3. What did they look like 1, 2, 3 years ago?
      4. What development path did they take?

If you can see how these customers evolved over time you can then answer the question…

“Of the customers in our customer base now, who are the ones most likely to turn in to good customers and how should we treat them over time to give us the best chance of turning them in to good customers?”

This is what customer dimensions association fact table does for you.

It allows you to go through the history of the actual customers evolution to detect patterns of evolution.

Very, very valuable and important information.

Lets look at some sql that would show you how this works.

This query is going to return to us the customer unique id, the customer key, first name, surname, age band, income band, life stage, number of dependents, marital status and education level.

It will do this as the customer profile was on 2017-01-01 for every customer who purchased “Product XYZ”, number “#123” between 2020-01-01 and 2020-08-31 who has an income band between 40001 and 70000 and who is a ‘gold’ loyalty card member today.



select
b.customer_unique_id
,h.pk_vm_customer
,h.first_name
,h.surname
,i.age_band
,i.income_band_person_sdesc
,i.life_stage_sdesc
,i.number_dependents
,i.marital_status_sdesc
,i.education_level_sdesc
from
dbo.vf_sale_txn a
,dbo.vf_customer_dims_asoc b -- current customer profile for the sale (s)
,dbo.vm_sale_date c
,dbo.vm_all_demographic d
,dbo.vm_product e
,dbo.vf_customer_dims_asoc f -- customer profile from 2017-01-01
,dbo.vm_loyalty_code g
,dbo.vm_customer h
,dbo.vm_all_demographic i
where 1=1
and e.pk_vm_product = a.dk_vm_product
and d.pk_vm_all_demographic = a.dk_vm_customer_demographic
and c.pk_vm_day = a.dk_vm_sale_date
and a.dk_vf_customer_dims_asoc = b.pk_vf_customer_dims_asoc
and b.dk_vm_loyalty_code = g.pk_vm_loyalty_code ---- They are a GOLD customer now.
and e.product_description = 'Product XYZ'
and e.product_num = '#123'
and c.day_date between '2020-01-01' and '2020-08-31'
and d.income_band_person in ( '040001-045000','045001-050000','050001-055000’,'055001-060000’,'060001-065000','065001-070000’ )
and g.loyalty_sdesc = 'GOLD'
and b.dk_vm_customer = f.dk_vm_customer
and '2017-01-01' between f.date_from and f.date_to ---- Get profile current on 2017-01-01.
and f.dk_vm_demographic = i.pk_vm_all_demographic ---- Get demographics from 2017-01-01
group by
b.customer_unique_id
,h.pk_vm_customer
,h.first_name
,h.surname
,i.age_band
,i.income_band_person_sdesc
,i.life_stage_sdesc
,i.number_dependents
,i.marital_status_sdesc
,i.education_level_sdesc


 

Now, this is just ONE EXAMPLE of how customer profiles linked to sales transactions can be recursively queried to get lists of customers and how they looked 1, 2, 3 years ago to compare and contrast the products such customers are buying today with what they looked like 1, 2, 3 or any number of months or years ago.

This customer profile query starts to give you an understanding of the power of customer profiles in finding in your customer base those customers who are more likely to develop in to profitable customers.

We are prepared to bet you have never seen a piece of sql like this that can give you the answer we just showed we can give you. And yes, we have much more we can show you.

Our Product Catalog Segmentation models were developed as a follow on from customer profiles.

And our Product Catalog Segmentation is nearly as powerful as customer profiles.

So welcome to your first taste of Customer Dimensions Associations. I hope you liked it.

When I showed this to both Bill and Ralph in the early 00s they were suitably impressed. I hope you are suitably impressed too.

If you would like this sort of sub model implemented and populated properly in your company? Just let me know. I would be happy to arrange it.

Obviously we have implemented this many times over and we have the full toolkit to implement such models in customers quickly and easily.

Let me not forget to mention this is just one portion of the very broad set of BI4ALL models.

Sean Kelly and I were selling these data models for EUR100K per copy before his untimely passing.

But for you, here on my blog? This is the “freebie” offered to you as the “teaser”.

Having given you a free copy of #SeETL and given you the design of the Customer Dimension Association table I can’t charge you for those things. And now you have everything you need to implement Customer Dimension Associations your self!

But if you would like me to do it or would like me to organise one of my business partners to do it?

That would be fine too!

I am putting this here to show you “I know something you don’t know that is of great value to you”.

In the last 17 years we have been implementing this idea of Customer Dimension Associations with a bigint primary key that can be linked to fact tables?

I have never seen anyone else do this. And for an idea to not get out “in to the wild” for 17 years is a long time.

In my own opinion every company who has customers they can identify as individuals should have a Customer Dimension Associations table to use for targeting customers for offers and treatments.

But then again, I would say that! Right?

In this post you can see EXACTLY what you would be getting if you implemented this idea.

And I can’t be fairer than that!

Thank you very much for your time and attention!

Best Regards

Peter.

Previous articleBI4ALL-006-New Documentation Technique Demo
Next articleSeETL044 – Sending Emails Using O365
Peter Nolan
Peter Nolan is one of the worlds leading thought leaders in Business Intelligence. Across his 29+ years in BI Peter has consistently invented new and innovative ways of designing and building data warehouses. SeETL now stands alone as the worlds most cost effective data warehouse development tool.