IBI-045-Data Warehouse Data Model Development Productivity

0
31

Note: You can listen to the blog post on the video or read the blog post.

Hello and Welcome.

I am Esther.

I am Peters A I Assistant to create voice overs.

Peter is using me as his Assistant, because men prefer to listen to a woman’s voice.

I will simply read Peters blog posts, so that you have a choice of reading the blog post, or listening to my voice.

Hello and welcome Gentlemen.

I’d like to say thank you very much, for coming along and listening to my latest blog post.

As you know I have been banned off LinkedIn for some years.

Now that I am back, I will post items that I think are of general interest to people in what is now loosely called the data community.

One area I have an interest is the cost of data modelling and E T L development for data warehouses.

After more than thirty three years in business intelligence, I can assure you that the people paying for your data warehouses also have an interest in how much they cost.

Everyone who knows me, knows that I wrote my first E T L software in nineteen ninety five.

That software meant we could sell fixed price data warehousing projects, for three hundred thousand Australian dollars in services fees, in Australia, in the second half of the nineteen nineties.

I have been selling fixed price data warehouses since nineteen ninety six.

I have made it clear in my comments, for more than twenty five years, that if a company is buying a data warehouse and the vendor is not offering a fixed price, they should consider another vendor, who will offer a fixed price.

If the vendor can not offer a fixed price, then the vendor should have a very good reason why they can’t do that.

In this post I wanted to publish some productivity numbers for my E T L software with respect to data models.

Just recently I invented a new way, to use my E T L software, to develop data models.

This new invention means we can create the target tables, and edit them to customise them, without having to actually develop the mapping from the source tables.

Needless to say, I was pretty excited about that.

So here are the numbers.

In four working weeks, about one hundred and sixty hours.

I added forty two dimension tables, and one hundred and ten fact tables, to a data model I am working on.

The total number of fields on these one hundred and fifty two tables is thirteen thousand, three hundred, and ninety seven fields.

On the one hundred and ten fact tables there are three thousand, six hundred, and eighty two, dimension lookup keys.

Two weeks, of the four weeks, was manually adding these three thousand, six hundred, and eighty two, dimension lookup keys.

The number of data fields from the source system in these tables is seven thousand, five hundred and fifteen data fields.

There are also audit fields, and special fields, added to each target table in the data warehouse.

Now.

I know those numbers are unparalleled.

No one else can do that.

Until a few weeks ago?

I was developing models by creating the mapping, to be able to generate the view, that went over the top of the staging area.

Then, from the view, create the target table.

With this process, of being able to generate the target table, before I write the mapping, I have been able to vastly increase the productivity of designing template data warehouse data models, for large operational systems.

So, Gentlemen.

Those numbers are the new high watermark in data warehouse template target table development productivity.

Of course, this is creating template target tables that will be customised on implementation.

When the mappings are actually written, the transform part of the work still has to be done.

The lookup key code needs to be written, for those three thousand, six hundred, and eighty two, dimension lookup keys.

But with the inventions I have added to see T L?

We are now in the era where we can build dimensional data warehouses, for very large operational systems in terms of numbers of columns.

What this does is to deliver everything Incorta is doing, plus delivering the dimensional data warehouse as well.

We can now build staging areas for very large systems in less than two weeks, including all indexes needed.

Now.

That is what I wanted to say about the numbers.

I know people will want to know more details.

There is a more unfortunate part of this post I have to share with you so that you understand my position.

I will not tell you the current number of tables or fields in this data model.

I will not tell you the target number of fields we will have.

I will not even tell you the industry segment.

Today, I must do my work in total secrecy.

Please allow me to publicly explain part of why I must now live in Fiji and work in total secrecy.

It is because of the way my ex wife of eighteen years, Jennifer Nolan, and her family have acted.

After I divorced Jennifer in two thousand and seven, I was working with Sean Kelly.

We were very successful in selling our data models to both Talk Talk, and Sky Talk, in the UK, in two thousand and eight and two thousand and ten.

At the Netezza European user group meeting in London, in two thousand and ten, Sky Talk did the key note speech about our project with them.

Sky Talk was not allowed to name us for legal reasons.

But we stood at the door and handed out our business cards, mentioning that we did the project they just heard about.

The future was very bright for Sean and I at that time.

Jennifer was so angry at our success, she had her brother harass and abuse Sean publicly.

Jennifer later had her brother harass and abuse Seans wife, and daughter, when he was in hospital dying of cancer.

Sean Kelly was a good man.

He was a good husband, a good father and a great friend to me, as well as many other men.

Sean Kelly was a scholar and a gentleman.

It was my honour and privilege to work with Sean for nearly ten years.

There are very few men of the character of Sean Kelly.

The world became poorer the day he passed.

Sean held my work permit in Ireland for six years.

Jennifer and her children all got Irish citizenship because of Seans kindness and generosity.

But that did not matter to Jennifer.

Jennifer still had her brother attack Sean, his wife, and their daughter.

I, personally, find men harassing women and girls to be unacceptable.

I, personally, find it particularly abhorrent that an ex wife would get her brother to abuse a wife and daughter while their husband and father lay dying of cancer in a hospital.

In my opinion, it was a shame Jennifer’s peers, other women, allowed this harassment to stand.

Despite the fact I divorced Jennifer in two thousand and seven, she is still demanding her family harass anyone associated with me.

Her father refuses to tell Jennifer, or her brother, to stop their crimes.

Her father’s position is that the crimes of his children are nothing to do with him.

I do not wish to bring such harassment on anyone else.

It is for this reason, I must work in secret.

Now.

I hope you found the numbers I quoted interesting.

We have entered a new era of data warehousing.

One in which a company can deliver a full dimensional data warehouse, for even the most massive operational systems.

There is really no upper limit, on the number of fields, that can be mapped to a dimensional data warehouse, any more.

Thank you very much for your time and attention.

Best Regards.

Esther.

Peters A I Assistant.

Carphone Warehouse Reference Video:

Previous articleIBI-045-Banned From LinkedIn Again
Next articleIBI-046-SeETL For Informatica DataStage Users
Peter Nolan
Peter Nolan is one of the worlds leading thought leaders in Business Intelligence. Across his 29+ years in BI Peter has consistently invented new and innovative ways of designing and building data warehouses. SeETL now stands alone as the worlds most cost effective data warehouse development tool.