IBI-046-SeETL For Informatica DataStage Users

0
19

Note: You can listen to the blog post on the video or read the blog post.

Hello and Welcome.

I am Esther.

I am Peters A I Assistant to create voice overs.

Peter is using me as his Assistant, because men prefer to listen to a woman’s voice.

I will simply read Peters blog posts, so that you have a choice of reading the blog post, or listening to my voice.

Hello and welcome Gentlemen.

I’d like to say thank you very much, for coming along and listening to my latest blog post.

As you know I have been banned off LinkedIn for some years.

Now that I am back, I will post items that I think are of general interest to people in what is now loosely called the data community.

One area I have an interest is the cost of E T L development using tools like Informatica and DataStage.

Just to let you know, I ran Ardent Professional Services for Asia Pacific in the late 90s.

I also implemented DataStage into Saudi Telecom in 2003, and Orange Romania in 2004.

Indeed, many of my inventions from Saudi Telecom, and Orange Romania, went into DataStage.

When I joined Ardent, in 1998, I sent all my software to Jason Silvia who was head of development.

I explained to him, how I was able to generate E T L, and suggested Jason get his people to look at my software.

Jason came back reporting that his best people were astonished at the idea of generated E T L.

He said they could not find a way to adopt my ideas into DataStage.

So we left it at that.

I also implemented Informatica at such places as Lindorff Financial in Norway, New Jersey Media Group in the U S A, Electronic Arts from the U S A and in Talk Talk in the U K.

So, I am very familiar with implementing Informatica as well.

I am sure the products have moved on a bit since I last used them.

However, for data warehouses, they remain fundamentally the same.

Having done many implementations with both products, both with see T L and not with see T L, I am very well aware of the productivity profiles of both products.

Fundamentally, the G U I makes it possible for low skilled programmers to write mappings in either of DataStage or Informatica.

And that is what is done today.

Specifications are written by someone of a relatively high skill level.

Then the Informatica or DataStage jobs are written by someone of relatively low skill level.

See T L removes the need to have the low skill level person write the Informatica or DataStage jobs.

See T L puts the DataStage and Informatica developers out of a job and replaces them with the higher skilled person.

Simply put.

When I was at Orange Romania we invented the idea of saving the mapping spreadsheet as an X M L document from Excel.

This was a newly introduced feature in two thousand and four when we were doing the project.

It came in with Office X P.

So what we did at Orange Romania was to develop all E T L using see T L. .

Then, at the end of the project, we migrated the see T L, E T L to Datastage jobs.

This was the first time we did it quite like this.

We had our issues and we worked out the kinks.

My next project after Orange Romania was Electronic Arts which was an Informatica site.

We did exactly the same thing and it worked just fine.

What I pioneered in two thousand and four to six, was to prove see TL could be used to develop E T L in it’s own right.

And for large companies that wanted a name brand E T L tool, we could migrate to that tool at the end of a project, in about two weeks work.

So, if you are a DataStage or Informatica site?

Or if you are consultants using DataStage or Informatica?

You can cut your costs of E T L development by fifty percent, or more, simply by adopting see T L in the development phase.

If you are consultants?

Using see T L will give you a fifty percent cost reduction in E T L development for new projects over your competitors.

You will get this fifty percent advantage, even if you go into production with Informatica or DataStage.

Of course, the reason I B M and Informatica did not want to tell anyone this was possible, was because if customers saw they could build their whole E T L system with see T L, many would not go into production with Informatica or DataStage.

So, both I B M and informatica, hid see T L from their customers, and prospects.

You might want to remember that.

So, if you use DataStage or Informatica?

You can cut your costs of E T L development just by downloading and getting started with see T L today.

Lastly, I thought I would share the story of how the C plus plus version of see T L came about.

I wrote an article for Ralph Kimball for his D B M S magazine in two thousand and one.

This was about how to make money using business intelligence.

The article was very well received.

A little while later I was working at North Jersey Media Group in New Jersy.

I was using Informatica as Sybase was an Informatica reseller.

It turned out that I was having to change my data models in order to accommodate the Informatica processing.

I was writing to Ralph and complaining about how these very expensive E T L tools were forcing me to make changes to my data models.

Ralph jokingly sent back an email saying.

If you are so smart, why don’t you write me an article on the top ten features all E T L tools should have?

I would love such an article and I think it would go over well with my audience.

I wrote back and told him that’s not a bad idea at all.

And I started writing the list.

However, I was very busy on the project and the list got put aside not long after.

When I finished the project I got back to my list, and completed the article.

I sent Ralph a draft of the article and he was very impressed.

However, he told me his tenure at D B M S magazine would soon be coming to a close, and so he did not believe the article would be published.

Nevertheless, we talked about the article as it was.

He asked me how much time and money an E T L tool with all those features would save.

I told him I would guess such a tool, would easily cut the cost of E T L development in half, over the current best practices for Informatica and DataStage.

Then Ralph said something that would change my life, again.

He said.

Well? If you are so smart?

Why don’t you write that E T L tool?

I started back with, but I don’t even know C plus plus.

But over the course of a few weeks the idea really got under my skin.

I knew that if I could write such an E T L tool, I could easily sell it for twenty thousand euros per copy.

And so I set about learning C plus plus, and writing the very first C plus plus version of see T L. .

And the rest, as they say, is history.

We used this very first version at Saudi Telecom to come in way under budget for their operational data store project.

It was at Saudi Telecom we added memory mapped I O, and the ability to scale fact table processing linearly.

In testing at Saudi Telecom, we had over 200 million C D Rs, twenty million customer and account records, and we were running on a Sun 18 K with 18 C P ewes.

My see TL software could split up the C D Rs into 100 separate files and then process those separate files down many parallel processing programs.

The parallel processing programs were able to share the same lookup tables, and also maintain a unique big integer at the front of each record for the primary key.

My customer, Knowledge Net, could not believe that we could process these volumes using the software I wrote.

But they had already sold DataStage and we had to go into production with DataStage.

So, as early as 2004.

We knew that see T L could handle twenty million customer records, twenty million account records, and two hundred million call records, for a telco.

Obviously, telcos want to buy name brand software for E T L. .

But it was very clear I could sell see T L for twenty thousand euros per copy.

And so I did.

I sold three copies to the richest man in Australia.

I sold another copy to Key Work Consulting in Germany.

The fifty percent, or more, reduction they got in their development costs helped them grow very rapidly.

Key Work Consulting were a reference account for me for many years.

When I divorced my wife of eighteen years in two thousand and seven, she gave me a lot of trouble, and so I was not able to sell see T L as an independent product any more.

My loss is your gain.

You can have the last public release of see T L that was selling for twenty thousand euros per copy, for free.

Now.

I hope you found this blog post interesting and informative.

If you want to reduce your costs of E T L development?

You are one click away from getting started.

Thank you very much for your time and attention.

I really appreciate that.

Best Regards.

Esther.

Peters A I Assistant.

Carphone Warehouse Reference Video:

Previous articleIBI-045-Data Warehouse Data Model Development Productivity
Next articleIBI-047-SeETL for MicroStrategy Users
Peter Nolan
Peter Nolan is one of the worlds leading thought leaders in Business Intelligence. Across his 29+ years in BI Peter has consistently invented new and innovative ways of designing and building data warehouses. SeETL now stands alone as the worlds most cost effective data warehouse development tool.