Hello and welcome to our latest blog post!
Thank you for dropping by!
We really appreciate you!
This is one for the “techies” in my audience. Only of great interest to people who are responsible for running schedules of ETL processing for their companies or customers.
One of the things that has to happen is that the ETL support “techies” have to be informed when the ETL fails so they can go and fix it in the middle of the night while the business users are fast asleep in their nice warm beds!
The standard way of informing an ETL support “techie” that a batch has failed, or has completed, is by email.
Of course, SeETL has the option of sending an email on failure. However, this requires the email message be put on every line of the schedule. So in practice what we do is we put emails in to long schedules at important sync points and we put an email at the end of the schedule. Then the absence of the email is the prompt to go and check the schedule.
Of course, we could have error emails being sent as well, we just don’t really do that in practice.
With the advent of Office 365 some of our customers wanted to move the email sending over to Office 365.
This is a little different to the standard smtp email server and so we had to work out how to make it work.
Since we had to work out how to make it work we thought it was worthy of a blog post so others could benefit from our piece of work.
You have access to the source code of our scheduler. It is the pair of programs CTLU008/009.
CTLU008 is the scheduler itself. CTLU009 runs a process group inside a batch.
It is very reliable. I have put some more comments at the end.
If you would like to watch a video showing you what is in the downloads package? Just click on the button below.
So here we are!
This is how to send an email using the SeETL scheduler.
With Office 365 the first thing you need to do is to convert the email password to be used to an encrypted text string to be placed in a file. This string will then be used for authentication with the O365 smtp mail server.
The way you do this is to create a power shell command. So let’s call it:
In this power shell command you put entries something like this. You do not need your email address, you will be prompted for your password and it will be encrypted, it will not be check by this process.
Obviously the “Out-File” is where the encrypted password will be written to:
Read-Host -Prompt "Enter your email password" -AsSecureString | ConvertFrom-SecureString | Out-File "E:\A_IBI_Hosting\CUSTOMER\S01-Schedule\p-nolan-email-password.txt"
Having put that command in to the ps1 text file you just right click and say run as a power shell command.
The power shell window will open with the
"Enter your email password"
prompt. You type in your password and it will write the encrypted password to that file.
The contents of that file will be very long string. The first characters will look something like this.
This is what an encoded password looks like for Office 365.
Now you want to create the power shell command that will actually send your email.
You can have one such command for each different email you want to send.
As I said, for long running ETL we usually send emails at important sync points so we know that progress is being made. And then we have one email at the end.
So next you create an power shell command that will be the actual command executed at run time in the scheduler. For example you could call it:
And the contents would be something like this:
$AdminName = "firstname.lastname@example.org"
$Pass = Get-Content "E:\A_IBI_Hosting\CUSTOMER\S01-Schedule\p-nolan-email-password.txt" | ConvertTo-SecureString
$cred = new-object -typename System.Management.Automation.PSCredential -argumentlist $AdminName, $Pass
Send-MailMessage -From email@example.com -To firstname.lastname@example.org -Subject "IBIWeekDayBatch001 - Batch Completed" -Body "IBIWeekDayBatch001 - Batch Completed" -SmtpServer smtp.office365.com -Credential $cred -UseSsl -Port 25
You store this command usually in the same directory as the password and the power shell commands to run this command.
Now, in order to run that power shell command from inside the SeETL Scheduler you need to create a mechanism by which it can be run.
SeETL executes commands using the “command” command inside the CTLU009 program so the command that is passed to the shell of the external operating system has to be a command that is acceptable to the external operating system. It can not be a power shell command.
So we create a command in a file named something like this:
And inside the command file we put a statement something like this:
PowerShell.exe -Command %1
So now we have to have a way of putting
in to the scheduler to be run when we want it to run. In this case we want it to run at the end of the ETL to send an email notifying the support person that the ETL has ended.
So, in the scheduler in the process commands section you create an entry like this:
Process Name = ibipowershellexe01
Process Command = ####Scheduler_Signal_Files_Folder####\runPowerShellCommand.cmd ?1
Comment = Run a powershell command
Notice you must use a “?” here and not a “%” sign.
You can use variable substitution in the Process Command field as I have done here. And you invoke the
passing it one parameter.
Now this parameter is resolved by SeETL, not the operating system, and to avoid confusion SeETL uses the “?” for parameters.
So this Process Name can be called from any line inside the schedule itself and passed a parameter of a power shell command and it will execute that power shell command.
So, then, in the schedule itself you can put:
Process Command = ibipowershellexe01
Process Command = ibipowershellexe01
Parameter = "####Scheduler_Signal_Files_Folder####\run-m01-send-end-etl-email.ps1"
Yes, we do now repeat the Process Command.
We used to have the ability to have two different values in the batch command line of the scheduler.
Now we tend to use the same for such commands as this because there was no real value in having them be different.
What this command in the actual schedule will do is to execute the file
by using the power shell command.
So this will then be the step that actually sends the email at the end of processing.
So there you have it.
This post has shown you how it is that we send emails via O365 accounts from inside the SeETL scheduler.
Of course, smtp emails are quite simple and it is quite well understood how to send them via a command line.
With the use of O365 it’s a little trickier because we have to use power shell and we have to provide the encrypted password. This is why I thought it was worth my while to put this blog post up for you to show you how this works!
Of course, our scheduler has turned out to be the backbone of our run time processing.
It was developed way back in 2003 in response to a customer request. Over the years it has proven itself to be very reliable.
We start it up at system startup time and it just runs.
We recommend putting the scheduler on the same machine as the database for the schedule.
We have had customers where the databases for the schedule were on a different machine to the scheduler itself.
This leads to the problem that if the ODBC connection fails for any reason the scheduler fails.
We tried very hard to get around that problem but the work arounds were worse than the simple solution of putting the scheduler on the same machine and restarting the schedule when the machine started.
So, today what we do is we install the scheduler on the same machine as the staging area database and we let it run the schedule from there. This has proven very, very reliable.
The only real issue left with the scheduler is that if the machine crashes so hard in the middle of a batch that the scheduler did not have time to write out the fact that it was going down?
Then manual intervention is needed to get the scheduler restarted because it did not record the fact it was dying.
If the DBA performs a shut down of the database properly then SeETL will notice that the database is being shut down and record those processes that have not been completed and the restart will find them and run them.
Today, servers crash so rarely this is not an issue.
When we want to take a machine down we make sure we do not take it down in the middle of an ETL run.
Our SeETL just runs and runs like the energiser bunny. And it has parallelisation features that mean we can take a DataStage schedule of jobs and actually run it faster than DataStage can. That’s why we have the DataStage interface, to do exactly that!
The SeETL scheduler also comes with SQL Server Report Services Reports that expose the schedules and the schedules in progress so that ETL support “techies” can see where any specific schedule is up to.
With schedules commonly having more than 5,000 steps reports for where things are up to are quite useful.
I would like to say thank you very much for reading this blog post!
I hope you like using our SeETL scheduler!