Talend Open Studio is a great open source solution. Its “Cost vs. functionalities” is quite good. But because this is the free version, as opposed to the enterprise edition, it lacks a few core functionalities such as a proper workflow manager or a simple execution dashboard. But as it turns out, creating an execution dashboard is actually quite easy.
There are two functionnalities that EVERY projects based on the “Open Studio” version should have turned on:
- “Use statistics” (tStatCatcher)
- “Use volumetrics” (tFlowMeterCatcher)
The first option will tell all jobs of the project to send execution statistics through tStatCatcher to the target datastore. This way we can collect some metrics such as:
- the job start timestamp
- the job end timestamp and duration
- the job execution status (‘failure’, ‘success’)
If we tick the option “Catch component statistics”, we will also collect component execution statistics.
The second option will route the outputs of all tFlowMeter components through the tFlowMeterCatcher to the target datastore. The tFlowMeter components can be used to catch “inner-join”/”look-up” failure metrics of the flow. This can be usefull to detect dimension look-up errors for example, or data quality issues.
To be able to report using these, we need the datastore to be simple RDMS tables. Just by selecting the “On Databases” option (and configuring it), Talend will automatically create the defined two tables as:
Regarding the execution statistics, there is actually nothing else to do. Talend will feed the data to the execution table automatically. Collecting volume information requires establishing development rules and conventions. For example, any dimension look-up implementations should send failed join records to a tFlowMeter as in this integration job part:
I have established the following development conventions:
- All source component (files, APIs, RDMS) flows should first go through a tFlowMeter with a custom label starting with the keyword in and a short description of the source.
- Any look-up (
tJoin) between datasets or dimension look-up should have a failed records flow going to or through a tFlowMeter with a custom label starting with the keyword error and following by a very short description.
The first development convention helps detecting issues in source systems communication or internal,
if on a particular execution, the label value
count drops to zero, there is definitely something wrong.
The second convention helps dealing with data quality issues, early facts (or late dimensions), integration errors etc.
We can now create a dashboard on these metrics with any reporting tool (example for Tableau). We want to blend both datasets into
the same dashboard. A “simple” way to do so with Tableau is to create a datasource with
ETL_jobs_execution_monitoring in a left
ETL_jobs_volume_monitoring. To avoid duplications, we need to filter the
originis null (We want the job, not the components)
message_type = 'end'
count) might be aggregated by the reporting tool, a smart decision here is to select the average
function, just to be sure.
In any case, to prevent it from aggregating metrics, we need to add the the
pid (Talend internal) or the
system_pid to the dimensions we use. Both are unique (at least within a day).
With all these in mind, it’s quite straightforward to build an execution dashboard like this one: