Implementing a bridge table with Talend
In Kimball’s multi-dimensional data model, a bridge table is an analytical solution to a multi-valued dimension fact in a fact table: when a fact in a fact table relates to more than one record in a dimension table (many-to-many). Creating and feeding the bridge table and the associated dimension table can be challenging: here is a solution with a RDBMS and the ETL tool Talend.
If you have used Informatica, or to some extent Talend Studio (that is, not the free version), you know that you can chain jobs together. Now, if you need to chain jobs which use diffent technologies, or if you need more than just linear chaining and dependencies, this is where a robust workflow manager comes in handy.
Bulk load between technologies
Data migration is a very common task in (big) data engineering. In the big data landscape, you would probably look for sqoop to handle such a task. But in the small to mid-size ecosystem, this task is actually not an obvious one. In this post, I will introduce a new bulk loading tool – Embulk – which can handle data loading between different technologies.