Structured Query Language Solutions
In general, ETL implies doing transformation processing upstream of the database while ELT implies doing transformation processing using SQL within the database. The main driver for the ELT approach is performance – i.e. utilise the potential of new powerful database technology and recent SQL improvements to do the transformations faster.
In theory the ELT approach offers the following advantages:
However, unless there is a green field situation where there is no incumbent ETL tool in place, existing ETL tools will typically be used, to maximise ROI on software licence costs and in-house skills.
Tools that were originally designed and optimised for traditional ETL may provide some ELT capability, but typically at the cost of losing some of the benefits of using that tool, for example:
In practice, a hybrid approach is often employed where processing is spread between the front-end and back-end depending on the characteristics of the workload and the capacity of the ETL and database engines. The hybrid approach should have clearly defined rules for deciding which approach is taken for a given data stream, and should manage the ELT SQL within the context of the ETL environment and integrate it as closely as possible.
It could be argued that MPP data warehousing appliances should be encouraging the use of ELT; however even the most established ETL tools like Ab Initio, Informatica and Expressor only support ELT by firing off sequential SQL statements. What is really needed is for SQL to be transparently available in the work flow mechanisms of the tool like the other transformation components.
However, the number of data warehouse appliances is small when compared to the more typical data warehouse platforms and so there is little reason for ETL tool vendors to focus on this area of the market, especially as it removes the emphasis from the performance of their product to that of the target database. Modern ETL products have focused on utilising multi-processor configurations and parallel processing threads to get high throughput, and so in some ways the fact that data warehouse appliances make ELT a practical reality make them competitors to the ETL tool vendors!