The History, Present, and Future of ETL Technology
Abstract. There is an abundance of data, but a large volume of it is unusable. Data may be noisy, unstructured, stored in incompatible for direct analysis medium or format, and often expensive to access. In most practical cases, the data needs to be processed before it can be used to extract valuable business insights. We refer to the nontrivial, end-to-end operation of extracting intelligence from raw data as an ETL process. In this paper, we review how the ETL technology has been evolved in the last 25 years, from a rather neglected engineering challenge to a first-class citizen in analytics and data processing. We present a brief historical overview of ETL, discuss its various applications and incarnations in modern data processing environments, and argue about exciting, feasible or wishful, potential future directions.