ETL (Extract, Transform & Load) has been one of the daunting task on a big data platform such as Apache Hadoop and Spark. Most of the time developers will write tons of code to build data pipeline and job control flow. AWS has brought Glue ETL tool which makes ETL easier and graphically very attractive. One of the great feature of AWS Glue is 100% serverless and it does not require cluster provisioning and management. Behind the scene Glue uses launch Apache Spark cluster to process the data in the ETL flow. Moreover, cost is also based on pay per use.
Principal Solutions Architect – Big Data & Cloud, R Systems
Abhi Tripathi is a Principal Big Data & Cloud Solutions Architect having 17+ Years of IT experience in architecture, design and development of applications and frameworks. He is well-versed in distributed, enterprise-grade, on premise & on-cloud, multi-tiered system development using AWS, Azure, Big Data, Apache Hadoop and Spark, Analytics, and other related technologies.
He is certified with AWS Solution Architect and AWS Big Data Specialty.
TBD, Amazon Web Services