Building efficient AWS data lake catalog

  JUNE 11, 2020       11:00 AM PST

About the Webinar:

One of the most common challenges organizations face with their data lakes is the inability to find, understand, and trust the data they need for deriving business value or gaining a competitive edge. Organizations are putting all of their enterprise data in a data lake over object storage like S3. In no time however, the data lake becomes swampy and unusable due redundant data copies. This increases organizational costs implicitly, since searching and indexing the data becomes difficult.

At R Systems, we’ve built an efficient process for data lake catalog using Amazon S3, Amazon Dynamo DB, AWS Lambda (server-less computing) and Amazon Elastic Search.

Our speakers will discuss & demonstrate the best practices for:

  • Setting up Dynamo DB to store Catalog information
  • Setting up Lambda function to trigger the Data Lake file ingestion
  • Showing the search capabilities

Key Takeaways:

  • Building a structured data lake
  • Spending more time on processing the data, and not finding the data
  • Implementing access control
  • Reducing costs, by eliminating data redundancies
  • Setting regulatory compliance

Who Should Attend:

  • Chief Data Officer
  • Director/VP of Engineering
  • Director/VP of Analytics
  • Data Architects, Systems Architects & Cloud Architects
  • Data Engineers

Fill up this form to

Get This Recording !

Speaker Profiles

Ajay Jha

Director – Big Data & Analytics

R Systems


Abhi Tripathi

Principal Solutions Architect – Big Data & Cloud

R Systems