What is Lake Formation?

AWS Lake Formation is a service that makes it easy to set up, secure, and manage your data lake. With Lake Formation you can discover, cleanse, transform, and ingest data into your data lake from various sources, Define fine-grained permissions at database, table or column level and then share controlled across analytic, machine learning and ETL services.

Moving your data into a data lake can provide better flexibility, cost savings, and scalability. However, manually setting up and managing data lakes can be a complex and time-consuming process. Lake Formation provides the following capabilities, either directly or through other AWS services, to reduce the time to deploy data lakes from many months to a few days or weeks:
  • Ingest and organize data.
  • Cleanse data.
  • Catalog and index data.
  • Analyze data.
  • Secure data at the database and table level.
  • Grant data access to users from a central location.
  • Orchestrate data flows.
Lake Formation uses the following services:
  • AWS Glue to orchestrate jobs with triggers to transform data using the AWS Glue transforms.
  • AWS Identity and Access Management (IAM) to secure data using Lake Formation permissions to grant and revoke access.
  • The Data Catalog as the central metadata repository across several services.
  • Amazon Athena to query data.
  • Amazon SageMaker to analyze data.
  • AWS Glue machine learning transforms to cleanse data.