Lake Formation Components
Lake Formation provides a console to set up and manage your data lakes. Lake Formation uses AWS Glue API operations
through several language-specific SDKs and the AWS Command Line Interface (AWS CLI). For information about using the
AWS CLI, see the
AWS CLI Command Reference.
Lake Formation uses the Data Catalog to store metadata about data lakes, data sources, transforms, and targets. The
AWS Glue API provides a managed infrastructure for defining, scheduling, and running extract, transform, and load
(ETL) operations on your data. For more information, see
AWS Glue API.
Lake Formation Console
You use the Lake Formation console to define your data lake. The console calls several API operations in the AWS
Glue API to perform the following tasks:
- Define AWS Glue objects such as jobs, tables, crawlers, and connections.
- Schedule when crawlers run.
- Define events or schedules for job triggers.
- Search and filter lists of Lake Formation objects.
Data Catalog
The Data Catalog is your persistent metadata store. It is a managed service that lets you store, annotate, and share
metadata in the AWS Cloud in the same way you would in an Apache Hive metastore. Each AWS account has one Data
Catalog per AWS Region. It provides a uniform repository where disparate systems can store and find metadata to keep
track of data in data silos, and use that metadata to query and transform the data.
Lake Formation provides a hierarchy of permissions to control access to databases and tables in a Data Catalog. You
grant and revoke access to resources to manage access.