About Ingestion
Ingestion is the process of transferring data from an on-premises database server to the cloud. There are two types of ingestion:
- Initial Ingestion: When Proficy MDC is installed for the first time, data from selected tables in the Plant Applications database server is transferred to MDC cloud, which is hosted on cloud. This is called initial ingestion. Initial ingestion is a manual process. After initial ingestion, Proficy MDC provides tools to validate that the data that has been ingested matches the data stored in the Plant Applications database server.
- Incremental Ingestion: After the initial ingestion, only the updated data is ingested at regular intervals from the Plant Applications database server to MDC cloud and Redshift. This is called incremental ingestion. Incremental ingestion is an automated process. You can specify the frequency of incremental ingestion.
Workflow for Initial Ingestion
Proficy MDC provides
REST APIs and other software tools to perform initial ingestion using the following
process:
- All the data stored in the Plant Applications database server is transferred to MDC cloud.
- The data in the MDC cloud can be validated to ensure that it matches the data in the Plant Applications database server.
Workflow for Incremental Ingestion
Proficy MDC performs
incremental ingestion using the following process:
- When data is updated in the Plant Applications database server, the database server is configured to generate transaction logs.
- Listener services monitor these transaction logs and communicate with AWS DMS, which is hosted on the AWS Cloud.
- The AWS DMS service captures the updated transaction logs and transfers them to a file in parquet format in an S3 bucket.
- The AWS DMS service transfers each data file from the Plant Applications database to the S3 bucket.
- The Cloud Gateway service then transfers the data file to the MDC cloud and Redshift, which is accessible via REST APIs.