Semistructured source data with automatic ETL schema generation at SQL destination
Problem:
A Software as a Service (SaaS) company stores OLTP data in a mixed format of relational and semi-structured data. This multi-tenant data consists of workflow, form data and documents. The company must export the data for their clients in a tabular form by pivoting the data. Business requirements change rapidly and data elements are created regularly triggering expensive code changes by the ETL development team. The business needed automatic ETL schema generation with metadata managed data transformation and data governance.
Solution:
Intelligent Integration eliminated 1000’s of manually-coded ETL packages, instead, new data elements are added automatically to the destination schema. Complete visibility of data mapping is now available without requiring code review or maintenance of separate documentation. In order to improve data quality, the company wants consistent business requirements applied to both the front end’s application form designer as well as the ETL export. This is being implemented as a data governance solution (a form of Master Data Management). Which is a term for a centralized repository of data requirements with source control. The application dev team’s form designer (in particular field validation) and the ETL team will now share data requirements. In the case of ETL, the schema changes are simply imported into Intelligent Integration’s metadata store without any coding required.
Automatic ETL schema generation has several other practical use cases. In development environments application developers make changes in OLTP database (either SQL or NoSQL) and data integration (i.e. reporting database etc) can be either automatically or semi-automatically updated. Intelligent Integration’s metadata can be deployed in the SDLC process. This reduces ETL LOE relatively to application development.