Generating ETL processes with BIML
// by Matthias Rensing
Data & Analytics, etl framework
noventum data warehouses purse a standardised layer architecture
The creation of data warehouses allows for relying on standardised elements at different spots. Depending on the project, there is a high degree of standardisation in some layers while others contain customer-specific business logic throughout. To implement the standardised layers as quickly as possible and in high quality means – for each project – a stable basis and motivation for the often difficult individual aspects that have to be worked on subsequently.
ETL process for the SQL Server are created as SSIS packages; these are XML files, but they were never designed for manual handling. Therefore, it is not efficiently possible to generate these files directly. To ease this process, there has been the Business Intelligence Modelling Language (BIML) for some time now.
BIML has been gaining momentum, especially this year. The SQL Server Conference in February, organised by PASS (Professional Association for SQL Server), made this clear. With a contribution of its own, „meta data driven development with BIML“, noventum was able to move the topic forward with an accent of its own at the PASS meeting in Hanover in May 2015. This year‘s SQL Saturday in St. Augustin is promising four presentations on the topic of BIML.
What is BIML?
BIML is a description of SSIS package in legible XML source text. This provides for initial advantages with respect to versioning and cooperation. But the whole potential of the BIML idea can only exploited with BIMLScript. With BIMLScript, BIML files can be generated dynamically. E.g., the developer can say „generate five packages for me that each look completely identical and are merely based on a different source table.“
What can be achieved by this?
From a source/target mapping, meta data can be derived that allows for generating therefrom a whole ETL process for loading a DWH. Since it is generated, the latter does – of course – exactly adhere to the project‘s naming conventions. A steadily growing toolbox of stored best practices regarding package creation supplements these options. This special noventum know-how does encompass, for example, the loading from the source system through delta methods or different historisation methods.
BIML allows for shorter project duration while opening-up new sources. It establishes a degree of standardisation that massively improves the maintainability of the DWH. Added value can be achieved for the customers through the automatic creation, which, in a manual implementation, would have exceeded the project budget, for instance in the area of historisation. The legible XML source text makes a versioning and therewith cooperation easier. BIML generates normal SSIS packages so that these later can be reviewed and executed with the SQL Server Data Tools (SSDT BI). If manual changes are desired, they can therefore also be implemented. BIML is part of the free SSDT BI plug-in BIDShelper that is being utilised on all noventum projects. With its help, packages can be generated for SQL Server versions from 2005 to 2014. A technical insight into the topic is afforded by the PASS presentation. To the present day, the methodology described has been used in corresponding customer projects with great success. In this, the custom part repeatedly turned out to be surprisingly small and was realisable at reasonable expenditure.
Road to Azure Data Platform
Sicherheit und eine solide Grundlage
für zukünftige Analytics-Architekturen
Die Fülle an neuen Möglichkeiten in der Azure Data Platform und der stetige Wandel werfen mit Blick auf die eigene Analytics Infrastruktur für viele Unternehmen Fragen auf: Wo lohnt sich ein genauerer Blick? Welcher Trend könnte echten Mehrwert bieten? Welche Technologien würde das Team zukünftig begeistern und uns für neue Bewerber attraktiv machen?
Vielleicht sehen Sie schon Engpässe der eigenen Infrastruktur?
- Fehlende Zugriffsmöglichkeiten auf webbasierte Datenquellen
- Performance Probleme der Beladeroutinen oder Antwortzeiten der Berichte
- Zu hoher Workload im Betrieb und fehlendes Know-how bei veralteter Technologie
Dann können Sie mit der Einführung der Azure Data Platform Abhilfe schaffen.
Mit dem bewährten Workshop-Modell "Road to Azure Data Platform" beantwortet noventum diese Fragen und ordnet kundenindividuell aktuelle Azure Services, vor dem Hintergrund der existierenden Analytics Infrastruktur, den erkannten Engpässen und der zukünftigen IT-, bzw. Datenstrategie ein.
Power BI | Fabric | Azure Analysis Services | Azure Data Factory | Azure SQL Managed Instance | Azure SQL Database | Synapse | Azure Data Factory | Azure SSIS | Azure DevOps | Key Vault | oder Databricks
noventum consulting GmbH