Fully automated deployment processes for Data & AI solutions

Beerwulf.jpg

Challenge

Incorporate CI/CD into the development lifecycle

Beerwulf unites beer drinkers with beer brewers through a user-centric, data-driven, storytelling, platform dedicated to selling, sharing, and exploring the world of beer. To achieve its mission, they needed a dedicated team of data scientists and data engineers to incorporate continuous integration and continuous delivery (CI/CD) into the development lifecycle. Beerwulf called on Xccelerated for assistance.

Solution

Azure Databricks workspace or Azure Data Factory

Although technologies like Azure Data Factory and Azure Databricks helps data engineers to develop data pipelines quickly and easily, delivering code changes more frequently is still a challenge. These changes have to be further validated and automated tested. This process ultimately results in an artefact that should be deployed to a target environment, in Beerwulf’s case, an Azure Databricks workspace or Azure Data Factory.

To integrate the deployment of an entire cloud environment — which contains data factories, storage accounts, and databases for different types of pipeline architectures — fully provisioned environments for any kind of source, needed to be prepared. This process required the management of template files, parameter template files, and pre- and post-deployment scripts.

Building libraries, non-notebook Apache Spark code, running automated tests, sanity checks and scheduled data pipelines & machine learning workflows, all needed to be implemented in order to incorporate CI/CD into the development process of the Data & AI solutions.

Result

Orchestrate deployments, provisioning, and staging of entire environments

Beerwulf now can orchestrate deployments, provisioning, and staging of entire environments — including compiled code, Azure Databricks, and other cloud-native tools. This makes it very easy for the team to manage deployments, run tests, and to adapt data pipelines to use different kinds of sources or environments. It resulted in three fully automated different deployment processes, monitoring and approval systems for each of them, more than 20 test scenarios and a centralized secret management system.

Other references

Heineken
KLM
ProRail
Alliander
Schiphol Group
Fedex
Randstad Group
Nationale Nederlanden
Mollie
Vattenfall
Heineken
KLM
ProRail
Alliander
Schiphol Group
Fedex
Randstad Group
Nationale Nederlanden
Mollie
Vattenfall