Fully automated deployment processes for Data & AI solutions
Challenge
Incorporate CI/CD into the development lifecycle
Beerwulf unites beer drinkers with beer brewers through a user-centric, data-driven, storytelling, platform dedicated to selling, sharing, and exploring the world of beer. To achieve its mission, they needed a dedicated team of data scientists and data engineers to incorporate continuous integration and continuous delivery (CI/CD) into the development lifecycle. Beerwulf called on Xccelerated for assistance.
Solution
Azure Databricks workspace or Azure Data Factory
Although technologies like Azure Data Factory and Azure Databricks helps data engineers to develop data pipelines quickly and easily, delivering code changes more frequently is still a challenge. These changes have to be further validated and automated tested. This process ultimately results in an artefact that should be deployed to a target environment, in Beerwulf’s case, an Azure Databricks workspace or Azure Data Factory.
To integrate the deployment of an entire cloud environment — which contains data factories, storage accounts, and databases for different types of pipeline architectures — fully provisioned environments for any kind of source, needed to be prepared. This process required the management of template files, parameter template files, and pre- and post-deployment scripts.
Building libraries, non-notebook Apache Spark code, running automated tests, sanity checks and scheduled data pipelines & machine learning workflows, all needed to be implemented in order to incorporate CI/CD into the development process of the Data & AI solutions.
Result
Orchestrate deployments, provisioning, and staging of entire environments
Beerwulf now can orchestrate deployments, provisioning, and staging of entire environments — including compiled code, Azure Databricks, and other cloud-native tools. This makes it very easy for the team to manage deployments, run tests, and to adapt data pipelines to use different kinds of sources or environments. It resulted in three fully automated different deployment processes, monitoring and approval systems for each of them, more than 20 test scenarios and a centralized secret management system.