Endjin - Home

Cloud

Are you configuring your Azure App Service to use a VNet? The regional VNet integration for an Azure App Service is a preview feature, and so comes with some quirks. One of the documented limitations of this preview feature is that “The feature is only available from newer App Service scale units that support PremiumV2 App Service plans.” This has interesting implications as to how you need to deploy your App Service, otherwise you could end up with a rather perplexing pattern of errors. Read about how we came about this error, and how we ended up fixing it.


C#, Span and async

by Ian Griffiths

The addition of ref struct types, most notably Span, opened C# to a range of high performance scenarios that were impractical to tackle with earlier versions of the language. However, they introduce some challenges. For example, they do not mix very well with async methods. This article shows some techniques for mitigating this.


GitHub Actions is GitHub’s new CI/CD platform. It is comparable with Azure Pipelines, which forms part of the Azure DevOps suite. In this post, Mike Larah looks at the similarities and differences in the high-level concepts and terminology between the two platforms.


Long Running Functions in Azure Data Factory

by Jess Panni

Azure Functions are powerful and convenient extension points for your Azure Data Factory pipelines. Put your custom processing logic behind an HTTP triggered Azure Function and you are good to go. Unfortunately many people read the Azure documentation and assume they can merrily run a Function for up to 10 minutes on a consumption plan […]


How Azure DevTestLabs is helping me climb Everest

by Carmel Eve

Remote working allows us to work from anywhere we want. This brings a huge amount of flexibility in freedom, however we do need the help of a working laptop! When Carmel’s laptop gave in just before a trip, she used Azure DevTestLabs to allow her to continue to work using a 10 year old Mac that probably couldn’t wouldn’t have been up to the task alone…


We worked on a project recently which required us to build a highly performant system for processing vast quantities of messages in real time. We had made the decision to run this processing using Azure Functions with C#. This post runs through some of the techniques we used for writing highly performant, low allocation code, including data streaming, list preallocation and the relatively new C# feature: Span.


Running Azure functions in Docker on a Raspberry Pi 4

by Jonathan George

For one of my first experiments with the Raspberry Pi 4, I decided to get an Azure Function running in a Docker container. This post gives a step-by-step guide on how to do it, as well as providing code you can use a starting point for your own experiments.


Import and export notebooks in Databricks

by Ed Freeman

Sometimes it’s necessary to import and export notebooks from a Databricks workspace. This might be because you have some generic notebooks that can be useful across numerous workspaces, or it could be that you’re having to delete your current workspace for some reason and therefore need to transfer content over to a new workspace. Importing and exporting can be doing either manually or programmatically. In this blog, we outline a way to recursively export/import a directory and its files from/to a Databricks workspace.


Machine learning often seems like a black box. This post walks through what’s actually happening under the covers, in an attempt to de-mystify the process!

Neural networks are built up of neurons. In a shallow neural network we have an input layer, a “hidden” layer of neurons, and an output layer. For deep learning, there is simply more hidden layers which allows for combining neuron’s inputs and outputs to build up a more detailed picture.

If you have an interest in Machine Learning and what is really happening, definitely give this a read (WARNING: Some algebra ahead…)!


Quite often it’s beneficial to work with pre-built CLIs/SDKs to interact with your favourite tools, instead of making requests to the underlying REST API. Much of the complexity around constructing requests has been abstracted, and authentication is often easier. The Databricks CLI makes it easier to interact with your Databricks instance, but sometimes you can run into strange errors when constructing the values passed in as arguments. In this blog, we take a look at a JsonDecodeError that can occur when speaking to the Clusters CLI, and look at a way we can avoid this error.


Building a secure solution on Azure can be a daunting task. Using Azure Functions and Managed Identities, we have built up a pattern for giving services access to one another, woithout the need to store credentials. These managed identities can be given access to necessary resources. For example, they can be granted roles and added to access control lists in ADLS Gen2 accounts, or the ability to access keys in key vault. This means that data can be securely accessed without needing to store connection strings or app passwords.


Whilst some of the Azure Active Directory PowerShell for Graph module (AzureAD) functionality has been rolled into the new Azure PowerShell Az module, it’s not currently (and might never be) a replacement for the full power of what you can achieve with AzureAD. So, there’s every chance you’ll find yourself needing to use both side-by-side. This post explains how to do that using the new cross-platform PowerShell Core.


A Power BI based solution typically consists of a variety of technologies – for example Azure data platform services containing source data. As such, automation of Power BI resources needs to be considered as part of a wider DevOps strategy. This post describes the specific steps needed in order to fully automate the creation and security of Power BI workspaces using Powershell and Azure DevOps pipelines.


Here at endjin we’ve done a lot of work around data analysis and ETL. As part of this we have done some work with Databricks Notebooks on Microsoft Azure. Notebooks can be used for complex and powerful data analysis using Spark. Spark is a “unified analytics engine for big data and machine learning”. It allows you to run data analysis workloads, and can be accessed via many APIs. This means that you can build up data processes and models using a language you feel comfortable with. They can also be run as an activity in a ADF pipeline, and combined with Mapping Data Flows to build up a complex ETL process which can be run via ADF.


Endjin is a Snowflake Partner

by Howard van Rooijen

I’ve very pleased to announce that endjin has become a Snowflake partner. This fantastic “designed for the cloud” data platform redefines what a data warehouse can be in the age of cloud. With features such as data sharing, usage based billing, and availability on Microsoft Azure, it has won our hearts. Over the last three years, we’ve […]


Mapping Data Flows are a relatively new feature of ADF. They allow you to visually build up complex data transformation sequences. This can aid in the streamlining of data manipulation and ETL processes, without the need to write any code! This post gives a brief introduction to the technology, and what this could enable!


In the last post I explained how to create a set of Azure Functions that could load data into Snowflake as well as execute Snowflake queries and export the results into your favorite cloud storage solution. In this post I will show how we can use these functions in Azure Data Factory to plug Snowflake […]


If, like me, you are a fan of Azure Data Factory and love Snowflake then you are probably disappointed that there isn’t a native Data Factory connector for Snowflake. While we wait for an official connector from Microsoft we have no alternative but to roll our own. In this blog post I will walk you through […]


Enforce resource tagging with Azure Policy

by Mike Larah

This blog post details how we used Azure Policy to enforce Azure resources were tagged with appropiate tags and ensured tags were inherited from parent resource groups where possible.


This post walks through the fix for DLL locking errors when trying to deploy an Azure Function. The solution was to switch over to the new “deploy from package” option when deploying the functions. This fixes the file locking problem because instead of deploying the DLLs, the function will run from a package file added to its directory.