Endjin - Home

Cloud

GitHub Actions is GitHub’s new CI/CD platform (currently in open beta, at the time of writing). It is comparable with Microsoft’s other CI/CD offering, Azure Pipelines, which forms part of the Azure DevOps suite. Being fairly well acquainted with Azure Pipelines, I found myself looking for comparisons when getting started with GitHub Actions. It became […]


Long Running Functions in Azure Data Factory

by Jess Panni

Azure Functions are powerful and convenient extension points for your Azure Data Factory pipelines. Put your custom processing logic behind an HTTP triggered Azure Function and you are good to go. Unfortunately many people read the Azure documentation and assume they can merrily run a Function for up to 10 minutes on a consumption plan […]


How Azure DevTestLabs is helping me climb Everest

by Carmel Eve

So, the title may be somewhat misleading… But I’ve got you here now… For those who don’t know, endjin is a fully remote company, this is brilliant in a lot of ways (which I won’t list now) but one of the main bonuses is it gives us the freedom to work from wherever we happen […]


Over the past few months I’ve worked on multiple projects involving reactive processing of large amounts of data. In a world where the volume of data is increasing at an almost inconceivable rate, being able to process this data efficiently and cheaply is vital. A large part of the work involved in these projects was […]


Running Azure functions in Docker on a Raspberry Pi 4

by Jonathan George

At our endjin team meet up this week, we were all presented with Raspberry Pi 4b’s and told to go away and think of something good to do with them. I first bought a Raspberry Pi back in 2012 and have to admit, beyond installing XBMC and playing around with it, I haven’t done a […]


Import and export notebooks in Databricks

by Ed Freeman

Sometimes we need to import and export notebooks from a Databricks workspace. This might be because you have a bunch of generic notebooks that can be useful across numerous workspaces, or it could be that you’re having to delete your current workspace for some reason and therefore need to transfer content over to a new […]


A couple of weeks ago I had the opportunity to attend NDC in Oslo. It was an absolutely brilliant experience, and my head is still reeling a bit from everything I learnt! The focus of a lot of the talks was around neural networks and machine learning – something which we have explored quite a […]


Have you been trying to create a Databricks cluster using the CLI? Have you been getting infuriated by something seemingly so trivial? Well, join the club. Although, get ready to depart the club because I may have the solution you need. When creating a cluster using the CLI command databricks clusters create, you’re required to […]


Here at endjin we spend a lot of time working with data, and securing that data is top on our list of priorities . Therefore, anything we can do to reduce the need for storing access keys is a huge win! (Here is a guest blog from Barry Smart at Hymans Robertson which details our […]


I recently wrote a blog on using ADF Mapping Data Flow for data manipulation. As part of the same project, we also ported some of an existing ETL Jupyter notebook, written using the Python Pandas library, into a Databricks Notebook. This notebook could then be run as an activity in a ADF pipeline, and combined […]


Endjin is a Snowflake Partner

by Howard van Rooijen

I’ve very pleased to announce that endjin has become a Snowflake partner. This fantastic “designed for the cloud” data platform redefines what a data warehouse can be in the age of cloud. With features such as data sharing, usage based billing, and availability on Microsoft Azure, it has won our hearts. Over the last three years, we’ve […]


As part of a recent project we did a lot of experimentation with the new Azure Data Factory feature: Mapping Data Flows. The tool is still in preview, and more functionality is sure to be in the pipeline, but I think it opens up a lot of really exciting possibilities for visualising and building up […]


In the last post I explained how to create a set of Azure Functions that could load data into Snowflake as well as execute Snowflake queries and export the results into your favorite cloud storage solution. In this post I will show how we can use these functions in Azure Data Factory to plug Snowflake […]


If, like me, you are a fan of Azure Data Factory and love Snowflake then you are probably disappointed that there isn’t a native Data Factory connector for Snowflake. While we wait for an official connector from Microsoft we have no alternative but to roll our own. In this blog post I will walk you through […]


Enforce resource tagging with Azure Policy

by Mike Larah

We recently had a requirement from a client that all of their Azure resources must be tagged with a specific set of tags, which were ultimately to be used for cost accounting when the bill came rolling in. For simplicity of this blog post, let’s assume the client just required that all resources had to […]


We’ve had ongoing issues when deploying web and functions apps involving the locking of DLLs during the deployment. The specific case I’m going to talk about focuses on Azure Functions, but you can also run Web Apps from a package (though the Azure Pipelines tooling currently only works for functions, so you would need to […]


We were recently looking for a way to run a script on an Azure Virtual Machine that already existed (i.e. not executing it at provisioning time). Whilst there are ways to do this remotely (using PowerShell remoting, for example), these tend to require updating the VM’s networking configuration to open up ports or allow traffic […]


So, another year, another random blog topic change! This time we’ve left the world of Rx, and done a hop, skip and leap into Azure! Specifically, Azure AD, permissions and all things service principal. As part of a recent project we needed an Azure Functions App to have access to various Azure resources, including CosmosDB […]


In September I joined endjin a Technical Fellow (an entirely new branch in endjin’s career pathway to accommodate me – more on that later). I’ve has been involved with endjin since 2011, as an Associate, helping to deliver some of our most technically challenging projects (and if you go even further back, I attended Cambridge University with endjin co-founder […]


We’re currently building a Data Governance Platform product that enables UK Financial Services organisations to discover and manage the life-cycle, usage, risk and compliance requirements of data assets across the organisation. Much of the core functionality is delivered using Cosmos DB’s Gremlin API to model data lineage and other relationships best represented by a graph […]


1 2 3 5