Endjin - Home

Big Compute

A couple of weeks ago I had the opportunity to attend NDC in Oslo. It was an absolutely brilliant experience, and my head is still reeling a bit from everything I learnt! The focus of a lot of the talks was around neural networks and machine learning – something which we have explored quite a […]


I recently wrote a blog on using ADF Mapping Data Flow for data manipulation. As part of the same project, we also ported some of an existing ETL Jupyter notebook, written using the Python Pandas library, into a Databricks Notebook. This notebook could then be run as an activity in a ADF pipeline, and combined […]


As part of a recent project we did a lot of experimentation with the new Azure Data Factory feature: Mapping Data Flows. The tool is still in preview, and more functionality is sure to be in the pipeline, but I think it opens up a lot of really exciting possibilities for visualising and building up […]


We’ve had ongoing issues when deploying web and functions apps involving the locking of DLLs during the deployment. The specific case I’m going to talk about focuses on Azure Functions, but you can also run Web Apps from a package (though the Azure Pipelines tooling currently only works for functions, so you would need to […]


In September I joined endjin a Technical Fellow (an entirely new branch in endjin’s career pathway to accommodate me – more on that later). I’ve has been involved with endjin since 2011, as an Associate, helping to deliver some of our most technically challenging projects (and if you go even further back, I attended Cambridge University with endjin co-founder […]


Using Python inside SQL Server

by Ed Freeman

Hello everyone. Before Christmas I played around with SQL Server 2017’s inline Python integration capability. This capability was announced early last year, with the corresponding integration with R already being possible for a number of months. The main benefits from this are the abilities to: Eliminate data movement (having to transfer data samples from a database to […]


How to plan your cloud transformation journey

by Howard van Rooijen

This week I received an email from someone who asked how they could use our free Thought Leadership content to help their organisation move to the cloud. I realised that although we’ve released a lot of content, we’d never talked publicly about the rationale behind them and how they are all interconnected. Our Thought Leadership […]


Choosing the right cloud platform provider can be a daunting task. Take the big three, AWS, Azure, and Google Cloud Platform; each offer a huge number of products and services, but understanding how they enable your specific needs is not easy. Since most organisations plan to migrate existing applications it is important to understand how […]


In this series, we’re comparing cloud services from AWS, Azure and Google Cloud Platform. A full breakdown and comparison of cloud providers and their services are available in this handy poster. We have assessed services across three typical migration strategies: Lift and shift – the cloud service can support running legacy systems with minimal change […]


We produced a booklet to coincide with our Future Decoded talk “The 100 Year Start-up: Embracing Disruption in Financial Services“, where we examine the challenges and opportunities in the Microsoft Cloud for the Financial Services Industry, covering the following topics: Security, Privacy & Data Sovereignty Data Ingestion, Transformation & Enrichment Big Compute Big Data – […]


Azure Batch – Time is Money in Big Compute

by James Broome

Earlier in the year, endjin worked with the Azure Batch Product Team to run a series of experiments against the Azure Batch service using a framework we developed for performing scale, soak and performance tests. We’ve had conversations with a number of organisations over the last 5 years who have scaled their compute intensive workloads (SAS, […]


A short while ago, I was trying to classify some data using Azure Machine Learning, but the training data was very imbalanced. In the attempt to build a useful model from this data, I came across the Synthetic Minority Oversampling Technique (SMOTE), an approach to dealing with imbalanced training data. This blog describes what I […]


Spinning up 16,000 A1 Virtual Machines on Azure Batch

by Howard van Rooijen

Big Compute, like Big Data has a different meaning for every organisation; for Big Data this generally tends to be when data grows to a point where it can no longer be stored, queried, backed up, restored or processed easily on traditional database architectures. For Big Compute this tends to be when computation grows to […]