Endjin - Home

Big Data

Integrating Azure Analysis Services into custom applications means more than just querying the data. By surfacing the metadata in your models, you can build dynamic and customisable UIs and APIs, tailored to the needs of the client application. This post explains how easy it is to query model metadata from .NET, so you can create deeper integrations between your data insights and your custom applications.

One of the first steps in integrating Azure Analysis Services into your applications is creating and opening a connection to the server – just like any other database technology. This post explains the ins and outs of creating Azure Analysis Services connections, including code samples for each of the key scenarios. 

With a variety of integration support through client SDKs, PowerShell cmdlets and REST APIs, it can be hard to know where to start with integrating Azure Analysis Services into your custom applications. This posts walks through the options, and lays out a simple guide to choosing the right framework.

We’ve done a lot of work at endjin with Azure Analysis Services over the last couple of years – but none of it has been what you’d call “traditional BI”. We’ve pulled, twisted and bent it in all sorts of directions, using it’s raw analytical processing power to underpin bespoke analysis products and processes. This post explains some of the common (and not-so-common) reasons why you might want to do similar things, and how Azure Analysis Services might be the key to unlocking your data insights.

AI for Good Hackathon

by Ian Griffiths

Towards the end of last year, Microsoft invited endjin along to a hackathon session they hosted at the IET in London as part of their AI for Good initiative. I’ve been thinking about the event and the broader work Microsoft is doing here a lot lately, because it gets to the heart of what I love about working in this industry: computers can magnify our power to do to good.

In this blog from the Azure Advent Calendar 2019 we discuss building a secure data solution using Azure Data Lake. Data Lake has many features which enable fine grained security and data separation. It is also built on Azure Storage which enables us to take advantage of all of those features and means that ADLS is still a cost effective storage option!

This post runs through some of the great features of ADLS and runs through an example of how we build our solutions using this technology!

Very excited to be speaking at NDC in London in January! The talk is focused on “Combatting illegal fishing with Machine Learning and Azure” and will focus on the recent work we did with OceanMind. OceanMind are a not-for-profit who are working on cleaning up the world’s oceans with the help of Microsoft’s cloud technologies. […]

C#, Span and async

by Ian Griffiths

The addition of ref struct types, most notably Span, opened C# to a range of high performance scenarios that were impractical to tackle with earlier versions of the language. However, they introduce some challenges. For example, they do not mix very well with async methods. This article shows some techniques for mitigating this.

We worked on a project recently which required us to build a highly performant system for processing vast quantities of messages in real time. We had made the decision to run this processing using Azure Functions with C#. This post runs through some of the techniques we used for writing highly performant, low allocation code, including data streaming, list preallocation and the relatively new C# feature: Span.

Import and export notebooks in Databricks

by Ed Freeman

Sometimes we need to import and export notebooks from a Databricks workspace. This might be because you have a bunch of generic notebooks that can be useful across numerous workspaces, or it could be that you’re having to delete your current workspace for some reason and therefore need to transfer content over to a new […]

Machine learning often seems like a black box. This post walks through what’s actually happening under the covers, in an attempt to de-mystify the process!

Neural networks are built up of neurons. In a shallow neural network we have an input layer, a “hidden” layer of neurons, and an output layer. For deep learning, there is simply more hidden layers which allows for combining neuron’s inputs and outputs to build up a more detailed picture.

If you have an interest in Machine Learning and what is really happening, definitely give this a read (WARNING: Some algebra ahead…)!

Have you been trying to create a Databricks cluster using the CLI? Have you been getting infuriated by something seemingly so trivial? Well, join the club. Although, get ready to depart the club because I may have the solution you need. When creating a cluster using the CLI command databricks clusters create, you’re required to […]

Here at endjin we’ve done a lot of work around data analysis and ETL. As part of this we have done some work with Databricks Notebooks on Microsoft Azure. Notebooks can be used for complex and powerful data analysis using Spark. Spark is a “unified analytics engine for big data and machine learning”. It allows you to run data analysis workloads, and can be accessed via many APIs. This means that you can build up data processes and models using a language you feel comfortable with. They can also be run as an activity in a ADF pipeline, and combined with Mapping Data Flows to build up a complex ETL process which can be run via ADF.

Endjin is a Snowflake Partner

by Howard van Rooijen

I’ve very pleased to announce that endjin has become a Snowflake partner. This fantastic “designed for the cloud” data platform redefines what a data warehouse can be in the age of cloud. With features such as data sharing, usage based billing, and availability on Microsoft Azure, it has won our hearts. Over the last three years, we’ve […]

Mapping Data Flows are a relatively new feature of ADF. They allow you to visually build up complex data transformation sequences. This can aid in the streamlining of data manipulation and ETL processes, without the need to write any code! This post gives a brief introduction to the technology, and what this could enable!

In the last post I explained how to create a set of Azure Functions that could load data into Snowflake as well as execute Snowflake queries and export the results into your favorite cloud storage solution. In this post I will show how we can use these functions in Azure Data Factory to plug Snowflake […]

If, like me, you are a fan of Azure Data Factory and love Snowflake then you are probably disappointed that there isn’t a native Data Factory connector for Snowflake. While we wait for an official connector from Microsoft we have no alternative but to roll our own. In this blog post I will walk you through […]

When he joined endjin, Technical Fellow Ian sat down with founder Howard for a Q&A session. This was originally published on LinkedIn in 5 parts, but is republished here, in full. Ian talks about his path into computing, some highlights of his career, the evolution of the .NET ecosystem, AI, and the software engineering life.

We’re currently building a Data Governance Platform product that enables UK Financial Services organisations to discover and manage the life-cycle, usage, risk and compliance requirements of data assets across the organisation. Much of the core functionality is delivered using Cosmos DB’s Gremlin API to model data lineage and other relationships best represented by a graph […]

Overflowing with dataflow part 1: An overview

by Carmel Eve

This is the first blog in a series about dataflow. The series focuses on TPL dataflow, but this post gives an overview of dataflow as a whole.

The crucial thing to understand when using dataflow is that the data is in control. In most conventional programming languages, the programmer determines how and when the code will run. In dataflow, it is the data that drives how the program executes. The movement of data controls the flow of the program.