Batch Computing on Azure
Want to feel like a Jedi master without the years of training? Get a scaled, multi-machine, multi-core parallel processing job spooled up in Azure Batch. The computing power at your fingertips is
There are lots of scenarios that call for batch computing. Image processing, file ETL/ELT, risk modeling, processing payroll, etc. The basic idea of Batch is on occasion we need to spool up a lot of machines, but not keep them spooled up and running when we don’t need them. That costs money, so Azure lets us auto-provision and deprovision VMs as they are needed. This can be a compute job that takes days or one that only lasts for a few minutes.
How Batch Works
There is typically a head controller server which has an application on it to orchestrate the Batch run. It does things like spool up the VMs to run as worker nodes, install the application for the worker nodes to run, and deprovisions them when the machines are no longer needed.
Then there is a worker application. This is the application that does the processing in parallel with other worker machines. Typically, this application runs in isolation and does not communicate to other nodes in the cluster, although there are some complex use cases for that. Often this application will write output files or other data into Azure Storage (many options here) for use in later processing.
For example, I might spool up a bunch of workers to perform file processing. The controller queues up jobs for the workers and they write their output to an instance of the document-based database Cosmos DB. The documents may then get picked up later for further processing in HDInsight pipelines or other applications.
I can also write code that just runs from my laptop, making my laptop the controller node. Code you can try yourself is in this GitHub repo. It’s a big repo and the .NET solution is here, but other languages are supported. Using Azure SDKs just invokes Azure REST APIs behind the scenes making it very easy for Microsoft to create SDKs for many languages.
In the model below, we can see the logical programming model for Batch. You have Pools, which spool up the worker nodes and hold the applications that will run on those worker nodes. Within Pools we have Jobs, and there may be more than one per Pool. Inside Jobs we have Tasks, that are the actual invocation of the application run on the worker nodes. It’s really a pretty simple model and not hard to get configured via the Azure Portal or through the Batch SDK.
Microsoft provides a free 30 day trial so you can try this out. You are limited to $150, so you probably don’t want to go nuts by spooling up 1000 core worker VMs. Just saying.
There is also a lot of good documentation here, which provides starting points for using the Azure CLI, the Azure Portal, .NET, or Python to get Batch working for you.