Floating Over Redmond

July 19, 2016

A Blue Sky Strategy

The stars have aligned such that I find myself driven back to the cold embrace of the spawn of Gates and Allen.

Or a more positive perspective: I've been afforded the opportunity to expand my knowledge of IaaS cloud offerings beyond my native Amazon Web Services. For the last couple of weeks I have been (and for the foreseeable future, will be) stumbling through the thick fog of Microsoft Azure.

Seriously though, I've been impressed. Although there are many capabilities that exist in AWS that don't yet have parallels in Azure (e.g., security groups composed of instances rather than subnets, and the ability to refer to SGs from other SGs); there are others where Azure seems to be ahead of the curve (e.g., the ability to seamlessly deploy agents from the Azure platform - even on non-Windows instances - that perform syslog collection and security scanning natively to designated storage containers, and stateful network security groups that may be applied to both subnets and instance interfaces). I'm learning that although my prejudices against Azure may have been justified at one time ("Windows Azure"), the platform has grown more sophisticated than I expected.

Getting Started

Signing up for Azure was uneventful. As a long-time Xbox-owner and Xbox Live subscriber, I already had a Microsoft account. That account was quickly usable to access Azure services and sign up for the 30-day, $200 introductory credit.

Microsoft seems to be transitioning from a "classic" portal design to one based around Azure Resource Manager. This was disorienting for me; some functionality is not yet available in the new interface and deciphering the scores of service-offerings (similar to what you might see in Amazon's web services console) spread across disparate interfaces is a bit overwhelming. I knew that I didn't want to mess with the Azure Service Manager (legacy/classic/whatever) interface at all, but without context it was a little challenging to determine what was what. And the web interface itself is sometimes painfully slow, particularly when switching between the classic and updated interfaces. For the uninitiated, this is what the latest console looks like:

Regardless, I prefer using the CLI. Happily, azure-cli is available in brew:

brew install azure-cli

Issuing azure login and then azure config mode arm will get you going. From there, the CLI is pretty obliging. Executing a partial command will give you a list of available completions and details:

> azure vm
help:    Commands to manage your virtual machines
help:    The operation to redeploy a virtual machine.
help:      vm redeploy [options] <resource-group> <vm-name>
help:    Create a virtual machine in a resource group
help:      vm create [options] <resource-group> <name> <location> <os-type>
help:    Create a virtual machine with default resources in a resource group
help:      vm quick-create [options] <resource-group> <name> <location> <os-type> <image-urn> <admin-username> <admin-password>
help:    Get all virtual machines in a resource group
help:      vm list [options] <resource-group>
. . .

I've found few actions that I could not complete from the CLI. There was one command that tried to overload -f with multiple functions and caused problems - but using the longer option (--address-prefix IIRC) worked fine. Also, adding DNS records doesn't appear to work right now: zone and record-set creation work, but assigning the A/AAAA/CNAME/MX/TXT value results in an API error.

Network Security Groups

In comparison to AWS, Azure seems to be much more tightly constrained to traditional data center networking concepts of networks and subnets. While AWS instances may be added to a Security Group on spin-up and that SG may be configured with access to other instances, SGs or networks; Azure NSGs only understand subnets. Also, whereas AWS SGs can now control any IP protocol, Azure NSGs also only support TCP and UDP controls.

Azure NSGs include a few default rules that allow access from the VNET as a whole (i.e., all local subnets and VPN-connected networks) and any defined Azure Load Balancers before denying all other access. Rules within an NSG are numbered by priority (a unique value) and the default rules may be overridden by a higher priority (lower cardinal). While the default rules start at 65000, the lowest priority that may be set by the user is 4096. So if you want to override the default rules completely, a "deny all" at a priority of 4096 would get the job done.

Based on my observations, it appears that applying an NSG to a subnet can affect the flow of data between adjacent systems on that subnet. For example, if I have a system at and another at, applying an NSG at that prohibits connections from to will be effective. That's not something you would see in a traditional data center and is likely an effect of applying that control in the Azure network fabric shared between all nodes, rather than at a distinct gateway.

azure network nsg show jdt presentation-security
info:    Executing command network nsg show
+ Looking up the network security group "presentation-security"
data:    Id                              : /subscriptions/.../providers/Microsoft.Network/networkSecurityGroups/presentation-security
data:    Name                            : presentation-security
data:    Type                            : Microsoft.Network/networkSecurityGroups
data:    Location                        : westus
data:    Provisioning state              : Succeeded
data:    Security rules:
data:    Name                           Source IP          Source Port  Destination IP  Destination Port  Protocol  Direction  Access  Priority
data:    -----------------------------  -----------------  -----------  --------------  ----------------  --------  ---------  ------  --------
data:    AllowHTTP                      Internet           *            *               80                Tcp       Inbound    Allow   110
data:    AllowHTTPS                     Internet           *            *               443               Tcp       Inbound    Allow   120
data:    AllowVnetInBound               VirtualNetwork     *            VirtualNetwork  *                 *         Inbound    Allow   65000
data:    AllowAzureLoadBalancerInBound  AzureLoadBalancer  *            *               *                 *         Inbound    Allow   65001
data:    DenyAllInBound                 *                  *            *               *                 *         Inbound    Deny    65500
data:    AllowVnetOutBound              VirtualNetwork     *            VirtualNetwork  *                 *         Outbound   Allow   65000
data:    AllowInternetOutBound          *                  *            Internet        *                 *         Outbound   Allow   65001
data:    DenyAllOutBound                *                  *            *               *                 *         Outbound   Deny    65500
info:    network nsg show command OK

There are separate rule-sets for Inbound and Outbound connections; both are stateful and defined in terms of connection initiation.

I'm sure you'll notice that I chose not to override the default Inbound rules above. I would prefer to do so, but I encountered some very bizarre behavior (interactions with the local file-system on the VM started sputtering and blocking, preventing TLS certificates from being read to accommodate database connections) that causes me to think that adding that type of restriction may block some vital internal communication. I haven't identified the source of that problem yet.

Continuous Deployment

To customize my deployments, I choose to use packer and chef-solo for provisioning. Packer supports Azure by default and can be installed on a Mac OS X workstation using brew install packer. To make things operational, you'll need to identify a few authentication and identification strings available within the azure-cli interface.

ARM_TENANT_ID=`azure account show --json | jq ".[] | .tenantId" | tr -d '"'`
ARM_SUBSCRIPTION_ID=`azure account show --json | jq ".[] | .id" | tr -d '"'`

To create a client ID for packer, execute commands like these:

azure ad app create -n packer -i http://packer --home-page http://packer/home -p <your@wesom3passw0rd>
ARM_CLIENT_ID=`azure ad app list --json | jq '.[] | select(.displayName | contains("packer")) | .appId' | tr -d '"'`
azure ad sp create --applicationId $ARM_CLIENT_ID
azure role assignment create --spn http://packer -o "Owner" -c /subscriptions/$ARM_SUBSCRIPTION_ID

You can set your resource group to ARM_RESOURCE_GROUP and your storage account (I don't recall what step that's created in) to ARM_STORAGE_ACCOUNT.

Packer Template

With all that in place, I've been using this packer template to kick off my Azure builds. Obviously my paths are my own and my cookbook structure (e.g., my common role of jdt) is unique to my networks.

    "variables": {
        "client_id": "{{env `ARM_CLIENT_ID`}}",
        "client_secret": "{{env `ARM_CLIENT_SECRET`}}",
        "resource_group": "{{env `ARM_RESOURCE_GROUP`}}",
        "storage_account": "{{env `ARM_STORAGE_ACCOUNT`}}",
        "subscription_id": "{{env `ARM_SUBSCRIPTION_ID`}}",
        "tenant_id": "{{env `ARM_TENANT_ID`}}",
        "recipe": "{{env `RECIPE`}}",
        "chef_json": "{{env `CHEF_JSON`}}",
        "type": "{{env `TYPE`}}"
    "builders": [
            "type": "azure-arm",
            "client_id": "{{user `client_id`}}",
            "client_secret": "{{user `client_secret`}}",
            "resource_group_name": "{{user `resource_group`}}",
            "storage_account": "{{user `storage_account`}}",
            "subscription_id": "{{user `subscription_id`}}",
            "tenant_id": "{{user `tenant_id`}}",

            "capture_container_name": "images",
            "capture_name_prefix": "packer",

            "os_type": "Linux",
            "image_publisher": "Canonical",
            "image_offer": "UbuntuServer",
            "image_sku": "16.04.0-LTS",
            "location": "West US",
            "vm_size": "{{user `type`}}"
    "provisioners": [
            "execute_command": "chmod +x {{ .Path }}; {{ .Vars }} sudo -E sh '{{ .Path }}'",
            "inline": [
                "mkdir /data",
                "chown packer:packer /data",
                "apt-get update",
                "apt-get upgrade -y"
            "inline_shebang": "/bin/sh -x",
            "type": "shell"
            "type": "file",
            "source": "/vmstorage/{{user `recipe`}}/",
            "destination": "/data"
            "type": "file",
            "source": "{{user `chef_json`}}",
            "destination": "/data/attrs.json"
            "type": "chef-solo",
            "execute_command": "{{if .Sudo}}sudo {{end}}chef-solo --no-color -c {{.ConfigPath}} -j /data/attrs.json -o \"role[jdt],recipe[{{user `recipe`}}]\"",
            "cookbook_paths": ["/cookbooks"],
            "data_bags_path": "/cookbooks/data_bags",
            "roles_path": "/cookbooks/roles"
            "execute_command": "chmod +x {{ .Path }}; {{ .Vars }} sudo -E sh '{{ .Path }}'",
            "inline": [
                "chown root:root /data",
                "/usr/sbin/waagent -force -deprovision+user && export HISTSIZE=0 && sync"
            "inline_shebang": "/bin/sh -x",
            "type": "shell"

A typical build will be initiated like:

RECIPE=apache-proxy TYPE=Basic_A0 CHEF_JSON=./proxy.json packer build azure.json
  • RECIPE: The Chef recipe that builds this type of system.
  • TYPE: The instance type (Basic_A1, Standard_A1, etc.)
  • CHEF_JSON: A JSON file with parameters specific to this build (imported as Chef attributes).

Azure Deployment

The above packer output will end with a value for TemplateUriReadOnlySas. That value will be needed for this next step:

azure group deployment create -g jdt -n apache-proxy-deployment --template-uri <TEMPLATE_URI_SAS>

When that command begins, it will ask you for a vmName, adminUsername, adminPassword and networkInterfaceId. Oh yeah, you'll need to azure network nic create a virtual NIC to use here and have that ID (the long one that begins with /subscriptions/) ready.


If you've made it this far and have been following along, you'll have a 70% chance of having an instance running (there's a 30% chance I left out something crucial). I also spent some time setting up a site-to-site VPN to my home network from Azure and configuring the Security Center agent and extensions on my instances.

Regardless, hopefully you've gleaned some useful details from this post. If you have any questions, feel free to click over to About to find my contact information.