Naming Things is Hard: Spotlight Edition

Like most specialist industries, software is rife with mainstream English words that we’ve taken and misappropriated to mean something completely different. Show business is no different. The software team here at Spotlight sits smack-bang in the intersection between these two specialist fields, and so when we’re talking to our customers and product owners about the systems we build, it’s very important to understand the difference between typecasting and type casting, and exactly what sort of actor model we’re talking about. We therefore present this delightful “double glossary” of everyday terms that you’ll hear here at Spotlight Towers. Because as we all know, there’s only two hard problems in software: cache invalidation, naming things, and off-by-one errors.

Actor

Software: A mathematical model of concurrent computation that treats “actors” as the universal primitives of concurrent computation.

Showbiz: A person whose profession is acting on the stage, in films, or on television.

Agent

Software: A software agent is a computer program that acts for a user or other program in a relationship of agency

Showbiz: A person who finds jobs for actors, authors, film directors, musicians, models, professional athletes, writers, screenwriters, broadcast journalists, and other people in various entertainment or broadcast businesses.

Callback

Software: Any executable code that is passed as an argument to other code, which is expected to call back (execute) the argument at a given time.

Showbiz: A follow-up interview or audition

Casting

Software: Explicitly converting a variable from one type to another

Showbiz: Employing actors to play parts in a film, play or other production. Also the act of doing same.

Client

Software: The opposite of a server

Showbiz: An actor, specifically in the context of the actor’s relationship with their agent or manager Internally at Spotlight we have both internal and external clients/customers

Client Profile

Software: A subset of the .NET framework intended to run on mobile and low-powered devices

Showbiz: An actor’s professional CV, as it appears on their agent’s’ website or in various kinds of casting software and directories

Double

Software: Primitive data type representing a floating-point number

Showbiz: A performer who appears in place of another performer, i.e., as in a stunt.

Mirror

Software: A copy of a system that updates from the original in near to real time, often a database or file storage system

Showbiz: An optical device that helps a performer check they’ve applied their makeup correctly

Principal

Software: Used in database mirroring to refer to the primary instance of the database

Showbiz: A performer with lines.

Production

Software: The live infrastructure and code environment

Showbiz: A film, TV or stage show, such as a professional actor might list on their acting CV.

REST

Software: Representational State Transfer – an architectural style used when building hypermedia APIs

Showbiz: What actors do between jobs.

Script

Software: A computer program written in a scripting language

Showbiz: The written dialogue and directions for a play, film or show

“Sequel”

Software: Standard pronunciation of SQL, referring to either the database query language. Also commonly refers to Microsoft’s SQL Server database product.

Showbiz: A published, broadcast, or recorded work that continues the story or develops the theme of an earlier one.

Server

Software: The opposite of a client

Showbiz: Someone working as waiting staff in a restaurant. Who is quite possibly an actor moonlighting as a server to pay the bills between acting jobs.

Spotlight

Software: The native MacOS search application

Showbiz: Our company – www.spotlight.com,  “The Home of Casting” – and the directories and services we have created since 1927. Not to be confused with the 20-odd different ‘Spotlight’ companies around the world, many of whose customers blindly email support@spotlight.com whenever they have a problem. 🙂

Staging

Software: A replica of a production hosting environment used to test new features and deployments.

Showbiz: The method of presenting a play or dramatic performance; also used to refer to the stage structure itself in theatre and live performance.

Naming Things is Hard: Spotlight Edition

Spotlight at Monki Gras 2017

Monki Gras is RedMonk’s annual UK conference about Software, Craft and Experience. I was very happy that my application for their Diversity Scholarships Programme got accepted and I gladly attended the event on the 26th and 27th of January.

Besides providing a full 2-day ticket they also had in place a Mentorship Programme for the scholars. Tracy Miranda – open source evangelist and veteran of the Eclipse community, Bryan Boreham – director of engineering at Weakework and Rachel Stephens – associate analyst at RedMonk made sure the scholars felt welcome, answered questions and facilitated networking.

monkigras

While some industry events have a very corporate feel about them these days, Monki Gras was like a breath of fresh air: less pitching and more content!

The subject of 2017 was “Packaging: convenience is the killer app for great developer and user experiences” and among the speakers were Arianna Aondio from Varnish Software Group, Gordon Haff from Red Hat, Alvaro Videla who works as a distributed systems engineer and was previously a Core Developer for RabbitMq and Abby Kearns from Cloud Foundry Foundation.

panel

My favourite talk was “Metaphors we compute by” by Alvaro Videla, who made a very interesting connection between the Linguistics and Philosophic worlds and the Software Industry. Communication is key even when you only ‘speak’ code!

The conference was held in quite an informal venue, The Bike Shed Motorcycle Club in Shoreditch and the techies were kept focused with artisan coffee, tasty food and locally brewed beer.

Looking forward to their event next year!

Spotlight at Monki Gras 2017

Semantic Versioning with Powershell, TeamCity and GitHub

Here at Spotlight Towers, we’ve been using TeamCity as our main build server since version 6; it’s a fantastic tool and we love it dearly. It got even better a few years back when we paired it with the marvellous Octopus Deploy; TeamCity builds the code and creates a set of deployable packages known as Octopacks; Octopus deploys the packages, and everything works quite nicely. Well, almost everything. One of the few problems that TeamCity + Octopus doesn’t magically solve for us is versioning. In this post, we’re going to look at how we use Git and TeamCity to manage versioning for our individual packages.

If this sounds like your sort of thing, why not come and work for us? That’s right – Spotlight is hiring! We’re looking for developers, testers and a new UX/Web designer – check out jobs.spotlight.com and get in touch if you’re interested.

First, let’s establish some principles

  • We are going to respect the semantic versioning convention of MAJOR.MINOR.PATCH, as described at semver.org.
  • Major and minor versions will be incremented manually. We trust developers to know whether their latest commit should be a new major or minor release according to semantic versioning principles.
  • Building the same codebase from the same branch twice should produce the same semantic version number.
  • Packages created from the master branch are release packages.
  • Packages created from a merge head of an open pull request are pre-release packages.
  • Pre-release packages will use the version number that would be assigned if that branch was accepted for release at build time.

Now, here’s the part where we’re going to deviate from the semantic versioning specification, because our packages actually use a four-part version number. We want to include a build number in our package versions, but the official semver extension for doing this – MAJOR.MINOR.PATCH+BUILD – won’t work with NuGet, so we’re going to use a four-part version number MAJOR.MINOR.PATCH.BUILD. Pre-release packages will be appended with a suffix describing which branch they were built from – MAJOR.MINOR.PATCH.BUILD-BRANCH.

OK, here’s an illustrated example that demonstrates what we’re trying to achieve. Master branch is green. Two developers are working on feature branches – blue and red in this example. To create our pre-release builds, we’re using a little-documented but incredibly useful feature of GitHub known as ‘merge heads’. The idea is that if you have an open pull request, the merge head will give you a snapshot of the codebase that would be created by merging the open pull request into master – so you’re not just testing your new feature in isolation, you’re actually building and testing your new feature plus the current state of the master branch. There is one caveat to this, which I’ll explain below.

So, we’ve got TeamCity set up to build and publish packages every time there’s a commit to master or to the merge head of an open pull request, and we’re also occasionally triggering manual builds just to make sure everything’s hanging together properly. Here’s what happens: 

semantic merging 500px

That line there that’s highlighted in yellow is a gotcha. At this point in our workflow, we’ve merged PR1 into our master branch, but because we haven’t pushed anything to the blue branch since this happened, the blue merge head is out of date. PR2 does NOT reflect the latest changes to master, and if we trigger a build manually, we’ll end up with a package that doesn’t actually reflect the latest state of the codebase. The workaround is pretty simple; if you’re creating pre-release builds from merge heads, never run these builds manually; make sure you always trigger the build by pushing a change to the branch.

Now let’s look at how can we get TeamCity to automatically calculate those semantic version numbers whenever a build is triggered. We’ll start with the major and minor version. We’re going to track these by creating a version.txt file in the root of the project codebase, which just contains the major and minor version numbers. If a developer decides that their feature branch represents a new major or minor version, it’s their responsibility to edit version.txt as part of implementing the feature. This also means that prerelease packages built from that branch will reflect the new version number whilst master branches will continue to use the old version until the branch gets merged, which I think is rather elegant.

For the patch version, we’re going to assume that every commit or merge to the master branch represents a new patch version, according to the following algorithm

  • If the current version.txt represents a NEW major/minor version, the patch number is zero
  • Otherwise, the patch number is the patch number of the latest release, incremented by the number of commits to the master branch since that release.

So – how do we know how many commits there have been since the last release? First, each time we build a release branch, we’re going to use Git tags to tag the repository with the version number we’ve just built. TeamCity will do this for you automatically using a build feature called “VCS labeling”:

image

Assuming every release has a corresponding tag, now we need to find the most recent release number, which we can do from the Git command line.

  git fetch –tags
  git tag –sort=v:refname

Git tags aren’t retrieved by default, so we need to explicitly fetch them before listing them. Then we list all the tags, specifying sort=v:refname which causes tag names to be treated as semantic versions when sorting. (Remember that semver sorting isn’t alphanumeric – in alphanumerics, v9 is higher than v12). Once we’ve got the latest tag, we need to count the number of revisions since that tag was created, which we can do using this syntax:

  git rev-list v1.2.3..HEAD –count

To use this in our TeamCity build, we’ll need to output the various different formats of that version so that TeamCity can use them. We want to do three things here:

  • Label the VCS root with the three-part semantic version number v1.2.3
  • Update the AssemblyInfo.cs files with the four-part version number 1.2.3.456 – note that we can’t put any prerelease suffix in the AssemblyInfo version.
  • Pass the full version – 1.2.3.456-pr789 – to Octopack when creating our deployable packages with Octopus.

I’ve wrapped the whole thing up in a Powershell script which runs as part of the TeamCity build process, which is on GitHub at https://gist.github.com/dylanbeattie/3a2fe5abca14600efee1c88009afc0f8

To use it in your project, add versions.ps1 to the root of your project repo; create a text file called version.txt which contains your major.minor version, and then add a TeamCity build step at the beginning of your build process that looks like this:

image

Finally, it’s worth mentioning that to use command-line git from Powershell, I had to set up TeamCity to use an SSH VCS root rather than HTTP, and install the appropriate SSH keys on the TeamCity build agent. I don’t know whether this is a genuine requirement or a quirk of our configuration; your mileage may vary. And I still find Powershell infuriatingly idiosyncratic, but hey – you probably knew that already. 🙂

Happy versioning! And like I said, if this sort of thing sounds like something you’d like to work on, awesome – we’re hiring! Check out jobs.spotlight.com for more details and get in touch if you’re interested.

Semantic Versioning with Powershell, TeamCity and GitHub

Semantic Versioning with Powershell, TeamCity and GitHub

Here at Spotlight Towers, we’ve been using TeamCity as our main build server since version 6; it’s a fantastic tool and we love it dearly. It got even better a few years back when we paired it with the marvellous Octopus Deploy; TeamCity builds the code and creates a set of deployable packages known as Octopacks; Octopus deploys the packages, and everything works quite nicely. Well, almost everything. One of the few problems that TeamCity + Octopus doesn’t magically solve for us is versioning. In this post, we’re going to look at how we use Git and TeamCity to manage versioning for our individual packages.

If this sounds like your sort of thing, why not come and work for us? That’s right – Spotlight is hiring! We’re looking for developers, testers and a new UX/Web designer – check out jobs.spotlight.com and get in touch if you’re interested.

First, let’s establish some principles

  • We are going to respect the semantic versioning convention of MAJOR.MINOR.PATCH, as described at semver.org.
  • Major and minor versions will be incremented manually. We trust developers to know whether their latest commit should be a new major or minor release according to semantic versioning principles.
  • Building the same codebase from the same branch twice should produce the same semantic version number.
  • Packages created from the master branch are release packages.
  • Packages created from a merge head of an open pull request are pre-release packages.
  • Pre-release packages will use the version number that would be assigned if that branch was accepted for release at build time.

Now, here’s the part where we’re going to deviate from the semantic versioning specification, because our packages actually use a four-part version number. We want to include a build number in our package versions, but the official semver extension for doing this – MAJOR.MINOR.PATCH+BUILD – won’t work with NuGet, so we’re going to use a four-part version number MAJOR.MINOR.PATCH.BUILD. Pre-release packages will be appended with a suffix describing which branch they were built from – MAJOR.MINOR.PATCH.BUILD-BRANCH.

OK, here’s an illustrated example that demonstrates what we’re trying to achieve. Master branch is green. Two developers are working on feature branches – blue and red in this example. To create our pre-release builds, we’re using a little-documented but incredibly useful feature of GitHub known as ‘merge heads’. The idea is that if you have an open pull request, the merge head will give you a snapshot of the codebase that would be created by merging the open pull request into master – so you’re not just testing your new feature in isolation, you’re actually building and testing your new feature plus the current state of the master branch. There is one caveat to this, which I’ll explain below.

So, we’ve got TeamCity set up to build and publish packages every time there’s a commit to master or to the merge head of an open pull request, and we’re also occasionally triggering manual builds just to make sure everything’s hanging together properly. Here’s what happens: 

semantic merging 500px

That line there that’s highlighted in yellow is a gotcha. At this point in our workflow, we’ve merged PR1 into our master branch, but because we haven’t pushed anything to the blue branch since this happened, the blue merge head is out of date. PR2 does NOT reflect the latest changes to master, and if we trigger a build manually, we’ll end up with a package that doesn’t actually reflect the latest state of the codebase. The workaround is pretty simple; if you’re creating pre-release builds from merge heads, never run these builds manually; make sure you always trigger the build by pushing a change to the branch.

Now let’s look at how can we get TeamCity to automatically calculate those semantic version numbers whenever a build is triggered. We’ll start with the major and minor version. We’re going to track these by creating a version.txt file in the root of the project codebase, which just contains the major and minor version numbers. If a developer decides that their feature branch represents a new major or minor version, it’s their responsibility to edit version.txt as part of implementing the feature. This also means that prerelease packages built from that branch will reflect the new version number whilst master branches will continue to use the old version until the branch gets merged, which I think is rather elegant.

For the patch version, we’re going to assume that every commit or merge to the master branch represents a new patch version, according to the following algorithm

  • If the current version.txt represents a NEW major/minor version, the patch number is zero
  • Otherwise, the patch number is the patch number of the latest release, incremented by the number of commits to the master branch since that release.

So – how do we know how many commits there have been since the last release? First, each time we build a release branch, we’re going to use Git tags to tag the repository with the version number we’ve just built. TeamCity will do this for you automatically using a build feature called “VCS labeling”:

image

Assuming every release has a corresponding tag, now we need to find the most recent release number, which we can do from the Git command line.

git fetch –tags
git tag –sort=v:refname

Git tags aren’t retrieved by default, so we need to explicitly fetch them before listing them. Then we list all the tags, specifying sort=v:refname which causes tag names to be treated as semantic versions when sorting. (Remember that semver sorting isn’t alphanumeric – in alphanumerics, v9 is higher than v12). Once we’ve got the latest tag, we need to count the number of revisions since that tag was created, which we can do using this syntax:

git rev-list v1.2.3..HEAD –count

To use this in our TeamCity build, we’ll need to output the various different formats of that version so that TeamCity can use them. We want to do three things here:

  • Label the VCS root with the three-part semantic version number v1.2.3
  • Update the AssemblyInfo.cs files with the four-part version number 1.2.3.456 – note that we can’t put any prerelease suffix in the AssemblyInfo version.
  • Pass the full version – 1.2.3.456-pr789 – to Octopack when creating our deployable packages with Octopus.

I’ve wrapped the whole thing up in a Powershell script which runs as part of the TeamCity build process, which is on GitHub:

https://gist.github.com/dylanbeattie/3a2fe5abca14600efee1c88009afc0f8.js

To use it in your project, add versions.ps1 to the root of your project repo; create a text file called version.txt which contains your major.minor version, and then add a TeamCity build step at the beginning of your build process that looks like this:

image

Finally, it’s worth mentioning that to use command-line git from Powershell, I had to set up TeamCity to use an SSH VCS root rather than HTTP, and install the appropriate SSH keys on the TeamCity build agent. I don’t know whether this is a genuine requirement or a quirk of our configuration; your mileage may vary. And I still find Powershell infuriatingly idiosyncratic, but hey – you probably knew that already. 🙂

Happy versioning! And like I said, if this sort of thing sounds like something you’d like to work on, awesome – we’re hiring! Check out jobs.spotlight.com for more details and get in touch if you’re interested.

Semantic Versioning with Powershell, TeamCity and GitHub

IdentityServer, OpenID Connect and Microsoft CRM Portals

As readers of this blog will know, here at Spotlight we’re in the process of moving nine decades’ worth of legacy business process onto Microsoft Dynamics CRM, aka CRM Online, which I gather is now called Dynamics 365 (because hey, it’s not like naming things was hard enough already, right?)

We’re also investigating a couple of options for building customer-facing systems that integrate with Dynamics. Until last year, there were really three options for this – a product called Adxstudio, a free Microsoft component called the CRM Portal Accelerator, or rolling your own solution using the CRM SDK. Around this time last year, Microsoft quietly retired the Portal Accelerator component and acquired Adxstudio, and since then, they’ve been in the process of assimilating it into the Dynamics product family – which has meant it’s been something of a moving target, both in terms of the supported features and in terms of the quality of documentation and examples.

I’ve previously blogged about one way to integrate Adxstudio with your existing authentication system, but that approach relied completely on running Adxstudio on-premise so you could run your own code as part of the request lifecycle – and as you may have noticed, there’s a bit of a trend in IT at the moment away from running your own servers and towards using hosted managed services, so that patching and backups are somebody else’s problem. Since Microsoft acquired Adxstudio, there’s been a lot of churn around what’s supported and what’s not – I’m guessing that behind the scenes they’re going through the Adxstudio codebase feature-by-feature and making sure it lines up with their plans for the Dynamics 365 platform, but that’s just guesswork on my part.

One of the main integration points I’ve been waiting for is the ability for a Microsoft-hosted Portal solution to use a third-party OpenID Connect endpoint to authenticate users, and it appears in the latest update this is finally supported – albeit with a couple of bumps along the way. Here’s what I’ve had to do to get a proof-of-concept up and running.

Setting up Dynamics CRM Portals

First, you’ll need to set up a Dynamics Portal trial. You can get a 30-day hosted trial of Dynamics CRM Online by signing up here – this actually gives you a full Office 365 organization including things like hosted Active Directory, as well as the Dynamics CRM Online instance we’re using in this example. Next, you’ll need to ask nicely for a trial of the portal add-on – which you can do by filling out the form at crmmanagedtrials.dynamics.com.

Setting up IdentityServer and configuring an ngrok tunnel

Whilst you’re waiting for the nice Microsoft people to send you your trial license, get up and running with IdentityServer. For this prototype, I’m using the MVC Authentication example from the IdentityServer3.Samples project – clone it to your workstation, open the MVC Authentication solution, hit F5, verify you can get up and running on localhost.

Next – in order for Dynamics CRM Online to talk to your IdentityServer instance, you’ll need to make your IdentityServer endpoints visible to the internet. You could do this by deploying your IdentityServer sample to Azure or AWS, but for experiments like this, I like to use a tool called ngrok, which will create temporary, secure tunnels from the internet to your workstation. Download ngrok, unzip it somewhere sensible.

Pick a tunnel name. I’m using authdemo in this example but any valid DNS host name will do. Next, create a local IIS application pointing to the EmbeddedMvc folder in your samples directory, and set the host name to <your tunnel name>.ngrok.io

image

Now run ngrok.exe to create a tunnel from the internet to your new IIS application:

C:\tools\ngrok> ngrok.exe http –subdomain=authdemo 80

ngrok by @inconshreveable

Session Status        online
Version               2.1.18
Region                United States (us)
Web Interface         http://127.0.0.1:4040
Forwarding            http://authdemo.ngrok.io -> localhost:80
Forwarding            https://authdemo.ngrok.io -> localhost:80

Connections           ttl     opn     rt1     rt5     p50     p90
                      0       0       0.00    0.00    0.00    0.00

All being

If that’s worked, you should be able to fire up a browser, go to http://authdemo.ngrok.io/ – replacing ‘authdemo’ with your own tunnel name – and see the IdentityServer3 sample landing page:

image

Configuring IdentityServer

Right. Next thing we need to do is to make a couple of changes to the IdentityServer configuration, so that it’ll run happily on authdemo.ngrok.io instead of on localhost

First, enable logging. Just do it. Use the package manager console to install the Serilog.Sinks.Trace package. Then add this to the top of your Configuration() method inside Startup:

Log.Logger = new LoggerConfiguration()
                .MinimumLevel.Debug()
                .WriteTo.Trace()
                .CreateLogger();

and add this to your web.config, specifying a path that’s writable by the application pool:

<system.diagnostics>
  <trace autoflush="true"
         indentsize="4">
    <listeners>
      <add name="myListener"
           type="System.Diagnostics.TextWriterTraceListener"
           initializeData="C:\logfiles\identityserver.log" />
      <remove name="Default" />
    </listeners>
  </trace>
</system.diagnostics>

Next, do a global search and replace, replacing any occurrence of localhost:44319 with authdemo.ngrok.io – again, substituting your own tunnel name as required.

Next, add a new client to the static EmbeddedMvc.IdentityServer.Clients class the IdentityServer sample project – changing the highlighted values to your own client ID, client secret, and portal instance URL:


new Client {
    ClientName = "Dynamics CRM Online",
    ClientId = "crm",
    Flow = Flows.Hybrid,
    ClientSecrets = new List() { new Secret("secret01".Sha256()) },
    RedirectUris = new List { 
"https://my-portal-instance.microsoftcrmportals.com"
}, PostLogoutRedirectUris = new List {
"https://my-portal-instance.microsoftcrmportals.com"
}, AllowedScopes = new List { "openid" } },

 

Adding IdentityServer as an endpoint in CRM Portals

Finally, you need to add your new IdentityServer as an identity provider. CRM Portals uses the Dynamics CRM platform for all its configuration and data storage, so to add new settings you’ll need to log into your Dynamics CRM Online instance, go into Portals > Site Settings, and add the following values:

Name

Value

Website

Authentication/OpenIdConnect/AuthDemo/Authority

http://authdemo.ngrok.io/identity/

Customer Self-Service

Authentication/OpenIdConnect/AuthDemo/Caption

IdentityServer OpenID Connect Demo

Customer Self-Service

Authentication/OpenIdConnect/AuthDemo/ClientId

crm

Customer Self-Service

Authentication/OpenIdConnect/AuthDemo/ClientSecret

secret01

Customer Self-Service

Authentication/OpenIdConnect/AuthDemo/MetadataAddress

http://authdemo.ngrok.io/identity/.well-known/openid-configuration

Customer Self-Service

Finally, it looks like you’ll need to restart the portal instance to get it to pick up the updated values – which you can do by logging into the Office 365 Admin Center, Admin Centers, CRM, Applications, Portal Add-On, clicking ‘MANAGE’, and pressing the nice big RESTART button on the Portal Actions page:

image

And – assuming everything lines up exactly right – you should now see an additional login button on your CRM Portals instance:

image

Clicking on it will bounce you across to your ngrok-tunnelled IdentityServer MVC app running on localhost:

imageLog in as bob / secret, and you’ll get the OpenID permissions check:

image

…and when you hit ‘Yes, Allow’, you’ll be redirected back to the CRM Portals instance, which will create a new CRM Contact linked to your OpenID Connect identity, and log you in to the portal.

Conclusions

Of course, in the real world there’s a lot more to it than this – there is a huge difference between a proof of concept like this and a production system. These sorts of user journeys form such a key part of delivering great user experience, and integrating multiple systems into your login and authentication/authorization journeys only makes this harder. But it did work, and it wasn’t actually all that complicated to get it up and running. It’s also interesting to see how something like OpenID Connect can be used to integrate a powerful open-source solution like IdentityServer with a heavyweight hosted platform service like CRM Portals.

Whether we end up adopting a hosted solution like CRM Portals – as opposed to just building our own apps that connect to CRM via the SDK or the new OData API – remains to be seen, but it’s nice to see solutions from two radically different sources playing nicely together thanks to the joy of open protocols like OpenID Connect. Long may it continue.

IdentityServer, OpenID Connect and Microsoft CRM Portals

The Mystery of the Chinese Junk

You know it’s going to be one of those days when, just as you’re about to put on your headphones and get into ‘the zone’, you overhear somebody saying the fateful words ‘ok, then maybe we’ll need to get Dylan to look at it.’

See, amongst the many hats I wear in the course of a given week, there’s one that’s probably labelled ‘dungeon master’ I’m the one who remembers where all the bodies are buried, because – for all sorts of reasons that made very good sense at the time – I probably helped bury most of them. And on this particular day, the source of so much excitement was our venerable Microsoft Dynamics CRM v4 server. It started out with a sort of general grumbling on the support channel about CRM4 being slow… but by the time it was handed over to me to look into, it was beautifully summarised as ‘dude… there’s Chinese in the Windows event log’

And, sure enough, there is – complete with the lovely Courier typeface that Windows Event Viewer kicks into when you get errors so weird that good old Microsoft Sans Serif can’t even display them:

image

Now, whilst it’s been a while since I’ve done any serious work on our old CRM system, I’m pretty sure it’s not supposed to do that – so we start investigating. Working theory #1: some sort of vulnerability has resulted in attackers injecting Chinese characters into our database – whilst CRM4 is generally pretty well insulated from any public-facing code, there’s one or two places where signup forms would generate CRM Leads, that sort of thing. So we start grepping the entire database for one of the Chinese strings we’ve found in the event log.

Whilst this is going on – and trust me, it takes a while – I decide to share my excitement via the wonder of social media. This turns out to be a Really Good idea, because… well, here’s what happened…

“The incoming tabular data stream TDS RPC protocol stream is incorrect. Parameter (“䐀攀氀攀琀椀漀渀匀琀愀琀攀…” Oh. It’s gonna be one of THOSE days.

— Dylan Beattie (@dylanbeattie) October 18, 2016//platform.twitter.com/widgets.js

@dwm @dylanbeattie The low bits are all null, so this probably UTF-16LE being mistaken for UTF-16BE (or vice versa…?). pic.twitter.com/D0IO4px9YE

— Fake Unicode ⁰ ⁧ (@FakeUnicode) October 18, 2016//platform.twitter.com/widgets.js

You see in @FakeUnicode’s screenshot there, the words ‘DeletionState’ appear quite clearly at the bottom of the message?

Whilst this is going on, our database search comes back reporting that there’s no mysterious Chinese characters in any of our CRM database tables. Which is good, since it means we probably haven’t been compromised. So, next step is to work through that Unicode lead, see if that gets us anywhere. Because .NET has a built-in encoding for big-endian Unicode, this is pretty simple:

var source = “䐀攀氀攀琀椀漀渀匀琀愀琀攀”;
var bytes = Encoding.BigEndianUnicode.GetBytes(source);
var result = Encoding.Unicode.GetString(bytes);
Console.WriteLine(result);

Turns out – just as in FakeUnicode’s screenshot – that’s the text “DeletionState” with the byte order flipped. We grabbed a few examples of the ‘Chinese’ text from the event log, ran it through this – sure enough, in every single case it’s a valid CRM database query that’s somehow been flipped into wrong-endian Unicode. At this point we start suspecting some sort of latent bug – this is old software, running on an old operating system,talking to an old database server, and sure enough, a bit of googling turns up a couple of  likely-looking issues, most of which are addressed in various updates to SQL Server 2008. We take a VM snapshot in case everything goes horribly wrong, and one of the Ops gang volunteers to work late to get the server patched.

Next morning, turns out the server hasn’t been patched – because every single download of the relevant service pack has been corrupted. At which point all bets are off, because chances are the problem is actually network-related – which also explains where the ‘Chinese’ is coming from.

OK, let’s capture a stream of bytes from somewhere. Like, say, from the TDS data stream used by the MSCRMAsyncService

image

What does that say? If you think you know the answer, you’re wrong. Pop off and read The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!) – done? Awesome. NOW what do you think it says?

See, we have no idea. It’s a stream of bytes. Without some indication of how we’re supposed to interpret those bytes, it’s meaningless. OK, I’ll give you a clue – it’s UTF-16. Now can you tell what it says? No, you can’t – because (1) you don’t know whether it’s big-endian or little-endian, and (2) you don’t know where it started.

If we assume it’s big-endian, then the first byte pair – 00 48 – would encode the character ‘H’, the second byte pair – 00 65 – would encode ‘e’, and so on. If we assume it’s little-endian, then the first byte pair – 00 48 – encodes the character 䠀 – and suddenly the mysterious Chinese characters in the event log start to make sense.

image

Of course, the data stream between the MSCRMAsyncService and the SQL server hasn’t actually flipped from little-endian UTF-16 to big-endian – what’s happened is that the network connection between them is dropping bytes. And if you drop a single byte – or any odd number of bytes – from a little-endian Unicode stream, you get a sort of off-by-one error right along the rest of the data stream, resulting in all sorts of weirdness – including Chinese in the event logs.

Turns out there was a problem with the virtual network interface on the SQL Server box – which was causing poor performance, timeouts, bizarre query syntax errors, Chinese in the event logs, and corrupted service pack downloads. Fortunately the databases themselves were intact, so we offlined them, cloned the virtual disk they were sitting on, attached that to a different server and brought them back online.

Every once in a while, you get a weird problem like this. I’ve seen maybe half-a-dozen problems in my entire career that made absolutely no sense until they turned out to be a faulty network connection, at which point generally you not only solve the problem, but explain a whole load of other weirdness that you hadn’t got round to investigating yet. The only thing more fun than dodgy networks is dodgy memory – but that’s a post for another day.

Oh, and if you’re wondering about the title of this post, you clearly haven’t studied the classics.

The Mystery of the Chinese Junk

The Next Big(int) Thing

One of our systems here uses a bigint identity column as a database primary key – because we knew when we built it, back in 2010, that we were going to end up with more than 2,147,483,647 records.

Well, that happened at 12:02 today, and a couple of systems promptly failed – because, despite the underlying database being designed to handle 2^63 records, the POCOs that were being mapped to those classes were using a regular C# int to store the record ID, and so as soon as they got an ID from the database that’s bigger than Int32.MaxValue, they blew up. Thanks to the underlying DB schema already supporting 64-bit IDs, the fix was pretty simple – just change int to long in a few carefully-selected places and redeploy the applications – but it’s still annoying that something we knew about, and planned for, still came back to bite us. So I started thinking – how could we stop this happening?

The problem is that, despite being a bigint column, we just accepted SQL Server’s default identity setting of (1,1) – i.e. start counting at 1, and increment by 1 each time. Which means that until you hit 2-billion-and-something records, it doesn’t actually make any difference – and that takes a while. In our case, it took 5 years, 8 months and 26 days. During that time we’ve made hundreds of changes to our code, and in a handful of those cases, we’ve mapped that bigint ID onto a regular C# Int32 – and so inadvertently planted a little time-bomb in our production code. Tick, tick, tick…

So here’s a nice neat solution, that I wish I’d thought of five years ago. Anytime you create a bigint identity, seed it with (2147483648, 1) – so that right from day one, it’s already too big to fit in an Int32. Any system that tries to store an ID in a regular int variable will fail immediately, not in five years when someone creates that magic 2.14-billion-and-somethingth record. Even though you’ve effectively thrown away 2^32 possible values, you have another (2^64 – 2^32) values to play with, so you’ve lost a tiny, tiny fraction of the available keyspace in exchange for immediate feedback if any of your client apps can’t cope with 64-bit ID values.

The Next Big(int) Thing