Separate, support, serve

Yesterday, Microsoft continued down a path that they’ve been pursuing for awhile by providing even tighter ties between Windows and Linux–including allowing running unmodified Ubuntu binaries directly in Windows. Reactions were, to say the least, varied; many people were preparing for the apocalypse, others were excited about being able to use Unix tools more easily at work, and still others were just fascinated by how this was technically accomplished. These reactions mostly made sense to me.

One did not. Especially on sites like Hacker News, many responses were screaming that people needed to be scared, to remember Embrace, Extend, Extinguish, to run for the exits as quickly as possible.

I find this reaction frustrating and depressing, not because it’s offensive, but because it’s so obviously incorrect and intellectually lazy that it gives me a headache.

I want to do two things in this blog post: convince you that Embrace, Extend, Extinguish is a grossly invalid understanding of Microsoft’s strategy; and convince you that an alternative strategy of Separate, Support, Serve provides a much better lens to view the modern Microsoft.

The Death of the Three Es

I’m not going to try to persuade you that Microsoft isn’t evil–if you believe they are, you’re wrong, but I don’t honestly care–but I am going to explain to you that, even if Microsoft were still evil, they would still not be doing Embrace, Extend, Extinguish.

First, I want to quickly remind you what the computing landscape looked like when Microsoft was using that strategy. Windows ruled everywhere, in a way that’s almost impossible to imagine today. Virtually all desktops everywhere ran some flavor of Windows. Mac OS, while arguably more usable than Windows, was technically inferior, and had such an app shortage (especially in niche spaces) that it was largely irrelevant. This in turn meant that Windows also ruled most of the back office. Paired along with the Office monopoly, Microsoft really and truly had a total lock on the personal computing space. It was basically impossible to use a computer without interacting with at least one Windows device in the process.

In that epoch, Embrace, Extend, Extinguish made a hell of a lot of sense. The idea was simple: if Microsoft saw a technology that threatened Windows, they’d embrace it (make it available on Windows), extend it in such a way that the best way to use the threat was Windows-specific, and, once most uses of the technology were sufficiently tied exclusively to Windows, extinguish it.

When Microsoft was a monopoly, this was a superb strategy to protect that monopoly. If they saw a threat, then bringing the threat in-house and tying it to the Windows platform was a great way to ensure people couldn’t leave, even if they wanted to. In effect, your alternatives had a tendency to evaporate before you had a chance to use them.

But Microsoft is no longer a monopoly. Hell, in many key areas, they’re effectively a non-player. While it maintains a plurality in old-school personal computers, Windows Phone is bascially a failed project, the cloud is all but Linux-only, and even the entire existence of the back office has been threatened by tools like Google Apps and other hosted solutions. They’ve even lost most kiosks to custom Android variants, and most developers to OS X. It’s now surprisingly rare that I do interact with a Microsoft system on a normal day, and I’m hardly unique in that.

This leaves us with two conclusions. First, empirically, Embrace, Extend, Extinguish failed; if it hadn’t, Windows would still be a monopoly. For Microsoft to be continuing this strategy, you have to believe they were not merely evil, but also unrecoverably stupid.

Second, it can’t work in an environment where Microsoft is an underdog. For nearly all shops out there, leaving Windows is honestly pretty trivial at this point; it’s adopting it that’d be an uphill battle. If I pick “Linux”, I can trivially integrate OpenBSD, Illumos, OS X, and any other Unix-like environment into my workflow with few or no issues. I can pick amongst AWS, GCE, Digital Ocean, and others for my hosting. I can pick virtually any language and database I want, use virtually any deployment tool, and migrate amongst all of these options with relative ease.

Windows is the odd one out. Adopting it not only means getting into a single-vendor solution, but also dealing with writing two sets of most deployment pieces, and dealing with licensing, and dealing with training my ops and IT teams with two radically different technology stacks. I’m going to need one hell of a value proposition to even think about it, and I still would likely turn it down to keep my ongoing maintenance costs sane.

Further, it’s a surprisingly hard environment for me to use as a developer these days even if I want to. If I grab a MacBook, I can write apps for iOS, Android, and Unix, all natively. If I grab a Windows laptop, I can’t target iOS at all, and I have to do any Unix development in a VM. This means that at Khan Academy, for example, I’d have to be insane to buy a Surface, even though I love the device; I’d end up spending all day in virtual machine running Ubuntu. It’s not impossible to use Windows, but honestly, if I have to spend all day in a full-screen VMware session, why bother?

In that environment, the old Three Es just don’t apply. They were about locking me into Windows, but we’ve long since passed that point. The problem Microsoft now faces is one of staunching the bleeding, and that requires a radically different strategy.

The Facts on the Ground

So: if you’re Microsoft, and you’re facing a world where you’ve largely lost all the current fights; where you’re losing developers left and right; where the challenge isn’t keeping people from leaving, but getting them to knock on the door in the first place; what do you do?

There are a couple of strategies that Microsoft could take in this environment, but I want to assert two key facts before we get going.

First, it’s very unlikely that Microsoft can stage a meaningful comeback at the OS layer in mobile, cloud, or server rooms at this point. We’re all now at least as entrenched in iOS and Android on mobile, and Linux on servers, as we ever were in Microsoft PCs. So if Microsoft is going to remain relevant, they’re going to have to do it in a way that meaningfully separates going Microsoft from going Windows.

Second, even if somehow they gained a meaningful foothold in those markets, it’s very unlikely they’ll be anywhere near a monopoly player in the space. iOS, Android, and Linux are so firmly established, and so pervasive, that any conceivable world for now is one where Microsoft has to get along with the other players. In other words, Microsoft-specific solutions are going to be punished; they’ll need technologies common to everyone.

If you agree with those two facts, their current strategy falls out pretty cleanly.

A Way Forward

First, Microsoft has to enable me to even use Microsoft technology in the first place. If Microsoft keeps tying all Microsoft technology to Windows, then they lose. If I have to use Windows to use SQL Server, then I’ll go with PostgreSQL. If I have to use Windows to have a sane .NET server environment, then I’ll pick Java. To fix that, Microsoft needs to let me make those decisions separately.

That’s indeed the first phase of their strategy: separating Windows from the rest of their technologies. SQL Server is available on Linux not to encourage lock-in, but because they need you to be able to chose SQL Server even though you’ve got a Docker-based deployment infrastructure running on RedHat. .NET is getting great runtimes and development environments (Visual Studio Code) for Unix so that I can more reasonably look at Azure’s .NET offerings without also forcing my entire dev team to work on Windows. This strategy dramatically increases the chance of me paying Microsoft money, even though it won’t increase the chance I’ll use Windows.

Next, Microsoft needs to do the reverse: make it feasible for me to use Windows as a development environment again. That’s where the dramatically improved Unix support comes from: by building in an entire natively supported Ubuntu environment, by having Visual Studio be able to make native Linux binaries, they’re making it feasible for me to realistically pick Windows even in a typical Unix-centric cloud-focused development shop. Likewise, Visual Studio’s improved support for targeting iOS and Android, and Microsoft’s acquisition of Xamarin, are going to go far to enabling me to do something similar on the mobile front.

In both of these cases, while there may be an “embrace” component, the “extend” part is notably missing from here–and it should be. Microsoft can’t meaningfully extend iOS, Android, or Linux in a way that’d actually matter to anyone at this point; it has to just support them on their own terms. And in that environment, it’s not possible to extinguish things; if Microsoft woke up one day and announced Xamarin was dead and gone, people would grumpily rewrite their stuff in Swift and Java, not suddenly announce that they were Windows-exclusive.

Finally, Microsoft still needs to make money, and they can do that by selling software as a service (Azure, Office365, and so on), rather than off-the-shelf. That not only gives them a steady revenue stream independent of their Windows installed base–after all, a person using Office365 pays the same whether they’re on Windows, OS X, or a Chromebook. It also provides insulation for them from any future platform changes. Does HoloLens take off? PlayStationVR? Oculus Rift? Will Microsoft catch the next wave? Who cares. As long as you’re using Microsoft products somewhere in your stack, they’ll be fine.

Separate, Support, Serve

It’s not as catchy as the original, and it certainly sounds a lot less ominous, but I think this can be summarized as the Three Ss: separate all of Microsoft’s offerings from Windows itself; support the reality of this heterogenous world when on Windows; and be the company that serves as much content as possible from its data centers.

I’m not saying that Microsoft can’t still lock people in some way. Apple definitely tries to lock in its customers with iCloud and iOS, and its developers with Swift, for example. But I do hope that this has convinced you that Embrace, Extend, Extinguish is dead–and, with it, at least some of the FUD about Microsoft’s software.

Jobs once famously said that Microsoft didn’t need to lose for Apple to win. Today, I think it’s worth realizing the reverse: Microsoft doesn’t require you lose for it to succeed.

Android, Project Fi, and Updates

Edit: Mere days after posting this (and unrelated to this post), Google publicly apologized for the Android 6 roll-out delay and pushed out Android 6.0.0 to Nexus 6 devices. They then followed that up extremely rapidly with the Android 6.0.1 update. I think this bodes incredibly well. Project Fi is still a very new service, and I’ve little doubt that Google has to work out some kinks of their end. For the moment, I’m going to take a step back, watch, and see if this new rapid update cycle is the new norm. If it is, I think I’ve found my ideal carrier and platform. But I still think that encouraging new users to stick to iOS until this update cycle is proven is probably the best course of action.

I want to make clear, right up front, that I am absolutely not an iOS apologist. I couldn’t wait for the first Android phones to come out, and I bought a Motorola Droid on launch day. I was excited about its better multitasking, about the keyboard, about the better integration with Google services, about the fact that I could use Java instead of Objective-C,1 about the much more open platform that wouldn’t restrict what I wanted to do. I was very sincerely excited.

But neither the hardware nor the software were quite ready at the time. I went through three Droids, suffering one (thankfully warranty-covered) hardware failure after another. After an initially promising update cycle (the Droid was upgraded to what I believe was Android 2.1 very quickly), I began to see that Google was having issues getting new versions of Android out on a sane schedule. So, after a couple of years of living on the Android train, I hopped off and grabbed an iPhone.

That didn’t mean I gave up on Android. If anything, I was pretty confident that Android, not iOS, would be the winner in the end anyway. Google would figure things out—and it wasn’t even just Google, after all, but a huge chunk of the telecom industry, all of whom had a vested interest in keeping Apple from dominating, helping them out. We’d seen this play out already with Microsoft and the PC makers versus Apple in the 90s; we knew how it would end, that Android would close the gaps and take over the industry. It was just a matter of time while Google got their operation running smoothly.

While that’s obviously not what happened, both Android software and hardware did markedly improve. There were even a lot of things that Android got first that were genuine usability wins: instant replies from notifications, assistants (Google Now), turn-by-turn directions, cross-application communication, automatic app updates, and more. I ended up buying a Nexus 7 as a tablet, and found that, at least as a developer, it fit my needs a lot better than an iPad ever did.

There was, however, one caveat: Android’s security story. Because Google couldn’t get updates out to its phones on a sane schedule, most Android phones had long-running unpatched security issues. If there’s one thing I think we’ve learned about security over the last few years, it’s that a team that patches early and often is going to be vastly better protected than one that doesn’t. This didn’t bother me too much on my Nexus 7—Google was better about pushing out updates for its tablets than its phones, and at any rate, side-loading the OS didn’t pose any major problems for me on a non-mission-critical tablet—but it kept me from returning to Android phones.

So when Project Fi was released, I signed up immediately. I figured I could finally, finally have my cake and eat it, too: Google generally kept Nexus devices up-to-date, and the Fi pricing model seemed like a huge improvement to me over what I’d been forced to do on the major carriers. What wasn’t to love? I could go back to Android and bid Verizon adieu at the same moment, a great double-win.

That is emphatically not what happened. First, security updates were slow to come out: whereas Apple virtually always has security issues patched well ahead of any disclosure window,2 Google seemed to struggle. When Stagefright came out, I had to wait, just like everyone else, for my patch. And when that patch happened, it was woefully incomplete, so then I got to wait again for a patch to the patch. And then when Android M shipped a month ago, Google left Nexus 6 users—all of whom own a phone that is just barely over a year old at this point—running Android 5.1.1. Yes, you can get M on Project Fi, but you have to side-load (which their support representatives are loudly and actively discouraging in their support forums), or you have to buy a new phone—the exact situation that exists on other carriers, and the exact situation I was trying to avoid.

This is ridiculous. Apple manages to push out updates to all carriers on the same day. Microsoft, which generally brings a vaguely Scooby Doo-like quality of competition to the smartphone landscape, manages to get updates out to all Lumia devices within at most a few days of each other, and also has a very simple system in which any Windows Phone user can opt-in to get Windows Update-style updates ahead of general availability. Meanwhile, on its own cell network, Google has…side-loading, which it’s discouraging.

This just shouldn’t be that hard. And yet, for Google, it clearly is.

So I give up. Apple can keep their products up-to-date across dozens of carriers; Google can’t even keep their own products up-to-date on their own cellular network. If they can’t even make that work, then I throw in the towel.

I suppose it’s possible that my next phone won’t run iOS, but the one I can guarantee you is that it’s not going to run Android.


  1. I am not trolling. Java, at the time, was a much more pleasant language to work in than Objective-C. You had garbage collection, a better dependency management story, better support resources, and a much larger collection of third-party libraries, and to top it all off, you had Eclipse or IntelliJ instead of a fairly early version of Xcode. Even if Android’s APIs might not’ve been the best I’d ever used, they were, at least in my opinion, just fine. [return]
  2. “Virtually” is a key word there; they had a couple minor vulnerabilities that were disclosed prior to patch. But we’re talking a couple issues patched after disclosure date versus a spate of major Android ones that stay unpatched for literally weeks or months after disclosure. There’s no contest. [return]

Genuine opinions, thoughtfully presented

When I was in high school, I used to do competitive speech.1 I didn’t really want to do competitive speech as such; what I wanted to do was competitive debate. After all, debate was way more fun: you got to argue, on purpose, about things with little actual consequence! And you got more points for being the best arguer! What’s not to love?

Sadly, my school didn’t have enough people to do both debate and speech; we had to pick one, and since the overwhelming majority of my fellow classmates wanted to do speech, we did speech. So I had to spend four lousy years getting good at public speaking, which hasn’t come up even a single time in real life when you exclude when I speak at events and conferences and at work and so on. Clearly I lost out.

There was, of course, an event that was kind of like debate at the speech meetups. The Internet is being amazingly unhelpful in answering this question, but I believe it was called “discussion.” To the best of my recollection, in discussion, you were given a topic to discuss, and then needed to demonstrate some finesse both in keeping the discussion going, and in having the group come to an agreement over the course of the event. You were actually penalized for arguing; you got points by drawing others into the discussion, helping them flesh out their points, and helping drive consensus.

Needless to say, I never did this event. Sounded like a thing for wimps.

Strong opinions, inconsistently supported

There’s a very, very old2 saying that many in our industry hold up on a regular basis as the ideal way for developers to have discussions: “strong opinions, weakly held.” It’s used almost as a mantra for how to be at once egoless and stay in motion. After all, if you have strong opinions, you’ll act on them, and if they’re weakly held, you’re highly amenable to changing your opinions in the face of more data. What’s not to like?

The truth is that I’ve been thinking about this a lot recently, and while I know that’s the spirit originally intended, and while I think there’s tons of value in what “strong opinions, weakly held” was originally trying to embody, I am increasingly convinced that its meaning has been coöpted by people who, like a teenage version of me, are really just pining for a good old round of high-school debate.

The original thought is one that I can pretty solidly get behind, at least in theory. As far as I can tell, the expression comes from Bob Johansen, via a post from Bob Sutton in 2006. According to that version,

Bob [Johansen] explained that weak opinions are problematic because people aren’t inspired to develop the best arguments possible for them, or to put forth the energy required to test them. Bob explained that it was just as important, however, to not be too attached to what you believe because, otherwise, it undermines your ability to “see” and “hear” evidence that clashes with your opinions

On the surface, this makes complete sense—and if this were how people actually took the expression, I doubt I’d have anything to add.

But in practice, that’s not what I actually see from people who claim to apply this rule. Instead, I see something that looks a lot more like this:

Developer A: This color is clearly a brownish gray, and everyone who disagrees with me is an idiot. Behold my strong opinion.

Developer B: It’s not a brownish gray; it’s taupe. Here’s the definition of taupe. You’re wrong. Repent before thou dost descend unto the Hellfire.

Developer A: Alas, I have seen the error of my ways. The color is clearly taupe, and everyone who disagrees with me is an idiot, including me from 15 seconds ago. Behold, ye, that my strong opinion was weakly held, and that I am now perfect in judgment.

This isn’t a “strong opinion, weakly held.” This is someone conveying their weak opinion in a forceful manner to the point of being an annoying little snot.3

And while this example is obviously hyperbole, I’ve seen real fights that are ever so barely removed from this extreme. Bower or Webpack? Gulp or Grunt? Go or Python? Java or Kotlin or Ceylon? I’ve seen all of these fought out tooth-and-nail by people who will, in a calm one-on-one discussion, prove surprisingly flexible, but who, in the heat of the moment, get so wrapped up in the “strong opinion” part of the quote that they forget they’re supposed to also be willing to change their minds.

In the most extreme version of this kind of attitude, you get debates over things like editors and personal workflows that don’t require buy-in from the entire team in the first place. I’d find these funnier if they weren’t so divisive, but you have whole friendships destroyed over things like Emacs v. Vim,4 or Firefox Developer Edition v. Chrome,5 or Windows v. OS X.6 It doesn’t even matter what your colleague picks in these situations, because you can still use whatever you want with no repercussions whatsoever. Their decision literally just does not impact you. And yet, these can be some of the most heated debates of any I’ve ever heard.

It’s not that these are invalid discussions to have as such, but rather that you’re likely to see them very forcefully carried out, by people who, to any onlookers, sure seem to be utterly convinced that they’re correct, and who seem to be allowing no margin for error. Even if they secretly, in their heart of hearts, are still holding their strong opinion weakly, the outward appearance they project is one of a strong opinion strongly held…on arguably flimsy data.

And that’s exactly not what you want.

Sincere opinions, frightfully restrained

To borrow a phrase from a different part of our industry, this kind of dialog has a chilling effect: when developers see people stridently defending fairly inconsequential opinions, they’re going to be very, very reticent to take a stand on something that actually matters.

Most people, after all, do not enjoy conflict, and will avoid it if they have the chance. If you’re willing to go all-in on a fight over how many spaces go in a tab,7 then heaven help me if I try to have a debate with you on whether using Docker images or a custom apt server is the best way to handle our infrastructure deployment. I’d rather just keep my opinion to myself and watch what happens, because I already know you’re going to fight me forcefully with a random assortment of arbitrary facts. So I’ll just be flexible and do whatever you want. That’s the “weakly held” bit, at least, right? I’m halfway there.

I’ve seen brilliant developers, who have great ideas when I speak to them quietly one-on-one, completely clam up in meetings when they’re in environments that espouse this mantra. They’re genuinely afraid to speak up, not because they don’t believe in what they want to say, and not because they can’t defend it, but because they know that anyone who disagrees with them will do so with violent rhetorical force. They will have to very literally fight, even if “only” verbally, for what they believe in. At the end of the day, whether consciously or subconsciously, they conclude that they’d simply rather not fight.

And now you’ve lost a valuable opinion, in a context where the outcome—unlike the conclusion to how many spaces go in a tab—may actually make a difference on whether your team can be successful.

Genuine opinions, thoughtfully presented

So let’s revisit what the original intention was. We want people to actually form a real opinion, because if they’re just trying to look at every possible side of something, they’ll never actually do anything. But we want them to be flexible, because even a well-researched, sincere opinion might be surprisingly incorrect when additional facts are presented.

I’d therefore like to propose a new version of this classic quote: genuine opinions, thoughtfully presented.

The opinions should be genuine because you should actually believe them, not just parrot them for the sake of having an opinion. While there are times and places for a tornado of possibilities, you should never take a side of an argument that you don’t believe in.8 You should have thought through things at least long enough that you do actually believe what you’re about to say.

But as soon as you reach that place, you should thoughtfully present that opinion. It’s an opinion, after all, but it’s one you took time to create. Everyone makes mistakes, including you, but also—and this is important—the person you’re talking to. Even I, on rare occasions, have been knonw to make a mistake or two. If you don’t present your opinion, I’ll never have a chance to realize I was wrong, you’ll never have a chance to validate you’re right, and we both lose out. And as long as you do it thoughtfully, I’ll be able to calmly discuss your opinion with you so that we can resolve our differences, regardless of which of us—if, indeed, either of us—is correct at the onset.

Calm discussion is the best form of debate

In the rush to have strong opinions, we lose track of the “weakly held” part. Even if we might genuinely be willing to change our opinions, presenting our own views so strongly that we appear to have our minds fully made up can prevent those with excellent, valuable opinions from speaking up.

I might or might not have enjoyed debate more than speech; I don’t know. What I do know is that I really badly wish I’d done discussion, even if only once, because magnifying those who are just learning how to present their genuine opinions is one of the best things I can do, both as a mentor, and as a colleague.

That’s my genuine opinion, thoughtfully presented. What’s yours?


  1. The organization that puts this on is the National Forensics League, which helpfully abbreviates its name to the NFL in most documents. So if you ever meet me and I casually mention I’ve won NFL meetups, don’t ask to see my Superbowl ring, because I’ve lent it out for the time being. [return]
  2. I mean, in industry terms. So, older than last week. [return]
  3. I’ve been told I name-call too much in my blog posts. Anyone who says that is a big fat nincompoop. [return]
  4. Emacs. [return]
  5. Firefox. [return]
  6. OS/2 Warp!, for PowerPC, little-endian, and anyone who says otherwise is a pinko commie. There. I think I’ve now fully offended everyone I can.9 [return]
  7. Seven. [return]
  8. Except when it comes to clothing fashions. I give up. If we all want to pretend that an un-tucked shirt with corduroys, a fedora, a used blazer, and US Keds is the apex of hip, rather than the completely expected outcome of a bunch of affluent extraterrestrials concluding that Goodwill represented our species’ best unified clothing line, then so be it. [return]
  9. No no, wait, hang on, I got it: and stick to just using WinOS2, since there are no good native apps for OS/2. There. Now I think I’m good. [return]

The More Things Change

React, if you’ve somehow missed it, is the new hotness in web programming. The idea is simple: each React component describes its view idempotently, in JavaScript. The view is rendered entirely based on a small amount of state the component keeps internally. Given the same state, a given component will always render identically. This in turn means that when data changes, React can apply just what changed to the browser’s DOM, saving it from having to re-rendering the entire page. In fact, the determination of whether to change anything at all can be made purely by consulting the component’s internal state. At its core, that’s why React is very fast.

React by itself doesn’t actually solve how to propagate changes, though. For that, the most popular solution is another Facebook framework called Flux. In Flux, you have stores that contain data, and dispatchers that process actions and notify parties appropriately when a relevant action has been performed. This flow is unidirectional: user actions trigger a dispatcher action, the dispatcher updates the stores, the stores update the views, potentially causing them to re-render. You can see a nice diagram that probably conveys it better than my description. To build a real application, you usually combine lots of React views and lots of Flux dispatchers into a coherent whole, so each piece composes nicely with all the others.

This sounds nice, but whenever I look at the code for websites that use Flux, I’ve just felt…well, weird. I felt like I’d seen this pattern before, that I had stories of that way be dragons1 even though it was all “new,” but I couldn’t quite put my finger on why I felt that way.

Until today.

Base Principles

Indulge me in a thought experiment. Let’s say we’re not writing a program for the web, but rather for a resource-strapped graphical computing environment. Maybe one of those embedded microcontrollers that are so popular these days, something even less powerful than a Raspberry Pi. How would we design such a framework?

Well, we mentioned resource-strapped, so instead of keeping an off-screen buffer for every single widget like we do on OS X, we’ll instead just have widgets redraw themselves when we need them to. Because of that, we’ll need to mandate that a widget’s drawing code be idempotent. Further, because (again) we’re resource-strapped, we don’t want to lug around a big runtime or anything. To keep things simple, we’ll say that each widget has a single procedure associated with it, and we’ll give that procedure two arguments: an integer representing what action the user just did, and a pointer to some additional action-specific data (if applicable). That way, each widget can handle each message in the most efficient way possible. Finally, because a user action might require us to redraw a widget, we’ll allow the widget to tell us if it needs to be changed. Because drawing is idempotent, we can minimize CPU usage by only redrawing whatever it says we have to redraw. Finally, for programmer sanity, we’ll allow these widgets to nest, and to pass custom messages to each other.

Congratulations! We’ve just designed Windows.

Specifically, Windows 1.0. Circa 1985.

Blasts from the Past

This is, really and truly, exactly how Windows used to be programmed all the way through at least Windows 7, and many modern Windows programs still work this way under the hood.2 Views would be drawn whenever asked via a WM_PAINT message. To be fast, WM_PAINT messages generally only used state stored locally in the view,3 and to keep them repeatable, they were forbidden from manipulating state. Because painting was separated from state changes, Windows could redraw only the part of the screen that actually needed to be redrawn. Each view had a function associated with it, called its WndProc,4 that took four parameters: the actual view getting updated; uMsg, which was the message type as an integer; and two parameters called wParam and lParam that contained data specific to the message.5 The WndProc would update any data stores in response to that message (frequently by sending off additional application-specific messages), and then, if applicable, the data stores could mark the relevant part of the screen as invalid, triggering a new paint cycle. Finally, for programmer sanity, you can combine lots of views inside other views. Virtually every widget you see in Windows — every list, every checkbox, every button, every menu — is actually its own little reusable view, each with its own little WndProc and its own messages.

This is exactly what Flux does. You can see this design very clearly in the official Flux repository’s example app. We can see the messages getting defined as an enum,6 just like you’d do in old-school Windows programming. We can see the giant switch statement over all the messages, and note that different data is extracted based on the message’s actionType. And, of course, you can see components rendering idempotently based off their small amount of internal state.

Everything old is new again.

When the hurly-burly’s done

Flux clearly works. Not just that; it clearly scales: people and companies, not least among them Facebook itself, are writing huge, functioning applications in this style.

But that shouldn’t be surprising; after all, many companies wrote huge, functioning applications for old versions of Windows, too. Just because it works—and moreover, even if I grant that it works and it’s better than what we had before—doesn’t mean it’s the be-all, end-all of web development.

Much as Windows wasn’t locked into WndProc-oriented code forever, I’m sure we’re not going to stop here on web development. We’ll rediscover Interface Builder and Morphic. I’ll get my web-based take on VisualAge, Delphi, and VisualBasic. Even if Flux lives on under the hood, I won’t have to think in its terms day-to-day. I know that will happen, because it has happened before, many times, on many platforms. There’s a ton of historical precedent; I just need to wait.

But maybe, with the benefit of hindsight this time, and recognizing that we have just brought the mid-eighties to web development, we can get through this phase a little faster than the last time.

Here’s to hoping.


  1. Or daemons. [return]
  2. WinUI programs probably work this way technically at some level, but the lowest level of abstraction for WinUI is thankfully much higher. [return]
  3. Via GWLP_USERDATA, if you’re curious. [return]
  4. It’s called WndProc because Windows calls views windows. Indeed, Windows’ concept of “window” is just a little chunk of screen with its own WndProc managing it. OS X’s Quartz actually used to make the same call deep under the hood, though Cocoa obviously exposes a much higher level of abstraction. [return]
  5. In practice, for all but the simplest of messages, wParam was ignored, and lParam pointed at a struct that contained data specific to that message. This actually makes it even closer to the Flux model. [return]
  6. Or JavaScript’s version of them, anyway. [return]

Remember to Reevaluate

I really dislike MySQL. I haven’t used it in a long time, but I remember that it basically just stores all of its data in a flat text file. No transactions, no write-ahead log. In fact, there’s barely even any real data integrity. You have to run a repair process on boot to fix table corruption in the case of hard shutdown. There was an alternative thing you could do that fixed that, but it was an option, and no one had it enabled by default. You should just use Oracle.

I don’t care what database you prefer; if you know MySQL, that block quote should have you cringing. It’s not that those complaints were never true—there was a time when MySQL only spoke MyISAM, and MyISAM did remain the default for a while after InnoDB landed as a supported database engine. But InnoDB has been the default since MySQL 5.5 shipped, in 2010. That’s around the same time that Windows 7 shipped. Docker did not exist. MongoDB hadn’t yet had its first birthday. Backbone, let alone things like Angular and React, hadn’t even been invented, and jQuery was pretty much your only sane bet for a cross-browser JavaScript library.

In tech, things change quickly. Yet I routinely come across people who hold opinions based on the state of a given tool as they encountered it years ago. They hate IE because it doesn’t do sane CSS. They hate Vim because you have to script it in Vimscript. They hate Java because it lacks generics, .NET because it doesn’t run natively on Linux, and Macs because you can’t right-click.1

Thing is, none of these things are true today. All of these things were true, once upon a time, but they’ve all been fixed for years. They only keep coming up because people love quick jabs and assume everything they don’t follow hasn’t changed since they last looked at it.

I can’t fault people for having silos of knowledge. Honestly, that just makes sense: if you’re a Java developer, you shouldn’t have to spend time watching what .NET is doing, because you need to spend that time keeping track of what Java is doing. If you wrote off Firefox’s dev tools for Chrome’s years ago, of course it makes sense to spend your learning time focused on just what Chrome’s doing. And if you’re in an all-Linux shop, it really doesn’t begin to matter if PowerShell is or is not the bee’s knees, because you’re never going to use it, and the few times you have to deal with Windows, your stale cmd.exe knowledge will get by just fine. So far, so good.

The problem comes when people get into debates over which of two competing solutions is best. There is an overwhelming tendency to assume that, because technology $X worked in one way when you first evaluated it, then it still works that way today, while $Y, which you follow closely, has evolved considerably. Thus, in debates, people are comparing some kind of Luddite time-locked version of $X against contemporary $Y, pointing out that $Y is demonstrably superior, and calling it a day.

There are many reasons to favor one technology over another. “I had to make a decision five years ago, and the decision was right then, and it hasn’t led me wrong since then, so I’m sticking to it” is a completely fine way to make certain classes of technology decisions. I’m not arguing against that. But you can’t seriously argue for or against something in the general case based on the state it was in years ago.

I was just in a discussion in a forum2 where someone was comparing the current version Git to an eight-year-old version of Mercurial, and finding that Mercurial was lacking.

I hope to heavens I do not need to explain on this blog that I’m a Mercurial apologist, but I will readily admit that Git 2.5.2 is better in almost every single way than Mercurial 1.4. Thing is, I could say the same about Mercurial 3.5.1 against (roughly) Git 1.7. Git eight years ago wasn’t the Git of today: newly created local branches didn’t automatically try to track remotes of the same name. Git subtrees did not exist. Many widely used flags to git grep and git log did not exist. Git didn’t do automatic garbage collection and packing. And the Mercurial of eight years ago wasn’t the Mercurial of today, either: there were no revsets, no bookmarks, no histedit, and no phases.

If you’re looking to transition, this history is irrelevant; what you care about is the current version of Git versus the current version of Mercurial—just like you should be choosing between the current versions of PostgreSQL and MySQL, not PostgreSQL 9 and MySQL 5.1. And that’s going to be a very different discussion.

I’m not saying you shouldn’t engage when people talk about what tools are best. I am asking you to think about whether you actually have recent, pertinent information to contribute to the discussion. If you used a tool eight years ago, your experience is likely no longer valid. If you used it last month, it probably is. If in doubt, reevaluate your position by trying the tool and seeing if things are still the way you remember them.

And remember: if you don’t have time to reevaluate, then you probably don’t have time to contribute a (useful) opinion, either.


  1. Okay, I admit it’s been awhile since I actually heard this one. [return]

Thoughts on Entitlement and Pricing

Yesterday, JetBrains announced new pricing for their line of developer tooling. Previously, you could buy their products for anything from $50 (for WebStorm) to $675 (for ReSharper Ultimate), with lower prices in most cases for yearly upgrades. Yesterday, JetBrains changed that and announced JetBrains Toolbox. For $12/month, you can get access to one of their products, or for less than double that, $20/month (discounted to $150/year for current customers), you can get access to all of their developer tools.

The reaction from developers has been consistent: viscerally negative. The pricing is too high and unfair, they complain. The tools will stop working if you stop paying for them, which is obviously insane, because what if you need to edit things later on? Quite a few even are whining about how any self-respecting developer should be using open-source tools,1 which in this context seems more about implying that any cost for tooling is too high rather than having a stance on libre software. As of this writing, one of the top stories on several news aggregators is even titled “How JetBrains lost years of customer loyalty in just a few hours.”

These people are overreacting to the point of being ridiculous.

I want to you to stop and think for a second. The average developer in the US, according to Google, makes $85,000 per year. For the cost of literally less than one dollar per work day, or 0.3% of the average developer income—less than one full day’s work—you can have access to every single developer desktop tool JetBrains makes. These tools cover Ruby, Python, Node, Java, C#, C++, and more. They include several full-blown IDEs, plus a couple plugins for Visual Studio. And yet this somehow far too expensive—so egregious that it permanently destroys people’s loyalty for JetBrains products.

Apparently because that “loyalty” was coming exclusively from entitled users.

The reason why JetBrains had so much loyalty to lose in the first place is that their tools are freaking amazing. They are fast, reliable, run on everything, have a bazillion plugins, are easy-to-use2, and have feature polish that is utterly lacking in almost every other peer in the market. I don’t think it’s an exaggeration to say that IntelliJ and its derivatives save me at least several hours of time every single year—and it’s very possible that “a week” is an even better estimate. They make it much, much easier for me to earn as an individual a salary that is over 50% higher than the average US household income. Yet developers have become so addicted to free-as-in-beer software that even this cost, even in this context, strikes them as ridiculous.

The thing is, developers are not alone in having that addiction. It’s why so many companies are forced to give their software away for free, while making sure that the user experience is so awful that most normal people have to pay for support.3 It’s why the app stores are increasingly struggling to charge meaningful amounts for software, and therefore why freemium is a thing in the first place. It’s why ads are becoming so awful and so pervasive, and why services like Amazon Underground are making such a big splash in the news. All of us, developers and normal users alike, have been conditioned to expect that software is now cheap or free.

*When it comes to their own products,* developers do recognize this as a problem, and they have a pretty consistent solution: switch to software-as-a-service. If you’re getting an ongoing benefit from the product (so the logic goes), then I’d like a cut of that ongoing benefit. It’s better for me (I have steady revenue) and it’s better for you (I have a strong incentive now to respond to your requests, and to focus on long-term quality instead of useless headline-grabbing features). And that’s exactly why JetBrains is making this change too, I’m sure: they want to make amazing products; they want to do so even if the best path forward isn’t amazingly weird features, but rather steady and determined improvements; and they want a more stable revenue stream. Software as a service is one of the few sane options left.

There is only one argument I’ve seen against this plan that has any real coherency other than “but I don’t wanna spend money!“, but I think it’s specious when examined. Specifically, some people are pointing out that IntelliJ-as-a-service shuts off if you quit paying, which is a step back from the existing permanent licenses.

You know what? That’s true. But of all the software to worry about going offline, JetBrains’ IDEs are probably the least of my concerns, vastly better than almost every other software-as-a-service I can think of. If I quit paying Amazon, my DNS goes down. If I quit paying Vultr, my blog goes offline. If I quit paying Dropbox, I suddenly lose the ability to keep working with my files. On the desktop side, If I quit paying Adobe, I may literally not even be able to meaningfully view some of my old .psds, and don’t even get me started on what happens if I quit paying for Office365.

But if I quit paying JetBrains, I don’t suddenly lose the ability to change and build my software. tox and mvn still work just fine. I can still build my projects with cmake. Neovim doesn’t suddenly lose the ability to edit my Python. Visual Studio can still do its built-in refactorings. And if I suddenly do find myself in a position where that productivity boost I’d get from, say, WebStorm would actually save me a lot of time, I can pay $204 for one month of PyCharm to speed up that nasty refactoring or debugging session.

I see developers whine all the time about how freemium products and ad-supported software are destroying the industry, and claim that they would be more than happy to pay for software if only offered the chance. Well, here’s your chance: great products at a very reasonable price that you’ll make up for in boosted productivity, almost guaranteed. But instead of putting your money where your mouth is, you’re all whining about how it’s just too damn expensive and unreasonable.

If you don’t want to pay for software anymore, fine. Just don’t be surprised when no one else does, either.


  1. The people saying this apparently being unaware that large chunks of IntelliJ are, in fact, open source (pretty much everything except web components and SQL integration, in fact). That’s how Android Studio can exist in the first place. [return]
  2. By the relatively low standards of IDEs, anyway. [return]
  3. I mean, how else do you explain MongoDB? [return]
  4. Thirty minutes of income if we’re going off that earlier $85,000/year figure. [return]

Unorthodocs: Abandon your DVCS and Return to Sanity

Hi. My name is Benjamin, and I’m a DVCS apologist.

I’ve pretty much always been a DVCS apologist. I know quite a few people who’ve been using DVCSes since Mercurial and Git, and a few who go back to BitKeeper, but I can totally out-hipster you. I was there for Monotone. I actually remember struggling to grok tla, and being happy that someone took the time to write baz. I remember the promise and the failure that was Darcs. I remember thinking that even Darcs was comically primitive, because it was nothing but a poor imitation of Smalltalk DVCSes like Monticello, themselves mere iterative improvements on classic Smalltalk changesets.1

You merely adopted the DVCS. I was born into it, molded by it. I didn’t see CVS until I was already a man2; by then it was nothing to me but a cause for self-inflicted blunt force trauma to the head.3

To me, the arrival of Git and Mercurial was a godsend, not because it was radically different than what I was used to, but because it finally meant that I had a sufficiently advanced general-purpose DVCS that I at last could cleanse the ground with the blood of my enemies obliterate the use of Subversion, CVS, that thingie that Microsoft made before TFS4, and anything else that might cross my path, and replace them with Mercurial (or, in a truly desperate situation, Git) goodness. Hell, I built a whole product centralized around making DVCSes easy and simple to use, and then went on the lecture circuit explaining how to best use them in your workflow.

These aren’t the droids you’re looking for

Somewhere along the way, I think I lost sight of the forest for the trees. I was so actively trying to argue why DVCSes were so superior to the existing centralized source control systems that we had that I never really stopped that long to think about if maybe, just maybe, they in fact weren’t well suited to every single situation that involved source code.

And as a result of the efforts of people like me, we’re now seeing some truly insane “best practices” in the name of adopting Git.5 And mind you, we insist they’re “best practices”; they’re not workarounds, oh no, they’re what you should have been doing since the beginning.

And that’s bullshit.

Today, I’m putting my foot down. I helped start this nonsense, so I’m going to help stop it. If a DVCS is great for your workflow, fine. If the trade-offs it imposes are good for you, great. But let’s stop claiming that they’re free, because they have a cost, and the cost is sometimes not worth it.

Why we fell in love with DVCSes

The actual reason is because GitHub let coders cleanly publish code portfolios, and GitHub happened to use Git as the underlying technology to enable that, but that’s a blog post for another time, so let’s instead pretend that this was purely due to technical reasons.6

It came down to atomicity. CVS treated the atom as the file: a given file had a history, and a file had versions, but the repository really didn’t. Oh sure, there were tags and branches, and those did really operate at the repository level, but those were the tools you reached for around release time—not part of your daily flow.

Subversion treated the atom as the entire repository…almost. I mean, Subversion claimed that’s what it was doing, and the entire repository had one single monotonically increasing version number, but Subversion went at once too far and not far enough: it went too far by saying that *every*thing is really just a convention-over-configuration use of a directory, and that in turn meant it didn’t go far enough because it forced Subversion to think of merging and branching at the file level.

Git and Mercurial (and, for that matter, most other DVCSes) fix that problem by actually treating the whole state of the repository as the atom. As fall-out, they treat merging and branching as repository-level operations. In this, they were a massive improvement over Subversion: by treating the whole repo state as the atom for real, they could do things like sanely track renames, replay merge resolutions, realize when deleting and recreating a file was an actual conflict, and more. This suddenly meant that long-lived branches could be sane, which in turn meant that feature flags could finally die,7 and all was good in the world, amen.

But here’s the rub

Note: we fell in love with DVCSes because they got branching right, but there’s nothing inherently distributed about a VCS with sane branching and merging. You could absolutely make a centralized VCS that got merging just as right as any of these others—and, indeed, long after the horses bolted, Subversion is closing the barn doors by adding better merge tracking, which should land literally any year now. No, the only thing that a DVCS gets you, by definition, is that everyone gets a copy of the full offline history of the entire repository to do with as you please.

Let me tell you something. Of all the time I have ever used DVCSes, over the last twenty years if we count Smalltalk changesets and twelve or so if you don’t, I have wanted to have the full history while offline a grand total of maybe about six times. And this is merely going down over time as bandwidth gets ever more readily available. If you work as a field tech, or on a space station, or in a submarine or something, then okay, sure, this is a huge feature for you. But for the rest of us, I am going to assert that this is not the use case you need to optimize for when choosing a VCS. And don’t even get me started on how many developers seem to assume that they can’t get work done if GitHub goes down. Suffice it to say that I think most developers are pretty fundamentally unaware of how to use their DVCS in a distributed manner in the first place.

That’d be fine if you got the distributed part “for free”, but you don’t. Because until Pied Piper makes a source control system, part of saying that you have the whole history means that you have an awful lot of data, and that causes Problems™. Let’s explore them, shall we?

Say goodbye to blobs (or your sanity (and sometimes both))

Take blobs (a.k.a. binary assets). Blobs are a part of most programs. You need images. You need audio. You need 3D meshes. You need to bundle a ton of fonts, because Android has like three of them, none of which happen to be Wingdings. These are large, opaque files that, while not code, are nevertheless an integral part of your program, and need to be versioned alongside the code if you want to have a meaningful representation of what it takes to build your program at any given point.

That’s fine for a centralized system. You only have one copy at any give point, so the amount of disk space you need is basically the size of the current (or target) version of the repository—not the whole thing.8 We don’t really need to care about how the history is stored.

With a DVCS, though, we have a problem. You do have the whole history at any given point, which in turn means you need to have every version of every blob. But because blobs are usually compressed already, they usually don’t compress or diff well at all, and because they tend to be large, this means that your repository can bloat to really huge sizes really fast.

A sane engineer at this point would say, “Well, if you’re doing that kind of thing, maybe you should use a centralized system,” but we’re in the DVCS cult now, so we don’t do that. Oh no, not us. We instead tell you that blobs are totally different from source code, and they really should be versioned in their own store, and then we invent insanity like git-annex so that you can have a non-distributed separately-configured non-Git-like store alongside your source code.9

You know what kind of external binary asset management system works really well? Subversion. Turns out it works tolerably for source control, too. Maybe you should take a look at it.

Say goodbye to sane version synchronization

It’s not just blobs that DVCSes don’t handle well. It’s also repositories with really big histories and/or tons of files. That’s both because the repository can again get extremely large—Emacs, a single application with no bundled dependencies, clocks in with a 313 MB repository, for example, and Linux itself happily runs along at about 1 GB—but also because the structures traditionally used by Git simply don’t scale well to very large repositories. In Git, for example, directories (which Git, in what I assume is a nod to ecoterrorists, calls trees) are identified by their SHAs, which are in turn determined by their contents, which in turn is defined by the SHAs of any trees they contain, recursively. This means that changing a file in a deep directory will require generating new trees for every directory up the chain—and, of course, that figuring out what changed in a given directory requires loading up all the trees going down the chain. Wondered why git blame runs slow as hell on big repos? Now you know.

But this isn’t a problem to us DVCS apologists. Nosiree! We instead told you to chunk up your project into a gazillion repos, and that this was Better™, because it forced you to break up your code base into tiny pieces. That’s right: you should be happy that DVCSes can’t scale to this size, because it made you code better.

This is stupid. This is like saying that chopping off your arms is good because it forces you to get really good at tying your shoes with your teeth. As much as I am a big fan of the Zen saying about the sound of no hands clapping,10 this argument is specious at best, justifying why a weakness is acceptable by claiming it’s superior.

But what makes it more absurd is that not even DVCS proponents really believe themselves when you get down to it. Instead, they design all kinds of tools to navigate around this issue. Google wrote repo. Git gained submodules, and Mercurial gained subrepositories. Build systems suddenly learned how to speak Git and Mercurial due to the inevitable repository explosion that would accompany any decent-sized project. All to work around a nominal design improvement.

Facebook and Google know that keeping all your source code in one single repository is good, because it turns out that using an SCM to manage your source dependencies by this concept of, you know, versioning the source code, is kind of the whole point of using them in the first place. Now of course sometimes you do genuinely want separate repositories, and these align with exactly when you wanted them in Subversion and CVS, too. But let’s admit that saying “small is good” is a complete misfeature. You’re giving something up by going that route, not gaining something in the form of BDSM dependency management.

Say what again, I dare you, I double dare you

All of the above might be almost tolerable if DVCSes were easier to use than traditional SCMs, but they aren’t.

You needn’t look further than how many books and websites exist on Git to realize that you are looking at something deeply mucked up. Back in ye days O Subversionne, we just had the Red Book, and it was good, and the people were happy.11 Sure, there were other tutorials that covered some esoteric things that you never needed to do,12 but those were few and far between. The story with CVS was largely the same.

And then Git happened. Git is so amazingly simple to use that APress, a single publisher, needs to have three different books on how to use it. It’s so simple that Atlassian and GitHub both felt a need to write their own online tutorials to try to clarify the main Git tutorial on the actual Git website. It’s so transparent that developers routinely tell me that the easiest way to learn Git is to start with its file formats and work up to the commands. And yet, when someone dares to say that Git is harder than other SCMs, they inevitably get yelled at, in what I can only assume is a combination of Stockholm syndrome and groupthink run amok by overdosing on five-hour energy buckets.

Here’s a tip: if you say something’s hard, and everyone starts screaming at you—sometimes literally—that it’s easy, then it’s really hard. The people yelling at you are trying desperately to pretend like it was easy so they don’t feel like an idiot for how long it took them to figure things out. This in turn makes you feel like an idiot for taking so long to grasp the “easy” concept, so you happily pay it forward, and we come to one of the two great Emperor Has No Clothes moments in computing.13

Do you hear the people sing, singing the song of angry men

There is one last major argument that I hear all the time about why DVCSes are superior, which is that they somehow enable more democratic code development, which is code for “I think GitHub pull requests are the Alpha and the Omega of software development, all hail Xenu.” Indeed, to listen to this, you would believe that open-source development was impossible, or at least absolutely horrible, before GitHub sprang into melodious existence.

This is like someone with hardcore Stockholm syndrome complimenting their kidnapper’s pancakes. Prior to GitHub, to send a patch to a project, you needed to

  1. Get a copy of the source code
  2. Make your change
  3. Generate a patch with diff
  4. Email it to the mailing list
  5. Watch it get ignored

Whereas with the GitHub pull-request model, you instead need to

  1. Fork the repository on GitHub
  2. Clone your fork of the source code
  3. Make sure you’re on the right branch that upstream expects your patch to be based on, because they totally won’t take patches on master if they expect them on dev or vice-versa.
  4. Make a new local branch for your patch
  5. Go ahead and make the patch
  6. Do a commit
  7. Push to a new branch on your GitHub fork
  8. Go to the GitHub UI and create a pull request
  9. Watch it get ignored

I see this workflow done for the tiniest of tiny projects on the grounds that “it makes things easier”. Yet I watch OpenBSD, an entire freaking operating system, get by just fine with CVS—CVS—and patch bombs. Hell, Mercurial itself, which is, you know, a DVCS, does development via patchbomb emails, not pull requests.

And this extra complication doesn’t really get you anything. You still frequently need to rebase patches when they don’t merge cleanly, just like you used to have to tinker with patch fuzz factors. You still get ignored, as 250,000+ open PRs that haven’t been touched in at least two months can testify. In fact, the only actual advantage I can see is that you get your name explicitly in the commit history instead of in a THANKS or a Changelog or a README. I mean, good job if that’s what you want, but maybe admit that it’s about vanity and not about tooling.

Oh let’s go back to the start

We aren’t going to abandon DVCSes. And honestly, at the end of the day, I don’t know if I want you to. I am, after all, still a DVCS apologist, and I still want to use DVCSes, because I happen to like them a ton. But I do think it’s time all of us apologists take a step back, put down the crazy juice, and admit, if only for a moment, that we have made things horrendously more complicated to achieve ends that could have been met in many other ways.

Thankfully, this whole point may be moot soon. Facebook and Google are putting in a tremendous amount of effort to make Mercurial scale to gargantuan code bases with ease—which, while it largely removes the D from DVCS, also means that you will be able to use at least Mercurial in a completely sane way with the completely reasonable workflows you want to manage code. Git may get there someday, too.14 So at the end of the day, we’ll end up, a decade later, right back where we could have been at the beginning: having a centralized SCM that can scale to insane sizes, and that also has sane branching and merging.

Long live the centralized SCM.


  1. Whether Smalltalk changesets count as a DVCS, or as patches, is a bit complicated. They’re technically instructions that will be executed by the VM, and therefore they’re almost scripts, but they usually contained the whole method-by-method history for how the image came to be, which meant that they could function similarly to a DVCS. [return]
  2. This is actually not true. [return]
  3. This is actually also not true, but very, very truthy. [return]
  4. The storage format was that of Mordor, which we will not utter here. [return]
  5. And it is Git, because Git “won”, which it did because GitHub “won”, because coding is and always has been a popularity contest, and we are very concerned about either being on the most popular side, or about claiming that the only reason literally everyone doesn’t agree with us is because they’re stupid idiots who don’t know better. For a nominally nuanced profession, we sure do see things in binary. [return]
  6. It was also due to technical reasons, mind, but claiming that DVCSes arose purely due to technical concerns would be like claiming we have web servers running Windows because Linux servers didn’t have virus scanners we could trust at the time. [return]
  7. And subsequently get resurrected when people remembered that they actually serve a purpose in real life where ideological purity gets in the way of making kosher turkey bacon. [return]
  8. This isn’t quite true. Subversion and CVS actually both store a complete, redundant, uncompressed copy of everything in your source tree to allow you to revert without talking to the server. This is why Subversion checkouts can actually be larger than Mercurial and Git versions of a given repository. That said, this distinction quits mattering in repositories with reasonable numbers of blobs, for reasons we’ll see in a second. [return]
  9. I’m picking on Git here because it’s more popular, but Mercurial’s equivalent largefiles extension makes basically exactly the same trade-offs, albeit in my opinion with a better UI. [return]
  10. Close enough. [return]
  11. For being programmers, anyway. [return]
  12. Like cloning a full repository for offline use. [return]
  13. We’ll save the other one for another day. [return]
  14. Sike. [return]

Don’t Forget to Take Vacation

Hello, world!

A lot of you are on the last bits of your vacation this week. That is awesome. There is likely no better time you can take vacation. Your team has hopefully shipped all deliverables for 2014 Q4. You have likely planned out Q1. You almost certainly have no real bugs in production. Cthulhu willing, you have automatic regression and integration tests so that you can rest assured knowing that The Person Who Does Not Vacation can safely fix anything that does come up. You’re in really good shape. This is an insanely good time to unplug your computer, toss it out the window, and then pour yourself more eggnog and rum as you realize that your employer owns that laptop and doesn’t have it insured against metaphoric defenestration. Point is, this is a good time to vacation.

Which is why I am disappointed that most of you on vacation apparently do not know how to vacation.

I have therefore written a helpful guide.

How to Vacation

  • DO spend time face-to-face with your family. If you are unsure how to do this, pretend you are texting with them via Siri, but omit the “Hey Siri, text X that…” part. Note that this only works over local-area network and does not generally work well through doors.
  • DO NOT check work email. Your employer has everyone’s cell phone. If it’s seriously that important, don’t call them, they’ll call you. I promise.
  • DO go sledding if you are in a snowy region, because it’s awesome to be an adult and act like a kid, and besides, it’s been way too long since you broke your arm and the story started with “okay so there was this awesome hill and all I had was a cafeteria tray…”
  • DO NOT do code reviews. There is not one single year-end feature blocked by code reviews. You are procrastinating from relaxing by doing work. Think about that. How insane is that? I mean seriously, right?
  • DO go swimming if you are in a swimming region. Santa may be drowning right now, what with all that velvet coat situation and stuff. Only you can save him. And if you can’t find him, hey, your skin won’t tan itself. Pass a mimosa.
  • DO NOT fix bugs in your product. Chances are ludicrously high you are at what Mozilla lovingly calls zarro boogs: there are obviously bugs, but you know them and they all have workarounds. This is an excellent place to be. You want to stay to here.
  • DO do things you don’t normally have time for. Start the push up challenge. Reread Harry Potter (again (again)). Write a spec for Mean Girls 3: Clique Ahoy. Play Candy Land with your kids/nieces/nephews “because they insist,” even though you’re the one who braved an Indiana-Jones–esque level of spider webs to dig it out of the moldy chest in the basement. Get that space station built in Kerbal just in time to realize you ran out of money to get kerbals to the space station. The sky and/or your checkbook is/are the limit.
  • DO NOT write new features. You see how you’re at zarro boogs? Do you want actual bugs? Because that’s how you get actual bugs.

That said, some of you, including me, like to code to relax. That’s cool. But do something not related to your day job to get your head out of the space. Learn Kotlin or OCaml. Write a plugin to Slack for MegaHAL. Rewrite your blog in Pharo. Play with OvertureJS from FastMail. Spend four hours customizing Vim or Emacs and don’t feel bad about procrastinating. Do something that is as far away from work as possible.

Use your vacation to take an actual vacation. You’re going to have an intense start to Q1. You’re in software; I know. Everyone wants you relaxed and refreshed and excited.

Please vacation. Vacation like there’s no tomorrow.

Happy holidays,
—Benjamin

Having Fun: Python and Elasticsearch, Part 3

Welcome back to having fun with Elasticsearch and Python. In the first part of this series, we learned the basics of setting up and running with Elasticsearch, and wrote the very basics we needed to cover basic indexing and searching of Gmail metadata. In the second part, we extended the search and querying to cover the full text of the emails as well.

That theoretically got us most of what we wanted, but there’s still work to be done. Even for a toy, this isn’t doing quite what I want yet, so let’s see what we can do with another thirty minutes of work.

Improving the Query Tool

There are two big sticking points. First, right now, we’re just passing the raw search queries to Elasticsearch and relying on Lucene’s search syntax to take care of things. That’s kind of okay, but it means we can’t easily do something I really care about, which is saying that labels must match, while everything else can be best-effort. Second, we’re not printing out all the data I want; while we did a basic extension of the query tool last time, that data is still kind of disgusting and annoying to read through.

Fixing the first part isn’t too bad. Remember earlier how I said that Elasticsearch provides a structured query language you can use? Let’s use it to solve our problem.

The structured query language is really just a JSON document describing what you’re looking for. The JSON document has a single top level key, query, which then has sub-elements describing exactly what we’re trying to query and how. For example, if you wanted to look at all the documents in a given index, that’s just

{
  "query": {
    "match_all": {}
  }
}

This, of course, is a bit silly; you usually want to look for something. For example, to explicitly match all subjects that contain go, we could do something like

{
  "query": {
    "match": {
      "subject": "go"
    }
  }
}

match is a simple query operator that does fuzzy matching, so “go” would also match “going”, “goes”, and the like. Using the query DSL from Python is really simple. Give it a shot in your Python prompt by passing it as the body parameter to es.search(). E.g.,

es.search('mail', 'message', body={
    'query': {
        'match': {
            'subject': 'go',
        }
    }
}

Of course, what we want to do here is to leverage the real power of the DSL search syntax. We want to be able to combine queries in specific ways. For example, I mentioned earlier I wanted to require labels, but have everything else be requested, but optional.

Thankfully, Elasticsearch provides the bool query operator to allow just this:

{
  "query": {
    "bool": {
      "must": [{"match": {"labels": "camlistore"}}],
      "should": [{"match": {"subject": "go"}}]
    }
  }
}

bool takes a dictionary containing at least one of must, should, and must_not, each of which takes a list of matches or other further search operators. In this case, we only care about must versus should: while our labels must match, the text in general should match if it can, but it’s okay if we don’t have an exact correspondence.

So let’s put everything together here.

First, we unfortunately have to replicate the Gmail-style query axes that Lucene gave us for free. Doing this properly would require writing a legitimate (if tiny) parser. While that can be done pretty easily, it’s a bit out of scope for this series, so we’ll cheat: since we know our keys can’t contain colons, we’ll say that everything must be in the rigid format header:value. If you want to specify multiple things can match, we’ll allow you to specify a given header multiple times. If a given token has no : in it, then we’ll assume it’s part of the full-body search. That would leave us code that looks something like this:

from collections import defaultdict
fulltext = io.StringIO()
keywords = defaultdict(str)
for token in query.split():
    idx = token.find(':')
    if 0 <= idx < len(token):
        key, value = token.split(':', 1)
        keywords[key] += ' ' + value
    else:
        fulltext.write(' ' + token)

That will allow us to search to, from, labels, and so on using exact matching, while still keeping track of anything that isn’t a key/value field so we can use it for a fuzzy body search.

We also introduced a new class from Python’s standard library, defaultdict. defaultdict is one of those tools that, once you learn about, you can’t put down. defaultdict takes a function that returns a default value to be used when you attempt to access a key that doesn’t exist. Since str() returns an empty string (''), we can avoid having to do a bunch of checks for existing keys, and instead simply directly concatenate to the default value.

Next, we just need to take the keywords dict we built above and put that into the must field, and then take the fulltext string we built up and use that for the body match. This is really straightforward to do in Python:

q = {
    'query': {
        'bool': {
            'must': [{'match': {k: v}} for k, v in keywords.viewitems()]
        }
    }
}

fulltext = fulltext.getvalue()
if fulltext:
    q['query']['bool']['should'] = [{'match': {'contents': fulltext}}]

That’s it; we’ve got our must and our should, all combined into an Elasticsearch DSL search query.

There’s only one piece missing at this point: enhancing the query output itself. While we took a passing stab at this last time, we really want to do a bit better. In particular, since we’re indexing the entire message, it’d sure be nice if we showed part of the entire message.

Doing this properly gets complicated, but we can make use of two easy tricks in Python to get something that works “well-enough” for most of our use cases. First, we can use the re module to write a regex that will replace all instances of annoying whitespace characters (tabs, newlines, carriage returns, and so on) with single spaces. Such a regex is simply [\r\n\t], so that’s easy enough. Second, Python allows us to trivially truncate a string using list slicing syntax, so s[:80] will return up to the first 80 characters in s.

Finally, we want to pull in one more special function from the Python standard library, textwrap.dedent. textwrap is a module that contains all sorts of useful utility functions for wrapping text. A shocker, I know. dedent is an incredibly handy function that simply removes leading white space from a string, based on the first line. This is incredibly useful when writing strings inline in a Python file, because you can keep the string itself properly indented with the rest of the code, but have it output to the screen flush at the left margin. We can use this to make writing our template string a lot cleaner than last time.

Putting this all together, our display code would look like this:

es = elasticsearch.Elasticsearch()
matches = es.search('mail', 'message', body=q)
hits = matches['hits']['hits']
if not hits:
    click.echo('No matches found')
else:
    if raw_result:
        click.echo(json.dumps(matches, indent=4))
    for hit in hits:
        click.echo(textwrap.dedent('''\
            Subject: {}
            From: {}
            To: {}
            Content: {}...
            Path: {}
            '''.format(
            hit['_source']['subject'],
            hit['_source']['from'],
            hit['_source']['to'],
            re.sub(r'[\r\n\t]', ' ', hit['_source']['contents'])[:80],
            hit['_source']['path']
        )))

The only tricky part is that we’ve combined the regex substitution and the truncation together. Otherwise, this is a very straightforward modification of what we already had.

That’s it. Here’s the full version, including all the imports and the initial #! line, in case you don’t want to perform all of the edits by hand:

#!/usr/bin/env python
    
import io
import json
import re
import textwrap
from collections import defaultdict
    
import click
import elasticsearch
    
    
@click.command()
@click.argument('query', required=True)
@click.option('---raw-result/---no-raw-result', default=False)
def search(query, raw_result):
    fulltext = io.StringIO()
    keywords = defaultdict(str)
    for token in query.split():
        idx = token.find(':')
        if 0 <= idx < len(token):
            key, value = token.split(':', 1)
            keywords[key] += ' ' + value
        else:
            fulltext.write(' ' + token)
    
    q = {
        'query': {
            'bool': {
                'must': [{'match': {k: v}} for k, v in keywords.viewitems()]
            }
        }
    }
    
    fulltext = fulltext.getvalue()
    if fulltext:
        q['query']['bool']['should'] = [{'match': {'contents': fulltext}}]
    
    es = elasticsearch.Elasticsearch()
    matches = es.search('mail', 'message', body=q)
    hits = matches['hits']['hits']
    if not hits:
        click.echo('No matches found')
    else:
        if raw_result:
            click.echo(json.dumps(matches, indent=4))
        for hit in hits:
            click.echo(textwrap.dedent('''\
                Subject: {}
                From: {}
                To: {}
                Content: {}...
                Path: {}
                '''.format(
                hit['_source']['subject'],
                hit['_source']['from'],
                hit['_source']['to'],
                re.sub(r'[\r\n\t]', ' ', hit['_source']['contents'])[:80],
                hit['_source']['path']
            )))

if __name__ == '__main__':
    search()

That’s it. We now have a full command-line search utility that can look through all of our Gmail messages in mere moments, thanks to the power of Python and Elasticsearch. Easy as pie.

Of course, command-line tools are cool, but it’d be really nice if we had a more friendly, graphical interface to use our tool. Thankfully, as we’ll see next time, it’s incredibly easy to extend our tool and start making it into something a bit more friendly for the casual end-user. For now, we’ve demonstrated how we can really trivially get a lot done in very little time with Python and Elasticsearch.

C++ Programming and Brain RAM

I have a tricky relationship with C++. There is a narrow subset of the language that, when properly used, I find to be a strict improvement over C. Specifically, careful use of namespaces, RAII, some pieces of the STL (such as std::string and std::unique_ptr), and very small bit of light templating can actually simplify a lot of common C patterns, while making it a lot harder to shoot yourself in the foot via macros and memory leaks.

That said, C++ faces a choking combination of wanting to simultaneously maintain backwards compatibility and also extend the language to be more powerful and flexible. I’ve recently been reading the final draft of Effective Modern C++ by Scott Meyers. It is an excellently written book, and it does a superb job covering what new features have been introduced in the last couple of C++ versions, and how to make your code base properly take advantage of them.

And a lot of the new stuff in C++ is awesome. I had a chance to start taking advantage of the new features when I was working at Fog Creek on [MESSAGE REDACTED], and I was actually really pleasantly surprised by how much of an improvement the additions made in my day-to-day coding. In fact, I was so pleasantly surprised that I actually gave a whole presentation on how C++ didn’t have to be awful.

But reading through Scott’s book the last few days has also reminded me why I was somewhat relieved to effectively abandon C++ when I joined Knewton.

Take move semantics, one of the hallmark features of C++1114. Previously, in C++, you always had to either pass around pointers to objects, or copy them around.1 This is a problem, because pointers aren’t easily amenable to C++-style RAII-based garbage collection, yet copies are very expensive. In order to get your memory safety and your speed, you end up having a lot of code where you semantically want something like

std::vector<something> generate_somethings();

...

std::vector<something> foo = generate_somethings();

but, for performance reasons, you have to actually write something closer to

void generate_somethings(std::vector<something> &empty_vector);

...

std::vector<something> foo;
generate_somethings(foo);

As much as C++ developers rapidly acclimate to this pattern, I think we can safely agree that it’s much less clear than the far-less-efficient first variant. You can’t even tell, simply by looking at the call site, that foo is mutated. You can infer it, certainly, but you have to actually find the prototype to be sure.

In theory, move semantics (also known as rvalue references) allow C++ to explicitly acknowledge when a value is “dead” in a specific context, which allows for much greater efficiency and clarity. The reason it’s called “move semantics” comes from the idea that you can move the contents of the old object to the new one, rather than copying them, since you know that the old object can no longer be referenced. For example, if you’re moving a std::string from one variable to another, you could simply assign the underlying char * and length, rather than making a full-blown copy of the underlying buffer, even if neither string is a const. The original can’t be accessed anymore, so it’s fine if you mutate memory that the original owned to your heart’s content.

In practice, though, things aren’t that simple. Scott Meyers helpfully notes that

std::move doesn’t move anything, for example […]. Move operations aren’t always cheaper than copying; when they are, they’re not always as cheap as you’d expect; and they’re not always called in a context where moving is valid. The construct type&& doesn’t always represent an rvalue reference.2

Got that?

In fact, Scott’s point is obvious if you understand how C++ gets realized under the hood. For example, when returning from a function, anything on the stack that’s returned via rvalue reference is going to have to be copied, so you’re only going to win if the object has enough data on the heap that moving actually saves copies. But understanding that requires you already bring a lot of C++ knowledge to the table.

This is a fractal issue with modern C++. Congratulations, you get type inference via auto! auto type inference works via template type inference, so make sure you understand that first.3 This comes up in especially fun situations, like Foo &&bar = quux() being an rvalue reference, auto&& bar = quux() not being one.

Or to quit picking on rvalue references, how about special member generation—those freebies like default constructors and copy constructors that the compiler will write on your behalf if you don’t write them? There are two new ones in C++, the move constructor and the move assignment operator, and the compiler will write them for you!…unless you wrote one of the two, in which case, unlike all the other special members, you have to write both. But at least that’ll be a compile-time issue, whereas, if you have an explicit copy constructor, you actually won’t get either move-related special members autogenerated; you’ll have to write both yourself if you want them. This isn’t purely academic: if you add a copy constructor and forget this fact, you may get a chance to enjoy an “unexplained” slowdown in your code when your silently generated move constructor vanishes.

To be clear again, these rules are emphatically not arbitrary. They make complete sense if you take a step back and think about why the standard would have mandated things work this way. But it’s not immediately transparent; you have to think.

And this is why I find it so amazingly hard to write code productively in C++. My brain has a limited amount of working memory. When I’m writing in a language with a simple runtime and syntax, such as C, Go, Python, Smalltalk, or (to an arguably slightly lesser extent) OCaml, then I need dedicate relatively little space in my brain to nuances of the language. I can spend nearly all of my working space on solving the actual problem at hand.

When I write in C++, by contrast, I find that I’m constantly having to dedicate large amount of thought to what the underlying C++ is actually going to do. Is this a template, a macro, or an inline function? Was that the right choice? How many copies of this templated class am I actually generating in the compiled code? If I switch this container to have const members, is that going to speed things up, or slow them down? Is this class used in a DLL for some silly reason? If so, how can I make this change without altering the vtable? Is this function supposed to be called from C? Do I even need to care in this instance?

It’s not that I can’t do this. I did it for years, and, as I noted, I was voluntarily, intentionally working in C++ for the last couple of months I was at Fog Creek. Sometimes, at least for now, C++ is unquestionably the right tool, and that project was one of those times. But as happy as I am that C++ is getting a lot of love, and that working with it is increasingly less painful, I can’t help but feel that the amount of baggage it’s dragging around at this point means that I have to spend far too much of my brain on the language, not the problem at hand. My brain RAM ends up being all about C++; most of the problem gets swapped to disk.

C++ still has a place in my toolbox, but I’m very, very glad that improvements elsewhere in the ecosystem are ever shrinking the tasks that require it. I’m optimistic that languages like Rust may shrink its uses even further, and that I may live to see when the answer to “when is C++ the best tool for the job?” can finally genuinely be “never.” In the meantime, if you have to write C++, go buy Effective Modern C++.


  1. I’m oversimplifying slightly, mostly by omitting things like std::auto_ptr and boost::scoped_ptr, but they don’t really change my point. [return]
  2. Effective Modern C++, pp. 355. [return]
  3. Did you know templates had type inference? No? Me neither. I somehow was able to work in C++ for several years without learning this fact, and am now scared to look back at my old code and figure out how dumb some of it is. [return]