The “Hole in the middle” pattern July 10, 2007 | 06:56 pm

Surfing on Chia’s Functional Longing article, I wanted to post an experience I had recently, working on some C# code. The point of this blog entry is that it’s not what a programming language make possible, it’s what a programming language make easy which determines what patterns are common and what patterns aren’t.

What the code was doing was making a bunch of COM calls to another program, which wasn’t ours- specifically, it was a plugin to excel. Thus I was coding this in C#, because of the three possible languages, VB.NET, C#, and C++, I disliked C# the least. Now, the problem is that this particular plugin was not a model for robust, well designed code. Actually, it’s pretty close to being a poster child for bad code, and the developers are obviously well into the active/stupid quadrant (among other things, they have reimplemented SQL, badly…). In any case, talking COM to this combination has a bad habit of throwing COMExceptions.

Now, sometimes, when it throws a COMException, what it really means is “I’m busy right now- try again later”, and backing off for a while and trying again will allow the call to succeed. Sometimes, what it really means is “I’m totally screwed- kill excel, run this other program to clean up the debris, and start the whole process over again.” Unfortunately, there’s no easy way to tell which is which- except that the latter is a permanent state.

So I had this whole block of code which caught the thrown COMException exceptions, and repeatedly backed off and tried again, until they gave up. And this block of code was cut & pasted everywhere I needed to make a COM call. There is a reason this style of programming is considered bad, as we discovered in the Y2K fiasco. The problem is that when you need to fix the cut and pasted code, you need to track down everywhere it was cut and pasted, and fix it everywhere. Which was the problem with Y2K- not fixing the problem, but finding all the damned places that needed to be fixed. The inevitable happened, of course- the inevitable has a distressing way of always happening. I needed to fix all my cut and pasted code. While fixing the problem, I also wanted to refactor the code so that the next time I had to fix this problem, I’d only have to fix it in one place.

And ran smack dab into a functional/object oriented impedance mismatch.

See, there’s a pattern I’ve gotten used to in Ocaml- and I’d even used it in Perl, which I think of as the “hole in the middle” pattern. The basic idea is that you have two pieces of code that are almost exactly identical, except for that little middle part there. The idea is that you factor out the common code into a single function, which takes a function pointer as an argument. The middle part in the shared code is replaced by a call to the function pointer, and the two places which are being combined simply call the combined function with a pointer to a function that contains the unique part.

In my COMException example, the code I wanted to write would have looked something like this in Ocaml:

let run_com_command f =
    try
        f ()
    with
    | COMException e ->
        (* fancy backoff/restart logic here *)

Most of the places where I was doing COM commands were, excepting the infrastructure of backing off and/or restarting, one liners. Unfortunately, as we shall see, they were one-liners that used and set local variables in non-static context. For Ocaml, no problem. Just use fun:


    let local_var = some value in
    let set_value = run_com_command (fun () -> my com command) in
    …

Because it’s so easy, and keeps things simple, Ocaml encourages the use of this pattern. There is very little overhead to factoring out code in this way, so even if the commonly factored code is small, it’s still useful. Standard Ocaml functions like map, fold, and iter can be seen as examples of this hole in the middle pattern. You’re running down a list, doing something to every element of the list- all code doing that is exactly the same, except for that middle part- what exactly it is you’re doing with every element. It’s not like the Ocaml code for running down a list is all the complicated- it’s about three lines of code, not counting the middle bit. But as we’re replacing three lines of code with a single line, it’s still worth it, even in marginal cases like this. And note that the my com command code above can simply access the local_var variable without doing anything special (Ocaml simply turns it into a hidden extra parameter to the created function, and then partially applies it).

And the number of “middle bits” is often more than one. These pieces of code are exactly identical except for this bit here, and that bit over there. OK, just pass in two functions, one for each bit. Or three functions, if there are three places of difference. Surprisingly dissimilar code can be factored into a common base and three replaceable bits. For example, one tree implementation I wrote had insert, delete, and find all sharing the same tree walking code.

Unfortunately for me, I wasn’t programming in Ocaml, I was programming in C#.

The fact that I was needing to call this common code from non-static environments and access local variables meant that I couldn’t use C# delegates. Now, it’s possible (probable) that I’m using an older version of C#, and that delegate usefulness has improved. But believe me, I tried delegates. I spent the better part of a day trying to get delegates to work, including emailing Chia for help- he in turn bounced me to the guy he asks C# questions of, who in turn bounced me to his guru, who bounced me to his guru. None of them could help me out (although to be honest, I’m not sure how many really tried).

Which meant that I was forced back to the old tried and true method- the “doit” object. This is a pattern that I see as very common in classic OO programming circles- the class with a single interesting function, the “doit” function. Generally the “doit” function has a slightly better name, like MouseClickEventAction or some such, but that’s what it amounts to. Then, rather than just passing the function around, you now pass an object of the Doit class around instead.

Note that a doit class does have one advantage over an unadorned function reference, such as in Perl or C/C++, in that it allows you to pass state around with the function. The general pattern I’ve falled into with both Perl and C is to pass a context variable around with function references, with some suitable generic type (void * in C) to pass in to the function when I call it, fulfilling the same role as the object does. I Ocaml, this is handled by partial function application or simply directly accessing local state.

The problem with this is code verbosity. Notice that in C#, at the place where I’m implementing the shared code, I have to:

  1. define the doit class (probably as an interface, definitely as an abstract base class),
  2. declare that the function takes an argument of the doit class type, and
  3. call the doit function at the correct point.
So far, so not too bad. OK, I have to define an interface, while in Ocaml I can just rely on type inference, but it’s not that big of a deal. But now consider what I need to do at the point where I’m using (calling) the shared code:
  1. Define a class that implements the correct interface,
  2. define member variables to hold copies of the local variables I need access (read or write) to, so I can store them within the object, and initialize those member variables that I am writting
  3. define a constructor for that class that allows me to copy in the relevant local variables I need to read into the member variables,
  4. define the doit function, the one real line of code, which sets member variables of the objects for those local variables I am writing,
  5. allocate a new object of the local doit class, copying the read variables in,
  6. call the common function with the new object just referenced, and
  7. copy the written variables out of the doit object and into their correct local variables.

That’s 7-10 lines of code every time I want to call this function. It’s still a win in this case, as I’m replacing 50 lines of code with 10. But of those 10 lines of code, all but one of them is infrastructure- code that simply gets in the way of understanding what is really happening. All the real action is packed up into that one line. This is one of the reasons Ocaml code can be both significantly shorter and significantly more readable- the code that Ocaml tends to abbreviate heavily is the infrastructure and overhead- Ocaml drops the noise code, not the signal code.

This is also why comprehensions- functions like map, fold, and iter- are huge in Ocaml (and similar functional languages like Haskell and SML), but virtually unknown in languages like C# and Java. The Ocaml programmer looks at it as replacing three lines of code with one, and thus a win. The C# programmer looks at is as replacing three lines of code with ten, and thus not a win. It’s possible to do this in C#, it’s just not easy.

This “minor” change in code factorization has a significant conceptual impact, however. The other way to look at map, fold, and iter are as operations on whole data structures. This makes the relational calculus seem somehow more “natural” to functional programmers, as the relational calculus is, at heart, about operations on whole data structures (tables aka lists/arrays aka relations). Object Oriented programmers, conditioned as they are to treating all data structures as simple data stores, and of not operating on whole data structures but instead simply accessing, inserting, and removing elements, quickly run into a Viet Nam-like quagmire. You get a quagmire when you ignore how the locals think and want things to work, and instead insist on imposing your own paradigm. This is what I was trying to say with this blog post. Small changes in what a programming language makes easy can lead to large changes in how programmers think about programming. And that these changes can have consequence far out of range of the size of change being made.

Tags:

  • http://billmill.org Bill Mill

    I was complaining about almost exactly the same problem in C# just a few days ago: link.

  • Andy

    Yeah, the “reimplementing closures/first-class-functions as function objects” pattern is kind of ridiculous in languages without them. The one upshot of C++ is that with template wizardry, it can actually do some pretty interesting things (e.g. Boost.Lambda), but the lack of syntactic abstraction mechanisms in other popular OO languages makes hacking in fuctional idioms unpleasant.

    I’ll admit I don’t know enough about C# to really comment on it, but I thought I read that C# 3.0 lambda was just syntactic sugar for delegates, so I don’t see why something higher order couldn’t be done in it currently. The problem is that the idiom is (I bet) so far out of the norm for C# programmers that I’d bet you’d get funny looks for using it. I suppose it really does all come back to how easy things are to do in a language that determines how it is used. :-D

  • http://dotnet.org.za/codingsanity Sean Hederman

    Ummm, actually C# anonymous delegate are closures, so you can access local vars. You’d need at least C# 2.0 for this:

    int a = 10;
    RunComCommand(delegate {
    // Do COM commande here or whatever
    // you can also access a:
    a += 10;
    });

    // a is now equal to 20

  • Ossi Herrala

    This “Hole in the middle” pattern is actually called Higher-order function or functional for short.

    See for example the Wikipedia article:

    http://en.wikipedia.org/wiki/Higher-order_function

  • http://www.kirit.com/ Kirit

    Looks like you should have chosen C++ to use. The thing you’re trying to do is trivial in it.

    There’s a full explanation on my web site.

  • Rytmis

    Any chance you’d post an example of what you’re doing in C#? I’ve used delegates to kill off repetitive NHibernate plumbing and it *seems* to fit this “hole in the middle” pattern you’re talking about (but then there’s always the possibility that I’m just being dumb and not seeing the problem).

  • http://freeshells.ch/~revence Revence

    @Rytmis: Delegates kind of model the high-order attributes of OCaml, so they fit, I think.
    Still, they are a tack-on.

    For one, I use F#, so that I don’t have to contend with my compiler when doing .NET programming.

  • Mark P Sullivan

    If I understand your problem correctly, I claim that Perl also allows a low-noise implementation. (If I’ve misunderstood, feel free to block the post.)

    You only need to give “run_com_command” its “f” parameter as a closure, an anonymous sub that has access to its surrounding local (lexical) variables:


    my $local_var = &some(value);
    my $set_value = run_com_command( sub { com(); command($local_var); } );

    Of course, if your “com command” is so complex (or compound) that your aesthetic complains about including it in the single line, you can put pointers to the closures into variables:


    my $local_var = &some(value);
    my $com = sub {
    do();
    com();
    stuff();
    };
    my $command = sub {
    my $no_longer_local = shift; # sorry, explicit unpacking in Perl
    do();
    command($no_longer_local);
    stuff();
    };

    my $set_value = run_com_command( $com, $command );

  • http://www.linkedin.com/in/robertfischer Robert

    @Mark P Sullivan

    That’s one of the reasons I used to love Perl. Waaaaaaaaaay back in the day, I gave a presentation to the Twin Cities Perl Mongers group called “Let Funky Functional Infect Your Code” talking about Perl’s nifty closure/function passing capabilities.

    Unfortunately, my examples used prototypes, so that started an ungodly flame war which overshadowed the original purpose of my conversation.

  • http://www.linkedin.com/in/robertfischer Robert

    BTW, my defense of prototypes (which came out of that ungodly flame war) is here: A Defense of Prototypes, or, Why Does Tom Christiansen Hate Perl?

  • http://www.linkedin.com/in/robertfischer Robert Fischer

    Oh, and in case that wasn’t enough functional Perl fun…consider my Regexp::Tr class (source).

    That class blesses a function reference into an object, which means that you have access to additional (static) state.

    As an aside, that class contains my favorite line of all the Perl code I ever wrote: &self($ref);.

  • Orion Edwards

    C# 2.0 delegates are “proper” closures at least as far as I’ve been able to push them, but if you’re using C# 1.0 or 1.1 then I feel your pain, and making a ‘function class’ is the only way to go

    One way of making these less painful is with C#s “using” keyword and the IDisposable pattern, so you can translate the following code as such:

    OLD CODE:

    DoitClass donut = new DoitClass( ref local1, ref local2 );
    try {
    ... bunch of code ...
    } finally {
    donut.Cleanup()
    }

    NEW CODE:

    using( DoitClass donut = new DoitClass( ref local1, ref local2 ) ) {
    ... bunch of code ...
    }

    This is a small win, but adds up :-)

  • http://luke.breuer.com Luke Breuer

    The fact that I was needing to call this common code from non-static environments and access local variables meant that I couldn’t use C# delegates. Now, it’s possible (probable) that I’m using an older version of C#, and that delegate usefulness has improved. But believe me, I tried delegates. I spent the better part of a day trying to get delegates to work, including emailing Chia for help- he in turn bounced me to the guy he asks C# questions of, who in turn bounced me to his guru, who bounced me to his guru. None of them could help me out (although to be honest, I’m not sure how many really tried).

    This worries me. Nobody in your list knew about C# 2.0′s anonymous delegates that are [pretty close] to true closures? (The compiler will create an on-the-fly class containing members for lexically scoped variables as well as the delegate.) I have saved several links on anonymous methods which you may find useful. If you are not using C# 2.0, you must be feeling a considerable amount of pain, with no generics, no anonymous methods, etc. If you are using C# 2.0, you are missing much of its goodness. For example, check out the methods on Array and List that take delegates (Predicate, Converter).

    If your gurus and gurus’ gurus don’t know the answer to these questions, I question their guru-ness. There is a small chance you are doing something incompatible with C# 2.0′s anonymous methods, but I doubt it. You will probably be shocked to learn about C# 3.0 (your post makes me think you do not know [much] about it), with its type-inference, lambda functions, expression trees, LINQ, etc.

  • Brian

    For what it’s worth, I’m pretty sure I’m sill in C# 1.X land. As I did run across several examples showing how to use anonymous delegates and none of the code worked for me.

    As for using C++ instead of C#, that’s like curing a head cold by causing cancer- you’re replacing a small problem (no closures) with a much larger problem (no GC). Most of the time the program spends is firing off and waiting for COM requests- but most of the logic of the program is deciding what COM requests to fire off and dealing with the results. GC makes the logic part of the program much, much simpler.

    As for using F# for this program, I’d dearly love to. I work at an Ocaml shop (possibly THE Ocaml shop)- as odd as this may sound, F# would be less weird and more familiar of a language in that environment than C# would. The problem with F# is it’s license- at any point Microsoft can come along and say you can’t use it any more. If it were a true open source project, that wouldn’t be a problem. The most Microsoft could say is that they weren’t maintaining any more. If it were a true product, that wouldn’t be a problem either, we’d just purchase a license agreement. But the “neither fish nor fowl” status of the current license means we can’t use it.

    And you can’t assure me they won’t pull that sort of stunt- especially not if F# starts becoming popular. Consider the implications of this controversy. Actually, a very strong case can be made that Microsoft was in the right in that case (they’re still evil, though!). If F# started pulling too much demand from more profitable mainstream products, it would be Microsoft’s fiduciary duty to correct that problem- and the easiest way to do that is to pull the trigger on the shotgun wired to F#’s forehead.

    But until this issue is resolved, I don’t get to use F#. Pity- it looks like a real nice language.

  • http://strangelights.com Robert Pickering

    I assure you, you don’t work in the _the_ ocaml shop, as I know of at least one other – where I work :) Although your right – they’re quite rare beasts.

    I doubt that F# would ever pose a serious threat to the C#/VB.NET user base as they generally appeals to a different type of user, so serves to attract new users to the platform rather than taking away paying customers. Even if it did start to threaten profits from C# or VB.NET, surely the sensibly thing would be to adopt the C# model of giving away the compiler and charging for the VS integration?

    Anyway I know the F# team are aware of the licensing issue and are looking into the matter. I think we’ll see a positive change on the licensing front soon, although wheels turn slowly a Microsoft, especially on this sort of thing.

  • http://www.ffconsultancy.com Jon Harrop

    Well, I was going to ask why you’re still using C# when F# is available but I see Brian (Hurt?) and Robert Pickering have beaten me to it.

    I will just add that injecting closures into the message loop of a GUI thread from another thread is one of the rather nifty tricks that F#’s closures facilitate:

    form.Invoke(fun () -> widget.Text

  • Pingback: Thoughts on the Science of Computing : What's in a language?

  • Pingback: Paint.NET » Blog Archive » Continuation-Passing Style Simplifies Your Exception Handling Code

  • Pingback: Enfranchised Mind » Functional (Meta)?Programming Stunts for Ruby and Groovy (and a Little Perl)

  • Pingback: Functional C#: The hole in the middle pattern at Mark Needham

  • Pingback: IT Blog

  • Pingback: Sockets and Bockets Part 3 | Moirae Software Engineering Ltd.