Friday, April 11, 2014

Why the proposed null-propogating operator in C# may be bad for the .NET community.

As one of many developers who is excited about the new Open-Source "Roslyn" compiler in .NET, I can't help but get excited about new language features for C# that could be included in the near future. However, there is one feature that I have mixed feelings about. That proposed feature is the "null propogation operator". It is meant to address issues where people find elaborate ways of working around the issue of "deep null checking". There has also been a lot of discussion about the "Law of Demeter" in various places in an attempt to address this issue. While it is the most requested feature for the language it strikes me as one of those things that is added to make things more convenient for programmers that may instead be an indication of code smell. In other words, if you need this feature, you may be "doing it wrong".

Pardon me while I tell a story to lay the foundation of my argument. The story is true, although I may be embellishing some of the details. Back in the mid-90's I was a humble C.S. undergrad new to the ways of programming. In my junior year, I had a class called "Object Oriented Programming". I had already learned C and C++ and was not unfamiliar with classes and objects. However, the class was focused on using the Smalltalk language, where everything is an object. What better way to learn OOP correctly? The professor was a Hagrid-like man, big and burly with a long beard. Our first lecture he gave a powerful demonstration of what OOP should be about. He got a volunteer to stand with him in the front of the class. He then demanded of the student, "Give me a dollar!" The student eyed him carefully, then finally pulled out his wallet and gave him a dollar. The professor beamed and said, "How would you feel if instead I had just reached into your back pocket, pulled out your wallet and took out a dollar?" "Not good." "That is the difference between OOP and the kind of programming you are probably used to. In this class, we will learn to stop picking people's pockets."

The point of me telling this story is that many programmers don't understand what "encapsulation" means. They treat objects as a convenient data store and nothing more. Almost all properties of an object are public and they reach in and pick the pockets of the object without any regard for encapsulation.

So when I see something like this...

 if (cake != null && cake.frosting != null && cake.frosting.berries != null)  

...I can't help but cringe. The programmer is peeking into the object's pockets to see what it has or doesn't have and what can be safely used to do something important. In this case, the programmer is using the properties of the cake to determine what kind of cake it is instead of asking, "What kind of cake are you?"

Let me say first that I understand and appreciate sanity clauses for null objects, especially when using inversion of control techniques like dependency injection. However, it is my opinion that you should never have a null object as a valid return value unless you specifically use a nullable type. If your source for an object gives you null instead of an object and you aren't expecting that, you should simply let things fail ("fail early, fail often") and find out why things failed. The biggest source of code smell here is that we are forcing ourselves to check for null when null should be exceptional outcome and not an expected outcome. Programmers who do OOP this way are willingly paying a null-check penalty every time they need to inspect a property. But that doesn't have to do be the case.

Let us say, for argument's sake, that we are trying to display the contents of a cake on a web page. Wouldn't you like your view to be able to assume that the cake existed, for starters? The view should be "dumb" in that it shouldn't require any special knowledge of cake objects in order to display a description of the cake. Should we have to check for the existence of berries (and frosting, and the cake itself!) to see if we should display berries and what kind of berries we should display? Or can we design a class hierarchy that doesn't require null checking to be used easily? In other words, can we design objects that tell us what they are without having to put our greasy mitts in their pockets?

What if a Cake object consisted of a collection of Ingredient objects? Each Ingredient object could have an IngredientType that contained an enum and/or string description of the ingredient? We could further fill out this tree of objects to included a Topping class and ultimately a BerryTopping class. By the time we're done our new code for displaying berry toppings now looks something like this:

 var berryToppings = cake.ingredients.OfType<BerryTopping>();    
 foreach (var bt in berryToppings)  
 {  
    // display the berry toppings  
 }  

No null checking needed! We simply ask the cake, "Do you have berry toppings?" And it replies with a collection of objects which we can enumerate and display. Now if some property in an ingredient is unexpectedly null, we should stop and examine the code that allowed such a condition to exist rather than adding another null check.

What is my point? It is that if you take the time to properly design your objects such that null references are truly unexpected, then you don't need deep null checking or a special language feature to make deep null checking convenient. I worry that if the Roslyn team moves forward with this feature, they will just be enabling bad OOP practices. In other words, they'll just be enabling.NET programmers to be better pick-pockets!

Edit: Many thanks to sharp redditors for improving my example!