Wednesday, November 15, 2017

Using the MemoryCache class can save your application a great deal of time by saving the results of expensive operations such as a database query or a web service requeste. However, great care must be taken to ensure that managing the scope of the cache and key management does not lead to headaches.
The best recommendation is to let the MemoryCache class handle its own scope by using the Default cache item. 
var memoryCache = MemoryCache.Default;
The class automatically lazily loads and references a singleton cache of objects regardless of when the class is referenced in your application. This is a good thing for your application because it means the cache is always there for you and you don't have to worry about the scope of the cache or the objects using it.
Using the cache can be as simple as the example below shows.
    public class FooProvider
    {
        private readonly MemoryCache _defaultCache = MemoryCache.Default;
        private readonly FooExpensiveProvider _fooExpensiveProvider = new FooExpensiveProvider();

        public Foo GetFooItem(string name)
        {
            var fooItem = _defaultCache.Get(name) as Foo;
            if (fooItem == null)
            {
                fooItem = _fooExpensiveProvider.GetFooItem(name);
                _defaultCache.Add(name, fooItem, DateTimeOffset.UtcNow.AddHours(1));
            }

            return fooItem;
        }
    }


    public class FooExpensiveProvider
    {
        public Foo GetFooItem(string name)
        {
            return new Foo
            {
                Name = Guid.NewGuid().ToString(),
                Content = "Content"
            };
        }
    }

With this in place, we can write a simple test to see it working.
    class Program
    {
        static void Main(string[] args)
        {
            var fooProvider = new FooProvider();

            Parallel.For(0, 1000, i =>
            {
                var fooItem = fooProvider.GetFooItem("test");
            });

            Console.WriteLine("CacheMissCount = " + fooProvider.CacheMissCount);
        }
    }

Whoops! We got back more than 1 cache miss! What happened? In our case, there are multiple requests to the FooProvider's GetItem method that are running concurrently. Multiple calls entered the code at the same time, found the cache empty and performed the expensive operation then updated the cache. 
We can prevent concurrent requests from performing the expensive operation by first locking the code  around the desired area.
    public class FooProvider
    {
        private readonly MemoryCache _defaultCache = MemoryCache.Default;
        private readonly FooExpensiveProvider _fooExpensiveProvider = new FooExpensiveProvider();
        private static readonly Object _cacheLock = new object(); // must be static!

        public int CacheMissCount = 0;

        public Foo GetFooItem(string name)
        {
            var fooItem = _defaultCache.Get(name) as Foo;
            if (fooItem == null)
            {
                lock (_cacheLock)
                {
                    fooItem = _defaultCache.Get(name) as Foo;
                    if (fooItem == null)
                    {
                        fooItem = _fooExpensiveProvider.GetFooItem(name);
                        _defaultCache.Add(name, fooItem, DateTimeOffset.UtcNow.AddHours(1));
                        Interlocked.Increment(ref CacheMissCount);
                    }
                }
            }
            return fooItem;
        }
    }

Now when we run the test program, we get back the expected single cache miss. Notice also, that it is necessary to again check the cache after locking the code area, because it is possible another request ahead of the current one updated the cache between the first cache check and locking the code area.
We can also demonstrate that by using MemoryCache.Default, multiple instances of the FooProvider class still use the same singleton cache and the cache item is only missed once.
    class Program
    {
        static void Main(string[] args)
        {
            Task[] tasks = new Task[3];

            tasks[0] = Task.Run(() =>
            {
                var fooProvider = new FooProvider();
                for (int i = 0; i < 100; i++)
                {
                    var fooItem = fooProvider.GetFooItem("test");
                }
                Console.WriteLine("FooProvider.CacheMissCount = " + FooProvider.CacheMissCount);
            });

            tasks[1] = Task.Run(() =>
            {
                var fooProvider = new FooProvider();
                for (int i = 0; i < 100; i++)
                {
                    var fooItem = fooProvider.GetFooItem("test");
                }
                Console.WriteLine("FooProvider.CacheMissCount = " + FooProvider.CacheMissCount);
            });

            tasks[2] = Task.Run(() =>
            {
                var fooProvider = new FooProvider();
                for (int i = 0; i < 100; i++)
                {
                    var fooItem = fooProvider.GetFooItem("test");
                }
                Console.WriteLine("FooProvider.CacheMissCount = " + FooProvider.CacheMissCount);
            });


            Task.WaitAll(tasks);
        }
    }

Using Named Caches

It is possible to used a named cache instead of the Default. It is strong recommended that you do not do this because it requires you to carefully manage the lifetime and scope of the cache item. This could lead to situations where more than one cache is created and cache related objects are not released causing memory leaks.
Do not create MemoryCache instances unless it is required. If you create cache instances in client and Web applications, the MemoryCacheinstances should be created early in the application life cycle. You must create only the number of cache instances that will be used in your application, and store references to the cache instances in variables that can be accessed globally. For example, in ASP.NET applications, you can store the references in application state. If you create only a single cache instance in your application, use the default cache and get a reference to it from the Default property when you need to access the cache.


Caching Multiple Types of Objects

Since the recommended strategy for caching objects is to use only the Default, what if we wish to cache multiple object types? Since the cache can store any type of object, we can simply differentiate the cached items by prefixing the object type to the cache key. Notice in the example below that the only difference between this example and the previous examples is that the cache key is different - it uses a prefix to denote what kind of object we're after. Note that this technique could also be used to cache the same object types by different cache keys.
        public Foo GetFooItem(string name)
        {
            string cacheKey = string.Format("Foo.ByName.{0}", name);
            var fooItem = _defaultCache.Get(cacheKey) as Foo;
            if (fooItem == null)
            {
                lock (_cacheLock)
                {
                    fooItem = _defaultCache.Get(cacheKey) as Foo;
                    if (fooItem == null)
                    {
                        fooItem = _fooExpensiveProvider.GetFooItem(name);
                        _defaultCache.Add(cacheKey, fooItem, DateTimeOffset.UtcNow.AddHours(1));
                        Interlocked.Increment(ref CacheMissCount);
                    }
                }
            }
            return fooItem;
        }

Monday, November 6, 2017

The Good, the Bad, and the Ugly of the static keyword

I've been working in a lot of legacy code lately. One of the issues that has come up repeatedly is the misuse and abuse of the "static" keyword in C#. It would be easy for me to knee-jerk and say "static" is just evil, but that would be a disservice to a useful tool that can totally make sense in the right situations. So before I talk about the negatives, let's look at the good stuff!

The Good

Pure Methods

One of the best uses of static is the ability to create pure methods (functions). A pure function meets these requirements (totally stolen from Wikipedia):

  1. The function always evaluates the same result value given the same argument value(s). The function result value cannot depend on any hidden information or state that may change while program execution proceeds or between different executions of the program, nor can it depend on any external input from I/O devices.
  2. Evaluation of the result does not cause any semantically observable side effect or output, such as mutation of mutable objects or output to I/O devices.

Pure functions or methods have several good benefits, including making the program easier to read and reason about, as well as preventing unwanted side-effects. Another tangible in .NET is that static methods can potentially be faster since the run time does not require an object instance to be referenced to run the method. However, I would caution against using static for performance gains unless you can actually measure the difference between using it and not using it. It's too easy to prematurely optimize code that doesn't need it at the cost of readability, or worse, creating something that commits some of the sins I'll list later on. 


Extension Methods

Another great use of static methods is extension methods. The ability to write fluent statements that alter an object, even when you don't own the code for that object, is an incredibly useful feature. While at first glance it does seem to violate the idea of a pure function, extension methods use the "this" keyword to explicitly indicate that a side effect is intended and desired. It's a contract between the caller and the method to do something to the input, so it's all good.

The Bad

Impure Functions with Side Effects

One of the easiest ways to misuse static is to have a function that alters its input arguments. One of the biggest code smells for me is seeing a method that starts with "static void". It means the author is intentionally creating a method full of side effects on objects that the method doesn't own! This is a clear violation of SOLID in terms of encapsulation and the single responsibility principle. It usually indicates some bit of logic that needs to belong to the object itself or some kind of factory or provider class. The best way to address this kind of flaw is to move the logic to a public method of the targeted class or create an extension method. If the logic is creating a series or collection of objects, consider a factory class instead.

The Ugly

Singletons and Static Properties

OK, there may be a few times in your programming career where you find that a singleton of something makes sense. The really important thing is how you implement it, and using the static keyword should not be it. Using static to create a singleton is literally just creating a giant global variable for the scope of the entire application and the programmer using the class can do squat about it. Instead you should use a IoC framework to control the scope of the object as necessary. The application should be able to choose how many of something it needs, and it shouldn't be the decision of the class itself. There are other downsides, not the least of which is that this often makes the class not thread safe. In addition, it is often difficult to unit test singleton classes because unit tests run concurrently may yield unpredictable results (i.e. a result of not being thread safe).

Static properties are another land mine of this idea of using a keyword to create global variables in a class. Not only are they not thread safe, they can result in race conditions where shared state is involved.

Static Dependencies

Probably the worst abuse in static methods (in my opinion) is the inclusion of other static classes and methods being referenced in the static method. This is a clear violation of SOLID and literally hard-wires external dependencies to your class & method, not to mention making them potential global variables. This is also most likely not thread safe (unless you happen to reference only pure functions), so in general is likely to cause you all sorts of headaches at design time and run time. 

Conclusion

There are cases where the static method could genuinely help your code. But in most cases, I see so much confusion and abuse of the keyword that I recommend avoiding it unless you can prove to yourself that a pure function genuinely helps the readability and performance of your code.