Monday, October 26, 2009

New features in C# 4.0 - Variance


An aspect of generics that often comes across as surprising is that the following is illegal:


IList strings = new List();
IList<object> objects = strings;

The second assignment is disallowed because strings does not have the same element type as objects. There is a perfectly good reason for this. If it were allowed you could write:
objects[0] = 5;
string s = strings[0];

Allowing an int to be inserted into a list of strings and subsequently extracted as a string. This would be a breach of type safety.
 However, there are certain interfaces where the above cannot occur, notably where there is no way to insert an object into the collection. Such an interface is IEnumerable<T>. If instead you say:
IEnumerable<object>

There is no way we can put the wrong kind of thing into strings through objects, because objects doesn’t have a method that takes an element in. Variance is about allowing assignments such as this in cases where it is safe. The result is that a lot of situations that were previously surprising now just work.

Covariance

In .NET 4.0 the IEnumerable<T>
interface will be declared in the following way:

public interface IEnumerable : IEnumerable
{
        IEnumerator GetEnumerator();
}
public interface IEnumerator : IEnumerator
{
        bool MoveNext();
        T Current { get; }
}

The “out” in these declarations signifies that the T can only occur in output position in the interface – the compiler will complain otherwise. In return for this restriction, the interface becomes “covariant” in T, which means that an IEnumerable<A> is considered an IEnumerable<B>

if A has a reference conversion to B.

As a result, any sequence of strings is also e.g. a sequence of objects.

This is useful e.g. in many LINQ methods. Using the declarations above:
 
var result = strings.Union(objects); // succeeds with an IEnumerable 

This would previously have been disallowed, and you would have had to to some cumbersome wrapping to get the two sequences to have the same element type.

Contravariance

Type parameters can also have an “in” modifier, restricting them to occur only in input positions. An example is IComparer:
 
public interface IComparer
{
        public int Compare(T left, T right);
}

The somewhat baffling result is that an IComparer can in fact be considered an IComparer! It makes sense when you think about it: If a comparer can compare any two objects, it can certainly also compare two strings. This property is referred to as contravariance.  A generic type can have both in and out modifiers on its type parameters, as is the case with the Func<…> delegate types: public delegate TResult Func(TArg arg);   Obviously the argument only ever comes in, and the result only ever comes out. Therefore a Func can in fact be used as a Func. Limitations Variant type parameters can only be declared on interfaces and delegate types, due to a restriction in the CLR. Variance only applies when there is a reference conversion between the type arguments. For instance, an IEnumerable is not an IEnumerable because the conversion from int to object is a boxing conversion, not a reference conversion.   Also please note that the CTP does not contain the new versions of the .NET types mentioned above. In order to experiment with variance you have to declare your own variant interfaces and delegate types. COM Example Here is a larger Office automation example that shows many of the new C# features in action.
using System;
using System.Diagnostics;
using System.Linq;
using Excel = Microsoft.Office.Interop.Excel;
using Word = Microsoft.Office.Interop.Word;
class Program
{
    static void Main(string[] args) {
        var excel = new Excel.Application();
        excel.Visible = true;
        excel.Workbooks.Add();                    // optional arguments omitted
        excel.Cells[1, 1].Value = "Process Name"; // no casts; Value dynamically  
        excel.Cells[1, 2].Value = "Memory Usage"; // accessed
        var processes = Process.GetProcesses()
            .OrderByDescending(p => p.WorkingSet)
            .Take(10);
        int i = 2;
        foreach (var p in processes) {
            excel.Cells[i, 1].Value = p.ProcessName; // no casts
            excel.Cells[i, 2].Value = p.WorkingSet;  // no casts
            i++;
        }
        Excel.Range range = excel.Cells[1, 1];       // no casts
        Excel.Chart chart = excel.ActiveWorkbook.Charts.
            Add(After: excel.ActiveSheet);         // named and optional arguments
        chart.ChartWizard(
            Source: range.CurrentRegion,
            Title: "Memory Usage in " + Environment.MachineName); //named+optional
        chart.ChartStyle = 45;
        chart.CopyPicture(Excel.XlPictureAppearance.xlScreen,
            Excel.XlCopyPictureFormat.xlBitmap,
            Excel.XlPictureAppearance.xlScreen);
        var word = new Word.Application();
        word.Visible = true;
        word.Documents.Add();          // optional arguments
        word.Selection.Paste();
    }
}
The code is much more terse and readable than the C# 3.0 counterpart. Note especially how the Value property is accessed dynamically. This is actually an indexed property, i.e. a property that takes an argument; something which C# does not understand. However the argument is optional. Since the access is dynamic, it goes through the runtime COM binder which knows to substitute the default value and call the indexed property. Thus, dynamic COM allows you to avoid accesses to the puzzling Value2 property of Excel ranges.

No comments:

Post a Comment