Friday, February 24, 2017

Reflection vs Compiled Expression Performace

Performance of reflection and compiled expressions will be shown in this post.

There's a nice library ObjectListView which has lots of features, and also easy to use. Because it's not need to fill ListViewItem manually.

For User class:

class User
{
    public int Id;
    public string Name;
    public DateTime BirthDate;
}

instead of this code:

var lvis = new List<ListViewItem>();
foreach (var user in users)
{
    lvis.Add(new ListViewItem(new[]
    {
        user.Id.ToString(),
        user.Name,
        user.BirthDate.ToString(),
    }));
}

you can simply pass collection of your own classes:

objectListView.Objects = users;

This library is an example of where reflection can be used.

But what should be used - reflection, or compiled expressions, or emit? The following tests will show. Except emit - it won't be tested because it's difficult to use it. Assumption about emit can be made looking on manual (speed) and compiled expression (startup overhead) tests.

Three tests will be made:

  1. Manual.
  2. Reflection.
  3. Compiled expression.

Each test consists of 200 iteration for warmup and 200 iterations for test itself.

Every test creates list of ListViewItem for specified object type. Except manual test which is only for User type.

Hardware: i5-4200H, DDR3-1600, Win 10 x64 1607. Software: VS 2015, .NET 4.6.1.

Code for manual test:

public static List<ListViewItem> CreateListItemsManual(List<User> users)
{
    var items = new List<ListViewItem>();
    foreach (var user in users)
    {
        var subitems = new[]
    {
            user.Id.ToString(),
            user.Name,
            user.BirthDate.ToString("dd.MM.yyyy (ddd)"),
        };
        var lvi = new ListViewItem(subitems);
        items.Add(lvi);
    }
    return items;
}

Code for reflection test:

public static List<ListViewItem> CreateListItemsReflection(Type type, IEnumerable<object> users)
{
    var items = new List<ListViewItem>();
    var fields = type.GetFields();
    foreach (var user in users)
    {
        var subitems = new string[fields.Length];
        for (int i = 0; i < fields.Length; i++)
        {
            string value;
            var field = fields[i];
            if (field.FieldType == typeof(string))
            {
                value = (string)field.GetValue(user);
            }
            else if (field.FieldType == typeof(int))
            {
                value = ((int)field.GetValue(user)).ToString();
            }
            else if (field.FieldType == typeof(DateTime))
            {
                value = ((DateTime)field.GetValue(user)).ToString("dd.MM.yyyy (ddd)");
            }
            else
            {
                value = field.GetValue(user).ToString();
            }
            subitems[i] = value;
        }
        var lvi = new ListViewItem(subitems);
        items.Add(lvi);
    }
    return items;
}

Code for compiled expression test:

public static List<ListViewItem> CreateListItemsCompiledExpression(Type type, IEnumerable<object> users)
{
    var items = new List<ListViewItem>();
    var fields = type.GetFields();
    Func<object, string>[] fieldGetters = new Func<object, string>[fields.Length];
    for (int i = 0; i < fields.Length; i++)
    {
        Func<object, string> fieldGetter;
        Expression<Func<object, string>> lambda;
        var field = fields[i];
        // user => 
        var userObject = Expression.Parameter(typeof(object), "user");
        // user => (User)user
        var user = Expression.Convert(userObject, type);
        // user => ((User)user)."Field"
        var fld = Expression.Field(user, field);
        if (field.FieldType == typeof(string))
        {
            // user => ((User)user)."Field"
            lambda = Expression.Lambda<Func<object, string>>(fld, userObject);
        }
        else if (field.FieldType == typeof(int))
        {
            // user => ((User)user)."Field".ToString() // int.ToString()
            var toString = Expression.Call(fld, typeof(int).GetMethod("ToString", new Type[0]));
            lambda = Expression.Lambda<Func<object, string>>(toString, userObject);
        }
        else if (field.FieldType == typeof(DateTime))
        {
            // user => ((User)user)."Field".ToString("dd.MM.yyyy (ddd)")
            var toString = Expression.Call(
                fld,
                typeof(DateTime).GetMethod("ToString", new Type[] { typeof(string) }),
                Expression.Constant("dd.MM.yyyy (ddd)"));
            lambda = Expression.Lambda<Func<object, string>>(toString, userObject);
        }
        else
        {
            // user => ((User)user)."Field".ToString() // object.ToString()
            var toString = Expression.Call(fld, typeof(object).GetMethod("ToString", new Type[0]));
            lambda = Expression.Lambda<Func<object, string>>(toString, userObject);
        }
        fieldGetter = lambda.Compile();
        fieldGetters[i] = fieldGetter;
    }
    foreach (var user in users)
    {
        var subitems = new string[fields.Length];
        for (int i = 0; i < fields.Length; i++)
        {
            subitems[i] = fieldGetters[i](user);
        }
        var lvi = new ListViewItem(subitems);
        items.Add(lvi);
    }
    return items;
}

Results

There's no much difference in absolute time for case with not many items - ~0.5 ms. This time is startup overhead for expressions compilation. It doesn't make sense for UI - nobody can see 0.5 ms difference.

Let's see the whole graph below.

Reflection is slower for about 6-7 ms for 20,000 elements. Again, this is not the time that anyone can see in UI.

But what should be used in real life projects? Is it worth to write code for universal and simple usage using reflection/expression, or it's better to spend time and write specific code for every type manually to achieve best performance for both little and many elements?

For UI components, if it's definitely known that there won't be many elements, reflection can be used.

But what if it's server application and/or there can be cases with both little and many elements, and/or performance is required? Already for 100-200 elements first graph shows ~1.5x performance difference between manual and reflection methods.

Fortunately, in real applications used types are not being changed all the time while program runs. This means that once expressions are compiled they can be cached.

This way allows to use compiled expressions without startup overhead.

Script with raw (200 iterations) results (R).

View project source code at GitHub.

Saturday, February 11, 2017

EF Core vs LINQ2DB

Entity Framework Core recently got v1.1.0. Though it still lacks some critical features like "GROUP BY" SQL translation (see its roadmap) it's time to test it.

The following frameworks will be tested:

  1. Entity Framework CodeFirst (LINQ query, models generated from DB)
  2. Entity Framework (raw SQL query)
  3. ADO.NET
  4. LINQ to DB (LINQ query, model entities generated from DB)
  5. LINQ to DB (raw SQL query)
  6. Entity Framework Core (doesn't support raw SQL execution at this moment)

Hardware used: i5-4200H, DDR3-1600, Win 10 x64 1607.

Software used: SQL Server 2016 SP1, VS 2015, .NET 4.6.1, EF 6.1.3, LINQ to DB 1.7.5, EF Core 1.1.0.

And default Northwind database.

The tests are the same as in one of the previous articles.

Note: EF Core doesn't use "GROUP BY" in generated SQL, instead it processes it in memory. This can lead to high load on the database in production.

Context Initialization

EF Core's context initialization is twice faster than EF 6. It matters for simple and fast queries.

Simple TOP 10 query

Here and below the grey part of bar is context initialization.

We can see that EF Core is faster than EF 6 when running simple queries. Though it's faster than EF 6 both in context initialization as well as in everything else but it still slower twice than LINQ2DB in overall.

Depending on the usage it might not be so bad, because absolute time is low.

Simple TOP 500 query

Results are almost the same, but now EF Core not to far from ADO.NET and LINQ2DB.

Complex TOP 10 query

Almost no difference between frameworks, except EF 6 which is 2x slower than others.

Complex TOP 500 query

The complex query with many result rows makes all frameworks nearly the same (again except EF 6 which is 2x slower than others).

Conclusions

EF Core is faster than EF 6. It's a good thing. But it still can't use "GROUP BY" clause in SQL although it's 1.1.0 version released. It's bad.

Another bad thing about EF Core is that it doesn't support raw SQL execution. It almost doesn't matter for complex queries, but usually applications have many simple queries, and here EF Core is weak - it can't be optimized more. Change tracking doesn't affect selects, and the only way for optimization is raw SQL.

So, if performance is not significant then EF Core can be chose. Otherwise, it's even EF 6 might be more preferable because it supports raw SQL execution which will help in heavy queries.

And if performance is important, or if change tracking is not required, then LINQ2DB may be the best choice. LINQ2DB's LINQ queries are not very slower than raw ADO.NET even for simple queries. And if it's not enough then raw SQL can be used. LINQ2DB is not new, so it hasn't such plenty of bugs as EF Core now.

Raw results (Excel).

View project source code at GitHub.