I did something and suddenly my API was not run in IIS Express anymore, but in Kestrel. I reviewed code changes, configuration changes, all to now avail. I don't want Kestrel, I want IIS Express!!! Why did Visual Studio suddenly decided to switch the development server?

Solution: it is your Visual Studio. The button you use to start debugging has a little dropdown that allows you to choose which server to use. I had probably pressed the wrong one at some point.

Update for .NET Core 3.0:

Seems for .NET Core 3.0 the solution is much simpler:
  • install the Microsoft.AspNetCore.Authentication.Negotiate NuGet package
  • add authentication in ConfigureServices like this:
    services
    .AddAuthentication(NegotiateDefaults.AuthenticationScheme)
    .AddNegotiate();
  • use the authentication in Configure (above app.UseAuthorization();)
    app.UseAuthentication();

No need to UseIISIntegration, UseHttpSys or anything.

Original post:

If you get the System.InvalidOperationException "No authenticationScheme was specified, and there was no DefaultChallengeScheme found." it means that ... err... you don't have a default authentication scheme. Solution:
  • Install NuGet package Microsoft.AspNetCore.Authentication in your project
  • add
    services.AddAuthentication(Microsoft.AspNetCore.Server.IISIntegration.IISDefaults.AuthenticationScheme);
    to the ConfigureServices method.

Update: Note that this is for IIS integration. If you want to use self hosted or Kestrel in debug, you should use HttpSysDefaults.AuthenticationScheme. Funny though, it's the same string value for both constants: "Windows".

Oh, and if you enter the credentials badly when prompted and you can't reenter them, try to restart Chrome (as in this answer)

and has 2 comments
I was working on this web ASP.Net Core 2.0 project that was spewing a lot of "Application Insights Telemetry (unconfigured): ..." messages in the Debug Output window. At first I thought I should just remove the Microsoft Application Insights NuGet package, but it didn't work. By default, it will still use insights even if you don't have it referenced anywhere in your code.

The solution is to do have installed Microsoft Application Insights NuGet package, but then set
Microsoft.ApplicationInsights.Extensibility.TelemetryConfiguration.Active.DisableTelemetry = true;
somewhere in Startup.cs (the constructor is fine).

Apparently, in the preview versions of Visual Studio 2017 there is an option under Options → Projects and Solutions → Web Projects → Disable local Application Insights for Asp.net Core web projects, too.

Update: A comment suggested you need to Disable the automatic loading of hosting startup assemblies which can be done in two ways:
  1. setting ASPNETCORE_preventHostingStartup to True or 1 in the project properties → Debug → Environment variables
  2. Doing something like
    WebHost.CreateDefaultBuilder(args)
    .UseSetting(WebHostDefaults.PreventHostingStartupKey, "true")
    ...
    - available from .NET Core 2.0

Either of these works, and although I agree with Andrei that the underlying issue for the unwanted telemetry is the automatic loading of hosting assemblies, I feel like the first option, the one that actually contains the word Telemetry in it is better for reasons of readability. But it's good there are three options to choose from.

and has 8 comments
I've learned something new today. It all starts with an innocuous question: Given the following struct, tell me what is its size:
    public struct MyStruct
{
public int i1;
public char c1;
public long l1;
public char c2;
public short s1;
public char c3;
}
Let's assume that this is in 32bit C++ or C#.

The first answer is 4+1+8+1+2+1 = 17. Nope! It's 24.

Well, it is called memory alignment and it has to do with the way CPUs work. They have memory registers of fixed size, various caches with different sizes and speeds, etc. Basically, when you ask for a 4 byte int, it needs to be "aligned" so that you get 4 bytes from the correct position into a single register. Otherwise the CPU needs to take two registers (let's say 1 byte in one and 3 bytes in another) then mask and shift both and add them into another register. That is unbelievably expensive at that level.

So, why 24? i1 is an int, it needs to be aligned on positions that are multiple of 4 bytes. 0 qualifies, so it takes 4 bytes. Then there is a char. Chars are one byte, can be put anywhere, so the size becomes 5 bytes. However, a long is 8 bytes, so it needs to be on a position that is a multiple of 8. That is why we add 3 bytes as padding, then we add the long in. Now the size is 16. One more char → 17. Shorts are 2 bytes, so we add one more padding byte to get to 18, then the short is added. The size is 20. And in the end you get the last char in, getting to 21. But now, the struct needs to be aligned with itself, meaning with the largest primitive used inside it, in our case the long with 8 bytes. That is why we add 3 more bytes so that the struct has a size that is a multiple of 8.

Note that a struct containing a struct will align it to its largest primitive element, not the actual size of the child struct. It's a recursive process.

Can we do something about it? What if I want to spend speed on memory or disk space? We can use directives such as StructLayout. It receives a LayoutKind - which defaults to Sequential, but can also be Auto or Explicit - and a numeric Pack parameter. Auto rearranges the order of the members of the class, so it takes the least amount of space. However, this has some side effects, like getting errors when you want to use Marshal.SizeOf. With Explicit, each field needs to be adorned with a FieldOffset attribute to determine the exact position in memory; that also means you can use several fields on the same position, like in:
    [StructLayout(LayoutKind.Explicit)]
public struct MyStruct
{
[FieldOffset(0)]
public int i1;
[FieldOffset(4)]
public int i2;
[FieldOffset(0)]
public long l1;
}
The Pack parameter tells the system on how to align the fields. 0 is the default, but 1 will make the size of the first struct above to actually be 17.
    [StructLayout(LayoutKind.Sequential, Pack = 1)]
public struct MyStruct
{
public int i1;
public char c1;
public long l1;
public char c2;
public short s1;
public char c3;
}
Other values can be 2,4,8,16,32,64 or 128. You can test on how the performance is affected by this, as an exercise.

More information here: Advanced c# programming 6: Everything about memory allocation in .NET

Update: I've created a piece of code to actually test for this:
unsafe static void Main(string[] args)
{
var st = new MyStruct();
Console.WriteLine($"sizeof:{sizeof(MyStruct)} Marshal.sizeof:{Marshal.SizeOf(st)} custom sizeof:{MySizeof(st)}");
Console.ReadKey();
}
 
private static long MySizeof(MyStruct st)
{
long before = GC.GetTotalMemory(true);
MyStruct[] array = new MyStruct[100000];
long after = GC.GetTotalMemory(true);
var size = (after - before) / array.Length;
return size;
}

Considering the original MyStruct, the size reported by all three ways of computing size is 24. I had to test the idea that the maximum byte padding is 4, so I used this structure:
public struct MyStruct
{
public long l;
public byte b;
}
Since long is 8 bytes and byte is 1, I expected the size to be 16 and it was, not 12. However, I decided to also try with a decimal instead of the long. Decimal values have 16 bytes, so if my interpretation was correct, 17 bytes should be aligned with the size of the biggest struct primitive field: a multiple of 16, so 32. The result was weirdly inconsistent: sizeof:20 Marshal.sizeof:24 custom sizeof:20, which suggests an alignment to 4 or 8 bytes, not 16. So I started playing with the StructLayoutAttribute:
[StructLayout(LayoutKind.Sequential, Pack = 1)]
public struct MyStruct
{
public decimal d;
public byte b;
}

For Pack = 1, I got the consistent 17 bytes. For Pack=4, I got consistent values of 20. For Pack=8 or higher, I got the weird 20-24-20 result, which suggests packing works differently for decimals than for other values. I've replaced the decimal with a struct containing two long values and the consistent result was back to 24, but then again, that's expected. Funny thing is that Guid is also a 16 byte value, although it is itself a struct, and the resulting size was 20. Guid is not a value type, though.

The only conclusion I can draw is that what I wrote in this post is true. Also, StructLayout Pack does not work as I had expected, instead it provides a minimum packing size, not a maximum one. If the biggest element in the struct is 8 bytes, then the minimum between the Pack value and 8 will be used for alignment. The alignment of the type is the size of its largest element (1, 2, 4, 8, etc., bytes) or the specified packing size, whichever is smaller.

All this if you are not using decimals... then all bets are off! From my discussions with Filip B. Vondrášek in the comments of this post, I've reached the conclusion that decimals are internally structs that are aligned to their largest element, an int, so to 4 bytes. However, it seems Marshal.sizeof misreports the size of structs containing decimals, for some reason.

In fact, all "simple" types are structs internally, as described by the C# language specification, but the Decimal struct also implements IDeserializationEventListener, but I don't see how this would influence things. Certainly the compilers have optimizations for working with primitive types. This is as deep as I want to go with this, anyway.

and has 0 comments
Just a heads up on a technology than I had no idea existed. To get the details read this 2009 (!!! :( ) article.

Basically you define a MemoryMappedFile instance from a path or a file reader, then create one or more MemoryMappedViewAccessors, then read or write binary data. The data can be structures, by using the generic Read/Write<[type]> methods.

Drawbacks: The size of the file has to be fixed, it cannot be increased or decreased. Also the path of the file needs to be on a local drive, it can't be on a network path.
Advantages: Fast access, built in persistency, the most efficient method to share data between processes.

and has 2 comments
A reader asked me how to work with multiple projects in Visual Studio Code and after fumbling a little I realized I had no idea. So I started trying out things.

First I created a folder in which I created two other folders ingeniously named Proj1 and Proj2. I went in both and ran dotnet new console and dotnet new classlib respectively. I moved the Console.WriteLine("Hello World!"); code from the console Program.cs in a static method in the Class1 class, then called the method from Program.cs Main, then tried to find ways of referencing Proj2 from Proj1 in Visual Studio Code.

And here I got stuck. I tried the smart solutions VS Code recommended, but none of them included adding a reference. I right clicked on everything to no avail. I wrote using Proj2; by hand hoping that Code will magically understand I need a project reference. I googled, only to find old articles that discussed project.json, not .csproj type of .NET projects.

In the end I was resigned to write the reference by hand. I opened Proj1.csproj and added
<ItemGroup>
<ProjectReference Include="..\Proj2\Proj2.csproj"/>
</ItemGroup>

After saving the file and going to the unresolved Class1 reference, I now got using Proj2; as an option to fix it. And now I got to the problem my reader was having. When trying to run Proj1, I got Unhandled Exception: System.IO.FileNotFoundException: Could not load file or assembly 'Proj2, Version=1.0.0.0, Culture=neutral, PublicKeyToken=null'. The system cannot find the file specified. at Proj1.Program.Main(String[] args).

It's disgustingly easy to solve, you just need to know what to do. Either Ctrl-Shit-P and type restore, then select restoring Proj1 or do it manually by going to the Proj1 folder and running dotnet restore by hand. After that the project is compiled and runs.

Summary:
  1. add project reference by hand to .csproj file
  2. resolve whatever compilation errors you have by specifying the correct usings or inlining namespaces
  3. dotnet restore the project you added references to

Learning ASP.Net MVC series:
  1. Setup
  2. MVC Concepts
  3. Authentication
  4. Entity Framework Fundamentals
  5. Upgrading project to .NET Core 1.1
  6. Dependency Injection and Services

Previously on Learning ASP.Net MVC...


Started with the idea of a project that would use user configurable queries to do Google searches, store the information in the results and then perform various data analysis on them and display them based on what the user wants. However, I first implemented a Google authentication and then went to write some theoretical posts. Lastly, I've upgraded the project from .NET Core 1.0 to version 1.1.

Well, it took me a while to get here because I was at a crossroads. I like the idea of dependency injectable services to do data access. At the same time there is the entire Entity Framework tutorial path that kind of wants to strongly integrate EF with my projects. I mean, if I have a service that gives me the list of all items in the database and then I want to get only a few items, it would be bad design to filter the entire list. As such, I would have to write a different method that allows me to get the items based on some kind of filters. On the other hand, Entity Framework code looks just like that "give me all you have, filtered by this" which is then translated into an efficient query to the database. One possibility would be to have my service return IQueryable <T>, so I could also use the system to generate the database code on the fly.

The Design


I've decided on the service architecture, against an EF type IQueryable way, because I want to be able to replace that service with anything, including something that doesn't work with a database or something that doesn't know how to dynamically create queries. Also, the idea that the service methods will describe exactly what I want appeals to me more than avoiding a bit of duplicated code.

Another thing to define now is the method through which I will implement the dependency injection. Being the control freak that I am, I would go with installing my own library, something like SimpleInjector, and configure it myself and use it explicitly. However, ASP.Net Core has dependency injection included out of the box, so I will use that.

As defined, the project needs queries to pass on to Google and a storage service for the results. It needs data services to manage these entities, as well as a service to abstract Google itself. The data gathering operation itself cannot be a simple REST call, since it might take a while, it must be a background task. The data analysis as well. So we need a sort of job manager.

As per a good structured design, the data objects will be stored in a separate project, as well as the interfaces for the services we will be using.

Some code, please!


Well, start with the code of the project so far: GitHub and let's get coding.

Before finding a solution to actually run the background code in the context of ASP.Net, let's write it inside a class. I am going to add a folder called Jobs and add a class in it called QueryProcessor with a method ProcessQueries. The code will be self explanatory, I hope.
public void ProcessQueries()
{
var now = _timeService.Now;
var queries = _queryDataService.GetUnprocessed(now);
var contentItems = queries.AsParallel().WithDegreeOfParallelism(3)
.SelectMany(q => _contentService.Query(q.Text));
_contentDataService.Update(contentItems);
}

So we get the time - from a service, of course - and request the unprocessed queries for that time, then we extract the content items for each query, which then are updated in the database. The idea here is that, for the first time a query is defined or when the interval from the last time the query was processed, the query will be sent to the content service from which content items will be received. These items will be stored in the database.

Now, I've kept the code as concise as possible: there is no indication yet of any implementation detail and I've written as little code as I need to express my intention. Yet, what are all these services? What is a time service? what is a content service? Where are they defined? In order to enable dependency injection, we will populate all of these fields from the constructor of the query processor. Here is how the class would look in its entirety:
using ContentAggregator.Interfaces;
using System.Linq;

namespace ContentAggregator.Jobs
{
public class QueryProcessor
{
private readonly IContentDataService _contentDataService;
private readonly IContentService _contentService;
private readonly IQueryDataService _queryDataService;
private readonly ITimeService _timeService;

public QueryProcessor(ITimeService timeService, IQueryDataService queryDataService, IContentDataService contentDataService, IContentService contentService)
{
_timeService = timeService;
_queryDataService = queryDataService;
_contentDataService = contentDataService;
_contentService = contentService;
}

public void ProcessQueries()
{
var now = _timeService.Now;
var queries = _queryDataService.GetUnprocessed(now);
var contentItems = queries.AsParallel().WithDegreeOfParallelism(3)
.SelectMany(q => _contentService.Query(q.Text));
_contentDataService.Update(contentItems);
}
}
}

Note that the services are only defined as interfaces which we declare in a separate project called ContentAggregator.Interfaces, referred above in the usings block.

Let's ignore the job processor mechanism for a moment and just run ProcessQueries in a test method in the main controller. For this I will have to make dependency injection work and implement the interfaces. For brevity I will do so in the main project, although it would probably be a good idea to do it in a separate ContentAggregator.Implementations project. But let's not get ahead of ourselves. First make the code work, then arrange it all nice, in the refactoring phase.

Implementing the services


I will create mock services first, in order to test the code as it is, so the following implementations just do as little as possible while still following the interface signature.
public class ContentDataService : IContentDataService
{
private readonly static StringBuilder _sb;

static ContentDataService()
{
_sb = new StringBuilder();
}

public void Update(IEnumerable<ContentItem> contentItems)
{
foreach (var contentItem in contentItems)
{
_sb.AppendLine($"{contentItem.FinalUrl}:{contentItem.Title}");
}
}

public static string Output
{
get { return _sb.ToString(); }
}
}

public class ContentService : IContentService
{
private readonly ITimeService _timeService;

public ContentService(ITimeService timeService)
{
_timeService = timeService;
}

public IEnumerable<ContentItem> Query(string text)
{
yield return
new ContentItem
{
OriginalUrl = "http://original.url",
FinalUrl = "https://final.url",
Title = "Mock Title",
Description = "Mock Description",
CreationTime = _timeService.Now,
Time = new DateTime(2017, 03, 26),
ContentType = "text/html",
Error = null,
Content = "Mock Content"
};
}
}

public class QueryDataService : IQueryDataService
{
public IEnumerable<Query> GetUnprocessed(DateTime now)
{
yield return new Query
{
Text="Some query"
};
}
}

public class TimeService : ITimeService
{
public DateTime Now
{
get
{
return DateTime.UtcNow;
}
}
}

Now all I have to do is declare the binding between interface and implementation. The magic happens in ConfigureServices, in Startup.cs:
services.AddTransient<ITimeService, TimeService>();
services.AddTransient<IContentDataService, ContentDataService>();
services.AddTransient<IContentService, ContentService>();
services.AddTransient<IQueryDataService, QueryDataService>();

They are all transient, meaning that for each request of an implementation the system will just create a new instance. Another popular method is AddSingleton.

Using dependency injection


So, now I have to instantiate my query processor and run ProcessQueries.

One way is to set QueryProcessor as a service. I extract an interface, I add a new binding and then I give an interface as a parameter of my controller constructor:
[Authorize]
public class HomeController : Controller
{
private readonly IQueryProcessor _queryProcessor;

public HomeController(IQueryProcessor queryProcessor)
{
_queryProcessor = queryProcessor;
}

public IActionResult Index()
{
return View();
}

[HttpGet("/test")]
public string Test()
{
_queryProcessor.ProcessQueries();
return ContentDataService.Output;
}
}
In fact, I don't even have to declare an interface. I can just use services.AddTransient<QueryProcessor>(); in ConfigureServices and it works as a parameter to the controller.

But what if I want to use it directly, resolve it manually, without injecting it in the controller? One can use the injection of a IServiceProvider instead. Here is an example:
[Authorize]
public class HomeController : Controller
{
private readonly IServiceProvider _serviceProvider;

public HomeController(IServiceProvider serviceProvider)
{
_serviceProvider = serviceProvider;
}

public IActionResult Index()
{
return View();
}

[HttpGet("/test")]
public string Test()
{
var queryProcessor = _serviceProvider.GetService<QueryProcessor>();
queryProcessor.ProcessQueries();
return ContentDataService.Output;
}
}
Yet you still need to use services.Add... in ConfigureServices and inject the service provider in the constructor of the controller.

There is a way of doing it completely separately like this:
var serviceProvider = new ServiceCollection()
.AddTransient<ITimeService, TimeService>()
.AddTransient<IContentDataService, ContentDataService>()
.AddTransient<IContentService, ContentService>()
.AddTransient<IQueryDataService, QueryDataService>()
.AddTransient<QueryProcessor>()
.BuildServiceProvider();
var queryProcessor = serviceProvider.GetService<QueryProcessor>();

This would be the way to encapsulate the ASP.Net Dependency Injection in another object, maybe in a console application, but clearly it would be pointless in our application.

The complete source code after these modifications can be found here. Test the functionality by going to /test on your local server after you start the app.

It is about time to revisit my series on ASP.Net MVC Core. From the time of my last blog post the .Net Core version has changed to 1.1, so just installing the SDK and running the project was not going to work. This post explains how to upgrade a .Net project to the latest version.

Learning ASP.Net MVC series:
  1. Setup
  2. MVC Concepts
  3. Authentication
  4. Entity Framework Fundamentals
  5. Upgrading project to .NET Core 1.1
  6. Dependency Injection and Services

Short version


Pressing the batch Update button for NuGet packages corrupted project.json. Here are the steps to successfully migrate a .Net Core project to a higher version.

  1. Download and install the .NET Core 1.1 SDK
  2. Change the version of the SDK in global.json - you can find out the SDK version by creating a new .Net Core project and checking what it uses
  3. Change "netcoreapp1.0" to "netcoreapp1.1" in project.json
  4. Change Microsoft.NETCore.App version from "1.0.0" to "1.1.0" in project.json
  5. Add
    "runtimes": {
    "win10-x64": { }
    },
    to project.json
  6. Go to "Manage NuGet packages for the solution", to the Update tab, and update projects one by one. Do not press the batch Update button for selected packages
  7. Some packages will restore, but remain in the list. Skip them for now
  8. Whenever you see a "downgrade" warning when restoring, go to those packages and restore them next
  9. For packages that tell you to upgrade NuGet, ignore them, it's an error that probably happens because you restore a package while the previous package restoring was not completed
  10. For the remaining packages that just won't update, write down their names, uninstall them and reinstall them

Code after changes can be found on GitHub

That should do it. For detailed steps of what I actually did to get to this concise list, read on.

Long version


Step 0 - I don't care, just load the damn project!


Downloaded the source code from GitHub, loaded the .sln with Visual Studio 2015. Got a nice blocking alert, because this was a .NET Core virgin computer:
Of course, I could have tried to install that version, but I wanted to upgrade to the latest Core.

Step 1 - read the Microsoft documentation


And here I went to Announcing the Fastest ASP.NET Yet, ASP.NET Core 1.1 RTM. I followed the instructions there, made Visual Studio 2015 load my project and automatically restore packages:
  1. Download and install the .NET Core 1.1 SDK
  2. If your application is referencing the .NET Core framework, your should update the references in your project.json file for netcoreapp1.0 or Microsoft.NetCore.App version 1.0 to version 1.1. In the default project.json file for an ASP.NET Core project running on the .NET Core framework, these two updates are located as follows:

    Two places to update project.json to .NET Core 1.1

  3. to be continued...

I got to the second step, but still got the alert...

Step 2 - fumble around


... so I commented out the sdk property in global.json. I got another alert:


This answer recommended uninstalling old versions of SDKs, in my case "Microsoft .NET Core 1.0.1 - SDK 1.0.0 Preview 2-003131 (x64)". Don't worry, it didn't work. More below:

TL;DR; version: do not uninstall the Visual Studio .NET Core Tooling


And then... got the same No executable found matching command "dotnet=projectmodel-server" error again.

I created a new .NET core project, just to see the version of SDK it uses: 1.0.0-preview2-003131 and I added it to global.json and reopened the project. It restored packages and didn't throw any errors! Dude, it even compiled and ran! But now I got a System.ArgumentException: The 'ClientId' option must be provided. Probably it had something to do with the Secret Manager. Follow the steps in the link to store your secrets in the app. It then worked.

Step 1.1 (see what I did there?) - continue to read the Microsoft documentation


The third step in the Microsoft instructions was removed by me because it caused some problems to me. So don't do it, yet. It was
  1. Update your ASP.NET Core packages dependencies to use the new 1.1.0 versions. You can do this by navigating to the NuGet package manager window and inspecting the “Updates” tab for the list of packages that you can update.

    Updating Packages using the NuGet package manager UI with the last pre-release build of ASP.NET Core 1.1


Since I had not upgraded the packages, as in the Microsoft third step, I decided to do it. 26 updates waited for me, so I optimistically selected them all and clicked Update. Of course, errors! One popped up as more interesting: Package 'Microsoft.Extensions.SecretManager.Tools 1.0.0' uses features that are not supported by the current version of NuGet. To upgrade NuGet, see http://docs.nuget.org/consume/installing-nuget. Another was even more worrisome: Unexpected end of content while loading JObject. Path 'dependencies', line 68, position 0 in project.json. Somehow the updating operation for the packages corrupted project.json! From a 3050 byte file, it now was 1617.

Step 3 - repair what the Microsoft instructions broke


Suspecting it was a problem with the NuGet package manager, I went to the link in the first error. But in Visual Studio 2015 NuGet is included and it was clearly the latest version. So the only solution was to go through each package and see which causes the problem. And I went to 26 packages and pressed Install on each and it worked. Apparently, the batch Update button is causing the issue. Weirdly enough there are two packages that were installed, but remained in the Update tab and also appeared in the Consolidate tab: BundleMinifier.Core and Microsoft.EntityFrameworkCore.Tools, although I can't to anything with them there.

Another package (Microsoft.VisualStudio.Web.CodeGeneration.Tools 1.0.0) caused another confusing error: Package 'Microsoft.VisualStudio.Web.CodeGeneration.Tools 1.0.0' uses features that are not supported by the current version of NuGet. To upgrade NuGet, see http://docs.nuget.org/consume/installing-nuget. Yet restarting Visual Studio led to the disappearance of the CodeGeneration.Tools error.

So I tried to build the project only to be met with yet another project.json corruption error: Can not find runtime target for framework '.NETCoreAPP, Version=v1.0' compatible with one of the target runtimes: 'win10-x64, win81-x64, win8-x64, win7-x64'. Possible causes: [blah blah] The project does not list one of 'win10-x64, win81-x64, win7-x64' in the 'runtimes' [blah blah]. I found the fix here, which was to add
"runtimes": {
"win10-x64": { }
},
to project.json.

It compiled. It worked.

Sometimes we want to serialize and deserialize objects that are dynamic in nature, like having different types of content based on a property. However, the strong typed nature of C# and the limitations of serializers make this frustrating.


In this post I will be discussing several things:
  • How to deserialize a JSON property that can have a value of multiple types that are not declared
  • How to serialize and deserialize a property declared as a base type in WCF
  • How to serialize it back using JSON.Net

The code can all be downloaded from GitHub. The first phase of the project can be found here.

Well, the scenario was this: I had a string in the database that was in JSON format. I would deserialize it using JSON.Net and use the resulting object. The object had a property that could be one of several types, but no special notation to describe its concrete type (like the $type notation for JSON.Net or the __type notation for DataContractSerializer). The only way I knew what type it was supposed to be was a type integer value.

My solution was to declare the property as JToken, meaning anything goes there, then deserialize it manually when I have both the JToken value and the integer type value. However, passing the object through a WCF service, which uses DataContractSerializer and which would throw the exception System.Runtime.Serialization.InvalidDataContractException: Type 'Newtonsoft.Json.Linq.JToken' is a recursive collection data contract which is not supported. Consider modifying the definition of collection 'Newtonsoft.Json.Linq.JToken' to remove references to itself..

I thought I could use two properties that pointed at the same data: one for JSON.Net, which would use JToken, and one for WCF, which would use a base type. It should have been easy, but it was not. People have tried this on the Internet and declared it impossible. Well, it's not!

Let's work with some real data. Database JSON:
{
"MyStuff": [
{
"Configuration": {
"Wheels": 2
},
"ConfigurationType": 1
},
{
"Configuration": {
"Legs": 4
},
"ConfigurationType": 2
}
]
}

Here is the code that works:
[DataContract]
[JsonObject(MemberSerialization.OptOut)]
public class Stuff
{
[DataMember]
public List<Thing> MyStuff;
}

[DataContract]
[KnownType(typeof(Vehicle))]
[KnownType(typeof(Animal))]
public class Configuration
{
}

[DataContract]
public class Vehicle : Configuration
{
[DataMember]
public int Wheels;
}

[DataContract]
public class Animal : Configuration
{
[DataMember]
public int Legs;
}

[DataContract]
public class Thing
{
private Configuration _configuration;
private JToken _jtoken;

[DataMember(Name = "ConfigurationType")]
public int ConfigurationType;

[IgnoreDataMember]
public JToken Configuration
{
get
{
return _jtoken;
}
set
{
_jtoken = value;
_configuration = null;
}
}

[JsonIgnore]
[DataMember(Name = "Configuration")]
public Configuration ConfigurationObject
{
get
{
if (_configuration == null)
{
switch (ConfigurationType)
{
case 1: _configuration = Configuration.ToObject<Vehicle>(); break;
case 2: _configuration = Configuration.ToObject<Animal>(); break;
}
}
return _configuration;
}
set
{
_configuration = value;
_jtoken = JRaw.FromObject(value);
}
}
}

public class ThingContractResolver : DefaultContractResolver
{
protected override JsonProperty CreateProperty(System.Reflection.MemberInfo member,
Newtonsoft.Json.MemberSerialization memberSerialization)
{
var prop = base.CreateProperty(member, memberSerialization);
if (member.Name == "Configuration") prop.Ignored = false;
return prop;
}
}

So now this is what happens:
  • DataContractSerializer ignores the JToken property, and doesn't throw an exception
  • Instead it takes ConfigurationObject and serializes/deserializes it as "Configuration", thanks to the KnownTypeAttributes decorating the Configuration class and the DataMemberAttribute that sets the property's serialization name (in WCF only)
  • JSON.Net ignores DataContract as much as possible (JsonObject(MemberSerialization.OptOut)) and both properties, but then un-ignores the Configuration property when we use the ThingContractResolver
  • DataContractSerializer will add a property called __type to the resulting JSON, so that it knows how to deserialize it.
  • In order for this to work, the JSON deserialization of the original text needs to be done with JSON.Net (no __type property) and the JSON coming to the WCF service to be correctly decorated with __type when it comes to the server to be saved in the database

If you need to remove the __type property from the JSON, try the solution here: JSON.NET how to remove nodes. I didn't actually need to do it.

Of course, it would have been simpler to just replace the default serializer of the WCF service to Json.Net, but I didn't want to introduce other problems to an already complex system.


A More General Solution


To summarize, it would have been grand if I could have used IgnoreDataMemberAttribute and JSON.Net would just ignore it. To do that, I could use another ContractResolver:
public class JsonWCFContractResolver : DefaultContractResolver
{
protected override JsonProperty CreateProperty(System.Reflection.MemberInfo member,
Newtonsoft.Json.MemberSerialization memberSerialization)
{
var prop = base.CreateProperty(member, memberSerialization);
var hasIgnoreDataMember = member.IsDefined(typeof(IgnoreDataMemberAttribute), false);
var hasJsonIgnore = member.IsDefined(typeof(JsonIgnoreAttribute), false);
if (hasIgnoreDataMember && !hasJsonIgnore)
{
prop.Ignored = false;
}
return prop;
}
}
This class now automatically un-ignores properties declared with IgnoreDataMember that are also not declared with JsonIgnore. You might want to create your own custom attribute so that you don't modify JSON.Net's default behavior.

Now, while I thought it would be easy to implement a JsonConverter that works just like the default one, only it uses this contract resolver by default, it was actually a pain in the ass, mainly because there is no default JsonConverter. Instead, I declared the default contract resolver in a static constructor like this:
JsonConvert.DefaultSettings = () => new JsonSerializerSettings
{
ContractResolver=new JsonWCFContractResolver()
};

Now the JSON.Net operations don't need an explicitly declared contract resolver. Updated source code on GitHub


Links that helped

Configure JSON.NET to ignore DataContract/DataMember attributes
replace WCF built-in JavascriptSerializer with Newtonsoft Json.Net json serializer
Cannot return JToken from WCF Web Service
Serialize and deserialize part of JSON object as string with WCF

I am mentally preparing for giving a talk about dependency injection and inversion of control and how are they important, so I intend to clarify my thoughts on the blog first. This has been spurred by seeing how so many talented and even experienced programmers don't really understand the concepts and why they should use them. I also intend to briefly explore these concepts in the context of programming languages other than C#.

And yes, I know I've started an ASP.Net MVC exploration series and stopped midway, and I truly intend to continue it, it's just that this is more urgent.

Head on intro


So, instead of going to the definitions, let me give you some examples, instead.
public class MyClass {
public IEnumerable<string> GetData() {
var provider=new StringDataProvider();
var data=provider.GetStringsNewerThan(DateTime.Now-TimeSpan.FromHours(1));
return data;
}
}
In this piece of code I create a class that has a method that gets some text. That's why I use a StringDataProvider, because I want to be provided with string data. I named my class so that it describes as best as possible what it intends to do, yet that descriptiveness is getting lost up the chain when my method is called just GetData. It is called so because it is the data that I need in the context of MyClass, which may not care, for example, that it is in string format. Maybe MyClass just displays enumerations of objects. Another issue with this is that it hides the date and time parameter that I pass in the method. I am getting string data, but not all of it, just for the last hour. Functionally, this will work fine: task complete, you can move to the next. Yet it has some nagging issues.

Dependency Injection


Let me show you the same piece of code, written with dependency injection in mind:
public class MyClass {
private IDataProvider _dataProvider;
private IDateTimeProvider _dateTimeProvider;

public void MyClass(IDataProvider dataProvider, IDateTimeProvider dateTimeProvider) {
this._dataProvider=dataProvider;
this._dateTimeProvider=dateTimeProvider;
}

public IEnumerable<string> GetData() {
var oneHourBefore=_dateTimeProvider.Now-TimeSpan.FromHours(1);
var data=_dataProvider.GetDataNewerThan(oneHourBefore);
return data;
}
}
A lot more code, but it solves several issues while introducing so many benefits that I wonder why people don't code like this from the get go.

Let's analyse this for a bit. First of all I introduce a constructor to MyClass, one that accepts and caches two parameters. They are not class types, but interfaces, which declare the intention for any class implementing them. The method then does the same thing as in the original example, using the providers it cached. Now, when I write the code of the class I don't actually need to have any provider implementation. I just declare what I need and worry about it later. I also don't need to inject real providers, I can mock them so that I can test my class as standalone. Note that the previous implementation of the class would have returned different data based on the system time and I had no way to control that behavior. The best benefit, for me, is that now the class is really descriptive. It almost reads like English: "Hi, folks, I am a class that needs someone to give me some data and the time of day and I will give you some processed data in return!". The rule of thumb is that for each method, external factors that may influence its behavior must be abstracted away. In our case if the date time provider provides the same time and the data provider the same data, the effect of the method is always the same.

Note that the interface I used was not IStringDataProvider, but IDataProvider. I don't really care, in my class, that the data is a bunch of strings. There is something called the Single Responsibility Principle, which says that a class or a method or some sort of unit of computation should try to only have one responsibility. If you change that code, it should only affect one area. Now, real life is a little different and classes do many things in many directions, yet they can implement any number of interfaces. The interfaces themselves can declare only one responsibility, which is why this is so nice. I don't actually have to have a class that is only a data provider, but in the context of my class, I only need that part and I am clearly declaring my intent in the code.

This here is called dependency injection, which is a fancy expression for saying "my code receives all third party instances as parameters". It is also in line with the Single Responsibility Principle, as now your class doesn't have to carry the responsibility of knowing how to instantiate the classes it needs. It makes the code more modular, easier to test, more legible and more maintainable.

But there is a problem. While before I was using something like new MyClass().GetData(), now I have to push the instantiation of the providers somewhere up the stream and do maybe something like this:
var dataProvider=new StringDataProvider();
var dateTimeProvider=new DateTimeProvider();
var myClass=new MyClass(dataProvider,dateTimeProvider);
myClass.GetData();
The apparent gains were all for naught! I just pushed the same ugly code somewhere else. But here is where Inversion of Control comes in. What if you never need to instantiate anything again? What it you never actually had to write any new Something() code?

Inversion of Control


Inversion of Control actually takes over the responsibility of creating instances from you. With it, you might get this code instead:
public interface IMyClass {
IEnumerable<string> GetData();
}

public class MyClass:IMyClass {
private IDataProvider _dataProvider;
private IDateTimeProvider _dateTimeProvider;

public void MyClass(IDataProvider dataProvider, IDateTimeProvider dateTimeProvider) {
this._dataProvider=dataProvider;
this._dateTimeProvider=dateTimeProvider;
}

public IEnumerable<string> GetData() {
var oneHourBefore=_dateTimeProvider.Now-TimeSpan.FromHours(1);
var data=_dataProvider.GetDataNewerThan(oneHourBefore);
return data;
}
}
Note that I created an interface for MyClass to implement, one that declares my GetData method. Now, to use it, I could write something like this:
var myClass=Dependency.Get<IMyClass>();
myClass.GetData();

Wow! What happened here? I just used a magical class called Dependency that gets me an instance of IMyClass. And I really don't care how it does it. It can discover implementations by itself or maybe I am manually binding interfaces to implementations when the application starts (for example Dependency.Bind<IMyClass,MyClass>();). When it needs to create a new MyClass it automatically sees that it needs two other interfaces as parameters, so it gets implementations for those first and continues up the chain. It is called a dependency chain and the container will go through it all to simply "Get" you what you need. There are many inversion of control frameworks out there, but the concept is so simple that one can make their own easily.

And I get another benefit: if I want to display some other type of data, all I have to do is instruct the dependency container that I want another implementation for the interface. I can even think about versioning: take a class that I know does the job and compare it with a new implementation of the same interface. I can tell it to use different versions based on the client used. And all of this in exactly one place: the dependency container bindings. You may want to plug different implementations provided by third parties and all they have to care about is respecting the contract in your interface.



Solution structure


This way of writing code forces some changes in the structure of your projects. If all you have is written in a single project, you don't care, but if you want to split your work in several libraries, you have to take into account that interfaces need to be referenced by almost everything, including third party modules that you want to plug. That means the interfaces need their own library. Yet in order to declare the interfaces, you need access to all the data objects that their members need, so your Interfaces project needs to reference all the projects with data objects in them. And that means that your logic will be separated from your data objects in order to avoid circular dependencies. The only project that will probably need to go deeper will be the unit and integration test project.

Bottom line: in order to implement this painlessly, you need an Entities library, containing data objects, then an Interfaces library, containing the interfaces you need and, maybe, the dependency container mechanism, if you don't put it in yet another library. All the logic needs to be in other projects. And that brings us to a nice side effect: the only connection between logic modules is done via abstractions like interfaces and simple data containers. You can now substitute one library with another without actually caring about the rest. The unit tests will work just the same, the application will function just the same and functionality can be both encapsulated and programatically described.

There is a drawback to this. Whenever you need to see how some method is implemented and you navigate to definition, you will often reach the interface declaration, which tells you nothing. You then need to find classes that implement the interface or to search for uses of the interface method to find implementations. Even so, I would say that this is an IDE problem, not a dependency injection issue.

Other points of view


Now, the intro above describes what I understand by dependency injection and inversion of control. The official definition of Dependency Injection claims it is a subset of Inversion of Control, not a separate thing.

For example, Martin Fowler says that when he and his fellow software pattern creators thought of it, they called it Inversion of Control, but they decided that it was too broad a term, so they moved to calling it Dependency Injection. That seems strange to me, since I can describe situations where dependencies are injected, or at least passed around, but they are manually instantiated, or situations where the creation of instances is out of the control of the developer, but no dependencies are passed around. He seems to see both as one thing. On the other hand, the pattern where dependencies are injected by constructor, property setters or weird implementation of yet another set of interfaces (which he calls Dependency Injection) is different from Service Locator, where you specifically ask for a type of service.

Wikipedia says that Dependency Injection is a software pattern which implements Inversion of Control to resolve dependencies, while it calls Inversion of Control a design principle (so, not a pattern?) in which custom-written portions of a computer program receive the flow of control from a generic framework. It even goes so far as to say Dependency Injection is a specific type of Inversion of Control. Anyway, the pages there seem to follow the general definitions that Martin Fowler does, which pits Dependency Injection versus Service Locator.

On StackOverflow a very well viewed answer sees dependency injection as "giving an object its instance variables". I tend to agree. I also liked another answer below that said "DI is very much like the classic avoiding of hardcoded constants in the code." It makes one think of a variable as an abstraction for values of a certain type. Same page holds another interesting view: "Dependency Injection and dependency Injection Containers are different things: Dependency Injection is a method for writing better code, a DI Container is a tool to help injecting dependencies. You don't need a container to do dependency injection. However a container can help you."

Another StackOverflow question has tons of answers explaining how Dependency Injection is a particular case of Inversion of Control. They all seem to have read Fowler before answering, though.

A CodeProject article explains how Dependency Injection is just a flavor of Inversion of Control, others being Service Locator, Events, Delegates, etc.

Composition over inheritance, convention over configuration


An interesting side effect of this drastic decoupling of code is that it promotes composition over inheritance. Let's face it: inheritance was supposed to solve all of humanity's problems and it failed. You either have an endless chain of classes inheriting from each other from which you usually use only one or two or you get misguided attempts to allow inheritance from multiple sources which complicates understanding of what does what. Instead interfaces have become more widespread, as declarations of intent, while composition has provided more of what inheritance started off as promising. And what is dependency injection if not a sort of composition? In the intro example we compose a date time provider and a data provider into a time aware data provider, all the time while the actors in this composition need to know nothing else than the contracts each part must abide by. Do that same thing with other implementations and you get a different result. I will go as far as to say that inheritance defines what classes are, while composition defines what classes do, which is what matters in the end.

Another interesting effect is the wider adoption of convention over configuration. For example you can find the default implementation of an interface as the class that implements it and has the same name minus the preceding "I". Rather than explicitly tell the framework that we want to use the Manager class each time someone needs an IManager implementation, it can figure it out for itself by naming alone. This would never work if the responsibility of getting class instances resided with each method using them.

Real life examples


Simple Injector


If you look on the Internet, one of the first dependency injection frameworks you find for .Net is Simple Injector, which works on every flavor of .Net including Mono and Core. It's as easy to use as installing the NuGet package and doing something like this:
// 1. Create a new Simple Injector container
var container = new Container();

// 2. Configure the container (register)
container.Register<IUserRepository, SqlUserRepository>(Lifestyle.Transient);
container.Register<ILogger, MailLogger>(Lifestyle.Singleton);

// 3. Optionally verify the container's configuration.
container.Verify();

// 4. Get the implementation by type
IUserService service = container.GetInstance<IUserService>();

ASP.Net Core


ASP.Net Core has dependency injection built in. You configure your bindings in ConfigureServices:
public void ConfigureServices(IServiceCollection svcs)
{
svcs.AddSingleton(_config);

if (_env.IsDevelopment())
{
svcs.AddTransient<IMailService, LoggingMailService>();
}
else
{
svcs.AddTransient<IMailService, MailService>();
}

svcs.AddDbContext<WilderContext>(ServiceLifetime.Scoped);

// ...
}
then you use any of the registered classes and interfaces as constructor parameters for controllers or even using them as method parameters (see FromServicesAttribute)

Managed Extensibility Framework


MEF is a big beast of a framework, but it can simplify a lot of work you would have to do to glue things together, especially in extensibility scenarios. Typically one would use attributes to declare which interface something "exports" and then use other attributes to "import" implementations in properties and values. All you need to do is put them in the same place. Something like this:
[Export(typeof(ICalculator))]
class SimpleCalculator : ICalculator {
//...
}

class Program {

[Import(typeof(ICalculator))]
public ICalculator calculator;

// do something with calculator
}
Of course, in order for this to work seamlessly you need stuff like this, as well:
private Program()
{
//An aggregate catalog that combines multiple catalogs
var catalog = new AggregateCatalog();
//Adds all the parts found in the same assembly as the Program class
catalog.Catalogs.Add(new AssemblyCatalog(typeof(Program).Assembly));
catalog.Catalogs.Add(new DirectoryCatalog("C:\\Users\\SomeUser\\Documents\\Visual Studio 2010\\Projects\\SimpleCalculator3\\SimpleCalculator3\\Extensions"));


//Create the CompositionContainer with the parts in the catalog
_container = new CompositionContainer(catalog);

//Fill the imports of this object
try
{
this._container.ComposeParts(this);
}
catch (CompositionException compositionException)
{
Console.WriteLine(compositionException.ToString());
}
}

Dependency Injection in other languages


Admit it, C# is great, but it is not by far the most used computer language. That place is reserved, at least for now, for Javascript. Not only is it untyped and dynamic, but Javascript isn't even a class inheritance language. It uses the so called prototype inheritance, which uses an instance of an object attached to a type to provide default values for the instance of said type. I know, it sounds confusing and it is, but what is important is that it has no concept of interfaces or reflection. So while it is trivial to create a dictionary of instances (or functions that create instances) of objects which you could then use to get what you need by using a string key (something like var manager=Dependency.Get('IManager');, for example) it is difficult to imagine how one could go through the entire chain of dependencies to create objects that need other objects.

And yet this is done, by AngularJs, RequireJs or any number of modern Javascript frameworks. The secret? Using regular expressions to determine the parameters needed for a constructor function after turning it to string. It's complicated and beyond the scope of this blog post, but take a look at this StackOverflow question and its answers to understand how it's done.

Let me show you an example from AngularJs:
angular.module('myModule', [])
.directive('directiveName', ['depService', function(depService) {
// ...
}])
In this case the key/type of the service is explicit using an array notation that says "this is the list of parameters that the dependency injector needs to give to the function", but this might be have been written just as the function:
angular.module('myModule', [])
.directive('directiveName', function(depService) {
// ...
})
In this case Angular would use the regular expression approach on the function string.


What about other languages? Java is very much like C# and the concepts there are similar. Even if all are flavors of C, C++ is very different, yet Dependency Injection can be achieved. I am not a C++ developer, so I can't tell you much about that, but take a look at this StackOverflow question and answers; it is claimed that there is no one method, but many that can be used to do dependency injection in C++.

In fact, the only languages I can think of that can't do dependency injection are silly ones like SQL. Since you cannot (reasonably) define your own types or pass functions along, the concept makes no sense. Even so, one can imagine creating dummy stored procedures that other stored procedures would use in order to be tested. There is no reason why you wouldn't use dependency injection if the language allows for it.

Testability


I mentioned briefly unit testing. Dependency Injection works hand in hand with automated testing. Given that the practice creates modules of software that give reproducible results for the same inputs and account for all the inputs, testing becomes a breeze. Let me give you some examples using Moq, a mocking library for .Net:
var dateTimeMock=new Mock<IDateTimeProvider>();
dateTimeMock
.Setup(m=>m.Now)
.Returns(new DateTime(2016,12,03));

var dataMock=new Mock<IDataProvider>();
dataMock
.Setup(m=>m.GetDataNewerThan(It.IsAny<DateTime>()))
.Returns(new[] { "test","data" });

var testClass=new MyClass(dateTimeMock.Object, dataMock.Object);

var result=testClass.GetData();
AssertDeepEqual(result,new[] { "test","data" });

First of all, I take care of all dependencies. I create a "mock" for each of them and I "set up" the methods or property setters/getters that interest me. I don't really need to set up the date time mock for Now, since the data from the data provider is always the same no matter the parameter, but it's there for you to see how it's done. Second, I instantiate the class I want to test using the Object property of my mocks, which returns an object that implements the type given as a generic parameter in the constructor. Third I assert that the side effects of my call are the ones I expect. The mocks need to be as dumb as possible. If you feel you need to write code to define your mocks you are probably doing something wrong.

The type of the tests, for people who are not familiar with this concept, is usually a fully positive one - that is give full valid data and expect the correct result - followed by many negative ones, where the correct data is made incorrect in all possible ways and it is tested that the method fails. If there are many types of combinations of data that would be considered valid, you need a test for as many of them.

Note that the test is instantiating the test class directly, using the constructor. We are not testing the injector here, but the actual class.

Conclusions


What I appreciate most with Dependency Injection is that it forces you to write code that has clear boundaries defined by interfaces. Once this is achieved, you can go write your own stuff and not care about what other people do with theirs. You can test your modules without even caring if the rest of the project even exists. It allows to refactor code in steps and with a lot more confidence since you are covered by unit tests.

While some people work on fire-and-forget projects, like small games or utilities, and they don't care about maintainability, one of the most touted reasons for using unit tests and dependency injection, these practices bring so many other benefits that are almost impossible to get otherwise.

The entire point of this is reducing the complexity of dependencies, which include not only the modules in your application, but also the support frame for them, like people working on them. While some managers might not see the wisdom of reducing friction between software components, surely they can see the positive value of reducing friction between people.

There was one other topic that I wanted to touch, but it is both vast and I have not enough experience with it, however it feels very attractive to me: refactoring old code in order to use dependency injection. Best practices, how to make it safe enough and fast enough to make managers approve it and so on. Perhaps another post later on. I was thinking of a combination of static analysis and automated methods, like replacing all usages of "new" with a single point of instantiation, warning about static methods and properties, automatically replacing known bad practices like DateTime.Now and so on. It might be interesting, right?

I hope I wasn't too confusing and I appreciate any feedback you have. I will be working on a presentation file with similar content, so any help will go into doing a better job explaining it to others.

and has 0 comments
A colleague of mine hit a strange bug today. It so happened that we use a bastardized dependency injection method that takes into account the WCF session before returning an implementation of an interface. In a piece of code the injection failed and we couldn't see why for a while. Let me give you a simplified version:
var someManager=Package.Get<IManager>();
var someDTOs=Cache.GetDatabaseObjects().Select(x=>x.Pack());

public class DataObject {
public string Data {get;set;}
public DataObjectDTO Pack() {
var anotherManager=Package.Get<IAnother>();
return new DataObjectDTO {
Data=anotherManager.Process(Data)
};
}
}

Package.Get will attempt to find a session object and if not it will use another mechanism, but if it finds one, it will only use it if it is not expired or invalid, else throwing an exception. This code failed in the Pack method, when trying to get an instance of IAnother. Please take a few moments to reflect on why (and no, it's not that between calls the session expired).

Show explanation

and has 0 comments
I've stumbled upon a very funny exception today. Basically I was creating a constant string from adding some other constant strings to each other. And it worked. The moment I added an integer, though, I got The expression being assigned to 'Program.x2' must be constant. The code that generated this error is simple:
const string x2 = "string" + 2;
Note that
const string x2 = "string" + "2";
is perfectly valid. Got the same result when using VS2010 and VS2015, so it's not a compiler bug, it's intended behavior.

So, what's going on? Well, my code transforms behind the scenes into
const string x2 = "string" + 2.ToString();
which is not constant because of ToString!

The only way to solve it was to declare the numeric constant as string as well.

.Net Core Web API uses Newtonsoft's Json.NET to do JSON serialization and for other cases where you wanted to control Json.NET options you would do something like
JsonConvert.DefaultSettings = (() =>
{
var settings = new JsonSerializerSettings();
// do something with settings
return settings;
});
, but in this case it doesn't work. The way to do it is to use the fluent interface method and hook yourself in the ConfigureServices(IServiceCollection services) method, after the call to .AddMvc(), like this:
services
.AddMvc()
.AddJsonOptions(options =>
{
var settings=options.SerializerSettings;
// do something with settings
});

In my particular case I wanted to serialize enums as strings, not as integers. To do that, you need to use the StringEnumConverter class. For example if you wanted to serialize the Gender property of a person as a string you could have defined the entity like this:
public class Person
{
public string Name { get; set; }
[JsonConverter(typeof(StringEnumConverter))]
public GenderEnum Gender { get; set; }
}

In order to do this globally, add the converter to the settings converter list:
services
.AddMvc()
.AddJsonOptions(options =>
{
options.SerializerSettings.Converters.Add(new StringEnumConverter {
CamelCaseText = true
});
});

Note that in this case, I also instructed the converter to use camel case. The result of the serialization ends up as:
{"name":"James Carpenter","age":51,"gender":"male"}

and has 0 comments
I was doing this silly HackerRank algorithm challenge and I got the solution correctly, but it would always time out on test 7. I wracked my brain on all sorts of different ideas but to no avail. I was ready to throw in the towel and check out other people solutions, only they were all in C++ and seemed pretty similar to my own. And then I've made a code change and the test passed. I had replaced LINQ's OrderBy with Array.Sort.

Intrigued, I started investigating. The idea was creating a sorted integer array from a space delimited string of values. I had used Console.ReadLine().Split(' ').Select(s=>int.Parse(s)).OrderBy(v=>v); and it consumed above 7% of the total CPU of the test. Now I was using var arr=Console.ReadLine().Split(' ').Select(s=>int.Parse(s)).ToArray(); Array.Sort(arr); and the CPU usage for that piece of the code was 1.5%. So it was almost five times slower. How do the two implementations differ?

Array.Sort should be simple: an in place quicksort, the best general solution for this sort (heh heh heh) of problem. How about Enumerable.OrderBy? It returns an OrderedEnumerable which internally uses a Buffer<T> to get all the values in a container, then uses an EnumerableSorter to ... quicksort the values. Hmm...

Let's get back to Array.Sort. It's not as straightforward as it seems. First of all it "tries" a SZSort. If it works, fine, return that. This is an external native code implementation of QuickSort on several native value types. (More on that here) Then it goes to a SorterObjectArray that chooses, based on framework target, to use either an IntrospectiveSort or a DepthLimitedQuickSort. Even the implementation of this DepthLimitedQuickSort is much, much more complex than the quicksort used by OrderBy. IntrospectiveSort seems to be the one preferred for the future and is also heavily optimized, but less complex and easier to understand, perhaps. It uses quicksort, heapsort and insertionsort together.

Now, before you go all "OrderBy sucks!", read more about it. This StackOverflow list of answers seems to indicate that in case of strings, at least, the performance is similar. A lot of other interesting things there, as well. OrderBy uses a "stable" QuickSort, meaning that two items that are compared as equal will appear in their original order. Array.Sort does not guarantee that.

Anyway, the performance difference in my particular case seems to come from the native code implementation of the sort for integers, rather than algorithmic improvements, although I don't have the time right now to grab the various implementations and test them properly. However, just from the way the code reads, I would bet the IntrospectiveSort will compare favorably to the simple Quicksort implementation used in OrderBy.

Update 12 August 2018: I've updates the project to Core 2.1 and changed the Add method to treat OutOfMemoryExceptions in case of memory fragmentation.

I have been looking for a job lately and so I've run into these obnoxious interviews when someone asks you to find the optimal algorithm for some task. And as much as I hate those questions in an interview, they got me thinking about all kinds of normal situations where the implementation is not optimal. I believe it will be hard to find something less optimal than the List<T> implementation in .Net.

Reference


For reference, look at the implementation code in its entirety.

Story


I will be discussing some of the glaring issues with the code, but before that, let's talk about the "business case". You have a C-like programming language, armed with all the default types for such, like bool, int, array, etc. and you need a dynamically sized container, one where you can add, insert and remove elements. The first idea is to use an array, then resize it accordingly. Only you can't know how large the array will need to be and you can't just allocate memory and then resize that allocation, as other variables have already occupied the next blocks of memory. The solution might be to allocate an initial array, then - when its size is no longer sufficient - create a larger one, copy what you need and make the changes in it.

A comment in the source tells us what the developers meant to do: The list is initially empty and has a capacity of zero. Upon adding the first element to the list the capacity is increased to 16, and then increased in multiples of two as required. So, an empty list is just a wrapper for an empty array. A list with one element occupies the size of a 16 element array, but if you add up to 17 elements, the array will double and continue to double.

Implementation


Let's check the algorithmic complexity for the add operation. For the first 16 elements you just add elements in the internal array, n operations for n elements. Once you get to 17, the complexity increases, as you need to copy all previous 16 values first. Now it's 16+16+1, which continues up to 33, where you have 16+16+16+32+1. At 65 it's 16+16+16+32+32+64+1. So while we are adding element after element the operational complexity is on average twice as much as using an array. Meanwhile, the space occupied is half more than you actually need.

Insertion is similarly tricky, but even more inefficient. Basically, when you insert a value or a range, the first operation is EnsureCapacity, which may copy the entire array in a new array. Only afterward the insert algorithm is run and it again copies the part of the array found after the index for the insert.

Removal works in the opposite direction with the caveat that it never decreases the size of the array. If you added 10 million records in your list, then deleted them, your list now contains an internal array that is 10 million elements in size. There is a method called TrimExcess that tries to solve this, but you must call it manually. At least RemoveAll is using an O(n) algorithm instead of calling Remove repeatedly, which would have been a disaster.

The piece of code that sets the internal dimension of the list is actually in the setter of the Capacity property, and it dumbly creates an array and copies the values from the current one to the new one.

A lot of the other operations are implemented by calling the static methods on the Array class: Array.IndexOf, Array.Sort, Array.BinarySearch, Array.Reverse, etc.

The last issue that List has is that, as an array wrapper, it needs contiguous memory space. There will be times where your code will fail not because there is not enough free memory, but because it is fragmented and the runtime cannot find a free block that is large enough for your data.

Better solutions


Before I start spouting all kinds of stupid things, I will direct you to the venerable C5 collection library, which is very well designed, documented and tested. It contains all kinds of containers to optimize whichever scenario you might have been thinking about.

Let's think solutions now. The major problem of this implementation is the need of a continuous array. Why not solve it by adding more arrays instead of replacing the one with another twice as large? When the capacity is exceeded, we create a new array of similar size, then link it to our list. And since we want to have index access, not linked list access, we need to add this array into an array of arrays.

What would that mean? Memory doesn't need to be contiguous. Adding is twice as fast. Inserting is fast, also, as you only need to insert a new array between existing arrays and move around the data in a single inner array. Accessing by index is a bit more complicated, but not by much. Removal is just as simple, with the added bonus that some inner arrays might become empty and be removed directly.

This is by far not "the best" option, but just one level of optimization that tries to fix the biggest single problem with the current implementation. As one of my friends used to say: first collect data about your program, see where the bottlenecks are, then proceed to fix them in decreasing order of importance. The source can be found on GitHub.

Notes


A tester program of sorts is showing the Count, Capacity and time for random operations for a normal list and the FragmentList<T>. The next line shows the individual lengths of the inner arrays. Note that I cheated by using Lists instead of arrays. It only makes sense, as the simplistic operations of List<T> now have a limited negative impact on individual fragments. Take note of AutoDefragmentThreshold, which is 0.8 by default. It replaces all the internal lists with a single contiguous one when there are more than 80% of internal lists that are smaller than a tenth of the total count. To disable the feature you need to set the value to more than 1, not 0. I also implemented all the public methods of List<T>, although you might only need to implement IList<T> and the *Range methods.

Well, enjoy! Hope it helps.