Quantcast
Channel: Andrew Lock | .NET Escapades
Viewing all articles
Browse latest Browse all 743

Thoughts about primary constructors: 3 pros and 5 cons

$
0
0

In my previous post I provided an introduction to primary constructors in C#12. In this post I describe some of the ways I like to use primary constructor and also some of the things I don't like about them.

The main approaches to primary constructors

As I described in my previous post, there are two main ways to use primary constructors:

  • Using the parameters to initialize fields and properties.
  • Using the parameters directly in members, implicitly capturing the parameters as fields.

In the first case, you use the constructor parameters in initialization code for fields and properties:

public class Person(string firstName, string lastName)
{
    private readonly string _firstName = firstName; // 👈 initialized
    private readonly string _lastName = lastName;
}

and the compiler synthesizes a constructor that does the assignment for you:

public class Person
{
    private readonly string _firstName;
    private readonly string _lastName;

    // This constructor is synthesized by the compiler
    public Person(string firstName, string lastName)
    {
        _firstName = firstName;
        _lastName = lastName;
    }
}

The other approach is to reference the parameters directly in members:

public class Person(string firstName, string lastName)
{
    // Directly referenced       👇          👇
    public string FullName => $"{firstName} {lastName}";
}

In this case, the compiler also adds mutable fields to the class to capture the variables:

public class Person
{
    private string <firstName>P; // Generated by the compiler
    private string <lastName>P; // Generated by the compiler

    public string FullName =>  string.Concat(<firstName>P, " ", <lastName>P);

    // The generated constructor sets the values of the generated fields
    public Person(string firstName, string lastName)
    {
        <firstName>P = firstName;
        <lastName>P = lastName;
    }
}

I go into more detail in my previous post, so I'll leave it there for now. For the remainder of the post I describe what I consider to be the sweet spot for primary constructors, a well as some things to watch out for and that I don't like!

The best use cases for primary constructors

In this section I provide an opinionated list of cases where I think primary constructors are a good fit. I don't think there's anything revolutionary or controversial in this section, but you never know!

Basic field initialization

One of the primary use cases for primary constructors is initialization of fields and properties, and this is where it really shines in my opinion. Primary constructors strictly reduce the amount of code you have to write, and instead the compiler generates that code for you. What's not to like!?

The example I showed at the start of this post, in which we're simply assigning to a couple of fields, is a prime example of a good use case:

public class Person(string firstName, string lastName)
{
    private readonly string _firstName = firstName;
    private readonly string _lastName = lastName;
}

Where things start to get more complex is if you want to add validation. Adding a throw expression for null isn't too bad:

public class Person(string firstName, string lastName)
{
    private readonly string _firstName = firstName ?? throw new ArgumentNullExpression(nameof(firstName));
    private readonly string _lastName = lastName ?? throw new ArgumentNullExpression(nameof(lastName));
}

But we've already partially obscured the simple fields. Once we start needing to use ternaries, things get very borderline in my opinion:

public class Person(string firstName, string lastName)
{
    private readonly string _firstName = !string.IsNullOrEmpty(firstName) ? firstName : throw new ArgumentException("Must not be null or empty", nameof(firstName));

    private readonly string _lastName
        = !string.IsNullOrEmpty(lastName)
            ? lastName
            : throw new ArgumentException("Must not be null or empty", nameof(lastName));
}

Whether you choose the firstName or lastName formatting shown above (they're equivalent), it's all a bit much in my opinion. You also can't use the handy guard methods introduced in .NET 7+ when initializing fields like this, unless you call out to a separate function. All of that feels like bending over backwards to use primary constructors, where a standard constructor would actually be less code and easier to read:

public class Person
{
    private readonly string _firstName;
    private readonly string _lastName;

    public Person(string firstName, string lastName)
    {
        ArgumentException.ThrowIfNullOrEmpty(firstName);
        ArgumentException.ThrowIfNullOrEmpty(lastName);
        _firstName = firstName;
        _lastName = lastName;
    }
}

Obviously you could create your own helpers to work around these limitations and reduce the verbosity while sticking to primary constructors. Or, you could just use standard constructors in those cases instead of reinventing the wheel.

So, to reiterate, I think primary constructors are great when you just need to do simple initialization of fields or properties. But if you need validation of the parameters, I have a very low threshold for converting to a standard constructor.

Initialization in test code

One prime example of code that doesn't need to do any validation of the constructor parameters is your test classes. Take this very simple example of an xunit test for the Person type:


public class PersonTests(ITestOutputHelper output)
{
    [Theory]
    [InlineData("Andrew", "")]
    public void Person_throws_if_values_are_null_or_empty(string firstName, string lastName)
    {
        output.WriteLine($"Testing '{firstName}' and '{lastName}'");
        Assert.Throws<ArgumentException>(() => new Person(firstName, lastName));
    }
}

Ignore the question of whether you think this test adds value, or if really need to use the ITestOutputHelper in the method. The point of interest here is that adding ITestOutputHelper as a primary constructor parameter simplifies the test class in general. Compare it to the version without a primary constructor:

public class PersonTests
{
    private readonly ITestOutputHelper _output;

    public PersonTests(ITestOutputHelper output)
    {
        _output = output;
    }

    [Theory]
    [InlineData("Andrew", "")]
    public void Person_throws_if_values_are_null_or_empty(string firstName, string lastName)
    {
        _output.WriteLine($"Testing '{firstName}' and '{lastName}'");
        Assert.Throws<ArgumentException>(() => new Person(firstName, lastName));
    }
}

This practically doubles the length of the test code, without adding any extra value. It's completely irrelevant to the test whether or not ITestOutputHelper is obviously stored in a field and it doesn't matter whether it's mutable or not. Using primary constructors here is unequivocally a net win here.

Dependency injection in MVC controllers

If you're using ASP.NET Core, you're presumably also using dependency injection. Most of the time, you're probably not doing much or any validation of the dependencies that are injected into your classes. As per my original suggestion, that makes them a good candidate for primary constructors. For example:

public class MyController(ILogger<MyController> logger, IService service) : ControllerBase
{
    private readonly ILogger<MyController> _logger = logger;
    private readonly IService _service = service;

    [HttpGet]
    public ActionResult<MyThing> Get()
    {
        _logger.LogInformation("Getting the thing");
        return _service.DoThing();
    }
}

I think it's safe to say a "regular" constructor doesn't add anything over a primary constructor in this case; it would simply be extra duplication. In fact, this is a case where you don't gain much from initializing fields over using the values directly:

public class MyController(ILogger<MyController> logger, IService service) : ControllerBase
{
    [HttpGet]
    public ActionResult<MyThing> Get()
    {
        logger.LogInformation("Getting the thing");
        return service.DoThing();
    }
}

This obviously looks quite a lot nicer, but just be aware that you're now creating the dependencies as mutable fields behind-the-scenes. That means you can assign to them, even though you probably don't ever want to do that.

If your class controller class really looks this simple, you could also just inject the dependencies directly into the Get() method instead of using a constructor.

That said, MVC controllers are probably a good candidate for this style of primary constructor: they likely won't have any validation requirements for the injected dependencies; they shouldn't have any complex logic in them which would mean we want to make sure the fields are readonly; and you could even name the parameters using "field" conventions to make the source more apparent (more on that later).

In fact, you'll probably find that a lot of day-to-day code you write in your application could use primary constructors without any significant impact other than reducing the lines of code.

Nevertheless there are things to watch out for, over-and-above the "overly complex initializer" that I've already flagged. In many ways I'm actually not a fan of some aspects of primary constructors, particularly when they're used with implicit capture. In the next section I describe some of the things I don't like.

Gotchas, or, "things I don't like"

Before I start, I'll just point out that none of these complaints are deal-breakers, or mean that you shouldn't necessarily use them. They're just things that I don't like about some usages of primary constructors, or issues I've run into.

Also bear in mind my current perspective as a library-author (the Datadog .NET client library); I may well be concerned about different things to you! For what it's worth, while writing this post, I found a post on NDepend's blog from a year ago, and they highlighted many of the same gripes!

Duplicate capture (minor)

This complaint is a relatively obvious and minor one: if you initialize a field with a primary constructor parameter and you implicitly capture the value in a member, then you'll have two fields storing the same value:

public class Person(string firstName, string lastName)
{
    private readonly string _firstName = firstName; // 👈 initialized
    private readonly string _lastName = lastName;

    public string FullName => $"{firstName} {lastName}"; // 👈 implicitly captured as fields
}

This issue is obvious enough that the compiler automatically adds warnings if you do this by accident:

Warnings generated when primary constructors are captured more than once

Given there's warnings about this case, this generally isn't something you'll need to worry about too much normally, but it's definitely something to be aware of (i.e. don't ignore the warnings!), and makes sense when you think about how the capture is implemented.

Implicit fields can't be readonly

I showed previously that if you implicitly capture a primary constructor parameter, the compiler generates a field to store it:

public class Person
{
    // implicit capture fields
    private string <firstName>P;
    private string <lastName>P;
    // ...
}

An important feature of those fields is that they're mutable, i.e. they aren't marked readonly. That means you're free to set these fields outside of the constructor body, as I showed in my previous post.

If that's what you need then no problem. But if a field shouldn't be changed, I would normally religiously mark it as readonly. Marking the field as readonly makes it easier to reason about the intent of the field and it removes a whole class of bugs related to accidental modification of the field.

This again is a relatively minor point, and it's also something that I wouldn't be surprised to see addressed in a future version of C#, but right now, it irks me a bit.

The "answer" here is to not use implicit capture if you want readonly fields. Instead, create the fields manually, mark them readonly, and use the primary constructor to initialize the fields instead.

Implicit fields change the struct layout

This is a relatively niche issue, but if you're working with structs, high-performance code, and/or native interop, then it's sometimes important to know exactly how big a type is, and how its contents are laid out. That involves knowing which fields exist, as well as the order they're declared.

If I have a struct like this:

public struct Person(string firstName, string lastName, int age)
{
    public readonly int Age = age;
    public string FullName => $"{firstName} {lastName}";
}

Then the size of the struct is the sum of the size of the fields. How many fields are there? Based on what we've seen already, there's 3:

  • int Age
  • string <firstName>P
  • string <lastName>P

Which gives a size of 8×3=24 bytes. Ok, all well and good. But what if in a later refactoring we realise we don't want to split firstName and lastName, and instead just use name:

public struct Person(string name, int age)
{
    public readonly int Age = age;
    public string FullName => name;
}

Suddenly we've gone from 3 fields to 2, but without obviously changing the fields (unless you know how primary constructors are implemented!) This could have very bad consequences if we're relying on the size of Person anywhere!

Similarly, if we care about the order of the fields, we're a bit stuck. How are the implicit fields created relative to the "real" fields? We can check the decompiled code, but fundamentally we're at the whim of the compiler here, and the details could change in the future.

This is obviously a niche use case for most people and the simple answer is just don't use primary constructors in this situation. Or, again, use primary constructors to initialize existing fields, and don't rely on the implicit capture. Are you noticing a pattern here 😉

Naming convention confusion

The next point is again a minor one, but it's one that you inevitably have to face early on: what conventions should you use for the primary constructor parameters?

Whether or not this is a problem for you may depend on the naming conventions you use. In this post I'm assuming you're using the default Visual Studio-style conventions used by the .NET runtime team. In these conventions: classes, properties, and methods use PascalCase; parameters use camelCase: and private instance fields use camelCase with an _ prefix.

Now, lets consider an example you've seen previously:

public class Person(string firstName, string lastName)
{
    public string FullName => $"{firstName} {lastName}";

    public void SetName(string first, string last)
    {
        firstName = first;
        lastName = last;
    }
}

In this example, the primary constructor parameters are acting much more like fields than like parameters, so does it really make sense to use camalCase for the parameters here? 🤔 Maybe the following would make more sense:

public class Person(string _firstName, string _lastName)
{
    public string FullName => $"{_firstName} {_lastName}";

    public void SetName(string first, string last)
    {
        _firstName = first;
        _lastName = last;
    }
}

Internal to the type, this feels much better. You get better glanceability that the variables being set in SetName are effectively fields.

However, now the constructor parameter names themselves are a bit weird…

var p = new Person(_firstName: "Andrew", _lastName: "Lock");

This isn't really an issue if you're never going to directly create an instance of the type, because it's an MVC controller or an xunit test class for example. These are created by the framework directly, so you never have the weird code above. But it would also seems strange to have different conventions for these special cases.

The problem obviously goes away if you only use primary constructors for initialization, as now they really are only parameters and there aren't any implicit fields:

public class Person(string firstName, string lastName)
{
    private readonly string _firstName = firstName;
    private readonly string _lastName = lastName;

    public string FullName => $"{_firstName} {_lastName}";
}

Now, even though I said I'm focusing on the current .NET runtime conventions, I'll point out that using camelCase for private fields and referencing them with this. doesn't solve the problem for you: your code won't even compile if you try to use this.:

You can't use this. to reference implicit primary constructor fields

For what it's worth, Microsoft suggests using the "parameter" naming convention. This is probably because they say:

It's important to view primary constructor parameters as parameters

Which I kind of disagree with. I think it's important to understand how they're implemented (as fields if you use implicit capture), but to each their own.

Record confusion

OK, this final one is definitely overly pedantic, but it caught me the other day and made me grumble. I was working on some code that was using a record initially, something a bit like this:

var p = new Person("Andrew", "Lock");
var i = new Invoice(p);

public record Person(string FirstName, string LastName);

public class Invoice
{
    private readonly string _customer;
    public Invoice(Person p)
    {
        _customer = $"{p.FirstName} {p.LastName}";
    }
}

At some point, I realised I didn't want the structural equality record brings for Person, so I changed the Person definition to be a class instead of record:

public class Person(string FirstName, string LastName);

And then, my code stopped compiling.

Primary constructors don't automatically generate properties other than in records

And it took me far too long to realise that despite the essentially identical syntax, it was because record converts the primary constructor parameters into properties, whereas for class it doesn't. And I obviously know that, but when syntax that is identical has fundamentaly different behaviour in slightly different circumstances, and you're in the flow, sometimes it catches you out. Grumble grumble.

You probably noticed a recurrent theme throughout my griping: using primary constructors for initialization is fine; all the sharp edges are around implicit capture. This made me wonder: is there a way to use only the initialization features of primary constructors. And it turns out, yes! In the next post I'll show how you can use a Roslyn analyzer to enforce that behaviour!

Summary

In my previous post I provided an introduction to primary constructors and how they work behind-the-scenes. In this post I expanded on that post to describe some of the cases I think primary constructors work well for. I then discussed some of the various gripes I have with primary constructors, mostly stemming from the implicit capture usage. In the next post I'll show an analyzer you can use to ensure primary constructors are only used for initialization.


Viewing all articles
Browse latest Browse all 743

Trending Articles