Quantcast
Channel: Andrew Lock | .NET Escapades
Viewing all articles
Browse latest Browse all 743

Exploring the generated code: T[], Span, and Immutable collections: Behind the scenes of collection expressions - Part 3

$
0
0

This series take an in-depth look at collection expressions, which were introduced with C#12. In the first post I provided an introduction to collection expressions, and in the previous post we looked at the code generated when you use List<T> with collection expressions, In this post we look at the code that collection expressions generate for arrays, ReadOnlySpan<T>/Span<T>, and immutable collections.

Note that by design the code produced by the compiler may change in future versions of .NET and C#. Also, in many situations the compiler-generated code uses low-level APIs that you shouldn't need to use in general, so don't worry if it all looks a bit complicated. That's kind of the point—the compiler is doing all this work so you don't have to!

Optimized collection expressions for arrays

Arrays are where we start to see some big differences in the generated code, depending on what types you're using. For most reference types, the collection expression code for arrays will be pretty much what you'd expect. For example, if you have this:

string[] array = [ "1", "2", "3", "4", "5" ];

then the generated code is the same as that generated using traditional array initializers:

string[] array = new string[5];
array[0] = "1";
array[1] = "2";
array[2] = "3";
array[3] = "4";
array[4] = "5";

Things get more interesting if T is a simple primitive type like an int, double, or bool, for example:

int[] array = [1, 2, 3, 4, 5];

Suddenly the generated code looks significantly different:

RuntimeHelpers.InitializeArray(new int[5], (RuntimeFieldHandle)/*OpCode not supported: LdMemberToken*/);

[CompilerGenerated]
internal sealed class <PrivateImplementationDetails>
{
    [StructLayout(LayoutKind.Explicit, Pack = 1, Size = 20)]
    private struct __StaticArrayInitTypeSize=20
    {
    }

    internal static readonly __StaticArrayInitTypeSize=20 4F6ADDC9659D6FB90FE94B6688A79F2A1FA8D36EC43F8F3E1D9B6528C448A384/* Not supported: data(01 00 00 00 02 00 00 00 03 00 00 00 04 00 00 00 05 00 00 00) */;
}

You'll notice all the "not supported" commets in there—that's because the generated code is valid IL, but it isn't valid C#, so sharplab is just doing it's best here. You might be able to infer what's going on though:

  • The representation of the final array is converted to a series of bytes and stored as a static readonly field in a class called <PrivateImplementationDetails>
  • The RuntimeHelpers.InitializeArray() method takes an array and a reference to a field, and directly overwrites the contents of the array with the field.

That's very powerful; there's no iterating through each element of the array, it's literally copying memory from a static field (embedded in the assembly) to another memory location (the array).

<PrivateImplementationDetails> is a class the C# compiler emits with compiler-generated helpers which is used by other generated code.

If you squint at the data() comment on the 4F6ADDC9659D6... field, and remember that each int is 4 bytes, you can see that this looks like the 1, 2, 3, 4, 5 of the array. Play with the number of values in the collection and you'll see the StructLayout.Size and __StaticArrayInitTypeSize values changing accordingly.

You can also try with other primitive types like double and bool and see that you get the same behaviour. You need to use arrays where 3 or more elements have their non-default value; no matter the size of the array, if only two or less values are non-default, the compiler switches to simply assigning the elements directly.

I feel it would be slightly remis if I didn't show the IL that's actually generated for the direct-field assignment case, so I've added the IL generated for the 5-element array collection expression shown previously:

IL_0000: ldc.i4.5 // Push '5' onto the stack as an int32
IL_0001: newarr [System.Runtime]System.Int32 // create an int[5]
IL_0006: dup // duplicate the reference (it will be "consumed" by a later call)
// 👇 Converts a referenced metadata token to a `RuntimeHandle`
IL_0007: ldtoken field valuetype '<PrivateImplementationDetails>'/'__StaticArrayInitTypeSize=16' '<PrivateImplementationDetails>'::4F6ADDC9659D6FB90FE94B6688A79F2A1FA8D36EC43F8F3E1D9B6528C448A384
// 👇 Call RuntimeHelpers::InitializeArray, passing in the array and the RuntimeHandle reference
IL_000c: call void [System.Runtime]System.Runtime.CompilerServices.RuntimeHelpers::InitializeArray(class [System.Runtime]System.Array, valuetype [System.Runtime]System.RuntimeFieldHandle)

There's still some hand waving around the ldtoken etc but conceptually it hopefully makes sense: we're copying a fixed, compile-time chunk of memory into the raw array memory. So it's fast. 😃

IEnumerable<T>, IReadOnlyCollection<T>, IReadOnlyList<T>

IEnumerable<T>, IReadOnlyCollection<T>, and IReadOnlyList<T> typed collection expressions are backed by a T[] and mostly behave like the array equivalents. So for primitive blittable types like int you get the optimised InitializeArray() code, whereas for other types you get the "naïve" initialization of each element.

"Blittable" technically refers to whether a type has the same representation in managed and unmanaged memory, and includes (among others) primitive types like int, long, and double. Types like string and bool(perhaps surprisingly) are not blittable.

However, for the purposes of this post I'm slightly abusing the term blittable to mean any type which the compiler can store the array of values in a field as a contiguous block of memory.

However, interestingly, the interface implementations aren't as straightforward as the List<T>/IList<T> case we saw in the previous post, where creating an IList<T> collection expression returns a List<> behind the scenes. That becomes apparent if we use each of the array interfaces with collection expressions:

using System;
using System.Collections.Generic;

IEnumerable<int> enumerable = [1, 2, 3, 4, 5];
IReadOnlyCollection<int> collection = [1, 2, 3, 4, 5];
IReadOnlyList<int> list = [1, 2, 3, 4, 5];

Console.WriteLine(enumerable is int[]); // False
Console.WriteLine(collection is int[]); // False
Console.WriteLine(list is int[]); // False

As you can see, the IEnumerable<T> etc are not simply implemented by int[]. Let's take a look at the generated code, taking the non-blittable version for example (as the generated code is easier to read):

using System.Collections.Generic;

IEnumerable<string> array = [ "1", "2", "3", "4", "5" ];

The generated code for this example shows that we're wrapping the array in another type.

string[] array = new string[5];
array[0] = "1";
array[1] = "2";
array[2] = "3";
array[3] = "4";
array[4] = "5";
return new <>z__ReadOnlyArray<string>(array);

The unpronounceable type <>z__ReadOnlyArray<T> is another compiler-emitted type, and it behaves pretty much as you might expect. It's a simple wrapper around a T[], implementing many of the same interfaces as arrays, but as a truly readonly version:

internal sealed class <>z__ReadOnlyArray<T> : IEnumerable, IEnumerable<T>, IReadOnlyCollection<T>, IReadOnlyList<T>, ICollection<T>, IList<T>
{
    private readonly T[] _items;

    public <>z__ReadOnlyArray(T[] items) => _items = items;

    int IReadOnlyCollection<T>.Count => _items.Length;
    T IReadOnlyList<T>.this[int index] => _items[index];
    int ICollection<T>.Count => _items.Length;
    bool ICollection<T>.IsReadOnly => true;

    T IList<T>.this[int index]
    {
        get => _items[index];
        set => throw new NotSupportedException();
    }

    IEnumerator IEnumerable.GetEnumerator() => ((IEnumerable)_items).GetEnumerator();
    IEnumerator<T> IEnumerable<T>.GetEnumerator() => ((IEnumerable<T>)_items).GetEnumerator();

    bool ICollection<T>.Contains(T item) => ((ICollection<T>)_items).Contains(item);
    void ICollection<T>.CopyTo(T[] array, int arrayIndex) => ((ICollection<T>)_items).CopyTo(array, arrayIndex);

    int IList<T>.IndexOf(T item) => ((IList<T>)_items).IndexOf(item);

    void ICollection<T>.Add(T item) => throw new NotSupportedException();
    void ICollection<T>.Clear() => throw new NotSupportedException();
    bool ICollection<T>.Remove(T item) => throw new NotSupportedException();
    void IList<T>.Insert(int index, T item) => throw new NotSupportedException();
    void IList<T>.RemoveAt(int index) => throw new NotSupportedException();
}

Interestingly, <>z__ReadOnlyArray<T> only partially implements the non-readonly collections, and throws if you try to invoke any of the mutation methods. I'm not entirely sure the logic of that approach, it seems a little dangerous to me but 🤷‍♂️ My assumption is that it enables other optimisations in the compiler thanks to the presence of ICollection<T>.CopyTo() etc.

Creating ReadonlySpan<T> collection expressions

For ReadOnlySpan<T> we have to think about two different implementations again, the blittable values like int and other types; we'll start with the int version. The code samples have to "use" the collection in some way, otherwise the compiler might omit it completely, as creating a ReadOnlySpan<T> is side-effect free:

using System;

ReadOnlySpan<int> A()
{
    ReadOnlySpan<int> array = [1, 2, 3, 4, 5];
    return array;
}

This generates code that looks like the following:

return RuntimeHelpers.CreateSpan<int>((RuntimeFieldHandle)/*OpCode not supported: LdMemberToken*/);

[CompilerGenerated]
internal sealed class <PrivateImplementationDetails>
{
    [StructLayout(LayoutKind.Explicit, Pack = 4, Size = 20)]
    private struct __StaticArrayInitTypeSize=20_Align=4
    {
    }

    internal static readonly __StaticArrayInitTypeSize=20_Align=4 4F6ADDC9659D6FB90FE94B6688A79F2A1FA8D36EC43F8F3E1D9B6528C448A3844/* Not supported: data(01 00 00 00 02 00 00 00 03 00 00 00 04 00 00 00 05 00 00 00) */;
}

That looks remarkably familiar! It's almost identical to the int[] code, but instead of calling RuntimeHelpers.InitializeArray() it's calling RuntimeHelpers.CreateSpan(), which in turn calls RuntimeHelpers.GetSpanDataFrom(). As Stephen Toub describes in his epic performance improvements in .NET 8 post:

It blits the data for the array into the assembly, and then constructing the span isn’t via an array allocation, but rather just wrapping the span around a pointer directly into the assembly’s data. This not only avoids the startup overhead and the extra object on the heap, it also better enables various JIT optimizations, especially when the JIT is able to see what offset is being accessed.

So that covers the int case. If you remember, int[] had similar optimisations to these, but for string[] the compiler fell back to "basic" initialization of each of the elements. So what happens for ReadOnlySpan<string>?

using System;

ReadOnlySpan<string> array = [ "1", "2", "3", "4", "5" ];

The generated code is definitely not a naïve implementation!

<>y__InlineArray5<string> buffer = default(<>y__InlineArray5<string>);
<PrivateImplementationDetails>.InlineArrayElementRef<<>y__InlineArray5<string>, string>(ref buffer, 0) = "1";
<PrivateImplementationDetails>.InlineArrayElementRef<<>y__InlineArray5<string>, string>(ref buffer, 1) = "2";
<PrivateImplementationDetails>.InlineArrayElementRef<<>y__InlineArray5<string>, string>(ref buffer, 2) = "3";
<PrivateImplementationDetails>.InlineArrayElementRef<<>y__InlineArray5<string>, string>(ref buffer, 3) = "4";
<PrivateImplementationDetails>.InlineArrayElementRef<<>y__InlineArray5<string>, string>(ref buffer, 4) = "5";
<PrivateImplementationDetails>.InlineArrayAsReadOnlySpan<<>y__InlineArray5<string>, string>(ref buffer, 5);
    

[CompilerGenerated]
internal sealed class <PrivateImplementationDetails>
{
    internal static ReadOnlySpan<TElement> InlineArrayAsReadOnlySpan<TBuffer, TElement>([In][IsReadOnly] ref TBuffer buffer, int length)
    {
        return MemoryMarshal.CreateReadOnlySpan(ref Unsafe.As<TBuffer, TElement>(ref Unsafe.AsRef(ref buffer)), length);
    }

    internal static ref TElement InlineArrayElementRef<TBuffer, TElement>(ref TBuffer buffer, int index)
    {
        return ref Unsafe.Add(ref Unsafe.As<TBuffer, TElement>(ref buffer), index);
    }
}

[StructLayout(LayoutKind.Auto)]
[InlineArray(5)]
internal struct <>y__InlineArray5<T>
{
    [CompilerGenerated]
    private T _element0;
}

It's kind of hard to follow all the unspeakable names in the above example, so we'll break it down piece by piece. We'll start with <>y__InlineArray5<T>:

[StructLayout(LayoutKind.Auto)]
[InlineArray(5)]
internal struct <>y__InlineArray5<T>
{
    [CompilerGenerated]
    private T _element0;
}

This defines a new struct type as an inline array. The syntax for inline arrays is rather unintuitive, but you can think of it as a way of creating a T[] but without having to allocate a new array on the heap; instead the elements are embedded directly in the <>y__InlineArray5<T> struct instance.

That's a very vague description, and I may well write a dedicated post about them at a later date. In the meantime I recommend reading this great explainer by Lazlo.

Next we have the <PrivateImplementationDetails>.InlineArrayElementRef<TBuffer, TElement>() implementation. This takes the reference to an inline array, and returns a reference to a given index into the array:

[CompilerGenerated]
internal sealed class <PrivateImplementationDetails>
{
    internal static ref TElement InlineArrayElementRef<TBuffer, TElement>(ref TBuffer buffer, int index)
    {
        return ref Unsafe.Add(ref Unsafe.As<TBuffer, TElement>(ref buffer), index);
    }
}

Finally, we have the <PrivateImplementationDetails>.InlineArrayAsReadOnlySpan<TBuffer, TElement>() method which does what it says on the tin: it takes an inline array as a parameter, and returns a ReadOnlySpan<TElement> reference to the inline array contents:

[CompilerGenerated]
internal sealed class <PrivateImplementationDetails>
{
    internal static ReadOnlySpan<TElement> InlineArrayAsReadOnlySpan<TBuffer, TElement>([In][IsReadOnly] ref TBuffer buffer, int length)
    {
        return MemoryMarshal.CreateReadOnlySpan(ref Unsafe.As<TBuffer, TElement>(ref Unsafe.AsRef(ref buffer)), length);
    }

Now we can put it all together to understand the code the collection expression generates. To make it a little easier to read, I've simplified the unspeakable names by essentially providing type aliases

// simplify some names
using Private = <PrivateImplementationDetails>;
using StringInlineArray = <>y__InlineArray5<string>;

// Create an instance of the inline array
StringInlineArray buffer = default(StringInlineArray);

// Set each of the elements of the inline array
Private.InlineArrayElementRef<StringInlineArray, MyType>(ref buffer, 0) = "1";
Private.InlineArrayElementRef<StringInlineArray, MyType>(ref buffer, 1) = "2";
Private.InlineArrayElementRef<StringInlineArray, MyType>(ref buffer, 2) = "3";
Private.InlineArrayElementRef<StringInlineArray, MyType>(ref buffer, 3) = "4";
Private.InlineArrayElementRef<StringInlineArray, MyType>(ref buffer, 4) = "5";

// Wrap the contents of the inline array in a ReadOnlySpan<MyType>
Private.InlineArrayAsReadOnlySpan<StringInlineArray, string>(ref buffer, 5);

By using inline arrays, the compiler is able to avoid allocating a whole new array (string[]) which would always be allocated on the heap. Inline arrays are a struct so can be allocated on the stack or directly embedded into other types, which can reduce allocations (depending on the details, as always!)

Note that only the array itself is "inline" and embedded. Each of the string instances would be allocated as normal, and each array element contains a reference to such an instance.

After ReadOnlySpan<T> the next obvious choice to consider is Span<T>.

Span<T>: inline arrays all the way

In the previous section we saw that the collection expression implementation for ReadOnlySpan<T> when T is blittable (e.g. int) is heavily optimised by storing a static readonly field in the assembly, and then "wrapping" a ReadOnlySpan<T> around this area of memory.

Unfortunately, Span<T> requires that you can mutate the elements it wraps, so that optimized approach won't work here. However, the non-blittable approach which uses inline arrays is possible. In fact Span<T> collection expressions use the same inline array implementation whatever the T. This code uses both blittable and non-blittable arrays:

using System;

Span<string> array = [ "1", "2", "3", "4", "5" ];
Span<int> array2 = [ 1, 2, 3, 4, 5 ];

The generated code looks a little like the following (I added type aliases for readability again), and shows that the implementations are essentially the same in both cases:

// simplify some names
using Private = <PrivateImplementationDetails>;
using StringInlineArray = <>y__InlineArray5<string>;
using IntInlineArray = <>y__InlineArray5<int>;

// Span<string> array = [ "1", "2", "3", "4", "5" ];
StringInlineArray buffer = default(StringInlineArray);
PrivateInlineArrayElementRef<StringInlineArray, string>(ref buffer, 0) = "1";
PrivateInlineArrayElementRef<StringInlineArray, string>(ref buffer, 1) = "2";
PrivateInlineArrayElementRef<StringInlineArray, string>(ref buffer, 2) = "3";
PrivateInlineArrayElementRef<StringInlineArray, string>(ref buffer, 3) = "4";
PrivateInlineArrayElementRef<StringInlineArray, string>(ref buffer, 4) = "5";
PrivateInlineArrayAsSpan<StringInlineArray, string>(ref buffer, 5);

//Span<int> array2 = [1, 2, 3, 4, 5];
IntInlineArray buffer2 = default(IntInlineArray);
PrivateInlineArrayElementRef<IntInlineArray, int>(ref buffer2, 0) = 1;
PrivateInlineArrayElementRef<IntInlineArray, int>(ref buffer2, 1) = 2;
PrivateInlineArrayElementRef<IntInlineArray, int>(ref buffer2, 2) = 3;
PrivateInlineArrayElementRef<IntInlineArray, int>(ref buffer2, 3) = 4;
PrivateInlineArrayElementRef<IntInlineArray, int>(ref buffer2, 4) = 5;
PrivateInlineArrayAsSpan<IntInlineArray, int>(ref buffer2, 5);

[CompilerGenerated]
internal sealed class <PrivateImplementationDetails>
{
    internal static Span<TElement> InlineArrayAsSpan<TBuffer, TElement>(ref TBuffer buffer, int length)
    {
        return MemoryMarshal.CreateSpan(ref Unsafe.As<TBuffer, TElement>(ref buffer), length);
    }

    internal static ref TElement InlineArrayElementRef<TBuffer, TElement>(ref TBuffer buffer, int index)
    {
        return ref Unsafe.Add(ref Unsafe.As<TBuffer, TElement>(ref buffer), index);
    }
}

[StructLayout(LayoutKind.Auto)]
[InlineArray(5)]
internal struct <>y__InlineArray5<T>
{
    [CompilerGenerated]
    private T _element0;
}

Note that it's the same <>y__InlineArray5<T> being reused in both cases here, just with a different T. If we had a 5-element ReadOnlySpan<T> it would also re-use this implementation. Similarly, the InlineArrayElementRef() implementation is the same as for ReadOnlySpan<T>, while InlineArrayAsSpan() is directly analogous to the InlineArrayAsReadOnlySpan version.

ImmutableList<T>, ImmutableArray<T>, Immutable

We've almost finished this extensive look at all the types you can use with collection expressions. Our final set of types are:

  • ImmutableArray<T>
  • ImmutableList<T>/IImmutableList<T>
  • ImmutableQueue<T>/IImmutableQueue<T>
  • ImmutableStack<T>/IImmutableStack<T>
  • ImmutableHashSet<T>/IImmutableSet<T>
  • ImmutableSortedSet<T>

We'll start with ImmutableArray<T>, as the generated code there is slightly different to the other immutable collections. Taking a simple int example:

using System.Collections.Immutable;

ImmutableArray<int> array = [1, 2, 3, 4, 5];

The generated code has two steps:

  • Initialize an array
  • Call ImmutableCollectionsMarshal.AsImmutableArray(array) to create the ImmutableArray
int[] array = new int[5];
RuntimeHelpers.InitializeArray(array, (RuntimeFieldHandle)/*OpCode not supported: LdMemberToken*/);

ImmutableCollectionsMarshal.AsImmutableArray(array);

If you remember back to the previous int[] analysis, you'll see that this is literally using the same collection-expression-generated-code for creating an array, and then converting the array to the immutable array type.

The ImmutableCollectionsMarshal.AsImmutableArray() implementation simply returns an ImmutableArray that wraps the provided instance. This is faster (and has fewer allocations) than calling ImmutableArray.Create(array), which would be the usual approach but which creates a copy of the array for safety. The compiler can safely use ImmutableCollectionsMarshal here, because it knows noone else can have a reference to the generated array.

The generated code is essentially the same if you're using a non-blittable type like string. It still creates a T[] and then calls ImmutableCollectionsMarshal.AsImmutableArray(). The only difference is that the generated code uses the "simple" initialization code for the T[] for non-blittable types.

The remaining immutable implementations all use essentially the same implementation. They each create a ReadOnlySpan<T> (using whichever generated code is appropriate for a given T) and then call their respective Create() methods. So for the following code:

using System.Collections.Immutable;

ImmutableList<int> list = [1, 2, 3, 4, 5];
IImmutableList<int> ilist = [1, 2, 3, 4, 5];

ImmutableQueue<int> queue = [1, 2, 3, 4, 5];
IImmutableQueue<int> iqueue = [1, 2, 3, 4, 5];

ImmutableStack<int> stack = [1, 2, 3, 4, 5];
IImmutableStack<int> istack = [1, 2, 3, 4, 5];

ImmutableHashSet<int> set = [1, 2, 3, 4, 5];
IImmutableSet<int> iset = [1, 2, 3, 4, 5];

ImmutableSortedSet<int> sortedSet = [1, 2, 3, 4, 5];

we end up with the following generated code. I've included the <PrivateImplementationDetails> code in this case as it shows that each call to RuntimeHelpers.CreateSpan() can wrap the same constant data in memory, because it creates a ReadOnlySpan<T>:

ImmutableList.Create(RuntimeHelpers.CreateSpan<int>((RuntimeFieldHandle)/*OpCode not supported: LdMemberToken*/));
ImmutableList.Create(RuntimeHelpers.CreateSpan<int>((RuntimeFieldHandle)/*OpCode not supported: LdMemberToken*/));

ImmutableQueue.Create(RuntimeHelpers.CreateSpan<int>((RuntimeFieldHandle)/*OpCode not supported: LdMemberToken*/));
ImmutableQueue.Create(RuntimeHelpers.CreateSpan<int>((RuntimeFieldHandle)/*OpCode not supported: LdMemberToken*/));

ImmutableStack.Create(RuntimeHelpers.CreateSpan<int>((RuntimeFieldHandle)/*OpCode not supported: LdMemberToken*/));
ImmutableStack.Create(RuntimeHelpers.CreateSpan<int>((RuntimeFieldHandle)/*OpCode not supported: LdMemberToken*/));

ImmutableHashSet.Create(RuntimeHelpers.CreateSpan<int>((RuntimeFieldHandle)/*OpCode not supported: LdMemberToken*/));
ImmutableHashSet.Create(RuntimeHelpers.CreateSpan<int>((RuntimeFieldHandle)/*OpCode not supported: LdMemberToken*/));

ImmutableSortedSet.Create(RuntimeHelpers.CreateSpan<int>((RuntimeFieldHandle)/*OpCode not supported: LdMemberToken*/));

internal sealed class <PrivateImplementationDetails>
{
    [StructLayout(LayoutKind.Explicit, Pack = 4, Size = 20)]
    private struct __StaticArrayInitTypeSize=20_Align=4
    {
    }

    internal static readonly __StaticArrayInitTypeSize=20_Align=4 4F6ADDC9659D6FB90FE94B6688A79F2A1FA8D36EC43F8F3E1D9B6528C448A3844/* Not supported: data(01 00 00 00 02 00 00 00 03 00 00 00 04 00 00 00 05 00 00 00) */;
}

And with that we've covered all the built-in types that I'm going to look into in this series. In the next post we'll look at how the generated code changes when you use the spread element .. in your collection expressions.

Summary

In this post I looked at the code generated for collection expressions when you're generating arrays, Span<T>, ReadOnlySpan<T>, and immutable collections. I showed that the array code is highly optimised when the data is a primitive like int or double, as the whole array can be copied from a static readonly field. Similarly, I showed that a ReadOnlySpan<T> can be created directly from that field. Span<T> uses inline arrays as the main optimization, while the immutable types build on top of the ReadOnlySpan<T> implementation


Viewing all articles
Browse latest Browse all 743

Trending Articles