Software & Apps

Write a .net garbage collector in C #

If you read my articles, you may know that I want to play with Actodinoot so much, especially using C # in areas that can’t have it before. I’ve been wrote a simple profilerAt this point we go one step further and try to write a garbage collector in C #.

Of course, it does not result in any available use of production. Building a performant and fully displayed GC gets hundreds of hours of work hours, and use a managed language for your GC to be bothered by self-collection of waste?). However, this is a great excuse to know more about .net internals to garbage.

Also, you might think about why the odd is necessary for it. Like Profiler in .net, there are two reasons why “Vanilla” is used. First, GC is not ready to run into the code management. Second, GC will expire depending on himself, which is a little a chicken-and-egg problem. The odd embed in one’s own CLR and does not associate with System CLR, so it can be used to bypass these limits.

.Add support support for loading an external GC (called “standalone GC”) in .NET Core 2.1. To use it, you need to build a DLL that reveals two ways:

  • GC_InitializeCalled to initialize GC
  • GC_VersionInfoCalled to ask the API version supported by GC

The first step is to create a new library library project 9 true at CSPOJ to help the Light. Then we can add a DllMain Class (the name does not matter) and express two required ways:

public class DllMain
{
    (UnmanagedCallersOnly(EntryPoint = "GC_Initialize"))
    public static unsafe uint GC_Initialize(IntPtr clrToGC, IntPtr* gcHeap, IntPtr* gcHandleManager, GcDacVars* gcDacVars)
    {
        Console.WriteLine("GC_Initialize");
        return 0x80004005; /* E_FAIL */
    }

    (UnmanagedCallersOnly(EntryPoint = "GC_VersionInfo"))
    public static unsafe void GC_VersionInfo(VersionInfo* versionInfo)
    {
        Console.WriteLine($"GC_VersionInfo {versionInfo->MajorVersion}.{versionInfo->MinorVersion}.{versionInfo->BuildVersion}");

        versionInfo->MajorVersion = 5;
        versionInfo->MinorVersion = 3;
    }

    (StructLayout(LayoutKind.Sequential))
    public unsafe struct VersionInfo
    {
        public int MajorVersion;
        public int MinorVersion;
        public int BuildVersion;
        public byte* Name;
    }

    (StructLayout(LayoutKind.Sequential))
    public readonly struct GcDacVars
    {
        public readonly byte Major_version_number;
        public readonly byte Minor_version_number;
        public readonly nint Generation_size;
        public readonly nint Total_generation_count;
    }
}

For now we do not have much to do, just write a console message to confirm that the tasks correctly are called. on GC_VersionInfoWe are also properly placed the API version we support. GC API is not written as far as I know, so you should be pry To the source CLR code to find the correct version number. Note that the versionInfo The dispute initially contains the version of the Execution Engine API provided by CLR. This is useful if you want to write a GC supporting multiple versions of .net.

DLL can be published in Mativeant by running simple:

dotnet publish -r win-x64

To load GC customs to a .net application, we need to copy it to the same folder as application, and place the DOTNET_GCName variable around the DLL name. Alternatively, you can use DOTNET_GCPath That accepts an entire passage and therefore allow you to load GC from another folder.

set DOTNET_GCName=ManagedDotnetGC.dll

Then we can run the application, and we immediately salute our first crash:

Sure, our “GC” is less practical, but we want to see the messages we write on the console.

If we have a debugger, we can see that crash is a breach of access to GC_Initialize function, while trying to write the console.

However, if you look closely at the cendstack, you can see something odd:

ManagedDotnetGC.dll!S_P_CoreLib_System_Threading_Volatile__Read_12()
ManagedDotnetGC.dll!System_Console_System_Console__get_Out()
ManagedDotnetGC.dll!System_Console_System_Console__WriteLine_12()
ManagedDotnetGC.dll!ManagedDotnetGC_ManagedDotnetGC_DllMain__GC_Initialize()
(Inline Frame) ManagedDotnetGC.dll!GCHeapUtilities::InitializeDefaultGC()
ManagedDotnetGC.dll!InitializeDefaultGC()
ManagedDotnetGC.dll!InitializeGC()
(Inline Frame) ManagedDotnetGC.dll!InitDLL(void * hPalInstance)
ManagedDotnetGC.dll!RhInitialize(bool isDll)
ManagedDotnetGC.dll!InitializeRuntime()
(Inline Frame) ManagedDotnetGC.dll!Thread::EnsureRuntimeInitialized()
(Inline Frame) ManagedDotnetGC.dll!Thread::ReversePInvokeAttachOrTrapThread(ReversePInvokeFrame *)
ManagedDotnetGC.dll!RhpReversePInvokeAttachOrTrapThread2(ReversePInvokeFrame * pFrame)
ManagedDotnetGC.dll!ManagedDotnetGC_ManagedDotnetGC_DllMain__GC_VersionInfo()
coreclr.dll!`anonymous namespace'::LoadAndInitializeGC(const wchar_t * standaloneGCName, const wchar_t * standaloneGCPath)
coreclr.dll!InitializeGarbageCollector()
coreclr.dll!EEStartupHelper()
coreclr.dll!EEStartup()
coreclr.dll!EnsureEEStarted()
coreclr.dll!CorHost2::Start()
coreclr.dll!coreclr_initialize(const char * exePath, const char * appDomainFriendlyName, int propertyCount, const char * * propertyKeys, const char * * propertyValues, void * * hostHandle, unsigned int * domainId)
hostpolicy.dll!coreclr_t::create(const std::wstring & libcoreclr_path, const char * exe_path, const char * app_domain_friendly_name, const coreclr_property_bag_t & properties, std::unique_ptr> & inst)
hostpolicy.dll!`anonymous namespace'::create_coreclr()
hostpolicy.dll!corehost_main(const int argc, const wchar_t * * argv)
hostfxr.dll!execute_app(const std::wstring & impl_dll_dir, corehost_init_t * init, const int argc, const wchar_t * * argv)
hostfxr.dll!`anonymous namespace'::read_config_and_execute(const std::wstring & host_command, const host_startup_info_t & host_info, const std::wstring & app_candidate, const std::unordered_map>,known_options_hash,std::equal_to,std::allocator<:pair known_options="" const="">>>>> & opts, int new_argc, const wchar_t * * new_argv, host_mode_t mode, const bool is_sdk_command, wchar_t * out_buffer, int buffer_size, int * required_buffer_size)
hostfxr.dll!fx_muxer_t::handle_exec_host_command(const std::wstring & host_command, const host_startup_info_t & host_info, const std::wstring & app_candidate, const std::unordered_map>,known_options_hash,std::equal_to,std::allocator<:pair known_options="" const="">>>>> & opts, int argc, const wchar_t * * argv, int argoff, host_mode_t mode, const bool is_sdk_command, wchar_t * result_buffer, int buffer_size, int * required_buffer_size)
hostfxr.dll!fx_muxer_t::execute(const std::wstring host_command, const int argc, const wchar_t * * argv, const host_startup_info_t & host_info, wchar_t * result_buffer, int buffer_size, int * required_buffer_size)
hostfxr.dll!hostfxr_main_startupinfo(const int argc, const wchar_t * * argv, const wchar_t * host_path, const wchar_t * dotnet_root, const wchar_t * app_path)
TestApp.exe!exe_start(const int argc, const wchar_t * * argv)
TestApp.exe!wmain(const int argc, const wchar_t * * argv)
(Inline Frame) TestApp.exe!invoke_main()
TestApp.exe!__scrt_common_main_seh()
kernel32.dll!BaseThreadInitThunk()
ntdll.dll!RtlUserThreadStart()

Starting at the bottom, we can see a set of frames associated with the introduction of the test application, then a call to ManagedDotnetGC.dll!ManagedDotnetGC_ManagedDotnetGC_DllMain__GC_VersionInfo(). So runtime calls GC_VersionInfo… but we crashed on GC_Initialize? Watch the healing, we can see that GC_VersionInfo prompts the introduction to the morganaot runtime (ManagedDotnetGC.dll!InitializeRuntime) that in hitting one’s own GC (ManagedDotnetGC.dll!InitializeDefaultGC). But however, it concludes the call our GC_Initialize function, which should not be called at this point. Why is the introduction to Aryansaot Runtime triggers a call to our GC?

My initial theory is that Actista in the Runtime Activate chooses our DOTNET_GCName Variable in the root and attempt to use our custom GC instead of its own. However, while Mativeaot have experimental support for standalone GCIt now has default disability (and should be enabled with a special flag). So that’s not the reason for our crash.

After further scrutiny, the problem seems to have something to do with linking time to compil. In the Runtime’s Acyanoot, THE GC_Initialize The function is defined as an external symbol:

// GC entrypoints for the linked-in GC. These symbols are invoked
// directly if we are not using a standalone GC.
extern "C" HRESULT LOCALGC_CALLCONV GC_Initialize(
    /* In  */ IGCToCLR* clrToGC,
    /* Out */ IGCHeap** gcHeap,
    /* Out */ IGCHandleManager** gcHandleManager,
    /* Out */ GcDacVars* gcDacVars
);

Because we also reveal ourselves GC_Initialize function, the linker as confused and linking the mistake. This can be confirmed by writing a simple console application that reveals a GC_Initialize Procedure, and simply runs it:

internal class Program
{
    static void Main(string() args)
    {
        Console.WriteLine("Hello, World!");
    }

    (UnmanagedCallersOnly(EntryPoint = "GC_Initialize"))
    public static int GC_Initialize()
    {
        return 0;
    }
}

If we publish it with ayunteaoot then it will run, it then crashes the same access violations, even if we do not put the DOTNET_GCName variable around or even try to appeal to GC_Initialize function.

Unfortunately I didn’t find a workaround at the level of compute massayaot, so I had to find another way.

During my first attempt, I used a simple workaround around the GC_Initialize issue. Since the gardeners crashes when you export a function to that name, arranging is simply changing it to something. I named it at Custom_GC_Initializethen wrote a small space of C ++ calling name that is laid by name:

#include "pch.h"

#include 
#include 

typedef HRESULT(__stdcall* f_GC_Initialize)(void*, void*, void*, void*);
typedef HRESULT(__stdcall* f_GC_VersionInfo)(void*);

static f_GC_Initialize s_gcInitialize;
static f_GC_VersionInfo s_gcVersionInfo;

BOOL APIENTRY DllMain(HMODULE hModule,
	DWORD  ul_reason_for_call,
	LPVOID lpReserved
)
{
	if (ul_reason_for_call == DLL_PROCESS_ATTACH)
	{
		auto module = LoadLibraryA("ManagedDotnetGC.dll");

		if (module == 0)
		{
			return FALSE;
		}

		s_gcInitialize = (f_GC_Initialize)GetProcAddress(module, "Custom_GC_Initialize");
		s_gcVersionInfo = (f_GC_VersionInfo)GetProcAddress(module, "Custom_GC_VersionInfo");
	}

	return true;
}

extern "C" __declspec(dllexport) HRESULT GC_Initialize(
	void* clrToGC,
	void** gcHeap,
	void** gcHandleManager,
	void* gcDacVars
)
{
	return s_gcInitialize(clrToGC, gcHeap, gcHandleManager, gcDacVars);
}

extern "C" __declspec(dllexport) void GC_VersionInfo(void* result)
{
	s_gcVersionInfo(result);
}

This code is a simple DLL loading the original ManagedDotnetGC.dllfound the Custom_GC_Initialize and Custom_GC_VersionInfo tasks, then expose them as GC_Initialize and GC_VersionInfo. This means that the application of .net should use the loader as Standalone GC, and the loader passes calls to the actual custom GC.

It works, and it’s honest clean, but it bothers me because the goal is to write a habitual GC entirely in C #. So I carefully checked the standalone GC loading code and found a mistake I could.

To fix the problem, we have to stop the antidanaot to call our GC_Initialize function. The only way I have found is to make the above standalone GC support of Moranot by adding a property to CSPOJ:

This is the reason for reading the Mativeaot of DOTNET_GCName/DOTNET_GCPath Variables around and load assigned GC. Well, the pattern .net GC itself is compatible with standalone GC API, so we can put the variables around to point out the original GC (DOTNET_GCName=clrgc.dll).

In this way, call the Matidaot GC_Initialize In that GC instead of ours, carefully crash. However environmental variables are placed at the level of process, so it causes the application to try to load the original GC instead of a custom. So we need a way of saying actinoot runtime to load the original GC, and the runtime of .net to load our custom GC.

I get near to the purpose of what I know an interesting quirk: .Prop the environmental variables selected by DOTNET_ or COMPlus_While Managaotaot only supports DOTNET_. So if we have COMPlus_GCName=ManagedDotnetGC.dllThe .NET Runtime will only pick it up, and the Runtime Acanoot does not pay attention. So, I think of it:

set COMPlus_GCName=ManagedDotnetGC.dll
set DOTNET_GCName=clrgc.dll

Unfortunately, DOTNET_ have a higher priority than COMPlus_So set two DOTNET_GCName and COMPlus_GCName caused by .net runtime ignored COMPlus_GCName. I need a final trick.

The last trick is that GCPath have a higher priority than GCName. So if you put, for example:

set DOTNET_GCPath=gc1.dll
set DOTNET_GCName=gc2.dll

Then .not load gc1.dll and ignores gc2.dll. Combined all of all, I finish it:

set DOTNET_GCName=clrgc.dll
set COMPlus_GCPath=ManagedDotnetGC.dll

The .NET Runtime does not care DOTNET_GCName because GCPath (and therefore COMPlus_GCPath) have higher priority. And the runtime morganaotal does not care COMPlus_GCPath Because it only supports DOTNET_ prefix. We finished the consequence we wanted: The .NET Runtime uses our custom GC, and the highest runtime is using the GC standard.

For this to work, remember that we need to send clrgc.dll with our custom GC. The file can be found in the .net inchollation founde, at shared\Microsoft.NETCore.App\9.x.x.

Shortly after I published the first version of this article, Michal stevský I’ve arrived in Bluesky and tacked with a better way to fix this problem.

The idea is to export tasks to different name, such as C ++ wrapper solution:

    (UnmanagedCallersOnly(EntryPoint = "_GC_Initialize"))
    public static unsafe uint GC_Initialize(IntPtr clrToGC, IntPtr* gcHeap, IntPtr* gcHandleManager, GcDacVars* gcDacVars)
    {
        Console.WriteLine("GC_Initialize");
        return 0x80004005; /* E_FAIL */
    }

    (UnmanagedCallersOnly(EntryPoint = "_GC_VersionInfo"))
    public static unsafe void GC_VersionInfo(VersionInfo* versionInfo)
    {
        Console.WriteLine($"GC_VersionInfo {versionInfo->MajorVersion}.{versionInfo->MinorVersion}.{versionInfo->BuildVersion}");

        versionInfo->MajorVersion = 5;
        versionInfo->MinorVersion = 3;
    }

Here I prefer them in an underscore. After all, it’s something to add a MSBuild target (at CSProj file) to edit definitions in the definitions before linking the name of exports:

   TaskName="RegexReplaceFile" TaskFactory="RoslynCodeTaskFactory" AssemblyFile="$(MSBuildToolsPath)\Microsoft.Build.Tasks.Core.dll">
    
       ParameterType="System.String" Required="true" />
    
    
       Namespace="System.IO" />
       Namespace="System.Text.RegularExpressions" />
       Type="Fragment" Language="cs">
                var lines = File.ReadAllLines(FilePath);
        for (int i = 0; i < lines.Length; i++)
        {
            lines(i) = Regex.Replace(lines(i), @"_GC_(\w+)", "GC_$1=_GC_$1");
        }
        File.WriteAllLines(FilePath, lines);
        ))>
      
    
  

   Name="TransformExportsFile" BeforeTargets="LinkNative">
     Importance="high" Text="Transforming exports file: $(ExportsFile)" />
     FilePath="$(ExportsFile)" />
  

These works are good, so I will use this solution in the future. Thanks Michal!

After using any workaround, we can see that GC is right full of running test application:

The application has never started because we never implemented the introduction of our custom GC, but that is something we face next time.

This article code is available at Kamahir.

Like this article? Don’t hesitate to check the 2nd edition of PRO .NET Memory Management For more views of waste internals in garbage waste!

PRO .NET Memory Management


https://minidump.net/images/2025-28-01-writing-a-net-gc-in-c-part-1-3.png

2025-02-24 16:28:00

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button