Plotted Pixels


Adventures in Demangling

I have recently put time into improving symbol demangling in DFHack, which is frequently used in determining the name of the active UI screen. This initially seemed like a simplistic task, but ended up revealing a lot about the internals of MSVC mangling, and even some bugs in Wine’s demangling implementation.

The demangling implementation originally trimmed the mangled names down, removing certain leading numbers and postfix symbols. This works for basic non-namespaced types, but could quickly provide non-portable names as soon as a namespace was involved.

Demangling with gcc and clang

Implementing this on Unix platforms which use gcc ended up being relatively simple. Both gcc and clang provide a library to do this called cxxabi. The abi::__cxa_demangle method is able to demangle function symbols, as well as RTTI mangled names without any special flags. Further documentation on this method can be found in the gcc documentation.

#include <cxxabi.h>

char* demangle_name(char* mangled) {
  int status;
  char* demangled = abi::__cxa_demangle(mangled_name, nullptr, nullptr, &status);
  if (demangled == nullptr) {
    throw "Failed to demangle symbol."
  }
  return demangled;
}

Demangling with MSVC

On Windows, demangling can be performed using UnDecorateSymbolName from dbghelp.dll. This method requires some flags to get specific results, and in most cases the documentation will point you in the correct direction.

#include <Windows.h>
#include <DbgHelp.h>

char* mangled = "";

char demangled[MAX_SYM_NAME];
DWORD flags = UNDNAME_COMPLETE;

DWORD res = UnDecorateSymbolName(mangled, (char*)&demangled, MAX_SYM_NAME, flags);

Mangled names from MSVC usually begin with a ?, but this is different for the names found in the RTTI typeinfo structures. RTTI mangled typenames start with .? and require a special flag for demangling.

Mangled names for C++ from MSVC begin with a ?, but this is not the case for names found in the RTTI typeinfo structures. The mangled names stored in the RTTI table begin with a leading ., which should always be skipped before demangling.

Misleading Documentation

The documentation for UnDecorateSymbolName as seen in the microsoft documentation provides a good starting point for usage, but falls apart in the flag documentation. The particular flag that must be set in order to demangle the RTTI table’s mangled names is UNDNAME_NO_ARGUMENTS, which is not immediately apparent from the flag’s description.

The UNDNAME_NO_ARGUMENTS flag actually behaves as UNDNAME_TYPE_ONLY, which is the name of the flag of the same value in the Debugging Interface Access SDK. This is information I was hard pressed to find, discovering this through reading implementations of std::type_info::name from Wine.

Using this knowledge, a implementation of demangling that supports types and methods would be as follows:

#include <Windows.h>
#include <DbgHelp.h>

char* mangled = ".?AV_RefCounter@details@Concurrency@@";

char demangled[MAX_SYM_NAME];
// UNDNAME_NO_ARGUMENTS is better refered to as "UNDNAME_TYPE_ONLY"
DWORD flags = UNDNAME_COMPLETE;

if (mangled[0] == '.') {
  // The mangled name is a type, and requires a special flag.
  flags |= UNDNAME_NO_ARGUMENTS;
  mangled++; // Skip the leading . as it is only a marker.
}

DWORD res = UnDecorateSymbolName(
  mangled,
  (char*)&demangled,
  MAX_SYM_NAME,
  flags
);

Implementation issues in Wine

While the function works without issue on native Windows, Wine introduces a need for a number of new workarounds.

Failures result in success

Wine’s internal implementation will always return a “demangled” string, regardless of whether an error was encountered or not. The problematic line can be seen in undname.c.

Returning a “demangled” string will allow programs to function with minimal issues when used in debugging information or matching typenames. Situations in which correctness of the human-readable name is important can make these failures a headache.

Thankfully, the demangling failures are easily detectable by inspecting the resultant demangled name. The mangled name and the demangled name will always match exactly upon failure, which can be checked more efficiently by comparing the prefix character, which is ? for C++. This symbol does not typically appear at the start of a symbol or type name, and thus shouldn’t produce false-positives.

Faulty demangling implementation

Wine’s reimplementation of demangling has some faults that show up on more complex types. One example of this is the following mangled/demangled pair:

// Mangled:
.?AV?$_Func_impl_no_alloc@V<lambda_1>@?1??set_custom_render@widget@widgets@@QEAAXV?$function@$$A6AXPEAVwidget@widgets@@@Z@std@@@Z@XPEAV34@I@std@@
// Wine demangled:
std::I::XPEAV34::Z::_Func_impl_no_alloc<widgets::widget::et_custom_render::`2'::`2'::<lambda_1>, & * __ptr64 const,std::function<void __cdecl(widgets::widget * __ptr64)> >
// Windows demangled:
// TODO

There sadly does not appear to be an easy way of detecting when these errors occur automatically. This did not cause any major issues when used in DFHack as all the types that encountered this problem were from the std or Concurrency namespace, and could be filtered out by checking for a trailing std@@ or Concurrency@@.

Result

In the end, these failures wine encountered could be worked around safely by any code using them, even though the problematic behavior could not be eliminated entirely.

The resultant PR implementing these changes can be found on github.