Plotted Pixels
Adventures in Demangling
I have recently put time into improving symbol demangling in DFHack, which is frequently used in determining the name of the active UI screen. This initially seemed like a simplistic task, but ended up revealing a lot about the internals of MSVC mangling, and even some bugs in Wine’s demangling implementation.
The demangling implementation originally trimmed the mangled names down, removing certain leading numbers and postfix symbols. This works for basic non-namespaced types, but could quickly provide non-portable names as soon as a namespace was involved.
Demangling with gcc and clang
Implementing this on Unix platforms which use gcc ended up being
relatively simple. Both gcc and clang provide a library to do this called cxxabi
.
The abi::__cxa_demangle
method is able to demangle function symbols, as well as RTTI mangled names
without any special flags. Further documentation on this method can be found in the
gcc documentation.
#include <cxxabi.h>
char* demangle_name(char* mangled) {
int status;
char* demangled = abi::__cxa_demangle(mangled_name, nullptr, nullptr, &status);
if (demangled == nullptr) {
throw "Failed to demangle symbol."
}
return demangled;
}
Demangling with MSVC
On Windows, demangling can be performed using UnDecorateSymbolName
from
dbghelp.dll
. This method requires some flags to get specific results,
and in most cases the documentation will point you in the correct direction.
#include <Windows.h>
#include <DbgHelp.h>
char* mangled = "";
char demangled[MAX_SYM_NAME];
DWORD flags = UNDNAME_COMPLETE;
DWORD res = UnDecorateSymbolName(mangled, (char*)&demangled, MAX_SYM_NAME, flags);
Mangled names from MSVC usually begin with a ?
, but this is different for the
names found in the RTTI typeinfo structures. RTTI mangled typenames start with .?
and
require a special flag for demangling.
Mangled names for C++ from MSVC begin with a ?
, but this is not the case
for names found in the RTTI typeinfo structures. The mangled names stored in
the RTTI table begin with a leading .
, which should always be skipped before
demangling.
Misleading Documentation
The documentation for UnDecorateSymbolName
as seen in the microsoft documentation
provides a good starting point for usage, but falls apart in the flag documentation.
The particular flag that must be set in order to demangle the RTTI table’s
mangled names is UNDNAME_NO_ARGUMENTS
, which is not immediately apparent from
the flag’s description.
The UNDNAME_NO_ARGUMENTS
flag actually behaves as UNDNAME_TYPE_ONLY
, which
is the name of the flag of the same value in the Debugging Interface Access SDK.
This is information I was hard pressed to find, discovering this through reading
implementations of std::type_info::name
from Wine.
Using this knowledge, a implementation of demangling that supports types and methods would be as follows:
#include <Windows.h>
#include <DbgHelp.h>
char* mangled = ".?AV_RefCounter@details@Concurrency@@";
char demangled[MAX_SYM_NAME];
// UNDNAME_NO_ARGUMENTS is better refered to as "UNDNAME_TYPE_ONLY"
DWORD flags = UNDNAME_COMPLETE;
if (mangled[0] == '.') {
// The mangled name is a type, and requires a special flag.
flags |= UNDNAME_NO_ARGUMENTS;
mangled++; // Skip the leading . as it is only a marker.
}
DWORD res = UnDecorateSymbolName(
mangled,
(char*)&demangled,
MAX_SYM_NAME,
flags
);
Implementation issues in Wine
While the function works without issue on native Windows, Wine introduces a need for a number of new workarounds.
Failures result in success
Wine’s internal implementation will always return a “demangled” string, regardless of whether an error was encountered or not. The problematic line can be seen in undname.c.
Returning a “demangled” string will allow programs to function with minimal issues when used in debugging information or matching typenames. Situations in which correctness of the human-readable name is important can make these failures a headache.
Thankfully, the demangling failures are easily detectable by inspecting the
resultant demangled name. The mangled name and the demangled name will always match
exactly upon failure, which can be checked more efficiently by comparing the prefix
character, which is ?
for C++. This symbol does not typically appear at the
start of a symbol or type name, and thus shouldn’t produce false-positives.
Faulty demangling implementation
Wine’s reimplementation of demangling has some faults that show up on more complex types. One example of this is the following mangled/demangled pair:
// Mangled:
.?AV?$_Func_impl_no_alloc@V<lambda_1>@?1??set_custom_render@widget@widgets@@QEAAXV?$function@$$A6AXPEAVwidget@widgets@@@Z@std@@@Z@XPEAV34@I@std@@
// Wine demangled:
std::I::XPEAV34::Z::_Func_impl_no_alloc<widgets::widget::et_custom_render::`2'::`2'::<lambda_1>, & * __ptr64 const,std::function<void __cdecl(widgets::widget * __ptr64)> >
// Windows demangled:
// TODO
There sadly does not appear to be an easy way of detecting when these errors occur
automatically. This did not cause any major issues when used in DFHack as all
the types that encountered this problem were from the std
or Concurrency
namespace, and could be filtered out by checking for a trailing std@@
or Concurrency@@
.
Result
In the end, these failures wine encountered could be worked around safely by any code using them, even though the problematic behavior could not be eliminated entirely.
The resultant PR implementing these changes can be found on github.