For Developers Working With Unreal Engine

The Sadness of the Ignored Null Pointer

by

You can view and download the code for this post from our Github Pull Request: https://github.com/EpicGames/UnrealEngine/pull/3728

Take a look at the following function (as always, I looked at this one due to it showing up in performance tests on our project)… see if you can see anything suspicious?

FORCEINLINE bool IsPendingKill() const
{
  return GUObjectArray.IndexToObject(InternalIndex)->IsPendingKill();
}

Now take a look at IndexToObject(), used in the above:-

FORCEINLINE FUObjectItem* IndexToObject(int32 Index)
{
  check(Index >= 0);
  if (Index < ObjObjects.Num())
  {
    return const_cast<FUObjectItem*>(&ObjObjects[Index]);
  }
  return nullptr;
}

Seen anything yet..? There’s something here which, assuming the code above isn’t wasting precious CPU cycles, isn’t quite right.

Yep, if Index is too large, we’re going to be returning nullptr. And IsPendingKill() isn’t picking that up, it’s jumping straight into FUObjectItem::IsPendingKill()… which would give an access violation. Here’s the assembly just to show that:-

0x1413738e6	movsxd rax, dword ptr [rbx+0xc]
0x1413738ea	cmp eax, dword ptr [rip+0x19c2334]
0x1413738f0	jnl 0x1413738ff
0x1413738f2	shl rax, 0x4
0x1413738f6	add rax, qword ptr [rip+0x19c231b]
0x1413738fd	jmp 0x141373901
0x1413738ff	xor eax, eax
0x141373901	mov eax, dword ptr [rax+0x8]
0x141373904	shr eax, 0x1d
0x141373907	test al, 0x1

A quick explanation of this:-

  • Lines 1-3 are dealing with  “if (Index < ObjObjects.Num())”, jumping through to line 7 if Index is too large (jnl is “jump if not less”)
  • Lines 4-5 work out the FUobjectItem pointer
  • Lines 6 jumps over line 7 to line 8
  • Line 7 sets eax to zero (xor’ing a register with itself is a short/fast way of doing that)
  • Lines 8-9 grab the IsPendingKill value
  • Line 10 tests it

Considering all of this, one of these 2 things must be true:-

  1. IsPendingKill() has a bug .. it should be checking for nullptr … we’d need to investigate whether the function should then return true or false in that case…;
  2. Index is never too big – so we could optimize the code perhaps?

For now, since the code has been this way for 15 months, and as far as I know no problems have surfaced as a result of it potentially being wrong, I’m going to assume that case (2) is the correct one… at the very least, if we assume that, we’re no worse off than we were before – if we ignore Index being too large, we’d likely just have an access violation or just a read from somewhere beyond the ObjObjects array.

While looking more into this, I found the following function close to IndexToObject():-

FORCEINLINE FUObjectItem* IndexToObjectUnsafeForGC(int32 Index)
{
  return const_cast<FUObjectItem*>(&ObjObjects[Index]);
}

This does exactly what we want, removing the test on Index… so we can go ahead and replace IsPendingKill() with a call to this instead:-

FORCEINLINE bool IsPendingKill() const
{
  return GUObjectArray.IndexToObjectUnsafeForGC(InternalIndex)->IsPendingKill();
}

And, while we’re at it, the same changes can be made in several other functions… these are all in Engine\Source\Runtime\CoreUObject\Public\UObject\UObjectBaseUtility.h:-

FORCEINLINE void MarkPendingKill()
{
  check(!IsRooted());
  GUObjectArray.IndexToObjectUnsafeForGC(InternalIndex)->SetPendingKill();
}
FORCEINLINE void ClearPendingKill()
{
  GUObjectArray.IndexToObjectUnsafeForGC(InternalIndex)->ClearPendingKill();
}
FORCEINLINE void AddToRoot()
{
  GUObjectArray.IndexToObjectUnsafeForGC(InternalIndex)->SetRootSet();
}
FORCEINLINE void RemoveFromRoot()
{
  GUObjectArray.IndexToObjectUnsafeForGC(InternalIndex)->ClearRootSet();
}
FORCEINLINE bool IsRooted()
{
  return GUObjectArray.IndexToObjectUnsafeForGC(InternalIndex)->IsRootSet();
}
FORCEINLINE bool ThisThreadAtomicallyClearedRFUnreachable()
{
  return GUObjectArray.IndexToObjectUnsafeForGC(InternalIndex)->ThisThreadAtomicallyClearedRFUnreachable();
}
FORCEINLINE bool IsUnreachable() const
{
  return GUObjectArray.IndexToObjectUnsafeForGC(InternalIndex)->IsUnreachable();
}
FORCEINLINE bool IsPendingKillOrUnreachable() const
{
  return GUObjectArray.IndexToObjectUnsafeForGC(InternalIndex)->HasAnyFlags(EInternalObjectFlags::PendingKill | EInternalObjectFlags::Unreachable);
}
FORCEINLINE bool IsNative() const
{
  return GUObjectArray.IndexToObjectUnsafeForGC(InternalIndex)->HasAnyFlags(EInternalObjectFlags::Native);
}
FORCEINLINE void SetInternalFlags(EInternalObjectFlags FlagsToSet) const
{
  GUObjectArray.IndexToObjectUnsafeForGC(InternalIndex)->SetFlags(FlagsToSet);
}
FORCEINLINE EInternalObjectFlags GetInternalFlags() const
{
  return GUObjectArray.IndexToObjectUnsafeForGC(InternalIndex)->GetFlags();
}
FORCEINLINE bool HasAnyInternalFlags(EInternalObjectFlags FlagsToCheck) const
{
  return GUObjectArray.IndexToObjectUnsafeForGC(InternalIndex)->HasAnyFlags(FlagsToCheck);
}
FORCEINLINE void ClearInternalFlags(EInternalObjectFlags FlagsToClear) const
{
  GUObjectArray.IndexToObjectUnsafeForGC(InternalIndex)->ClearFlags(FlagsToClear);
}
FORCEINLINE bool AtomicallyClearInternalFlags(EInternalObjectFlags FlagsToClear) const
{
  return GUObjectArray.IndexToObjectUnsafeForGC(InternalIndex)->ThisThreadAtomicallyClearedFlag(FlagsToClear);
}

Ok, sorry, when I started copy/pasting those, I honestly didn’t realize there were so many of them! But there you go…

As well as those, there are 2 more in Engine\Source\Runtime\CoreUObject\Private\UObject\UObjectBase.cpp… the first at the end of UObjectBase::DeferredRegister():-

check(!GUObjectArray.IsDisregardForGC(this) || GUObjectArray.IndexToObjectUnsafeForGC(InternalIndex)->IsRootSet());

And the next close to the end of UObjectBase::AddObject():-

if (InternalFlagsToSet != EInternalObjectFlags::None)
{
  GUObjectArray.IndexToObjectUnsafeForGC(InternalIndex)->SetFlags(InternalFlagsToSet);

}	

And that’s it..! Here’s how the new code looks when disassembled – feel free to compare this to the 10-line version above:-

00007FF6A79A28F6  movsxd      rax,dword ptr [rbp+0Ch]  
00007FF6A79A28FA  mov         rdx,qword ptr [GUObjectArray+10h (07FF6A9684288h)]  
00007FF6A79A2901  add         rax,rax  
00007FF6A79A2904  mov         ecx,dword ptr [rdx+rax*8+8]  
00007FF6A79A2908  shr         ecx,1Dh  
00007FF6A79A290B  test        cl,1  

We’re down to 6 simple lines of assembly – with no branching logic at all. That’s a definite win.

As I say, I’m not 100% sure whether this is really the “right” thing to do … it would be great to get some input from someone at Epic (perhaps in the comments below)… but, as it stands, if there’s a bug here then there was a bug in the previous code too … that seems unlikely, given the amount of time that’s now passed since the “bug” would’ve been introduced.

Anyway, with the new code, we see a significant reduction in time spent in IsPendingKill() (since it’s now much simpler)… and with the other functions, too, of course – though they weren’t actually causing us any great performance concerns at the time that we tested (pre-emptive optimizations are cool too, though, right..?).


Credit(s): Robert Troughton (Coconut Lizard)
Status: Currently unimplemented in 4.12


1 Comment

Leave a Reply

Your email address will not be published.

*

Latest from ALL

Placating The Natives

In this article we delve into Blueprint Nativization, a relatively new feature
Go to Top
%d bloggers like this: