So Long MAX_PATH… And Thanks For All The Fish!

by

You can view and download the code for this post from our Github Pull Request: github.com/EpicGames/UnrealEngine/pull/3732

Summary

Almost every developer that we’ve spoken to seems to have, at some point, encountered the MAX_PATH limit problem… particularly an issue on Windows platforms where MAX_PATH is set to be just 260. It usually crops up when you’re cooking data. We previously looked at this problem and fixed “some” of the errors in The Long And Winding Path Of The Cook. This only pushed the problem a little further along, though, it didn’t totally fix it.

Extending MAX_PATH

You’ll be pleased to know that, starting in Windows 10 (version 1607), Microsoft have removed MAX_PATH restrictions from a number of file and directory management functions in the Windows API (MSDN link).

In today’s blog, and accompanying Github Pull Request, we’re going to show you how it’s possible to fix the problem completely (assuming you’re on a newer version of Windows). We’ll warn you in advance, though: the changes are quite deep into the engine… so you need to make the call about whether or not it’s right for your project. We’re hoping that Epic will take the changes – or simply implement the changes in their own way after seeing that it’s possible – so that developers don’t need to worry about integration conflicts later on.

Enabling Long Paths

Nominally, the developer can opt their program in to allow long path support by adding the following to the manifest file:-

<application xmlns="urn:schemas-microsoft-com:asm.v3">
  <windowsSettings xmlns:ws2="http://schemas.microsoft.com/SMI/2016/WindowsSettings">
    <ws2:longPathAware>true</ws2:longPathAware>
  </windowsSettings>
</application>

The user must also set the following registry key to true (1):-

  • HKLM\SYSTEM\CurrentControlSet\Control\FileSystem LongPathsEnabled (Type: REG_DWORD)

Or via a group policy:

  • Computer Configuration > Administrative Templates > System > Filesystem > Enable NTFS long paths

(Note that Microsoft’s documentation implies that either manifesting an application OR setting the registry key / group policy will enable long path support. Currently you definitely need to do both, if you want to try for yourself Powershell is manifested for long path support, but won’t create über deep directories without the correct registry settings!)

Required Code Changes

The settings above aren’t magic bullets, however… you’ll also have to make sure your program isn’t limiting itself. Many Windows API functions require the programmer using them to supply a pointer to an array that is large enough to take the data which they will return and to supply the size of that array. For example, GetCurrentDirectory() is often used as follows:-

TCHAR PathIWantToFind[MAX_PATH];
GetCurrentDirectory(MAX_PATH, PathIWantToFind);

Two parameters are needed.. a pointer to a suitably sized buffer and the size of it. If the function succeeds the current directory for the current process will be written to the provided buffer.

This was all well and good while Microsoft guaranteed that no file path would exceed 260 characters. But, of course, we want to move beyond that. Well, if we had no intention of supporting Windows versions lower than 10, we could replace that #defined MAX_PATH with the maximum size a filename can be on our filesystem – ie. 32,767 characters on those supported by Windows. But then we’re going to be having to create a 32,767 byte (minimum!) array every time we want to call this function, and most likely use less than 1% of the memory we just allocated most of the time. So really we need to determine the required array size at runtime… Fortunately (with a few exceptions…) Microsoft does provide solutions for this. For example, calling GetCurrentDirectory() with a NULL buffer sets the return value to the size of buffer required. So.. we could simply do something like this:-

// Get the buffer length (in TCHAR) needed - includes terminating ‘\0’
uint RequiredBufferSize = GetCurrentDirectory(0, nullptr);

// Create the buffer
TCHAR* PathIWantToFind = new TCHAR[RequiredBufferSize];

// Get the directory path
GetCurrentDirectory(RequiredBufferSize, PathIWantToFind);

...

// Cleanup
delete[] PathIWantToFind;
PathIWantToFind = nullptr;

Obviously, it’s possible to take advantage of the dynamic resizability, underlying arrays and guarantee of contiguous memory that Unreal’s FString or the STL’s std::vector provide if you want to avoid the danger of calling new and delete all over the place. But the above code provides a demonstration of the usage style.

Another similar style is in use elsewhere in the Windows API where you can provide a function with a pointer parameter to a variable to define the buffer length. When the function returns this will contain the size of buffer you’ll need for the data to be requested. Or sometimes there’s a pair of functions, one to query data length / size and one to retrieve the data. These can start to look a little weird, and exactly how any of these methods are implemented and whether they account for the terminating ‘\0’ character varies… For example:-

DWORD MaxSizeOfRegistryNameBuffer = 0;

// Get the length (in bytes) of the longest key name in the current registry
RegQueryInfoKey(Key, nullptr, nullptr, nullptr, nullptr, nullptr, nullptr, nullptr, &MaxSizeOfRegistryNameBuffer, nullptr, nullptr, nullptr);

// Account for terminating character '\0' as it isn't counted by RegQueryInfoKey()
MaxSizeOfRegistryNameBuffer++;

That’s a bit of an extreme example – but if you’re only after that one piece of data (the Name in this case – which could be a path) from the registry then this is how you need to retrieve the size of the data. Without, that is, resorting to defining an array based on MAX_PATH or another magic number (eg. 4096, 8192, 16384, 260, 1024, 32767 and 512 were all spotted in the UE4 codebase for calling these API functions!).

Yes, we do need to make more function calls and do some expensive heap allocation (even if we hide it under the hood using a dynamic array like std::vector), but using the Windows API functions in this way should ensure your program can handle any length filename (or other parameter) that Windows might pass to it. Even if filename length limits ever increase. That being said, that action adventure game set in Llanfairpwllgwyngyllgogerychwyrndrobwllllantysiliogogogoch (or Taumatawhakatangi­hangakoauauotamatea­turipukakapikimaungahoronukupokaiwhen­uakitanatahu for those of you in the southern hemisphere) you’re planning is still going to have a hard time hitting 32,767 character path lengths!

Generally, newer Windows API functions are moving towards returning pointers to strings. The user must declare a pointer, pass the pointer to the API and manually delete the memory when it’s finished with (eg. SHGetKnownFolderPath()). However, there are other older functions lurking in the API which require a correctly sized array and don’t provide any way of finding a buffer size beforehand. Fortunately they’re mostly deprecated and aren’t being used in Unreal Engine. That said, one of the least user friendly API functions according to StackOverflow (link), GetModuleFilenames(Ex), is still kicking around in BootstrapPackagedGame.cpp.

Fixing GetEnvironmentVariable()

There is a slight cross-platform problem that we need to look at. While the MAX_PATH problem is mostly specific to Windows, UE4 is written to be very Windows-centric, so the problem hampers other platforms as well. In particular, wrapper implementations of platform API functions tend to mimic the Windows API rather closely. One that’s particularly problematic is F<platform>PlatformMisc::GetEnvironmentVariable() (ie. FWindowsPlatformMisc etc).

//- FWindowsPlatformMisc.cpp
void FWindowsPlatformMisc::GetEnvironmentVariable(const TCHAR* VariableName, TCHAR* Result, int32 ResultLength)
{
  uint32 Error = ::GetEnvironmentVariableW(VariableName, Result, ResultLength);
  if (Error <= 0)
  {		
    *Result = 0;
  }
}
//- FLinuxPlatformMisc.cpp
void FLinuxPlatformMisc::GetEnvironmentVariable(const TCHAR* InVariableName, TCHAR* Result, int32 ResultLength)
{
  FString VariableName = InVariableName;
  VariableName.ReplaceInline(TEXT("-"), TEXT("_"));
  ANSICHAR *AnsiResult = secure_getenv(TCHAR_TO_ANSI(*VariableName));
  if (AnsiResult)
  {
    wcsncpy(Result, UTF8_TO_TCHAR(AnsiResult), ResultLength);
  }
  else
  {
    *Result = 0;
  }
}

In both of the cases above the user has to create an array of defined size outside the function, passing a pointer to that array and its size to the GetEnvironmentVariable() function – which copies the data into it.

Our suggestion is that the GetEnvironmentVariable() functions should be re-written to return an FString as follows:-

FString FWindowsPlatformMisc::GetEnvironmentVariable(const TCHAR* VariableName)
{
    // Returns the length of the Environment Variable name INCLUDING terminating character
    uint32 ResultLength = ::GetEnvironmentVariableW(VariableName, nullptr, 0);

    FString ResultString;
    ResultString.GetCharArray().SetNumUninitialized(ResultLength);
    TCHAR* Result = ResultString.GetCharArray().GetData();

    if (::GetEnvironmentVariableW(VariableName, Result, ResultLength) > 0)
    {
        ResultString.TrimToNullTerminator()
        return ResultString;
    }
    else
    {
        return FString();
    }
}

The Unix-like platforms supported by UE4 get environment variables by returning a pointer to an existing string (via getenv()) so would be relatively simple to implement in this style. This would do away with a lot of the need for fixed size arrays (and hence MAX_PATH and magic numbers). But, wholesale cross-platform engine changes are a little outside our bailiwick…

Conclusion

We’ve run through the UE4 codebase and removed references to MAX_PATH. Either the same functions have been used in the ‘call for size then call for data’ style described above or replaced to us the newer Windows API functions. With our changes, on Windows at least, UE4 should now be able to handle any path the OS throws at it.

*nb. Due to the nature of our changes, some platforms may need additional modifications. We’ve only made changes for the platforms that we have access to through Github.

 

Please, as always, let us know if you have any ideas or questions in the comments below!

 


Credits:
Programming/Research: Josef Gluyas (Coconut Lizard)
Mentoring: Joe Tidmarsh, Gareth Martin, Robert Troughton, Lee Clark (Coconut Lizard)


Leave a Reply

Your email address will not be published.

*