Lỗi crash game sniper warrior 3 cryengine error memory allocation

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

doitsujin opened this issue

Jun 19, 2019

· 244 comments

Comments

For some reason it looks like DXVK's device memory allocation strategy does not work reliably on Nvidia GPUs. This leads to game crashes with the characteristic DxvkMemoryAllocator: Memory allocation failed error in the log files.

This issue has been reported in the following games:

1099 (Bloodstained: Ritual of the Moon)

1087 (World of Warcraft)

If you run into this problem, please do not open a new issue. Instead, post a comment here, including the full DXVK logs, your hardware and driver information, and information about the game you're having problems with.

Update: Please check for further information on how to get useful debugging info. Update 2: Please also see . Update 3: Please update to driver version 440.59.

The text was updated successfully, but these errors were encountered:

The same error with World of Tanks. Hardw.: 6700k 16GB GTX 780 3GB when i set and without __GL_SHADER_DISK_CACHE_PATH=~/.nv Nvidia doesn't allocate cache. Maybe here is issue? GPU drivers: 418.52.10 or 430.26

@Sandok4n

post a comment here, including the full DXVK logs

The shader cache should have nothing to do with memory allocation issues.

Ok. I'll try to reproduce this error but when it appeared I've downgraded kernel, dxvk and wine. Problem were that same. Only one thing was not changed. NV drivers (installation version in AUR was only new). Problem is for about two weeks.

The shader cache should have nothing to do with memory allocation issues.

Maybe not, but as i have pointed out in another thread "dirty shader cache", it seems to me i have fewer crashes with a fresh .nv cache (delete the GLCache folder AND the WoW/Retail/Cache folder). If i keep clearing it regularly the crashes is less, but more stuttering at the start. Crashing while zoning COULD perhaps mean something weird happens when DXVK shader compilation is done?

I assume that the shader compilation business with WoW goes something in the lines of: WoW (Cache .WDB) -> DXVK -> .nv (driver cache)? Could the WoW cache folder contain some weird shaders that DXVK uses too much memory to compile/read somehow?

Joshua-Ashton/d9vk

170 - possibly connected.

From my observations, crashes are more often if "free" host memory is low. But IMHO app should use "available" memory. https://gist.github.com/pchome/fb43b3752b878501757bdad571473a4e - mem data during such crash (from D9VK issue 170).

103 - I was happy with this fix, some "heavy" games was able to use my whole VRAM, then RAM, swap, ... and still be alive 😄 . Or REISUB sysrq sometimes.

Because of current issue I definitely want more "magically created RAM".

Test cache behaviour:

  • drop whole caches (not recommended): sync && echo 3 | sudo tee /proc/sys/vm/drop_caches more free ram, longer game sessions.
  • fill caches: search/copy/... large amount of files less free ram, shorter game sessions.

p.s. 418.52.10

If you can grab /proc/slabinfo or slabtop output that would be helpful. As is the output from grep . /proc/sys/vm/* preferably before and after though that understandably might be hard.

You could for example bump /proc/sys/vm/swappiness as a test, it would tell the kernel to be more active in freeing memory. Your gist doesn't show any swap at all which is odd.

From my observations, crashes are more often if "free" host memory is low. But IMHO app should use "available" memory.

The average application doesn't even know or care about how much RAM you have at all.

Someone on the VKx discord found that if VRAM is full, vkAllocateMemory fails even on a memory type that is not device local. This would also explain why

1099 crashes even though memory utilization is very low. This does include VRAM allocated by other applications (window manager, browser, ...), which DXVK has no control over.

@doitsujin

From my observations, crashes are more often if "free" host memory is low. But IMHO app should use "available" memory.
The average application doesn't even know or care about how much RAM you have at all.

Yes, it was not a technical description.

@h1z1

Your gist doesn't show any swap at all which is odd.

swap:512MiB, swappiness:10, swap in my system used only as "fallback", it rarely filled and used as indicator "be ready". Also it's zram.


Well, superposition test still the thing, I able to reproduce the issue running the "1080p" profile. It quits immediately when VRAM got filled. "720p" profile is fine with ~1200/1300MB used/allocated.

I installed 418.49.04, the lowest (IIRC) driver version for my current kernel (5.0.21) and was able to fill whole VRAM (1900+) and have ~2700/2800MB used/allocated during benchmark. Well, it's freshly booted system, so I going to stay on 418.49.04 driver for a while and perform more tests later, to be sure.

This is also an issue with Borderlands GOTY Enhanced. Seems to occour when loading new map areas/title sequence. It seems that this does not happen once loaded successfully into a map, until I have been playing for around 15-20minutes. For example, after loading in, traveling between seperate map areas (loading sequence) does not produce a crash no matter how many times you travel. But trying to load a new area after ~10 minutes crashes the game.

Regarding , Clearing an already built cache makes the game crash on launch with the same errors nearly every single time until the 3rd or 4th launch. Very strange.

At first I thought this was an issue with Reshade, however it appears that this happens less often with Reshade active. Perhaps this is just placebo.

d3d11.log (note: I removed a few thousand lines of compiling shader outputs, above paste limit)

dxgi.log

lutris/wine/dxvk_debug.log

Specs: i7-4770 GTX 980 Ti Kernel: 5.1.11-arch Driver: 430.26.0 DXVK: 1.2.2 Wine: ge-protonified-4.10 (tested Proton 4.2-7 & Wine 4.9 Staging)

Cheers (side question: Is this a recent development? I've never noticed this with any other games before, although previous DXVK versions have the same error)

@telans @Rugaliz Can you test setting the environment variable `DxvkMemoryAllocator: Memory allocation failed`1?

Note that performance will most likely be poor, but this should hopefully work around the crashes.

Still crashing, and performance appears to remain the same.

Lỗi crash game sniper warrior 3 cryengine error memory allocation

lutris.log

Is Borderlands a 32-bit game? In that case your issue is most likely something else, on Proton you can try `DxvkMemoryAllocator: Memory allocation failed`2. Some wine builds in Lutris may also support this (it would be `DxvkMemoryAllocator: Memory allocation failed`3 there).

The Enhanced version (remastered/released a couple months ago) I'm playing is 64bit, the remastered versions are also updated to DX11 from DX9.

update: ge-wine does support `DxvkMemoryAllocator: Memory allocation failed`4, but this didn't change anything.

From my observations, crashes are more often if "free" host memory is low. But IMHO app should use "available" memory. The average application doesn't even know or care about how much RAM you have at all.

I had the error in D9VK on my system with 32 GB on a 2080 Ti. Both RAM and VRAM were barely 25% used when I got this error. It has nothing to do with availability.

Also interesting is that I can hit the error with BL2 in a couple of minutes, but I've been playing Bloodstained Ritual of the Night for much longer without a problem. Could it be something new that is not included in Proton yet? The errors are also relatively new to D9VK (as in, builds older than Monday 10 June were fine).

> From my observations, crashes are more often if "free" host memory is low. But IMHO app should use "available" memory.
The average application doesn't even know or care about how much RAM you have at all.
I had the error in D9VK on my system with 32 GB on a 2080 Ti. Both RAM and VRAM were barely 25% used when I got this error. It has nothing to do with availability.

The couple of times i have actually had any monitoring up while this crash happened with World of Warcraft and DXVK, the dxvk HUD had a bump in allocated up around 3.6GB-4GB, and nVidia SMI was barely 2GB'ish. This is with RTX2070 8GB card. So yeah, it does not really seem to be ACTUAL resource starvation, but some imaginary problem possibly from the driver perhaps.

Could it be something new that is not included in Proton yet? The errors are also relatively new to D9VK (as in, builds older than Monday 10 June were fine).

There have been no memory allocation changes at all for several months. Only 138dde6 (from today) changs things a bit, but most likely won't affect this issue at all.

I also somehow doubt that this can be fixed within DXVK since it's the vkAllocateMemory calls that are failing for no apparent reason, no matter which memory type we're trying to allocate from.

There have been no memory allocation changes at all for several months. Only 138dde6 (from today) changs things a bit, but most likely won't affect this issue at all.

Yeah, just tried it and still crashed unfortunately.

New lines in log:

DxvkMemoryAllocator: Memory allocation failed`6 `DxvkMemoryAllocator: Memory allocation failed`7 `DxvkMemoryAllocator: Memory allocation failed`8 `DxvkMemoryAllocator: Memory allocation failed`9 "free"0 "free"1 "free"`2

I also somehow doubt that this can be fixed within DXVK since it's the vkAllocateMemory calls that are failing for no apparent reason, no matter which memory type we're trying to allocate from.

As mentioned, I haven't actually run into this one myself with DXVK in Proton 4.2-7. But assuming that D9VK still shares the same memory allocation code, something changed in the last 10 days that made it highly sensitive. Maybe there is a hint there.

Well, found a little snippit to allocate ram via CUDA. https://devtalk.nvidia.com/default/topic/726765/need-a-little-tool-to-adjust-the-vram-size/


# include <stdio.h>
int main(int argc, char *argv[])
{
     unsigned long long mem_size = 0;
     void *gpu_mem = NULL;
     cudaError_t err;
     // get amount of memory to allocate in MB, default to 256
     if(argc < 2 || sscanf(argv[1], " %llu", &mem_size) != 1) {
        mem_size = 256;
     }
     mem_size *= 1024*1024;; // convert MB to bytes
     // allocate GPU memory
     err = cudaMalloc(&gpu_mem, mem_size);
     if(err != cudaSuccess) {
        printf("Error, could not allocate %llu bytes.\n", mem_size);
        return 1;
     }
     // wait for a key press
     printf("Press return to exit...\n");
     getchar();
     // free GPU memory and exit
     cudaFree(gpu_mem);
     return 0;
}

Needs cuda-dev-kit from nVidia (or distro). Compile with: `"free"`4

That way you can allocate and "spend" vram without actually spending it.. What happened if i spend 6GB vram, was that WoW started as normal, and did not crash even tho after running around a bit and zoning++ vram was topped out at 7.9GB+ on my 8GB card. Did not crash, not notice any huge issues, but did not test more than maybe 10-15 minutes.

However, using "gpufill" to load 7GB ram `"free"`5 to spend 7GB vram BEFORE starting WoW, something was clearly taxed to system ram instead, cos the performance was horrible. But i still did not crash from that. Screenshot:

Lỗi crash game sniper warrior 3 cryengine error memory allocation
Closing "gpufill" by pressing enter did release 7GB of vram according to nVidia-smi, but there was no change in WoW performance. This atleast indicates that allocated vram -> system ram does not "transfer" back to actual vram even if its freed later. That may well be intended tho, but from what i gather even this experiment did not immediately crash WoW, so the crashing might not REALLY be actual memory allocation problems due to memory starvation.

The "shared memory" thing between vram<->sysram probably does not work the same way that swap does i guess? Ie. in a memory starving situation things gets put to swap on disk, but once memory gets freed, it does not continue to be used from swap. I have no clue what is supposed to happen in a situation like that tho?

Will do some more testing with this, and with the latest 138dde6

138dde6 seems an improvement so far.

Doing the same test as above with 7GB memory allocated with "gpufill", WoW loaded and had a lot higher fps, although some stuttering and framespikes.. closing "gpufill" to release 7GB vram brought the frametimes down, and fps up. Fairly playable, but i noticed GPU load was still 90%+ vs if normally where i was standing it usually is 45-50% with 30+ more fps.

So for the little testing i did, 138dde6 did help on performance when in a out of vram situation. EDIT: Clearing the .nv/GLCache folder and WoW/retail/Cache folder brought back the same "issues" as it seems..

One other thing i noticed was nVidia-smi seemed to indicate less vram usage from WoW. Is this due to "reuising chunks" so that "actual" vram is not so much?

Since i am an incredibly slow learner, and a n00b.. Let me just ask this to TRY to get my head around this "allocated" thing. The Cuda app i posted above "allocates" vram from "actual" vram. If i have 7800MB free vram, i can allocate 7800MB, but if i try to allocate 7900MB i get "Error, could not.." So, when i open eg. firefox, it uses (according to nVidia SMI) 79MB. When i play WoW at my current resolution/settings, the app uses 1880'ish MB. This does not vary much, but may vary with spell effects, and possibly when changing "worlds" (ref. expansions and different texture details and whatnot). Simple math according again to nVidia SMI, 1880 (wow) + 79 (firefox) = 1959mb. This means i can allocate 6GB (well.. i could allocate 5960MB with the cuda app).

Reading from DXVK HUD, the "allocation" is 4500+ MB. What is this "allocation", and is this "unlimited"? Is the allocation limited by vram + system ram? (in my case 8 + 16 = 24GB) From the little tests i have done, it is atleast clear that the "allocated" and "used" listed on dxvk hud does not in any way limit me allocating vram with the cuda app, or starting chrome or whatnot. The only thing that actually spew an error message is if i try to use the cuda app to allocate > available vram.

What i don't know is supposed to happen with this "dxvk allocation" is what happens if physical vram is full. From the tests it SEEMS as it will happily use system ram (as i guess this is the intended function). The "allocation" and "used" does not change, but WoW (according to nVidia SMI) uses less physical vram if the game is started in a vram starved situation vs. not. What was rather clear tho, is that it can seem as if once any actual data (textures and whatnot) is put in the system ram, it stays there for some reason. The tests with really starved vram makes the GPU usage 99%, and fps.. a LOT less even after i kill the cuda app, even if i then get 5GB free physical vram. Would it not be ideal if allocation blocks could be freed or moved to vram once vram is free? Or is that not a feature available to vulkan.. or perhaps a driver thing that things dont get "transfered"?

Would it not be ideal if allocation blocks could be freed or moved to vram once vram is free?

Indeed, but that would require recreatnig all Vulkan resources that are in system memory, as well as all views for those resources. This is an absolute nightmare, and I have no plans to do that.

DXVK can let the driver do the paging so that it doesn't have to recreate any resources, however that only works on drivers which support `"free"`6 and allow over-subscribing the device-local memory heap. On Linux, this currently only works on AMD and possibly Intel drivers.

SveSop, have you tried completely disabling GLCache with `"free"`7?

DXVK can let the driver do the paging so that it doesn't have to recreate any resources, however that only works on drivers which support `"free"`6 and allow over-subscribing the device-local memory heap. On Linux, this currently only works on AMD and possibly Intel drivers.

Since this extension IS available for Windows and nVidia, hopefully this COULD be a thing for Linux aswell. IF this happens, would this help in situations like this? Cos to me it kinda seems like somewhat of a drawback if resources ever get put in system ram and never moved back. I wonder if this is somewhat related to what i have tried to describe before - After playing a while (2-3hours +), the performance is worse (less fps) standing at the same spot, but restarting the game will gain back the same performance i had earlier. Maybe over time some stuff gets bumped to sysmem due to the "allocated memory" actually allocating memory outside of vram and decides to put some shit there? Cos as i have kinda proven above - allocation does not seem to have anything AT ALL to do with available vram.

Is it up to the driver not to mess this up? If i have 2GB physical vram, and DXVK allocates 4.5GB, it is feasible to think 2.5GB of that is allocated in system ram, but if i have 8GB vram, it "should" be allocated in vram... but that does not seem to be the way things actually works i guess. Can one blame the driver for putting stuff "where it seems fit", assuming `"free"`6 extension is not available?

Getting same crash on work computer, with Radeon HD 8570 / R7 240/340 OEM

440.26 was released. Wonder if this is related?

`"available"`0

Would be nice for NVIDIA to provide at least some context.

I don't see much of a point buying nvidia hardware anytime soon.

Wonder if this is related?

I believe it is, but there's another issue regarding sysmem allocations. There's a patch floating around somewhere, but I don't know if it made its way into any official driver release yet.

As the context provided in the changelog entry implies, this fixes a different issue with different symptoms. That was

1169

For the issue here, there has been a patch floating around that we were waiting for feedback on, and we didn't really get a lot of testing data from end users. It has now been added to our trunk, and will show up in the next release in our Vulkan beta sidebranch, as well as in an unspecified future official release.

@h1z1

440.26 was released. Wonder if this is related?

`"available"`0

Would be nice for NVIDIA to provide at least some context.

It seems this patch was introduced in the vulkan beta branch with 435.19.03 a while back, but its nice to have it for a release driver.

@ahuillet Are you talking about the `"available"`2 patch? Due to the somewhat random nature of this allocation failure it is near impossible to pin a certain patch as a fix. As i posted either up the thread, or someplace else, i have had streaks of several days playing 3-4 hours a day without ANY issues, and suddenly i can experience a failure 2-3 times in a row.

This means adding this patch and play for 3-4 hours a couple of times is not enough to say "it fixes things" sadly. Perhaps with a large enough test-base it could indicate something, and implementing this is the only way to go, as most ppl do not patch and compile their own drivers.

Is there an extension one can easily use to check allocations in native vulkan apps? Like the DXVK HUD outputs "allocated" and "used". It could perhaps be an interesting experiment to see how allocations are used in games like The Talos Principle or similar when compared to DXVK/Wine games.

As the context provided in the changelog entry implies, this fixes a different issue with different symptoms. That was

1169

There was no context, unless you're privy to something else. No where was Squad even mentioned let alone the bug.

There are several reports of memory allocations failing. 1169 may be triggered by some other specific code path within the driver but the end result is the same as this one. Memory is not allocated. The crashes likely are a result of where the allocation happened.

When the errors are being reported by the hardware it's pretty far outside anything end users would understand. What else is NVIDIA expecting? It's literally impossible to diagnose a binary blob. Maybe it's a specific bios vendor or some combination of hardware.

Could someone please summarize which 64 bit applications definitely are affected?

Updated nvidia/dxvk 2 days ago, to current at that day. Now E:D still may fail 1-2 times to start but not so often as it was before (before I could do 7 times alt+sysreq+reisub prior game going). In game had no hangs during 2 days at all (it was before, solved sysreq+b). Also game started to lag heavy as it was in May-June periodically (sys mem allocation?). It seems for me some regression was done in driver / dxvk.

@alexzk1 Yes, the effect of fallback allocation from sysmem can result in reduced performance. I can confirm that with the driver fixes, games are more likely to lag instead of crashing. Tho, I never saw hard/complete system freezes as the result of that. In KDE there's a keyboard shortcut to invoke a window kill mouse pointer (Ctrl+Alt+Esc by default). It always worked, tho, it could take 20-30 seconds to kill the game window. Usually, this leaves processes lingering which should then be killed with `"available"`3 but it gets you back to the desktop.

What always helped the lags (after the driver fixes) in my case was lowering the texture quality. In many games, reducing the quality just one step usually helps a lot with only barely visible loss of render quality, especially if you're going from "ultra" to "very high", or from "very high" to "high".

There's a patch floating around which patches the open source interface of the driver to use a different allocation strategy. It instructs the kernel to not immediatly fail kernel memory request but let the driver retry while the kernel tries to free some memory for the allocation. It will still be in the fallback path, of course (allocating from sysmem).

The patch is here: https://github.com/Tk-Glitch/PKGBUILDS/blob/master/nvidia-all/patches/GFP_RETRY_MAYFAIL-test.diff

@wgpierce is asking for trying the patch and reporting back the results.

I tried this patch but I found it introduces lags in normal desktop usage in render-heavy browser tabs, and also in games (other types of lags not seen before), without really improving the situation of the existing problems. Not sure if this was coincidence or really a side-effect of the patch. I'll try to reproduce that later.

This bug has always been about sysmem allocation failures. This isn't a fallback path. This bug isn't about video memory being full.

This bug has always been about sysmem allocation failures. This isn't a fallback path. This bug isn't about video memory being full.

Yes, I know that. You probably wrote that because I wrote "what helped [...] in my case was lowering the texture quality". While this would indeed sound like a video memory allocation issue, it really isn't in my case: VRAM always had free space left (1GB+). But still, reducing the texture quality reduced crashes. There may be some interaction between handling texture uploads and sysmem allocations. With current drivers, it now doesn't crash but starts to lag/stutter at the same instant in the game. If that's not a result of the latest driver changes, what is it then? Why would this bug be affected by texture usage if it shouldn't be?

Texture memory was never really an issue yet. With too high texture settings, games would either crash very early (during load), or have low FPS right from the start because texture reads are going to sysmem.

Borderlands: The Pre-Secuel via Forced PROTON with the UHD textures pack installed, crash with D9VK....

`"available"`4

With D9Vk disabled, the game doesn't crash, but the performance is horrible.

The GPU is an Nvidia 2060 SUPER The CPU is an intel 4690K with 4x4GB DDR3 2400 CL11 All SSD's

This is the Steam log... steam-261640.log

I don't know how to create a full DXVK log.

Borderlands: The Pre-Sequel is a 32bit game so it probably runs out of address space just like Borderlands 2.

Borderlands: The Pre-Sequel is a 32bit game so it probably runs out of address space just like Borderlands 2.

I know, that's why I have PROTON_FORCE_LARGE_ADDRESS_AWARE=1 on both games.. I don't have that crash on Borderlands 2 and, like The Pre-Sequel, it has the UHD textures pack installed.

Remember that if I use the default Proton OpenGL, The Pre-Sequel works without any crash...

I know, that's why I have PROTON_FORCE_LARGE_ADDRESS_AWARE=1 on both games.. The problem still happens with LAA. Remember that if I use the default Proton OpenGL, The Pre-Sequel works without any crash... Unrelated.

This is a DXVK/D9VK issue (higher memory usage). It's not the Nvidia driver issue.

@CSahajdacny @K0bin Yes The Pre-Sequel runs out of address space with UHD textures. That game blows out the address space even with PROTON_FORCE_LARGE_ADDRESS_AWARE=1. That issue is the same as here (BL2 also suffers from it) and is not what the Nvidia devs are trying to fix in this issue.

Weird. I can play Borderlands 2 with UHD textures without any problem with everything at maximum... Is there an specific part where the game crash? I am on the second map of the game, where Hammerlock is located.

It can also be exacerbated by screen resolution. Discussion of it would be better suited in Joshua-Ashton/d9vk

170

Not sure guys what u did there, but I got banned for client modification of E:D. That's sad to say least.

@doitsujin I can reliably reproduce this with d9vk and Heroes of Might and Magic 5 (in the main campaign and when starting the Dark Messiah addon). In the main campaign it crashes when starting a cutscene (the game automatically saves right before it, which means that loading the save triggers the crash) and the addon dies when starting. Is there something I can provide you with or is this purely a driver issue (skimmed the comments, but have a hard time extracting meaningful information)?

Again, crashes with 32-bit games running out of memory are unrelated.

@doitsujin Unless I misinterpreted what I'm seeing and what this is about, I do believe that I see the same problem. dxvk reports that the memory allocation failed, but the stats on the next line show that there is enough memory available. Also this doesn't happen without d9vk as far as I can tell (but I'd have to test that again).

Again, you are testing 32 bit games which have only 2GB of virtual memory available, which isn't always enough for d9vk (or dxvk in general). This has nothing to do with this issue.