Why GPU memory matters for CAD, viz and AI |

Even the fastest GPU can stall if it runs out of memory. CAD, BIM visualisation, and AI workflows often demand more than you think, and it all adds up when multi-tasking, writes Greg Corke

When people talk about GPUs, they usually focus on cores, clock speeds, or ray-tracing performance. But if you’ve ever watched your 3D model or architectural scene grind to a crawl — or crash mid-render — the real culprit is often GPU memory, or more specifically, running out of it.

GPU memory is where your graphics card stores all the geometry, textures, lighting, and other data it needs to display or compute your 3D model / scene. If it runs out, your workstation must start paging data to system RAM, which is far slower and can turn an otherwise smooth workflow into a frustrating slog.

This is why professional GPUs usually come with more memory than consumer cards. Real-world CAD, BIM, visualisation, and AI workflows demand it. Large assemblies, high-resolution textures, and complex lighting can quickly fill memory. Once GPU memory is exhausted, frame rates can collapse, and renders lag. Extra GPU memory ensures everything stays on the card, keeping workflows smooth and responsive.

This article is part of AEC Magazine’s 2026 Workstation Special report

Subscribe here

GPU memory isn’t a luxury — it can make or break a workflow. A fast GPU may crunch geometry or trace light rays quickly, but if it can’t hold everything it needs, that speed is wasted. Even the most powerful GPU can feel practically useless if it’s constantly starved for memory.

CAD and BIM: quiet consumers

In CAD software like Solidworks or BIM software such as Autodesk Revit, GPU memory is rarely a major bottleneck. Most 3D models, particularly when viewed in the standard shaded display mode, will comfortably fit within a modest 8 GB professional GPU, such as the Nvidia RTX A1000. However, it’s still important to understand how CAD and BIM workflows impact overall GPU memory — each application and dataset contributes to the total, and it soon adds up.

Memory demands rise with model complexity and display resolution. The same model viewed on a 4K (3,840 × 2,160) display uses more memory than when viewed on FHD (1,920 × 1,080). Realism also has an impact: enabling RealView in Solidworks or turning on realistic mode in Revit consumes more memory than a simple shaded view.

Looking ahead, as CAD and BIM software evolves with the addition of modern graphics APIs, and viewport realism goes up with more advanced materials, lighting, and even ray tracing, memory requirements will increase. At that point, 8 GB GPUs will probably start to show their limitations, so when considering any purchase it’s prudent to plan for the future.

Visualisation: memory can explode

GPU-accelerated visualisation tools like Twinmotion, D5 Render, Enscape, KeyShot, Lumion, and Unreal Engine are where memory demands really spike. Every texture, vertex, and light source must reside in GPU memory for optimal performance. High-resolution materials, dynamic shadows, reflections, and complex vegetation can quickly push memory usage upward. As with CAD, display resolution also has a significant impact on GPU memory load.

Running out of GPU memory in real-time visualisation software can be brutal. Frame rates don’t gradually decline — they plummet. A smooth 30– 60 frames per second (FPS) viewport can drop to 1–2 FPS, making navigation impossible, and in the worst cases, the software may crash entirely. This is why professional GPUs aimed at design visualisation, such as the RTX Pro 2000 or 4000 Blackwell series, come with 16 GB, 24 GB, or even more memory. Having a cushion of memory allows designers to push realism without worrying about performance cliffs.

In real-world projects, memory usage scales with scene complexity. A small residential building might need 4–6 GB, but a large urban environment with trees, vehicles, and complex lighting can easily consume 20 GB or more.

Exporting final stills and videos pushes memory demands even higher. A scene that loads and navigates smoothly can still exceed the GPU’s capacity once rendering begins. Often there’s no obvious warning — renders may just take much longer as data is offloaded to system RAM. The more memory that’s borrowed, the slower the process becomes, and by the time you notice, it may already be too late: the software has crashed.

AI: memory gets even hotter

AI image generators, such as Stable Diffusion and Flux, place a completely new kind of demand on GPU memory. The models themselves, along with the data they generate during inferencing, all need to live on the graphics card. Larger models, higher-resolution outputs, or batch processing require even more memory.

If the GPU runs out, AI workloads either crash, or slow dramatically as data is paged to system RAM. Even small amounts of paging can cause significant slowdowns — which can be more severe than running out of memory in a ray-trace renderer. According to Nvidia, the Flux. dev AI image-generation model requires over 23 GB to run fully in GPU memory.

Everything adds up

The biggest drain on GPU memory comes when multi-tasking. Few designers work in a single application in isolation — CAD, BIM, visualisation, and simulation tools all compete for memory. Even lighter apps, like browsers or Microsoft Teams, contribute. Everything adds up. The GPU doesn’t necessarily need to have everything loaded at once, but even when it appears to have a little spare capacity, you can notice brief slowdowns as data is shuffled in and out of memory. When bringing a new app to the foreground, the viewport can initially feel laggy, only reaching full performance seconds later.

Modern GPUs handle multi-tasking better than older cards, but if you’re running a GPU render in the background while modelling in CAD, you definitely need enough memory to handle both. Otherwise, frame rates drop, viewports freeze and rendering pipelines choke.

Different graphics APIs can complicate matters further. OpenGL-based CAD programs and DirectX-based visualisation tools don’t always share memory efficiently.

Keeping your memory in check

You can take several steps to help avoid running out of GPU memory. Close any applications you’re not actively using, and reboot occasionally — memory isn’t always fully freed up when datasets or programs are closed.

Understanding how different workflows and applications impact memory usage helps, too. You can track this in Windows Task Manager or with a dedicated tool like GPU-Z.

Practical strategies also help reduce memory load. In visualisation software, avoid high-polygon assets where they add little visual value, use optimised textures appropriate to the resolution of the scene, and take advantage of level-of-detail technologies such as Nanite in Twinmotion and Unreal Engine. Even in CAD and BIM software, limiting unnecessary realism during navigation can help keep memory usage within bounds. And do you really need to model every nut and bolt?

The bottom line

GPU memory is just as important as cores or clocks in professional workflows, and unlike CPU memory, it’s unforgiving — there’s no graceful degradation. CAD and BIM may not be massive memory hogs, but they all contribute to the load. Visualisation demands far more, AI workflows can push requirements even higher, and multi-tasking compounds the problem.

Professional add-in graphics cards with large memory pools give designers, engineers, and visualisation professionals the headroom needed to work without hitting sudden performance cliffs.

Meanwhile, new-generation processors with advanced integrated graphics, offer a different proposition. The AMD Ryzen AI Max Pro for example, gives the GPU fast, direct access to a large pool of system memory — up to 96 GB. This allows very large datasets to be loaded, and renders to be attempted, that would be impossible on a GPU with limited fixed memory.

However, as datasets grow, don’t expect performance to scale in the same way. One must not forget that the GPUs in these new all-in-one processors are still very much entry-level, so renders and AI tasks will take longer and navigating large, complex viz models can quickly become impractical due to low frame rates.

Ultimately, understanding how GPU memory is consumed — and planning for it — will help avoid slowdowns, crashes, and frustration, ensuring workflows stay fast, responsive, predictable, and frustration-free.

Keeping an eye on GPU memory

Keeping an eye on GPU memory usage is important as it lets you see exactly how much your applications and active datasets are consuming at any given moment, rather than relying on guesswork or system slowdowns as a warning sign. It also makes it possible to see the immediate impact of closing an application, unloading a large model, or switching projects, helping you understand which tasks are placing the greatest demands on your hardware.

This insight allows you to plan your workflow more effectively, avoiding situations where memory pressure leads to reduced performance, stuttering viewports, or crashes. It can also inform purchasing and configuration decisions, such as whether you need a higher-end GPU with more memory, or simply better task management.

Monitoring can be done through a dedicated app like GPU-Z or simply through Windows Task Manager. To access, right click on the Windows taskbar, launch Task Manager, then select the Performance tab at the top. Finally click GPU in the left-hand column and you’ll see all the important stats at the bottom.

AEC Magazine Workstation Special Report

1 GPU Utilisation: When rendering or navigating complex viz models, this will often sit close to 100%. In CAD or BIM workfl ows it is typically much lower, as these applications are less GPU-intensive

2 GPU memory: The total memory available to the GPU (dedicated GPU memory + shared GPU memory)

3 Dedicated GPU memory: the amount of memory physically on your graphics card. Ideally, you want to maintain some headroom here, as once it becomes full, performance can drop sharply

4 Shared GPU memory: a portion of system RAM that Windows makes available to the GPU. A small amount is always reserved, but usage increases when dedicated GPU memory is exhausted. As reliance on shared memory grows, GPU performance degrades and applications may become unstable or crash.

What happens when you run out of GPU memory?

Running out of GPU memory can be catastrophic, and the impact is often far more severe than many users expect. Our testing highlights just how dramatic the consequences can be across two different workflows – AI image generation and real-time visualisation.

In the Procyon AI Image Generator benchmark, based on Stable Diffusion XL, we compared an 8 GB Nvidia RTX A1000 with several GPUs offering larger memory capacities. On paper, the RTX A1000 is only around 2 GB short of the 10 GB required for the benchmark’s dataset to reside entirely in GPU memory. In practice, that small deficit caused performance to fall off a cliff. The RTX A1000 took a staggering 23.5 times longer to generate a single image than the RTX A4000 — far beyond what its relative compute specifications would suggest.

With 16 GB of memory, the RTX A4000 can keep the entire AI model resident in GPU memory, avoiding costly paging to system RAM and delivering consistent performance.

A similar pattern emerged in Twinmotion. Using an older Nvidia Quadro RTX 4000 with 8 GB, we loaded the Snowdon Tower Sample project, which requires around 7.2 GB at 4K resolution. When run in isolation, the scene fit comfortably in GPU memory, delivering smooth real-time performance at around 20 frames per second (FPS). However, by simultaneously loading a complex 7 GB CAD model in Solidworks, we forced the GPU into a memoryconstrained state. Twinmotion’s viewport performance collapsed to just 4 FPS, before recovering to 20 FPS once memory was eventually reclaimed.

Stable Diffusion image courtesy of James Gray

Autodesk Revit 2026: GPU memory utilisation

Revit is the most widely used BIM tool. For our tests, we used the small-to-medium sized Snowdon sample model that comes with the software.

Display resolution has a significant impact on GPU memory usage, and enabling realistic mode more than doubles the amount of memory required.

Revit 2026 also allows models to be displayed using the Accelerated Graphics Tech Preview, a new graphics technology currently in development designed to improve performance. Interestingly, in shaded view, this tech preview appears to consume less GPU memory than the standard graphics engine.

D5 Render 2.9: GPU memory utilisation

D5 Render is a real-time architectural visualisation tool built on Unreal Engine, capable of producing still images, videos, and 360 panoramas. For our test scene, we used a colossal lakeside model containing 2,177,739,558 faces. While navigating the model in the viewport, GPU memory usage increased modestly as display resolution rose. Final renders placed a heavier load on memory overall, although output resolution itself appeared to have little influence. The heaviest memory demands occurred when rendering a final 4K video, requiring a minimum of a 20 GB GPU for best performance.

KeyShot Studio 2025: GPU memory utilisation

KeyShot Studio is a powerful 3D rendering and animation tool that can render with either the CPU or GPU. When GPU mode is enabled, the entire scene must fi t into GPU memory. If it doesn’t, KeyShot automatically falls back to CPU rendering. For our testing, we used a very large supermarket model from Kesseböhmer Ladenbau, containing approximately 447 million triangles. Despite the complexity, GPU memory usage scaled predictably. Increasing viewport resolution, final render resolution, and enabling denoising all raised memory requirements, but none had a dramatic impact.

Twinmotion 2024: GPU memory utilisation

Twinmotion is a popular GPU accelerated real-time viz tool. For our testing, we used the medium sized Snowdon Towers dataset. GPU memory usage increased most significantly when enabling path tracing, exporting video at higher resolutions, and batch rendering multiple raster images — but, interestingly, not when batch rendering multiple path traced images.

Lumion 2025: GPU memory utilisation

Lumion is a real-time visualisation tool focused on AEC workflows. For testing, we used the medium-sized Streetscape scene. GPU memory usage increased noticeably with higher viewport resolution, but output resolution had little effect on final raster exports. Ray-traced rendering had the largest impact, with memory use jumping sharply from FHD to 4K, likely explaining why 8K ray-tracing is disabled.

This article is part of AEC Magazine’s 2026 Workstation Special report

Subscribe here

Features

Reviews

Discover what’s new in technology for architecture, engineering and construction — read the latest edition of AEC Magazine

Subscribe FREE here

The post Why GPU memory matters for CAD, viz and AI appeared first on AEC Magazine.

Source: AEC