The virgl driver only queried the host resource layout (which includes
the real modifier) for PIPE_BIND_SHARED resources. Resources created
via gbm_bo_create() with GBM_BO_USE_RENDERING have PIPE_BIND_RENDER_TARGET
but not PIPE_BIND_SHARED, so the layout query never fired and
gbm_bo_get_modifier() always returned LINEAR (0). When the host GPU
allocates with non-linear tiling (e.g. NVIDIA block-linear), this
caused the host compositor to receive the wrong modifier in Wayland
dmabuf add() calls, producing GL_INVALID_OPERATION.
Broaden the trigger for virgl_resource_async_query_gbm_layout() to also
fire for PIPE_BIND_SCANOUT and PIPE_BIND_RENDER_TARGET resources. The
existing async query and sync mechanisms are unchanged.
Also relax virgl_resource_create_with_modifiers() to accept modifier
lists that don't include LINEAR, since the host always decides the
actual layout. This prevents resource creation failures when clients
request host-native modifiers from the compositor's format table.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
move_rt_instructions() only makes sense for CPS recursive shaders, where
later rt_trace_ray calls can overwrite the current shader's RT system
values.
Running it on the function-call path can hoist load_hit_attrib_amd
above merged intersection writes, which corrupts any-hit
hitAttributeEXT. Move the pass into the existing CPS-only
non-intersection branch before nir_lower_shader_calls().
Fixes: c5d796c902 ("radv/rt: Use function call structure in NIR lowering")
Closes: #15074
Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40531>
When vdrm_handle_to_res_id fails in virtio_bo_init_dmabuf, the handle
obtained from vdrm_dmabuf_to_handle was leaked.
Closing the handle is safe despite the lack of vdrm refcounting
because dma_bo_lock is held and already-imported BOs return early.
At this point, we are the sole holder of the handle.
While here, use the local vdrm variable consistently.
Fixes: 6ca192f586 ("turnip: virtio: fix iova leak upon found already imported dmabuf")
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40507>
In tu_bo_init, if growing the submit BO list fails, the GEM handle
must be closed. However, bo->gem_handle is only populated later
via compound assignment. Use the gem_handle parameter directly
to ensure the correct handle is closed and not leaked.
Fixes: d67d501af4 ("tu/drm/virtio: Switch to vdrm helper")
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40507>
When initializing a BO using a lazy VMA, the iova is provided by
the sparse VMA and was not allocated from the device's VMA heap.
Avoid calling util_vma_heap_free in the error path for such BOs
to prevent heap corruption and potential double-frees.
Fixes: 88d001383a ("tu: Add support for a "lazy" sparse VMA")
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40507>
set_iova() was called unconditionally after tu_bo_init(), even on the
failure path where the BO has been zeroed. This would call set_iova()
with res_id 0 and a stale iova, corrupting the iova mapping.
Move set_iova() into the success branch so it is only called when
tu_bo_init() succeeds.
Fixes: db88a490b8 ("tu: Avoid extraneous set_iova")
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40507>
All BOs allocated from vkAllocateMemory are either local BOs or added
to the global BO list. Only BOs allocated internally should be added
to the per-cmdbuf list.
Verified this by doing a full CTS run with amdgpu.debug=0x1.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40466>
The global BO list for app allocations has been enabled by default
since Mesa 25.3 and we didn't find any blockers, so let's make it the
default for real. Note that vkd3d-proton and Zink always used that
path and DXVK started to use it in August 2025 after requiring BDA.
This removes RADV_DEBUG=nobolist which was added only for debugging
purposes since the global BO list was enabled by default for app
allocations.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40466>
It's supported on GFX9+ and on GFX8+ with a specific fw version. It's
more correct with preemption.
Also rewrite the comment now that we got more information from Marek.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40341>
These are already implemented by common code, so there's nothing to be
done here, really.
A few tests fail due to timeouts. But this seems no different than on
other drivers, we just skip less WSI tests than most drivers does. Skip
those for now.
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40502>
On U85, both NPU_SET_IFM_BROADCAST and NPU_SET_IFM2_BROADCAST must be
emitted for elementwise operations, matching Vela's GenerateInputBroadcast.
Add calc_broadcast_mode() matching Vela's CalculateBroadcast(): broadcasts
a dimension of shape1 when it is 1 and shape2 is larger, producing a
broadcast_mode bitmask (H=1, W=2, C=4, SCALAR=8).
Split emit_ifm2_broadcast into U65 (legacy bitfields) and U85 paths.
The U85 path emits both IFM_BROADCAST and IFM2_BROADCAST using
calc_broadcast_mode in each direction.
Also fix emit_eltwise to call emit_ifm2_precision instead of
emit_ifm_broadcast for U85, which was emitting 0 instead of the
required IFM2_PRECISION register.
Signed-off-by: Tomeu Vizoso <tomeu@tomeuvizoso.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39611>
Map DRM buffer objects once at resource_create and unmap at
resource_destroy, instead of mapping them in buffer_map where they
were never unmapped. This fixes a virtual memory leak that caused
SIGBUS under heavy workloads by exhausting CMA.
Also remove unused phys_addr and obj_addr fields from ethosu_resource,
and add asserts on pipe_buffer_create return values.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39611>
For U85-256 with 8-bit IFM, Vela's _uBlockToOpTable restricts which
microblocks are valid per operation type:
{2,2,8} and {4,1,8}: conv, matmul, vectorprod, reducesum, eltwise, resize
{2,1,16}: depthwise, pool, eltwise, reduceminmax, argmax, resize
Mesa's find_ublock() was not enforcing these constraints, allowing
{4,1,8} or {2,2,8} to be selected for depthwise/pooling based on
minimum waste. For depthwise ops with OFM shapes that aligned better
to {4,1,8}, the wrong ublock was chosen, causing incorrect weight
encoding and NPU hangs.
Fix by skipping {4,1,8} and {2,2,8} for depthwise/pooling operations,
matching Vela's operation-validity table.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39611>