Commit graph

220224 commits

Author SHA1 Message Date
fed6ed3ab7 Build fixes for vmsilo 2026-03-23 13:52:04 +00:00
fc5f045e1c virgl: query host layout for render-target and scanout resources
The virgl driver only queried the host resource layout (which includes
the real modifier) for PIPE_BIND_SHARED resources. Resources created
via gbm_bo_create() with GBM_BO_USE_RENDERING have PIPE_BIND_RENDER_TARGET
but not PIPE_BIND_SHARED, so the layout query never fired and
gbm_bo_get_modifier() always returned LINEAR (0). When the host GPU
allocates with non-linear tiling (e.g. NVIDIA block-linear), this
caused the host compositor to receive the wrong modifier in Wayland
dmabuf add() calls, producing GL_INVALID_OPERATION.

Broaden the trigger for virgl_resource_async_query_gbm_layout() to also
fire for PIPE_BIND_SCANOUT and PIPE_BIND_RENDER_TARGET resources. The
existing async query and sync mechanisms are unchanged.

Also relax virgl_resource_create_with_modifiers() to accept modifier
lists that don't include LINEAR, since the host always decides the
actual layout. This prevents resource creation failures when clients
request host-native modifiers from the compositor's format table.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-23 13:32:22 +00:00
Ryan Zhang
760ac320be panvk: trivial fix to remove repeated assignment
Fixes: c0d9827 ("panvk: Use WB mappings for the global RW and executable memory pools")

Signed-off-by: Ryan Zhang <ryan.zhang@nxp.com>
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40564>
2026-03-23 11:47:39 +00:00
kingstom.chen
5a7f4c62d8 radv/rt: only run move_rt_instructions() for CPS shaders
move_rt_instructions() only makes sense for CPS recursive shaders, where
later rt_trace_ray calls can overwrite the current shader's RT system
values.

Running it on the function-call path can hoist load_hit_attrib_amd
above merged intersection writes, which corrupts any-hit
hitAttributeEXT. Move the pass into the existing CPS-only
non-intersection branch before nir_lower_shader_calls().

Fixes: c5d796c902 ("radv/rt: Use function call structure in NIR lowering")
Closes: #15074

Reviewed-by: Konstantin Seurer <konstantin.seurer@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40531>
2026-03-23 11:24:07 +00:00
Emre Cecanpunar
c60e5df798 aco: drop optimizer peephole TODO comment
The remaining items are either handled elsewhere or unlikely to be
implemented in the optimizer.

Signed-off-by: Emre Cecanpunar <emreleno@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40497>
2026-03-23 11:03:59 +00:00
Valentine Burley
f2c89f0188 tu/drm/virtio: Fix GEM handle leak on failed dmabuf res_id lookup
When vdrm_handle_to_res_id fails in virtio_bo_init_dmabuf, the handle
obtained from vdrm_dmabuf_to_handle was leaked.
Closing the handle is safe despite the lack of vdrm refcounting
because dma_bo_lock is held and already-imported BOs return early.
At this point, we are the sole holder of the handle.

While here, use the local vdrm variable consistently.

Fixes: 6ca192f586 ("turnip: virtio: fix iova leak upon found already imported dmabuf")
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40507>
2026-03-23 10:17:31 +00:00
Valentine Burley
316d9b0209 tu/drm/virtio: Fix GEM handle leak in tu_bo_init error path
In tu_bo_init, if growing the submit BO list fails, the GEM handle
must be closed. However, bo->gem_handle is only populated later
via compound assignment. Use the gem_handle parameter directly
to ensure the correct handle is closed and not leaked.

Fixes: d67d501af4 ("tu/drm/virtio: Switch to vdrm helper")
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40507>
2026-03-23 10:17:31 +00:00
Valentine Burley
eb7897f57b tu/drm/virtio: Do not free iova from heap for lazy BOs
When initializing a BO using a lazy VMA, the iova is provided by
the sparse VMA and was not allocated from the device's VMA heap.
Avoid calling util_vma_heap_free in the error path for such BOs
to prevent heap corruption and potential double-frees.

Fixes: 88d001383a ("tu: Add support for a "lazy" sparse VMA")
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40507>
2026-03-23 10:17:31 +00:00
Valentine Burley
f1366ca144 tu/drm/virtio: Avoid freeing zombified tu_sparse_vma
This is d3cedd2fa5 ("tu/drm: msm's has_set_iova codepath should avoid
freeing zombified tu_sparse_vma") but for virtio.

Fixes: 764b3d9161 ("tu: Implement transient attachments and lazily allocated memory")
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40507>
2026-03-23 10:17:30 +00:00
Valentine Burley
7a96bc3187 tu/drm/virtio: Move set_iova into success path of virtio_bo_init_dmabuf
set_iova() was called unconditionally after tu_bo_init(), even on the
failure path where the BO has been zeroed. This would call set_iova()
with res_id 0 and a stale iova, corrupting the iova mapping.

Move set_iova() into the success branch so it is only called when
tu_bo_init() succeeds.

Fixes: db88a490b8 ("tu: Avoid extraneous set_iova")
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40507>
2026-03-23 10:17:30 +00:00
Valentine Burley
28e3fb7052 tu/drm/virtio: Add missing lock to virtio_bo_init_dmabuf
Lock vma mutex when freeing iova in virtio_bo_init_dmabuf.

Fixes: f17c5297d7 ("tu: Add virtgpu support")
Signed-off-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40507>
2026-03-23 10:17:30 +00:00
Samuel Pitoiset
9f224289b0 radv: remove adding a BO to the per-cmdbuf list when unnecessary
All BOs allocated from vkAllocateMemory are either local BOs or added
to the global BO list. Only BOs allocated internally should be added
to the per-cmdbuf list.

Verified this by doing a full CTS run with amdgpu.debug=0x1.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40466>
2026-03-23 09:50:41 +00:00
Samuel Pitoiset
fb195bd6bd radv/amdgpu: remove the virtual BOs tracking logic
All BOs allocated by applications are local BOs, so this can be
removed completely.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40466>
2026-03-23 09:50:41 +00:00
Samuel Pitoiset
2cf84eedb9 radv: stop allocating an array of BO for descriptors
They are no longer added to the per cmdbuf BO list.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40466>
2026-03-23 09:50:41 +00:00
Samuel Pitoiset
375c82a27e radv: cleanup functions that writes descriptors
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40466>
2026-03-23 09:50:41 +00:00
Samuel Pitoiset
b24c18667d radv: remove radv_device::use_global_bo_list
This is always TRUE now.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40466>
2026-03-23 09:50:40 +00:00
Samuel Pitoiset
09f83982e2 radv: stop allowing users to disable the global BO list
The global BO list for app allocations has been enabled by default
since Mesa 25.3 and we didn't find any blockers, so let's make it the
default for real. Note that vkd3d-proton and Zink always used that
path and DXVK started to use it in August 2025 after requiring BDA.

This removes RADV_DEBUG=nobolist which was added only for debugging
purposes since the global BO list was enabled by default for app
allocations.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40466>
2026-03-23 09:50:40 +00:00
Georg Lehmann
559a35dcb3 aco: skip fract for sin/cos on gfx6-8 if the src is already in range
Foz-DB Polaris10:
Totals from 1301 (1.86% of 69950) affected shaders:
Instrs: 1447217 -> 1445610 (-0.11%); split: -0.11%, +0.00%
CodeSize: 7775988 -> 7769588 (-0.08%); split: -0.08%, +0.00%
SGPRs: 101712 -> 101776 (+0.06%)
SpillSGPRs: 931 -> 927 (-0.43%)
Latency: 16119433 -> 16115293 (-0.03%); split: -0.03%, +0.01%
InvThroughput: 9605952 -> 9577042 (-0.30%); split: -0.31%, +0.01%
VClause: 24591 -> 24593 (+0.01%); split: -0.01%, +0.02%
SClause: 29656 -> 29655 (-0.00%)
Copies: 133968 -> 134001 (+0.02%); split: -0.01%, +0.03%
VALU: 1157855 -> 1156235 (-0.14%)
SALU: 124626 -> 124639 (+0.01%); split: -0.00%, +0.01%

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40545>
2026-03-23 09:27:32 +00:00
Samuel Pitoiset
57e2b272d5 radv: emit PFP_SYNC_ME right after STRMOUT_BUFFER_UPDATE is emitted
This is likely less frequent than the draws, and it's only needed
when the VA is used.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40341>
2026-03-23 08:40:23 +00:00
Samuel Pitoiset
a0471ddad8 radv: update color/ds clear metadata in ME
It's probably sligthly better because loading the clear registers
is likely more frequent than updating them.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40341>
2026-03-23 08:40:23 +00:00
Samuel Pitoiset
fd019c24e9 radv: remove useless PFP_SYNC_ME when loading color/ds metadata on GFX6-7
WRITE_DATA is emitted in PFP and the COPY_DATA in ME, so this shouldn't
be necessary.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40341>
2026-03-23 08:40:23 +00:00
Samuel Pitoiset
2751a427e1 radv: use LOAD_CONTEXT_REG_INDEX when supported for streamout
It's supported on GFX9+ and on GFX8+ with a specific fw version. It's
more correct with preemption.

Also rewrite the comment now that we got more information from Marek.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40341>
2026-03-23 08:40:22 +00:00
Samuel Pitoiset
dfbed0d016 ac/cmdbuf: add an assertion for COPY_DATA+PFP with registers
This shouldn't be used.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40341>
2026-03-23 08:40:22 +00:00
Samuel Pitoiset
42de2fc38e ac/gpu_info: remove a TODO about LOAD_CONTEXT_REG on GFX6-7
This doesn't seem supported at all.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40341>
2026-03-23 08:40:22 +00:00
Erik Faye-Lund
8e274e5105 pan/ci: move flake from fails to flakes file
This test passed in this job:

https://gitlab.freedesktop.org/mesa/mesa/-/jobs/95625942

...but then flaked in this job, from the same pipeline:

https://gitlab.freedesktop.org/mesa/mesa/-/jobs/95634562

So, it seems like this is a flake, and not a normal fail. Let's mark it
as such.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40502>
2026-03-23 08:00:39 +00:00
Erik Faye-Lund
9ec387efb1 panvk: advertise wsi maintenance extensions
These are already implemented by common code, so there's nothing to be
done here, really.

A few tests fail due to timeouts. But this seems no different than on
other drivers, we just skip less WSI tests than most drivers does. Skip
those for now.

Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40502>
2026-03-23 08:00:39 +00:00
Erik Faye-Lund
59c1fb8284 panvk: fix incorrect sorting
Reviewed-by: Christian Gmeiner <cgmeiner@igalia.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40502>
2026-03-23 08:00:38 +00:00
Tomeu Vizoso
15f0c245c8 ethosu: Set test baseline for the Corstone 1000 (U85)
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39611>
2026-03-23 07:45:59 +00:00
Tomeu Vizoso
ac0d6e7b7c ethosu: Properly emit IFM_BROADCAST and IFM2_BROADCAST on U85
On U85, both NPU_SET_IFM_BROADCAST and NPU_SET_IFM2_BROADCAST must be
emitted for elementwise operations, matching Vela's GenerateInputBroadcast.

Add calc_broadcast_mode() matching Vela's CalculateBroadcast(): broadcasts
a dimension of shape1 when it is 1 and shape2 is larger, producing a
broadcast_mode bitmask (H=1, W=2, C=4, SCALAR=8).

Split emit_ifm2_broadcast into U65 (legacy bitfields) and U85 paths.
The U85 path emits both IFM_BROADCAST and IFM2_BROADCAST using
calc_broadcast_mode in each direction.

Also fix emit_eltwise to call emit_ifm2_precision instead of
emit_ifm_broadcast for U85, which was emitting 0 instead of the
required IFM2_PRECISION register.

Signed-off-by: Tomeu Vizoso <tomeu@tomeuvizoso.net>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39611>
2026-03-23 07:45:59 +00:00
Tomeu Vizoso
2a6d181bc6 ethosu: Fix scalar ADD on U85
They added new registers tot he command stream, with a new bitfield
layout.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39611>
2026-03-23 07:45:59 +00:00
Tomeu Vizoso
818e1835d7 ethosu: map BOs at creation time and unmap at destruction
Map DRM buffer objects once at resource_create and unmap at
resource_destroy, instead of mapping them in buffer_map where they
were never unmapped. This fixes a virtual memory leak that caused
SIGBUS under heavy workloads by exhausting CMA.

Also remove unused phys_addr and obj_addr fields from ethosu_resource,
and add asserts on pipe_buffer_create return values.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39611>
2026-03-23 07:45:58 +00:00
Tomeu Vizoso
f9cd399eb0 ethosu: Fix ublock selection for 8-bit depthwise/pooling on U85-256
For U85-256 with 8-bit IFM, Vela's _uBlockToOpTable restricts which
microblocks are valid per operation type:

  {2,2,8}  and {4,1,8}:  conv, matmul, vectorprod, reducesum, eltwise, resize
  {2,1,16}:              depthwise, pool, eltwise, reduceminmax, argmax, resize

Mesa's find_ublock() was not enforcing these constraints, allowing
{4,1,8} or {2,2,8} to be selected for depthwise/pooling based on
minimum waste. For depthwise ops with OFM shapes that aligned better
to {4,1,8}, the wrong ublock was chosen, causing incorrect weight
encoding and NPU hangs.

Fix by skipping {4,1,8} and {2,2,8} for depthwise/pooling operations,
matching Vela's operation-validity table.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39611>
2026-03-23 07:45:58 +00:00
Tomeu Vizoso
dc36a32214 ethosu: Implement simplified scaling for U85
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39611>
2026-03-23 07:45:58 +00:00
Tomeu Vizoso
dbfbc6eff4 ethosu: Emission changes for U85
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39611>
2026-03-23 07:45:58 +00:00
Tomeu Vizoso
42082266f0 ethosu: Refactor ethosu_allocate_feature_map to return the new offset
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39611>
2026-03-23 07:45:58 +00:00
Tomeu Vizoso
fc70406bdd ethosu: Expand pooling to U85
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39611>
2026-03-23 07:45:57 +00:00
Tomeu Vizoso
a735fe040b ethosu: Improve parallelism by detecting overlaps for BLOCKDEP
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39611>
2026-03-23 07:45:57 +00:00
Tomeu Vizoso
2cf3d0b273 ethosu: Add a separate scheduler for the U85
As the performance details have changed quite a bit.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39611>
2026-03-23 07:45:57 +00:00
Tomeu Vizoso
82d4f21106 ethosu: Don't emit redundant state changes
Keep track of the state and only emit meaningful changes.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39611>
2026-03-23 07:45:56 +00:00
Tomeu Vizoso
8872f5eea4 ethosu: Add debug option for forcing U85 generation
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39611>
2026-03-23 07:45:56 +00:00
Tomeu Vizoso
45fb8b99df ethosu: Invert lowering order of concatenation suboperations
Just so we match the order in which Vela assigns offsets to the FMs so
it's easier to diff cmdstream dumps.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39611>
2026-03-23 07:45:56 +00:00
Tomeu Vizoso
d66d2c05d3 ethosu: Switch to the weight encoder from Regor
We vendor the encoder used in the Regor compiler in Vela, and replace
the previous one that was used by the Python compiler and doesn't
support U85.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39611>
2026-03-23 07:45:55 +00:00
Tomeu Vizoso
410d74e078 ethosu: Compute is_partkernel during scheduling
As we need it for encoding the weights.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39611>
2026-03-23 07:45:55 +00:00
Tomeu Vizoso
3ade0a4dd6 ethosu: Make the UBlock sizes arch-specific
As U85 has a different configuration.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39611>
2026-03-23 07:45:55 +00:00
Tomeu Vizoso
91137a9327 ethosu: Let maxblockdeps be arch-specific
As U85 can have up to 7.

Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39611>
2026-03-23 07:45:54 +00:00
Tomeu Vizoso
0af37552a7 ethosu: Add U85 fields, these are compatible with the U65
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39611>
2026-03-23 07:45:54 +00:00
Tomeu Vizoso
4388f602ed teflon: Fix leak of tensor structs
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39611>
2026-03-23 07:45:53 +00:00
Tomeu Vizoso
47aa30276e ethosu: Update test expectations
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39611>
2026-03-23 07:45:53 +00:00
Marek Olšák
533b962b29 driconf: rename sha1 option to blake3
it's already blake3 except the name

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40383>
2026-03-23 07:03:28 +00:00
Marek Olšák
94bcf968f4 driconf: unbreak profiles for "runner" by merging them and ignoring sha1s
SHA1 is no longer supported.

Acked-by: Pierre-Eric Pelloux-Prayer <pierre-eric.pelloux-prayer@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/40383>
2026-03-23 07:03:28 +00:00