Commit graph

217438 commits

Author SHA1 Message Date
Samuel Pitoiset
3e7f38efa8 radv: always fast-clear color image with comp-to-single on GFX11-11.5
This is possible because no comp-to-reg and no FCE. This probably helps
a bunch on GFX11+ if GENERAL is widely used with color images. And
since VK_KHR_unified_image_layout it's likely the case on GFX11-11.5

GFX10-10.3 could also benefit from this but some MSAA with DCC
fast-clears are currently broken and they need to be fixed first.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39396>
2026-01-20 15:15:35 +00:00
Georg Lehmann
711598982a ac/nir,radv: remove ac_nir_opt_pack_half
Foz-DB Navi21:
Totals from 2937 (3.01% of 97591) affected shaders:
Instrs: 1908695 -> 1908291 (-0.02%); split: -0.02%, +0.00%
CodeSize: 10232148 -> 10229224 (-0.03%); split: -0.03%, +0.01%
VGPRs: 142168 -> 142080 (-0.06%)
Latency: 8052895 -> 8052622 (-0.00%); split: -0.01%, +0.01%
InvThroughput: 2550330 -> 2549602 (-0.03%); split: -0.03%, +0.01%
VClause: 32601 -> 32603 (+0.01%); split: -0.01%, +0.02%
Copies: 118570 -> 118587 (+0.01%); split: -0.04%, +0.05%
PreVGPRs: 110090 -> 110082 (-0.01%)
VALU: 1468422 -> 1468043 (-0.03%); split: -0.03%, +0.00%
SALU: 173858 -> 173828 (-0.02%)

Foz-DB Navi48:
Totals from 4196 (4.30% of 97637) affected shaders:
MaxWaves: 118678 -> 118680 (+0.00%); split: +0.01%, -0.01%
Instrs: 3627604 -> 3624093 (-0.10%); split: -0.10%, +0.00%
CodeSize: 18956684 -> 18939824 (-0.09%); split: -0.09%, +0.01%
VGPRs: 225624 -> 225060 (-0.25%); split: -0.26%, +0.01%
Latency: 11856204 -> 11857280 (+0.01%); split: -0.01%, +0.02%
InvThroughput: 2388584 -> 2389178 (+0.02%); split: -0.01%, +0.03%
VClause: 50409 -> 50410 (+0.00%)
SClause: 64701 -> 64699 (-0.00%)
Copies: 208353 -> 207522 (-0.40%); split: -0.43%, +0.03%
PreVGPRs: 161314 -> 161306 (-0.00%)
VALU: 2345604 -> 2345172 (-0.02%); split: -0.02%, +0.00%
SALU: 391466 -> 388723 (-0.70%)
VOPD: 1788 -> 1806 (+1.01%)

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38815>
2026-01-20 14:48:23 +00:00
Georg Lehmann
939b4a6476 aco/optimizer: apply v_cvt_pkrtz_f16_f32 as fma_mix to operands
Foz-DB Navi21:
Totals from 2085 (2.14% of 97591) affected shaders:
Instrs: 4880879 -> 4882355 (+0.03%); split: -0.04%, +0.07%
CodeSize: 26869332 -> 26881744 (+0.05%); split: -0.02%, +0.06%
VGPRs: 93944 -> 94160 (+0.23%); split: -0.06%, +0.29%
Latency: 40035558 -> 40035595 (+0.00%); split: -0.02%, +0.02%
InvThroughput: 10333800 -> 10329093 (-0.05%); split: -0.06%, +0.01%
VClause: 139147 -> 139148 (+0.00%)
Copies: 454527 -> 454656 (+0.03%); split: -0.00%, +0.03%
VALU: 3214838 -> 3211105 (-0.12%)

Foz-DB Navi48:
Totals from 2349 (2.41% of 97637) affected shaders:
Instrs: 6471998 -> 6471817 (-0.00%); split: -0.05%, +0.05%
CodeSize: 34793372 -> 34808748 (+0.04%); split: -0.02%, +0.06%
VGPRs: 141804 -> 142560 (+0.53%)
Latency: 45225910 -> 45226000 (+0.00%); split: -0.01%, +0.01%
InvThroughput: 9152634 -> 9149850 (-0.03%); split: -0.04%, +0.01%
VClause: 148536 -> 148537 (+0.00%)
Copies: 527206 -> 527336 (+0.02%); split: -0.01%, +0.03%
VALU: 3491701 -> 3487347 (-0.12%); split: -0.12%, +0.00%
VOPD: 669 -> 683 (+2.09%); split: +2.69%, -0.60%

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38815>
2026-01-20 14:48:23 +00:00
Georg Lehmann
c6b74705dd aco/optimizer: support fma_mix with rtz
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38815>
2026-01-20 14:48:23 +00:00
Georg Lehmann
6b9d28ab9b aco/insert_fp_mode: insert fp mode in reverse
This allows us to skip the mode set by changing
the initial mode in the command stream.

Foz-DB Navi48:
Totals from 14 (0.02% of 82405) affected shaders:
Latency: 79417 -> 79438 (+0.03%)

Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38815>
2026-01-20 14:48:22 +00:00
Georg Lehmann
7212a75c5e aco/insert_fp_mode: exclude some instructions that will never round
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38815>
2026-01-20 14:48:22 +00:00
Georg Lehmann
d6356191b9 aco: add fma_mix opcodes with rtz fp16 rounding
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38815>
2026-01-20 14:48:22 +00:00
Georg Lehmann
af68c08e88 radeonsi: only override float_mode for llvm
aco implements the same logic, and in the future it will make changes to
config->float_mode to avoid unnecessary s_setreg.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38815>
2026-01-20 14:48:21 +00:00
Rhys Perry
f0f53e624c aco/tests: remove vcc definitions from p_call
The version of instruction selection that got merged doesn't have vcc
definitions, so this shouldn't either.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39390>
2026-01-20 13:33:16 +00:00
Rhys Perry
adf5c7cba4 aco: remove dead p_call code in live_var_analysis
The version of instruction selection that got merged doesn't have vcc
definitions.

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39390>
2026-01-20 13:33:16 +00:00
Rhys Perry
ba798120c6 aco/ra: split blocking vectors if needed when handling fixed operands
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39390>
2026-01-20 13:33:16 +00:00
Rhys Perry
5ebefceb42 aco/ra: move split_blocking_vectors higher
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39390>
2026-01-20 13:33:16 +00:00
Lionel Landwerlin
a7d7492f10 anv: enable debug printfs on internal shaders
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39399>
2026-01-20 12:19:41 +00:00
Lionel Landwerlin
61b35c9d2b anv: remove all kinds of useless info for internal shaders
Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
Reviewed-by: Tapani Pälli <tapani.palli@intel.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39399>
2026-01-20 12:19:41 +00:00
Pohsiang (John) Hsu
487da8f248 mediafoundation: set rc mode in GetCodecPrivateData for 2 pass rc mode
Reviewed-by: Yubo Xie <yuboxie@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39402>
2026-01-20 12:02:03 +00:00
Pohsiang (John) Hsu
581ffd1450 d3d12: fix slice support for setting number of coding units per slice
Reviewed-by: Sil Vilerino <sivileri@microsoft.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39402>
2026-01-20 12:02:03 +00:00
Icenowy Zheng
b61dbc98fd nir/algebraic: fix Python-3.10-incompatible syntax
Using a string literal enclosed with the same type of quotation marks
with the outer f-string isn't supported on Python 3.10, which is
currently still with security maintainance.

This leads to syntax error when building Mesa with Python 3.10.

Fix this by alternating these string literals' quotation mark to '' (as
the outer f-string uses "").

Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14673
Reviewed-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39372>
2026-01-20 11:14:41 +00:00
Rhys Perry
24fe4a5b58 aco/ra: copy precolor affinities to p_create_vector/p_split_vector
fossil-db (navi31):
Totals from 7 (0.01% of 84369) affected shaders:
Instrs: 2742 -> 2704 (-1.39%); split: -1.82%, +0.44%
CodeSize: 15300 -> 15052 (-1.62%); split: -1.93%, +0.31%
VGPRs: 516 -> 504 (-2.33%)
Latency: 12478 -> 12504 (+0.21%); split: -0.24%, +0.45%
InvThroughput: 2350 -> 2300 (-2.13%)
Copies: 350 -> 272 (-22.29%)
VALU: 1626 -> 1592 (-2.09%)
VOPD: 280 -> 236 (-15.71%)

Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39387>
2026-01-20 10:53:18 +00:00
Pierre-Eric Pelloux-Prayer
f5f84e6739 radeonsi: add asserts to validate emit functions use of atoms
emit functions shouldn't dirty any atom.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39308>
2026-01-20 09:56:28 +00:00
Pierre-Eric Pelloux-Prayer
0efe11e84e radeonsi/sqtt: restore barrier_flags in si_sqtt_init_cs
si_sqtt_start / si_sqtt_stop use emit_barrier which clears barriers_flags.
Since these functions are used to build an auxiliary cs which will only
be emitted later (on sqtt enablement/disablement) it shouldn't clear
the global barrier_flags value.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39308>
2026-01-20 09:56:28 +00:00
Pierre-Eric Pelloux-Prayer
3bc60e1bb0 radeonsi: add extra flags param to si_emit_barrier_direct
Most callers wants to add new flags to barrier_flags so add
a parameter.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39308>
2026-01-20 09:56:28 +00:00
Pierre-Eric Pelloux-Prayer
9175388740 radeonsi: add a si_clear_and_set_barrier_flags helper
Same as si_set_barrier_flags except it can be used to clear
some barriers first.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39308>
2026-01-20 09:56:27 +00:00
Pierre-Eric Pelloux-Prayer
db4b1cdb3b radeonsi: fix references to sctx->flags in documentation
It was renamed barrier_flags.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39308>
2026-01-20 09:56:26 +00:00
Pierre-Eric Pelloux-Prayer
c77828c8e9 radeonsi: add a si_set_barrier_flags helper
The pattern:

  ctx->barrier_flags |= ...;
  si_mark_atom_dirty(sctx, &sctx->atoms.s.barrier);

is used a lot, let's add an inline helper. This prevents
forgetting the call to si_mark_atom_dirty.

si_upload_bindless_descriptors is special because we're
already in the emit phase so we shouldn't dirty barrier
again.

Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39308>
2026-01-20 09:56:26 +00:00
Christian Gmeiner
ef860bcaa1 pvr/ci: Add dEQP-VK testing for BXS-4-64 on TI AM68 SK
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Acked-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39356>
2026-01-20 09:19:16 +00:00
Christian Gmeiner
2386770815 ci: Build imagination vulkan driver
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Acked-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39356>
2026-01-20 09:19:16 +00:00
Christian Gmeiner
a0a87eb88e ci: Describe imagination farm
Signed-off-by: Christian Gmeiner <cgmeiner@igalia.com>
Acked-by: Frank Binns <frank.binns@imgtec.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39356>
2026-01-20 09:19:16 +00:00
Rob Clark
a9f05399ae tu: gen8 support
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39167>
2026-01-20 02:27:33 +00:00
Rob Clark
77e83d1449 tu: gen8 sampler support
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39167>
2026-01-20 02:27:33 +00:00
Rob Clark
5b40d98388 tu: Add helper to set render mode
Make it less awkward to deal with gen6/7 vs gen8 differences.

Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39167>
2026-01-20 02:27:33 +00:00
Rob Clark
eee7a6fb35 tu: gen8 descriptor support
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39167>
2026-01-20 02:27:32 +00:00
Rob Clark
039e21fde8 tu: Support acceleration_structure for wave64
Gen8 replaces wave128 with double dispatch wave64, and so will need
smaller subgroup sizes.

Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39167>
2026-01-20 02:27:32 +00:00
Rob Clark
380c79c923 ir3: Limit 64b atomic 16b offset quirk to a7xx
This was fixed in gen8.

Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39167>
2026-01-20 02:27:32 +00:00
Rob Clark
1dc7d0ade9 ir3: Skip shading_rate lowering when unneeded
Some newer gen8 devices (like a840/kaanapali, but not x2-85 which is
otherwise similar) flip the hw shading rate value around to match
vulkan/gl instead of DX.

Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39167>
2026-01-20 02:27:32 +00:00
Rob Clark
6f1faceb6a ir3: Avoid narrowing int conversions from GPR on SALU
Narrowing integer conversions on SALU with GPR src do not behave as one
would expect on gen8, so avoid them.  This does not apply to uGPR srcs
or float conversions.

See, for example:
dEQP-VK.glsl.builtin.function.integer.bitcount.int_highp_compute

Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39167>
2026-01-20 02:27:31 +00:00
Rob Clark
7cb890fe1b freedreno/registers: Update gen8 VRS registers
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39167>
2026-01-20 02:27:31 +00:00
Rob Clark
74484da82f freedreno/registers: Update gen8 FDM regs
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39167>
2026-01-20 02:27:31 +00:00
Rob Clark
635410f749 freedreno/registers: Fix py array reg offsets
Otherwise the first reg in the array ends up with offset=0, which
signals the end of parsing the magic_raw regs.  Which is not the
desired outcome :-)

Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39167>
2026-01-20 02:27:30 +00:00
Rob Clark
f0ada848e5 freedreno/registers: Fix GRAS_LRZ_CB_CNTL
The field name needs to match between variants.  In this case, the
register is the same, just with different offset.  So use a bitset.

Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39167>
2026-01-20 02:27:30 +00:00
Rob Clark
49f2545de6 freedreno/registers: Fix gen8 TPL1_A2D_BLT_CNTL
START_OFFSET_TEXELS is removed.  Instead TPL1_A2D_SRC_TEXTURE_BASE can
take an unaligned address for IMG_BUFFER.

Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39167>
2026-01-20 02:27:30 +00:00
Rob Clark
f5f9fecfc3 freedreno/registers: Fix gen8 TPL1_MODE_CNTL
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39167>
2026-01-20 02:27:29 +00:00
Rob Clark
ff034b5aef freedreno/registers: Fix gen8 GRAS_SU_STEREO_CNTL
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39167>
2026-01-20 02:27:29 +00:00
Rob Clark
a2dc77323d freedreno/registers: Add subpass fence events
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39167>
2026-01-20 02:27:28 +00:00
Rob Clark
5546654104 freedreno/registers: Fix gen8 UV_PITCH
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39167>
2026-01-20 02:27:28 +00:00
Rob Clark
c9a0b1d6f1 freedreno/fdl: Fix gen8 sRGB buffers
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39167>
2026-01-20 02:27:28 +00:00
Rob Clark
bd00d86bd7 freedreno/fdl: Fix gen8 MUTABLEEN
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39167>
2026-01-20 02:27:28 +00:00
Rob Clark
eba06c5e5b tu: Convert foveat state to CRB
The GRAS regs are no longer consecutive in gen8.

Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39167>
2026-01-20 02:27:27 +00:00
Rob Clark
eb43e95d61 freedreno: Disable supports_double_threadsize for gen8
Gone is thread128.  Instead the hw can co-dispatch thread64.

Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39167>
2026-01-20 02:27:27 +00:00
Rob Clark
7958a19ee9 freedreno: Disable has_rt_workaround for gen8
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39167>
2026-01-20 02:27:26 +00:00
Faith Ekstrand
13926b3492 panfrost: Lower pixel-local storage to load/store_tile in NIR
Now that we have intrinsics which map directly to the hardware opcodes,
we can lower PLS inside the gallium driver instead of the back-end
compiler having to know anything about it.  This simplifies the back-end
and is less code, if you ignore the new copyright header.

Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39367>
2026-01-19 21:33:14 +00:00