This is possible because no comp-to-reg and no FCE. This probably helps
a bunch on GFX11+ if GENERAL is widely used with color images. And
since VK_KHR_unified_image_layout it's likely the case on GFX11-11.5
GFX10-10.3 could also benefit from this but some MSAA with DCC
fast-clears are currently broken and they need to be fixed first.
Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39396>
This allows us to skip the mode set by changing
the initial mode in the command stream.
Foz-DB Navi48:
Totals from 14 (0.02% of 82405) affected shaders:
Latency: 79417 -> 79438 (+0.03%)
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38815>
aco implements the same logic, and in the future it will make changes to
config->float_mode to avoid unnecessary s_setreg.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/38815>
The version of instruction selection that got merged doesn't have vcc
definitions, so this shouldn't either.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39390>
The version of instruction selection that got merged doesn't have vcc
definitions.
Signed-off-by: Rhys Perry <pendingchaos02@gmail.com>
Reviewed-by: Daniel Schürmann <daniel@schuermann.dev>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39390>
Using a string literal enclosed with the same type of quotation marks
with the outer f-string isn't supported on Python 3.10, which is
currently still with security maintainance.
This leads to syntax error when building Mesa with Python 3.10.
Fix this by alternating these string literals' quotation mark to '' (as
the outer f-string uses "").
Signed-off-by: Icenowy Zheng <zhengxingda@iscas.ac.cn>
Closes: https://gitlab.freedesktop.org/mesa/mesa/-/issues/14673
Reviewed-by: Valentine Burley <valentine.burley@collabora.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39372>
si_sqtt_start / si_sqtt_stop use emit_barrier which clears barriers_flags.
Since these functions are used to build an auxiliary cs which will only
be emitted later (on sqtt enablement/disablement) it shouldn't clear
the global barrier_flags value.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39308>
The pattern:
ctx->barrier_flags |= ...;
si_mark_atom_dirty(sctx, &sctx->atoms.s.barrier);
is used a lot, let's add an inline helper. This prevents
forgetting the call to si_mark_atom_dirty.
si_upload_bindless_descriptors is special because we're
already in the emit phase so we shouldn't dirty barrier
again.
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39308>
Some newer gen8 devices (like a840/kaanapali, but not x2-85 which is
otherwise similar) flip the hw shading rate value around to match
vulkan/gl instead of DX.
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39167>
Narrowing integer conversions on SALU with GPR src do not behave as one
would expect on gen8, so avoid them. This does not apply to uGPR srcs
or float conversions.
See, for example:
dEQP-VK.glsl.builtin.function.integer.bitcount.int_highp_compute
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39167>
Otherwise the first reg in the array ends up with offset=0, which
signals the end of parsing the magic_raw regs. Which is not the
desired outcome :-)
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39167>
The field name needs to match between variants. In this case, the
register is the same, just with different offset. So use a bitset.
Signed-off-by: Rob Clark <rob.clark@oss.qualcomm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39167>
Now that we have intrinsics which map directly to the hardware opcodes,
we can lower PLS inside the gallium driver instead of the back-end
compiler having to know anything about it. This simplifies the back-end
and is less code, if you ignore the new copyright header.
Reviewed-by: Christoph Pillmayer <christoph.pillmayer@arm.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39367>