radeonsi: Limit variable workgroup size to 256 for CS regalloc bug

Even though radeonsi may not use compute queues, other processes
might run compute jobs in the background, so radeonsi must make
sure not to use larger than 256 sized workgroups on GPUs that
are affected by the regalloc hang.

Unfortunately that means that for now RadeonSI won't be able to
support ARB_compute_variable_group_size on these GPUs.

Signed-off-by: Timur Kristóf <timur.kristof@gmail.com>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/39288>
This commit is contained in:
Timur Kristóf 2025-10-21 17:09:59 +02:00 committed by Marge Bot
parent 3d934c7951
commit dc41023510
2 changed files with 3 additions and 2 deletions

View file

@ -1022,7 +1022,8 @@ void si_init_compute_caps(struct si_screen *sscreen)
else
caps->subgroup_sizes = sscreen->info.gfx_level < GFX10 ? 64 : 64 | 32;
caps->max_variable_threads_per_block = SI_MAX_VARIABLE_THREADS_PER_BLOCK;
caps->max_variable_threads_per_block =
sscreen->info.has_cs_regalloc_hang_bug ? 256 : SI_MAX_VARIABLE_THREADS_PER_BLOCK;
}
static void si_init_mesh_caps(struct si_screen *sscreen)

View file

@ -149,7 +149,7 @@ unsigned si_get_max_workgroup_size(const struct si_shader *shader)
/* Compile a variable block size using the maximum variable size. */
if (shader->selector->info.base.workgroup_size_variable)
return SI_MAX_VARIABLE_THREADS_PER_BLOCK;
return sscreen->b.compute_caps.max_variable_threads_per_block;
uint16_t *local_size = shader->selector->info.base.workgroup_size;
unsigned max_work_group_size = (uint32_t)local_size[0] *