sdlgpu.c's update_uniform_buffer function stands out in profiles of FNA3D trace data. For OpenGL we were able to reduce glUniform's impact on CPU performance by memcmp'ing the buffer first, and only pushing when a change actually happened...
https://github.com/icculus/mojoshader/blob/main/mojoshader_opengl.c#L2813
... but sdlgpu.c currently allocs a buffer for every constant buffer push:
https://github.com/icculus/mojoshader/blob/main/mojoshader_sdlgpu.c#L195
We should try to remove this middle buffer and shadow buffer state, this can improve performance quite a bit!
sdlgpu.c's update_uniform_buffer function stands out in profiles of FNA3D trace data. For OpenGL we were able to reduce glUniform's impact on CPU performance by memcmp'ing the buffer first, and only pushing when a change actually happened...
https://github.com/icculus/mojoshader/blob/main/mojoshader_opengl.c#L2813
... but sdlgpu.c currently allocs a buffer for every constant buffer push:
https://github.com/icculus/mojoshader/blob/main/mojoshader_sdlgpu.c#L195
We should try to remove this middle buffer and shadow buffer state, this can improve performance quite a bit!