Let lightgbm opencl module support "quantization gradient" function

## Summary

It is hoped that the "quantization gradient" function can be supported in the OpenCL-related code. 

## Motivation

I am using an RTX 4060Ti 16GB graphics card. I trained a model using Lightgbm 4.6 on a group of private data on Windows. Repeating the training with the same parameters would yield different results. However, XGBoost has a default "quantized gradient" feature for int64_t data types in FP32 format. Therefore, it won't do this. But lightgbm is an important development library for me, so I hope it can provide a similar function. 

## Description

The Lightgbm project offers the "quantized gradient" feature on both CPU and CUDA modules, but the OpenCL module does not provide it. By reviewing the relevant materials, it turns out that OpenCL 3.0 already possesses some functions similar to CUDA, and can achieve a similar effect to a certain extent. 

The RTX series graphics cards' performance using FP64 is only 1/64 that of FP32. 

It is hoped that the corresponding "quantized gradient" function can be implemented on the opencl module. 

## References

| CUDA primitives | OpenCL equivalents | Availability | 
| ------------------------------------- | ------------------------------------- | --------------------------------------- |
| `__shfl_down_sync(mask, val, offset)` | `sub_group_shuffle_down(val, offset)` | Requires `cl_khr_subgroup_shuffle_relative` |
| `atomicAdd(addr, val)` (int32) | `atom_add(addr, val)` | Core feature |
| `__syncthreads()` | `barrier(CLK_LOCAL_MEM_FENCE)` | Core feature |
| `__shared__` | `__local` | Core feature |
| `blockIdx.x / threadIdx.x` | `get_group_id(0) / get_local_id(0)` | Core feature |

CUDA primitives	OpenCL equivalents	Availability
`__shfl_down_sync(mask, val, offset)`	`sub_group_shuffle_down(val, offset)`	Requires `cl_khr_subgroup_shuffle_relative`
`atomicAdd(addr, val)` (int32)	`atom_add(addr, val)`	Core feature
`__syncthreads()`	`barrier(CLK_LOCAL_MEM_FENCE)`	Core feature
`__shared__`	`__local`	Core feature
`blockIdx.x / threadIdx.x`	`get_group_id(0) / get_local_id(0)`	Core feature

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Let lightgbm opencl module support "quantization gradient" function #7154

Summary

Motivation

Description

References

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Let lightgbm opencl module support "quantization gradient" function #7154

Description

Summary

Motivation

Description

References

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions