Skip to content

Let lightgbm opencl module support "quantization gradient" function #7154

@shengyanli1982

Description

@shengyanli1982

Summary

It is hoped that the "quantization gradient" function can be supported in the OpenCL-related code.

Motivation

I am using an RTX 4060Ti 16GB graphics card. I trained a model using Lightgbm 4.6 on a group of private data on Windows. Repeating the training with the same parameters would yield different results. However, XGBoost has a default "quantized gradient" feature for int64_t data types in FP32 format. Therefore, it won't do this. But lightgbm is an important development library for me, so I hope it can provide a similar function.

Description

The Lightgbm project offers the "quantized gradient" feature on both CPU and CUDA modules, but the OpenCL module does not provide it. By reviewing the relevant materials, it turns out that OpenCL 3.0 already possesses some functions similar to CUDA, and can achieve a similar effect to a certain extent.

The RTX series graphics cards' performance using FP64 is only 1/64 that of FP32.

It is hoped that the corresponding "quantized gradient" function can be implemented on the opencl module.

References

CUDA primitives OpenCL equivalents Availability
__shfl_down_sync(mask, val, offset) sub_group_shuffle_down(val, offset) Requires cl_khr_subgroup_shuffle_relative
atomicAdd(addr, val) (int32) atom_add(addr, val) Core feature
__syncthreads() barrier(CLK_LOCAL_MEM_FENCE) Core feature
__shared__ __local Core feature
blockIdx.x / threadIdx.x get_group_id(0) / get_local_id(0) Core feature

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions