内存占用不释放，并且会导致CPU 100%

生成一次图片，内存占用从16G增加到37G，并稳定不释放，重复多次生成图片，不再增加内存占用
切换工作流，加载depth相关模型，内存释放一部分，降低到34G占用
开始加载depth模型，并生成一次图片，内存占用49G, 重复执行图片生成，内存占用稳定不变。
切换工作流，但是新工作也同样使用depth，内存先降低到47G，生成图片后回到48G，并稳定
切换工作流，使用flux1 dev模型，不是depth，内存占用直接升到68Gb，CompfyUI报错，OOM
重启Comfyui，内存占用稳定55G，Vram全部释放

![Image](https://github.com/user-attachments/assets/b62114ad-1cc1-4d24-918f-b154b079520b)

测试模型加载的comfyui日志：

`
Set vram state to: NORMAL_VRAM
Device: cuda:0 NVIDIA GeForce RTX 2080 Ti : cudaMallocAsync
Using pytorch attention
ComfyUI version: 0.3.27
ComfyUI frontend version: 1.14.6
[Prompt Server] web root: /home/hucd/miniconda3/envs/comfyui/lib/python3.10/site-packages/comfyui_frontend_package/static
[Crystools INFO] Crystools version: 1.22.1
[Crystools INFO] CPU: Intel(R) Xeon(R) CPU E5-2697 v4 @ 2.30GHz - Arch: x86_64 - OS: Linux 6.11.0-21-generic
[Crystools INFO] Pynvml (Nvidia) initialized.
[Crystools INFO] GPU/s:
[Crystools INFO] 0) NVIDIA GeForce RTX 2080 Ti
[Crystools INFO] 1) NVIDIA GeForce RTX 2080 Ti
[Crystools INFO] 2) NVIDIA GeForce RTX 2080 Ti
[Crystools INFO] 3) NVIDIA GeForce RTX 2080 Ti
[Crystools INFO] NVIDIA Driver: 570.133.07
Total VRAM 22002 MB, total RAM 128706 MB
pytorch version: 2.6.0+cu124
Set vram state to: NORMAL_VRAM
Device: cuda:0 NVIDIA GeForce RTX 2080 Ti : cudaMallocAsync

[rgthree-comfy] Loaded 42 epic nodes. 🎉

[comfyui_controlnet_aux] | INFO -> Using ckpts path: /home/hucd/ComfyUI/custom_nodes/comfyui_controlnet_aux/ckpts
[comfyui_controlnet_aux] | INFO -> Using symlinks: False
[comfyui_controlnet_aux] | INFO -> Using ort providers: ['CUDAExecutionProvider', 'DirectMLExecutionProvider', 'OpenVINOExecutionProvider', 'ROCMExecutionProvider', 'CPUExecutionProvider', 'CoreMLExecutionProvider']
/home/hucd/ComfyUI/custom_nodes/comfyui_controlnet_aux/node_wrappers/dwpose.py:26: UserWarning: DWPose: Onnxruntime not found or doesn't come with acceleration providers, switch to OpenCV with CPU device. DWPose might run very slowly
  warnings.warn("DWPose: Onnxruntime not found or doesn't come with acceleration providers, switch to OpenCV with CPU device. DWPose might run very slowly")
### Loading: ComfyUI-Impact-Pack (V8.8)
### Loading: ComfyUI-Impact-Subpack (V1.2.7)
[Impact Pack] Wildcards loading done.
[Impact Subpack] ultralytics_bbox: /home/hucd/ComfyUI/models/ultralytics/bbox
[Impact Subpack] ultralytics_segm: /home/hucd/ComfyUI/models/ultralytics/segm
### Loading: ComfyUI-Manager (V3.31.9)
[ComfyUI-Manager] network_mode: public
### ComfyUI Revision: 3288 [75c1c757] *DETACHED | Released on '2025-03-21'
[ComfyUI-Easy-Use] server: v1.2.8 Loaded
[ComfyUI-Easy-Use] web root: /home/hucd/ComfyUI/custom_nodes/ComfyUI-Easy-Use/web_version/v2 Loaded

Import times for custom nodes:
   0.0 seconds: /home/hucd/ComfyUI/custom_nodes/websocket_image_save.py
   0.0 seconds: /home/hucd/ComfyUI/custom_nodes/cg-use-everywhere
   0.0 seconds: /home/hucd/ComfyUI/custom_nodes/ComfyUI-FluxExt-MZ
   0.0 seconds: /home/hucd/ComfyUI/custom_nodes/ComfyBootlegOffload.py
   0.0 seconds: /home/hucd/ComfyUI/custom_nodes/fairy-root_ComfyUI-Show-Text
   0.0 seconds: /home/hucd/ComfyUI/custom_nodes/ComfyUI-HFRemoteVae
   0.0 seconds: /home/hucd/ComfyUI/custom_nodes/comfyui_llm_api
   0.0 seconds: /home/hucd/ComfyUI/custom_nodes/comfyui-impact-subpack
   0.0 seconds: /home/hucd/ComfyUI/custom_nodes/ComfyUI-easycontrol
   0.0 seconds: /home/hucd/ComfyUI/custom_nodes/ComfyUI_UltimateSDUpscale
   0.0 seconds: /home/hucd/ComfyUI/custom_nodes/rgthree-comfy
   0.0 seconds: /home/hucd/ComfyUI/custom_nodes/comfyui-kjnodes
   0.0 seconds: /home/hucd/ComfyUI/custom_nodes/comfyui-impact-pack
   0.1 seconds: /home/hucd/ComfyUI/custom_nodes/comfyui_controlnet_aux
   0.1 seconds: /home/hucd/ComfyUI/custom_nodes/comfyui-manager
   0.1 seconds: /home/hucd/ComfyUI/custom_nodes/ComfyUI-WanVideoWrapper
   0.1 seconds: /home/hucd/ComfyUI/custom_nodes/ComfyUI-Easy-Use
   0.5 seconds: /home/hucd/ComfyUI/custom_nodes/ComfyUI-VideoHelperSuite
   0.6 seconds: /home/hucd/ComfyUI/custom_nodes/ComfyUI-LLMs
   0.6 seconds: /home/hucd/ComfyUI/custom_nodes/ComfyUI-Crystools
   0.7 seconds: /home/hucd/ComfyUI/custom_nodes/ComfyUI-nunchaku
   2.6 seconds: /home/hucd/ComfyUI/custom_nodes/comfyui-advancedliveportrait

Starting server

To see the GUI go to: http://0.0.0.0:8188
[ComfyUI-Manager] default cache updated: https://raw.githubusercontent.com/ltdrdata/ComfyUI-Manager/main/alter-list.json
[ComfyUI-Manager] default cache updated: https://raw.githubusercontent.com/ltdrdata/ComfyUI-Manager/main/model-list.json
[ComfyUI-Manager] default cache updated: https://raw.githubusercontent.com/ltdrdata/ComfyUI-Manager/main/github-stats.json
[ComfyUI-Manager] default cache updated: https://raw.githubusercontent.com/ltdrdata/ComfyUI-Manager/main/custom-node-list.json
[ComfyUI-Manager] default cache updated: https://raw.githubusercontent.com/ltdrdata/ComfyUI-Manager/main/extension-node-map.json
FETCH ComfyRegistry Data: 5/81
FETCH ComfyRegistry Data: 10/81
FETCH ComfyRegistry Data: 15/81
FETCH ComfyRegistry Data: 20/81
FETCH ComfyRegistry Data: 25/81
got prompt
Using pytorch attention in VAE
Using pytorch attention in VAE
VAE load device: cuda:0, offload device: cpu, dtype: torch.float32
GPU 0 (NVIDIA GeForce RTX 2080 Ti) Memory: 22002.1875 MiB
VRAM > 14GiB，disable CPU offload
FETCH ComfyRegistry Data: 30/81
[2025-04-07 20:23:09.315] [info] Initializing QuantizedFluxModel on device 0
[2025-04-07 20:23:09.315] [info] Use FP16 model
[2025-04-07 20:23:09.377] [info] Loading weights from /home/hucd/ComfyUI/models/diffusion_models/svdq-int4-flux.1-dev/transformer_blocks.safetensors
[2025-04-07 20:23:44.455] [warning] Failed to load safetensors using method PRIVATE: CUDA error: invalid argument (at /home/hucd/nunchaku/src/Serialization.cpp:124)
[2025-04-07 20:23:44.892] [warning] Failed to load safetensors using method MIO: CUDA error: invalid argument (at /home/hucd/nunchaku/src/Serialization.cpp:130)
[2025-04-07 20:23:47.747] [warning] Failed to load safetensors using method READ: CUDA error: invalid argument (at /home/hucd/nunchaku/src/Tensor.h:72)
[2025-04-07 20:23:50.881] [warning] Memory not pinned
[2025-04-07 20:23:51.711] [info] Done.
Injecting quantized module
Cannot connect to comfyregistry.
FETCH DATA from: https://raw.githubusercontent.com/ltdrdata/ComfyUI-Manager/main/custom-node-list.json[2025-04-07 20:23:52.442] [info] Set attention implementation to nunchaku-fp16
Loading configuration from /home/hucd/ComfyUI/models/diffusion_models/svdq-int4-flux.1-dev/comfy_config.json
model_type FLUX
 [DONE]
[ComfyUI-Manager] All startup tasks have been completed.
CLIP/text encoder model load device: cuda:0, offload device: cpu, current: cpu, dtype: torch.float16
clip missing: ['text_projection.weight']
Token indices sequence length is longer than the specified maximum sequence length for this model (80 > 77). Running this sequence through the model will result in indexing errors
Requested to load FluxClipModel_
loaded completely 13991.570050048827 9319.23095703125 True
Requested to load Flux
loaded completely 4484.035085388184 122.3087158203125 True
Converting LoRAs to nunchaku format: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 57/57 [00:00<00:00, 60.57it/s]
[2025-04-07 20:24:24.967] [info] Loading partial weights from pytorch████████████████████████████████████████████████████████████████████████████████████████▌      | 54/57 [00:00<00:00, 55.72it/s]
[2025-04-07 20:24:25.130] [info] Done.
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 9/9 [00:09<00:00,  1.05s/it]
Requested to load AutoencodingEngine
loaded completely 420.18880462646484 319.7467155456543 True
Prompt executed in 89.90 seconds
got prompt
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 9/9 [00:05<00:00,  1.51it/s]
Prompt executed in 6.93 seconds
got prompt
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 9/9 [00:06<00:00,  1.49it/s]
Prompt executed in 7.00 seconds
got prompt
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 9/9 [00:06<00:00,  1.49it/s]
Prompt executed in 7.00 seconds
got prompt
Requested to load Flux
loaded completely 7943.976236053467 122.3087158203125 True
Converting LoRAs to nunchaku format: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 57/57 [00:00<00:00, 72.85it/s]
[2025-04-07 20:25:57.223] [info] Loading partial weights from pytorch████████████████████████████████████████████████████████████████████████████████               | 50/57 [00:00<00:00, 70.22it/s]
[2025-04-07 20:25:57.366] [info] Done.
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 9/9 [00:07<00:00,  1.28it/s]
Prompt executed in 8.04 seconds
got prompt
Requested to load Flux
Converting LoRAs to nunchaku format: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 57/57 [00:00<00:00, 65.36it/s]
[2025-04-07 20:26:13.295] [info] Loading partial weights from pytorch████████████████████████████████████████████████████████████████████████████████████████▌      | 54/57 [00:00<00:00, 57.33it/s]
[2025-04-07 20:26:13.448] [info] Done.
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 9/9 [00:07<00:00,  1.22it/s]
Prompt executed in 8.54 seconds
got prompt
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 9/9 [00:06<00:00,  1.48it/s]
Prompt executed in 7.10 seconds
got prompt
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 9/9 [00:06<00:00,  1.47it/s]
Prompt executed in 7.09 seconds
got prompt
Requested to load Flux
loaded completely 7943.976236053467 122.3087158203125 True
Converting LoRAs to nunchaku format: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 57/57 [00:00<00:00, 100.34it/s]
[2025-04-07 20:26:52.928] [info] Loading partial weights from pytorch████████████████████████████████████████████████████████████████████████████████████████▌      | 54/57 [00:00<00:00, 92.58it/s]
[2025-04-07 20:26:52.998] [info] Done.
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 9/9 [00:07<00:00,  1.23it/s]
Prompt executed in 8.44 seconds
got prompt
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 9/9 [00:06<00:00,  1.37it/s]
Prompt executed in 7.58 seconds
got prompt
Requested to load Flux
loaded completely 8791.023111053466 122.3087158203125 True
Converting LoRAs to nunchaku format: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 57/57 [00:00<00:00, 67.57it/s]
[2025-04-07 20:27:16.413] [info] Loading partial weights from pytorch████████████████████████████████████████████████████████████████████████████████████████████▊  | 56/57 [00:00<00:00, 58.43it/s]
[2025-04-07 20:27:16.578] [info] Done.
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 9/9 [00:07<00:00,  1.19it/s]
Prompt executed in 8.71 seconds
got prompt
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 9/9 [00:06<00:00,  1.47it/s]
Prompt executed in 7.11 seconds
got prompt
Failed to validate prompt for output 44:
* NunchakuFluxDiTLoader 49:
  - Value not in list: data_type: 'bfloat16' not in ['float16']
Output will be ignored
model_path is /home/hucd/ComfyUI/custom_nodes/comfyui_controlnet_aux/ckpts/LiheYoung/Depth-Anything/checkpoints/depth_anything_vitl14.pth
using MLP layer as FFN
Prompt executed in 8.24 seconds
got prompt
CLIP/text encoder model load device: cuda:0, offload device: cpu, current: cpu, dtype: torch.float16
clip missing: ['text_projection.weight']
Requested to load FluxClipModel_
loaded completely 20027.178284454345 9319.23095703125 True
GPU 0 (NVIDIA GeForce RTX 2080 Ti) Memory: 22002.1875 MiB
VRAM > 14GiB，disable CPU offload
[2025-04-07 20:29:00.077] [info] Initializing QuantizedFluxModel on device 0
[2025-04-07 20:29:00.077] [info] Use FP16 model
[2025-04-07 20:29:00.129] [info] Loading weights from /home/hucd/ComfyUI/models/diffusion_models/svdq-int4-flux.1-depth-dev/transformer_blocks.safetensors
[2025-04-07 20:29:36.534] [warning] Failed to load safetensors using method PRIVATE: CUDA error: invalid argument (at /home/hucd/nunchaku/src/Serialization.cpp:124)
[2025-04-07 20:29:36.985] [warning] Failed to load safetensors using method MIO: CUDA error: invalid argument (at /home/hucd/nunchaku/src/Serialization.cpp:130)
[2025-04-07 20:29:39.885] [warning] Failed to load safetensors using method READ: CUDA error: invalid argument (at /home/hucd/nunchaku/src/Tensor.h:72)
[2025-04-07 20:29:42.969] [warning] Memory not pinned
[2025-04-07 20:29:43.727] [info] Done.
Injecting quantized module
[2025-04-07 20:29:44.424] [info] Set attention implementation to nunchaku-fp16
Loading configuration from /home/hucd/ComfyUI/models/diffusion_models/svdq-int4-flux.1-depth-dev/comfy_config.json
model_type FLUX
Requested to load Flux
loaded completely 4157.913289733887 122.6837158203125 True
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 20/20 [00:14<00:00,  1.42it/s]
Prompt executed in 66.82 seconds
got prompt
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 20/20 [00:14<00:00,  1.40it/s]
Prompt executed in 15.31 seconds
got prompt
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 20/20 [00:14<00:00,  1.38it/s]
Prompt executed in 15.49 seconds
got prompt
model_path is /home/hucd/ComfyUI/custom_nodes/comfyui_controlnet_aux/ckpts/LiheYoung/Depth-Anything/checkpoints/depth_anything_vitl14.pth
using MLP layer as FFN
HTTP Request: POST http://192.168.2.192:8001/v1/chat/completions "HTTP/1.1 200 OK"
CLIP/text encoder model load device: cuda:0, offload device: cpu, current: cpu, dtype: torch.float16
clip missing: ['text_projection.weight']
Requested to load FluxClipModel_
loaded completely 13575.410187530517 4777.53759765625 True
Token indices sequence length is longer than the specified maximum sequence length for this model (84 > 77). Running this sequence through the model will result in indexing errors
Requested to load Flux
Converting LoRAs to nunchaku format: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 57/57 [00:00<00:00, 99.45it/s]
[2025-04-07 20:31:25.595] [info] Loading partial weights from pytorch██████████████████████████████████████████████████████████████████████████████████████████▋    | 55/57 [00:00<00:00, 90.23it/s]
[2025-04-07 20:31:25.661] [info] Done.
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 24/24 [00:27<00:00,  1.15s/it]
Prompt executed in 53.46 seconds
got prompt
GPU 0 (NVIDIA GeForce RTX 2080 Ti) Memory: 22002.1875 MiB
VRAM > 14GiB，disable CPU offload
[2025-04-07 20:32:07.231] [info] Initializing QuantizedFluxModel on device 0
[2025-04-07 20:32:07.231] [info] Use FP16 model
[2025-04-07 20:32:07.250] [info] Loading weights from /home/hucd/ComfyUI/models/diffusion_models/svdq-int4-flux.1-dev/transformer_blocks.safetensors
[2025-04-07 20:32:09.739] [warning] Failed to load safetensors using method PRIVATE: CUDA error: invalid argument (at /home/hucd/nunchaku/src/Serialization.cpp:124)
[2025-04-07 20:32:10.146] [warning] Failed to load safetensors using method MIO: CUDA error: invalid argument (at /home/hucd/nunchaku/src/Serialization.cpp:130)
[2025-04-07 20:32:12.940] [warning] Failed to load safetensors using method READ: CUDA error: invalid argument (at /home/hucd/nunchaku/src/Tensor.h:72)
[2025-04-07 20:32:15.963] [warning] Memory not pinned
[2025-04-07 20:32:16.706] [info] Done.
Injecting quantized module
[2025-04-07 20:32:17.027] [info] Set attention implementation to nunchaku-fp16
Loading configuration from /home/hucd/ComfyUI/models/diffusion_models/svdq-int4-flux.1-dev/comfy_config.json
model_type FLUX
CLIP/text encoder model load device: cuda:0, offload device: cpu, current: cpu, dtype: torch.float16
clip missing: ['text_projection.weight']
Token indices sequence length is longer than the specified maximum sequence length for this model (80 > 77). Running this sequence through the model will result in indexing errors
Requested to load FluxClipModel_
loaded partially 7290.652600097656 7290.47705078125 0
Requested to load Flux
0 models unloaded.
loaded completely 128.0 122.3087158203125 True
Converting LoRAs to nunchaku format: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 57/57 [00:00<00:00, 70.02it/s]
[2025-04-07 20:32:21.198] [info] Loading partial weights from pytorch████████████████████████████████████████████████████████████████████████████████████████████▊  | 56/57 [00:00<00:00, 60.35it/s]
[2025-04-07 20:32:21.356] [info] Done.
  0%|                                                                                                                                                                         | 0/9 [00:01<?, ?it/s]
!!! Exception during processing !!! Allocation on device
Traceback (most recent call last):
  File "/home/hucd/ComfyUI/execution.py", line 327, in execute
    output_data, output_ui, has_subgraph = get_output_data(obj, input_data_all, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb)
  File "/home/hucd/ComfyUI/execution.py", line 202, in get_output_data
    return_values = _map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb)
  File "/home/hucd/ComfyUI/execution.py", line 174, in _map_node_over_list
    process_inputs(input_dict, i)
  File "/home/hucd/ComfyUI/execution.py", line 163, in process_inputs
    results.append(getattr(obj, func)(**inputs))
  File "/home/hucd/ComfyUI/comfy_extras/nodes_custom_sampler.py", line 657, in sample
    samples = guider.sample(noise.generate_noise(latent), latent_image, sampler, sigmas, denoise_mask=noise_mask, callback=callback, disable_pbar=disable_pbar, seed=noise.seed)
  File "/home/hucd/ComfyUI/comfy/samplers.py", line 1008, in sample
    output = executor.execute(noise, latent_image, sampler, sigmas, denoise_mask, callback, disable_pbar, seed)
  File "/home/hucd/ComfyUI/comfy/patcher_extension.py", line 110, in execute
    return self.original(*args, **kwargs)
  File "/home/hucd/ComfyUI/comfy/samplers.py", line 976, in outer_sample
    output = self.inner_sample(noise, latent_image, device, sampler, sigmas, denoise_mask, callback, disable_pbar, seed)
  File "/home/hucd/ComfyUI/comfy/samplers.py", line 959, in inner_sample
    samples = executor.execute(self, sigmas, extra_args, callback, noise, latent_image, denoise_mask, disable_pbar)
  File "/home/hucd/ComfyUI/comfy/patcher_extension.py", line 110, in execute
    return self.original(*args, **kwargs)
  File "/home/hucd/ComfyUI/comfy/samplers.py", line 738, in sample
    samples = self.sampler_function(model_k, noise, sigmas, extra_args=extra_args, callback=k_callback, disable=disable_pbar, **self.extra_options)
  File "/home/hucd/miniconda3/envs/comfyui/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
    return func(*args, **kwargs)
  File "/home/hucd/ComfyUI/comfy/k_diffusion/sampling.py", line 161, in sample_euler
    denoised = model(x, sigma_hat * s_in, **extra_args)
  File "/home/hucd/ComfyUI/comfy/samplers.py", line 390, in __call__
    out = self.inner_model(x, sigma, model_options=model_options, seed=seed)
  File "/home/hucd/ComfyUI/comfy/samplers.py", line 939, in __call__
    return self.predict_noise(*args, **kwargs)
  File "/home/hucd/ComfyUI/comfy/samplers.py", line 942, in predict_noise
    return sampling_function(self.inner_model, x, timestep, self.conds.get("negative", None), self.conds.get("positive", None), self.cfg, model_options=model_options, seed=seed)
  File "/home/hucd/ComfyUI/comfy/samplers.py", line 370, in sampling_function
    out = calc_cond_batch(model, conds, x, timestep, model_options)
  File "/home/hucd/ComfyUI/comfy/samplers.py", line 206, in calc_cond_batch
    return executor.execute(model, conds, x_in, timestep, model_options)
  File "/home/hucd/ComfyUI/comfy/patcher_extension.py", line 110, in execute
    return self.original(*args, **kwargs)
  File "/home/hucd/ComfyUI/comfy/samplers.py", line 319, in _calc_cond_batch
    output = model.apply_model(input_x, timestep_, **c).chunk(batch_chunks)
  File "/home/hucd/ComfyUI/comfy/model_base.py", line 137, in apply_model
    return comfy.patcher_extension.WrapperExecutor.new_class_executor(
  File "/home/hucd/ComfyUI/comfy/patcher_extension.py", line 110, in execute
    return self.original(*args, **kwargs)
  File "/home/hucd/ComfyUI/comfy/model_base.py", line 170, in _apply_model
    model_output = self.diffusion_model(xc, t, context=context, control=control, transformer_options=transformer_options, **extra_conds).float()
  File "/home/hucd/miniconda3/envs/comfyui/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/hucd/miniconda3/envs/comfyui/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/hucd/ComfyUI/custom_nodes/ComfyUI-nunchaku/nodes/models/flux.py", line 90, in forward
    out = model(
  File "/home/hucd/miniconda3/envs/comfyui/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/hucd/miniconda3/envs/comfyui/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/hucd/nunchaku/nunchaku/caching/diffusers_adapters/flux.py", line 33, in new_forward
    return original_forward(*args, **kwargs)
  File "/home/hucd/nunchaku/nunchaku/models/transformers/transformer_flux.py", line 578, in forward
    image_rotary_emb = self.pos_embed(ids)
  File "/home/hucd/miniconda3/envs/comfyui/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/hucd/miniconda3/envs/comfyui/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/hucd/nunchaku/nunchaku/models/transformers/transformer_flux.py", line 217, in forward
    emb = torch.cat([rope(ids[..., i], self.axes_dim[i], self.theta) for i in range(n_axes)], dim=-3)
  File "/home/hucd/nunchaku/nunchaku/models/transformers/transformer_flux.py", line 217, in <listcomp>
    emb = torch.cat([rope(ids[..., i], self.axes_dim[i], self.theta) for i in range(n_axes)], dim=-3)
  File "/home/hucd/nunchaku/nunchaku/models/transformers/transformer_flux.py", line 198, in rope
    stacked_out = torch.stack([sin_out, cos_out], dim=-1)
torch.OutOfMemoryError: Allocation on device

Got an OOM, unloading all loaded models.
Prompt executed in 18.27 seconds
`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

内存占用不释放，并且会导致CPU 100% #57

Loading: ComfyUI-Impact-Pack (V8.8)

Loading: ComfyUI-Impact-Subpack (V1.2.7)

Loading: ComfyUI-Manager (V3.31.9)

ComfyUI Revision: 3288 [75c1c757] *DETACHED | Released on '2025-03-21'

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

内存占用不释放，并且会导致CPU 100% #57

Description

Loading: ComfyUI-Impact-Pack (V8.8)

Loading: ComfyUI-Impact-Subpack (V1.2.7)

Loading: ComfyUI-Manager (V3.31.9)

ComfyUI Revision: 3288 [75c1c757] *DETACHED | Released on '2025-03-21'

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions