-
Notifications
You must be signed in to change notification settings - Fork 155
内存占用不释放,并且会导致CPU 100% #57
Description
生成一次图片,内存占用从16G增加到37G,并稳定不释放,重复多次生成图片,不再增加内存占用
切换工作流,加载depth相关模型,内存释放一部分,降低到34G占用
开始加载depth模型,并生成一次图片,内存占用49G, 重复执行图片生成,内存占用稳定不变。
切换工作流,但是新工作也同样使用depth,内存先降低到47G,生成图片后回到48G,并稳定
切换工作流,使用flux1 dev模型,不是depth,内存占用直接升到68Gb,CompfyUI报错,OOM
重启Comfyui,内存占用稳定55G,Vram全部释放
测试模型加载的comfyui日志:
`
Set vram state to: NORMAL_VRAM
Device: cuda:0 NVIDIA GeForce RTX 2080 Ti : cudaMallocAsync
Using pytorch attention
ComfyUI version: 0.3.27
ComfyUI frontend version: 1.14.6
[Prompt Server] web root: /home/hucd/miniconda3/envs/comfyui/lib/python3.10/site-packages/comfyui_frontend_package/static
[Crystools INFO] Crystools version: 1.22.1
[Crystools INFO] CPU: Intel(R) Xeon(R) CPU E5-2697 v4 @ 2.30GHz - Arch: x86_64 - OS: Linux 6.11.0-21-generic
[Crystools INFO] Pynvml (Nvidia) initialized.
[Crystools INFO] GPU/s:
[Crystools INFO] 0) NVIDIA GeForce RTX 2080 Ti
[Crystools INFO] 1) NVIDIA GeForce RTX 2080 Ti
[Crystools INFO] 2) NVIDIA GeForce RTX 2080 Ti
[Crystools INFO] 3) NVIDIA GeForce RTX 2080 Ti
[Crystools INFO] NVIDIA Driver: 570.133.07
Total VRAM 22002 MB, total RAM 128706 MB
pytorch version: 2.6.0+cu124
Set vram state to: NORMAL_VRAM
Device: cuda:0 NVIDIA GeForce RTX 2080 Ti : cudaMallocAsync
[rgthree-comfy] Loaded 42 epic nodes. 🎉
[comfyui_controlnet_aux] | INFO -> Using ckpts path: /home/hucd/ComfyUI/custom_nodes/comfyui_controlnet_aux/ckpts
[comfyui_controlnet_aux] | INFO -> Using symlinks: False
[comfyui_controlnet_aux] | INFO -> Using ort providers: ['CUDAExecutionProvider', 'DirectMLExecutionProvider', 'OpenVINOExecutionProvider', 'ROCMExecutionProvider', 'CPUExecutionProvider', 'CoreMLExecutionProvider']
/home/hucd/ComfyUI/custom_nodes/comfyui_controlnet_aux/node_wrappers/dwpose.py:26: UserWarning: DWPose: Onnxruntime not found or doesn't come with acceleration providers, switch to OpenCV with CPU device. DWPose might run very slowly
warnings.warn("DWPose: Onnxruntime not found or doesn't come with acceleration providers, switch to OpenCV with CPU device. DWPose might run very slowly")
Loading: ComfyUI-Impact-Pack (V8.8)
Loading: ComfyUI-Impact-Subpack (V1.2.7)
[Impact Pack] Wildcards loading done.
[Impact Subpack] ultralytics_bbox: /home/hucd/ComfyUI/models/ultralytics/bbox
[Impact Subpack] ultralytics_segm: /home/hucd/ComfyUI/models/ultralytics/segm
Loading: ComfyUI-Manager (V3.31.9)
[ComfyUI-Manager] network_mode: public
ComfyUI Revision: 3288 [75c1c757] *DETACHED | Released on '2025-03-21'
[ComfyUI-Easy-Use] server: v1.2.8 Loaded
[ComfyUI-Easy-Use] web root: /home/hucd/ComfyUI/custom_nodes/ComfyUI-Easy-Use/web_version/v2 Loaded
Import times for custom nodes:
0.0 seconds: /home/hucd/ComfyUI/custom_nodes/websocket_image_save.py
0.0 seconds: /home/hucd/ComfyUI/custom_nodes/cg-use-everywhere
0.0 seconds: /home/hucd/ComfyUI/custom_nodes/ComfyUI-FluxExt-MZ
0.0 seconds: /home/hucd/ComfyUI/custom_nodes/ComfyBootlegOffload.py
0.0 seconds: /home/hucd/ComfyUI/custom_nodes/fairy-root_ComfyUI-Show-Text
0.0 seconds: /home/hucd/ComfyUI/custom_nodes/ComfyUI-HFRemoteVae
0.0 seconds: /home/hucd/ComfyUI/custom_nodes/comfyui_llm_api
0.0 seconds: /home/hucd/ComfyUI/custom_nodes/comfyui-impact-subpack
0.0 seconds: /home/hucd/ComfyUI/custom_nodes/ComfyUI-easycontrol
0.0 seconds: /home/hucd/ComfyUI/custom_nodes/ComfyUI_UltimateSDUpscale
0.0 seconds: /home/hucd/ComfyUI/custom_nodes/rgthree-comfy
0.0 seconds: /home/hucd/ComfyUI/custom_nodes/comfyui-kjnodes
0.0 seconds: /home/hucd/ComfyUI/custom_nodes/comfyui-impact-pack
0.1 seconds: /home/hucd/ComfyUI/custom_nodes/comfyui_controlnet_aux
0.1 seconds: /home/hucd/ComfyUI/custom_nodes/comfyui-manager
0.1 seconds: /home/hucd/ComfyUI/custom_nodes/ComfyUI-WanVideoWrapper
0.1 seconds: /home/hucd/ComfyUI/custom_nodes/ComfyUI-Easy-Use
0.5 seconds: /home/hucd/ComfyUI/custom_nodes/ComfyUI-VideoHelperSuite
0.6 seconds: /home/hucd/ComfyUI/custom_nodes/ComfyUI-LLMs
0.6 seconds: /home/hucd/ComfyUI/custom_nodes/ComfyUI-Crystools
0.7 seconds: /home/hucd/ComfyUI/custom_nodes/ComfyUI-nunchaku
2.6 seconds: /home/hucd/ComfyUI/custom_nodes/comfyui-advancedliveportrait
Starting server
To see the GUI go to: http://0.0.0.0:8188
[ComfyUI-Manager] default cache updated: https://raw.githubusercontent.com/ltdrdata/ComfyUI-Manager/main/alter-list.json
[ComfyUI-Manager] default cache updated: https://raw.githubusercontent.com/ltdrdata/ComfyUI-Manager/main/model-list.json
[ComfyUI-Manager] default cache updated: https://raw.githubusercontent.com/ltdrdata/ComfyUI-Manager/main/github-stats.json
[ComfyUI-Manager] default cache updated: https://raw.githubusercontent.com/ltdrdata/ComfyUI-Manager/main/custom-node-list.json
[ComfyUI-Manager] default cache updated: https://raw.githubusercontent.com/ltdrdata/ComfyUI-Manager/main/extension-node-map.json
FETCH ComfyRegistry Data: 5/81
FETCH ComfyRegistry Data: 10/81
FETCH ComfyRegistry Data: 15/81
FETCH ComfyRegistry Data: 20/81
FETCH ComfyRegistry Data: 25/81
got prompt
Using pytorch attention in VAE
Using pytorch attention in VAE
VAE load device: cuda:0, offload device: cpu, dtype: torch.float32
GPU 0 (NVIDIA GeForce RTX 2080 Ti) Memory: 22002.1875 MiB
VRAM > 14GiB,disable CPU offload
FETCH ComfyRegistry Data: 30/81
[2025-04-07 20:23:09.315] [info] Initializing QuantizedFluxModel on device 0
[2025-04-07 20:23:09.315] [info] Use FP16 model
[2025-04-07 20:23:09.377] [info] Loading weights from /home/hucd/ComfyUI/models/diffusion_models/svdq-int4-flux.1-dev/transformer_blocks.safetensors
[2025-04-07 20:23:44.455] [warning] Failed to load safetensors using method PRIVATE: CUDA error: invalid argument (at /home/hucd/nunchaku/src/Serialization.cpp:124)
[2025-04-07 20:23:44.892] [warning] Failed to load safetensors using method MIO: CUDA error: invalid argument (at /home/hucd/nunchaku/src/Serialization.cpp:130)
[2025-04-07 20:23:47.747] [warning] Failed to load safetensors using method READ: CUDA error: invalid argument (at /home/hucd/nunchaku/src/Tensor.h:72)
[2025-04-07 20:23:50.881] [warning] Memory not pinned
[2025-04-07 20:23:51.711] [info] Done.
Injecting quantized module
Cannot connect to comfyregistry.
FETCH DATA from: https://raw.githubusercontent.com/ltdrdata/ComfyUI-Manager/main/custom-node-list.json[2025-04-07 20:23:52.442] [info] Set attention implementation to nunchaku-fp16
Loading configuration from /home/hucd/ComfyUI/models/diffusion_models/svdq-int4-flux.1-dev/comfy_config.json
model_type FLUX
[DONE]
[ComfyUI-Manager] All startup tasks have been completed.
CLIP/text encoder model load device: cuda:0, offload device: cpu, current: cpu, dtype: torch.float16
clip missing: ['text_projection.weight']
Token indices sequence length is longer than the specified maximum sequence length for this model (80 > 77). Running this sequence through the model will result in indexing errors
Requested to load FluxClipModel_
loaded completely 13991.570050048827 9319.23095703125 True
Requested to load Flux
loaded completely 4484.035085388184 122.3087158203125 True
Converting LoRAs to nunchaku format: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 57/57 [00:00<00:00, 60.57it/s]
[2025-04-07 20:24:24.967] [info] Loading partial weights from pytorch████████████████████████████████████████████████████████████████████████████████████████▌ | 54/57 [00:00<00:00, 55.72it/s]
[2025-04-07 20:24:25.130] [info] Done.
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 9/9 [00:09<00:00, 1.05s/it]
Requested to load AutoencodingEngine
loaded completely 420.18880462646484 319.7467155456543 True
Prompt executed in 89.90 seconds
got prompt
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 9/9 [00:05<00:00, 1.51it/s]
Prompt executed in 6.93 seconds
got prompt
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 9/9 [00:06<00:00, 1.49it/s]
Prompt executed in 7.00 seconds
got prompt
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 9/9 [00:06<00:00, 1.49it/s]
Prompt executed in 7.00 seconds
got prompt
Requested to load Flux
loaded completely 7943.976236053467 122.3087158203125 True
Converting LoRAs to nunchaku format: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 57/57 [00:00<00:00, 72.85it/s]
[2025-04-07 20:25:57.223] [info] Loading partial weights from pytorch████████████████████████████████████████████████████████████████████████████████ | 50/57 [00:00<00:00, 70.22it/s]
[2025-04-07 20:25:57.366] [info] Done.
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 9/9 [00:07<00:00, 1.28it/s]
Prompt executed in 8.04 seconds
got prompt
Requested to load Flux
Converting LoRAs to nunchaku format: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 57/57 [00:00<00:00, 65.36it/s]
[2025-04-07 20:26:13.295] [info] Loading partial weights from pytorch████████████████████████████████████████████████████████████████████████████████████████▌ | 54/57 [00:00<00:00, 57.33it/s]
[2025-04-07 20:26:13.448] [info] Done.
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 9/9 [00:07<00:00, 1.22it/s]
Prompt executed in 8.54 seconds
got prompt
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 9/9 [00:06<00:00, 1.48it/s]
Prompt executed in 7.10 seconds
got prompt
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 9/9 [00:06<00:00, 1.47it/s]
Prompt executed in 7.09 seconds
got prompt
Requested to load Flux
loaded completely 7943.976236053467 122.3087158203125 True
Converting LoRAs to nunchaku format: 100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 57/57 [00:00<00:00, 100.34it/s]
[2025-04-07 20:26:52.928] [info] Loading partial weights from pytorch████████████████████████████████████████████████████████████████████████████████████████▌ | 54/57 [00:00<00:00, 92.58it/s]
[2025-04-07 20:26:52.998] [info] Done.
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 9/9 [00:07<00:00, 1.23it/s]
Prompt executed in 8.44 seconds
got prompt
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 9/9 [00:06<00:00, 1.37it/s]
Prompt executed in 7.58 seconds
got prompt
Requested to load Flux
loaded completely 8791.023111053466 122.3087158203125 True
Converting LoRAs to nunchaku format: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 57/57 [00:00<00:00, 67.57it/s]
[2025-04-07 20:27:16.413] [info] Loading partial weights from pytorch████████████████████████████████████████████████████████████████████████████████████████████▊ | 56/57 [00:00<00:00, 58.43it/s]
[2025-04-07 20:27:16.578] [info] Done.
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 9/9 [00:07<00:00, 1.19it/s]
Prompt executed in 8.71 seconds
got prompt
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 9/9 [00:06<00:00, 1.47it/s]
Prompt executed in 7.11 seconds
got prompt
Failed to validate prompt for output 44:
- NunchakuFluxDiTLoader 49:
- Value not in list: data_type: 'bfloat16' not in ['float16']
Output will be ignored
model_path is /home/hucd/ComfyUI/custom_nodes/comfyui_controlnet_aux/ckpts/LiheYoung/Depth-Anything/checkpoints/depth_anything_vitl14.pth
using MLP layer as FFN
Prompt executed in 8.24 seconds
got prompt
CLIP/text encoder model load device: cuda:0, offload device: cpu, current: cpu, dtype: torch.float16
clip missing: ['text_projection.weight']
Requested to load FluxClipModel_
loaded completely 20027.178284454345 9319.23095703125 True
GPU 0 (NVIDIA GeForce RTX 2080 Ti) Memory: 22002.1875 MiB
VRAM > 14GiB,disable CPU offload
[2025-04-07 20:29:00.077] [info] Initializing QuantizedFluxModel on device 0
[2025-04-07 20:29:00.077] [info] Use FP16 model
[2025-04-07 20:29:00.129] [info] Loading weights from /home/hucd/ComfyUI/models/diffusion_models/svdq-int4-flux.1-depth-dev/transformer_blocks.safetensors
[2025-04-07 20:29:36.534] [warning] Failed to load safetensors using method PRIVATE: CUDA error: invalid argument (at /home/hucd/nunchaku/src/Serialization.cpp:124)
[2025-04-07 20:29:36.985] [warning] Failed to load safetensors using method MIO: CUDA error: invalid argument (at /home/hucd/nunchaku/src/Serialization.cpp:130)
[2025-04-07 20:29:39.885] [warning] Failed to load safetensors using method READ: CUDA error: invalid argument (at /home/hucd/nunchaku/src/Tensor.h:72)
[2025-04-07 20:29:42.969] [warning] Memory not pinned
[2025-04-07 20:29:43.727] [info] Done.
Injecting quantized module
[2025-04-07 20:29:44.424] [info] Set attention implementation to nunchaku-fp16
Loading configuration from /home/hucd/ComfyUI/models/diffusion_models/svdq-int4-flux.1-depth-dev/comfy_config.json
model_type FLUX
Requested to load Flux
loaded completely 4157.913289733887 122.6837158203125 True
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 20/20 [00:14<00:00, 1.42it/s]
Prompt executed in 66.82 seconds
got prompt
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 20/20 [00:14<00:00, 1.40it/s]
Prompt executed in 15.31 seconds
got prompt
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 20/20 [00:14<00:00, 1.38it/s]
Prompt executed in 15.49 seconds
got prompt
model_path is /home/hucd/ComfyUI/custom_nodes/comfyui_controlnet_aux/ckpts/LiheYoung/Depth-Anything/checkpoints/depth_anything_vitl14.pth
using MLP layer as FFN
HTTP Request: POST http://192.168.2.192:8001/v1/chat/completions "HTTP/1.1 200 OK"
CLIP/text encoder model load device: cuda:0, offload device: cpu, current: cpu, dtype: torch.float16
clip missing: ['text_projection.weight']
Requested to load FluxClipModel_
loaded completely 13575.410187530517 4777.53759765625 True
Token indices sequence length is longer than the specified maximum sequence length for this model (84 > 77). Running this sequence through the model will result in indexing errors
Requested to load Flux
Converting LoRAs to nunchaku format: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 57/57 [00:00<00:00, 99.45it/s]
[2025-04-07 20:31:25.595] [info] Loading partial weights from pytorch██████████████████████████████████████████████████████████████████████████████████████████▋ | 55/57 [00:00<00:00, 90.23it/s]
[2025-04-07 20:31:25.661] [info] Done.
100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 24/24 [00:27<00:00, 1.15s/it]
Prompt executed in 53.46 seconds
got prompt
GPU 0 (NVIDIA GeForce RTX 2080 Ti) Memory: 22002.1875 MiB
VRAM > 14GiB,disable CPU offload
[2025-04-07 20:32:07.231] [info] Initializing QuantizedFluxModel on device 0
[2025-04-07 20:32:07.231] [info] Use FP16 model
[2025-04-07 20:32:07.250] [info] Loading weights from /home/hucd/ComfyUI/models/diffusion_models/svdq-int4-flux.1-dev/transformer_blocks.safetensors
[2025-04-07 20:32:09.739] [warning] Failed to load safetensors using method PRIVATE: CUDA error: invalid argument (at /home/hucd/nunchaku/src/Serialization.cpp:124)
[2025-04-07 20:32:10.146] [warning] Failed to load safetensors using method MIO: CUDA error: invalid argument (at /home/hucd/nunchaku/src/Serialization.cpp:130)
[2025-04-07 20:32:12.940] [warning] Failed to load safetensors using method READ: CUDA error: invalid argument (at /home/hucd/nunchaku/src/Tensor.h:72)
[2025-04-07 20:32:15.963] [warning] Memory not pinned
[2025-04-07 20:32:16.706] [info] Done.
Injecting quantized module
[2025-04-07 20:32:17.027] [info] Set attention implementation to nunchaku-fp16
Loading configuration from /home/hucd/ComfyUI/models/diffusion_models/svdq-int4-flux.1-dev/comfy_config.json
model_type FLUX
CLIP/text encoder model load device: cuda:0, offload device: cpu, current: cpu, dtype: torch.float16
clip missing: ['text_projection.weight']
Token indices sequence length is longer than the specified maximum sequence length for this model (80 > 77). Running this sequence through the model will result in indexing errors
Requested to load FluxClipModel_
loaded partially 7290.652600097656 7290.47705078125 0
Requested to load Flux
0 models unloaded.
loaded completely 128.0 122.3087158203125 True
Converting LoRAs to nunchaku format: 100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 57/57 [00:00<00:00, 70.02it/s]
[2025-04-07 20:32:21.198] [info] Loading partial weights from pytorch████████████████████████████████████████████████████████████████████████████████████████████▊ | 56/57 [00:00<00:00, 60.35it/s]
[2025-04-07 20:32:21.356] [info] Done.
0%| | 0/9 [00:01<?, ?it/s]
!!! Exception during processing !!! Allocation on device
Traceback (most recent call last):
File "/home/hucd/ComfyUI/execution.py", line 327, in execute
output_data, output_ui, has_subgraph = get_output_data(obj, input_data_all, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb)
File "/home/hucd/ComfyUI/execution.py", line 202, in get_output_data
return_values = _map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True, execution_block_cb=execution_block_cb, pre_execute_cb=pre_execute_cb)
File "/home/hucd/ComfyUI/execution.py", line 174, in _map_node_over_list
process_inputs(input_dict, i)
File "/home/hucd/ComfyUI/execution.py", line 163, in process_inputs
results.append(getattr(obj, func)(**inputs))
File "/home/hucd/ComfyUI/comfy_extras/nodes_custom_sampler.py", line 657, in sample
samples = guider.sample(noise.generate_noise(latent), latent_image, sampler, sigmas, denoise_mask=noise_mask, callback=callback, disable_pbar=disable_pbar, seed=noise.seed)
File "/home/hucd/ComfyUI/comfy/samplers.py", line 1008, in sample
output = executor.execute(noise, latent_image, sampler, sigmas, denoise_mask, callback, disable_pbar, seed)
File "/home/hucd/ComfyUI/comfy/patcher_extension.py", line 110, in execute
return self.original(*args, **kwargs)
File "/home/hucd/ComfyUI/comfy/samplers.py", line 976, in outer_sample
output = self.inner_sample(noise, latent_image, device, sampler, sigmas, denoise_mask, callback, disable_pbar, seed)
File "/home/hucd/ComfyUI/comfy/samplers.py", line 959, in inner_sample
samples = executor.execute(self, sigmas, extra_args, callback, noise, latent_image, denoise_mask, disable_pbar)
File "/home/hucd/ComfyUI/comfy/patcher_extension.py", line 110, in execute
return self.original(*args, **kwargs)
File "/home/hucd/ComfyUI/comfy/samplers.py", line 738, in sample
samples = self.sampler_function(model_k, noise, sigmas, extra_args=extra_args, callback=k_callback, disable=disable_pbar, **self.extra_options)
File "/home/hucd/miniconda3/envs/comfyui/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
return func(*args, **kwargs)
File "/home/hucd/ComfyUI/comfy/k_diffusion/sampling.py", line 161, in sample_euler
denoised = model(x, sigma_hat * s_in, **extra_args)
File "/home/hucd/ComfyUI/comfy/samplers.py", line 390, in call
out = self.inner_model(x, sigma, model_options=model_options, seed=seed)
File "/home/hucd/ComfyUI/comfy/samplers.py", line 939, in call
return self.predict_noise(*args, **kwargs)
File "/home/hucd/ComfyUI/comfy/samplers.py", line 942, in predict_noise
return sampling_function(self.inner_model, x, timestep, self.conds.get("negative", None), self.conds.get("positive", None), self.cfg, model_options=model_options, seed=seed)
File "/home/hucd/ComfyUI/comfy/samplers.py", line 370, in sampling_function
out = calc_cond_batch(model, conds, x, timestep, model_options)
File "/home/hucd/ComfyUI/comfy/samplers.py", line 206, in calc_cond_batch
return executor.execute(model, conds, x_in, timestep, model_options)
File "/home/hucd/ComfyUI/comfy/patcher_extension.py", line 110, in execute
return self.original(*args, **kwargs)
File "/home/hucd/ComfyUI/comfy/samplers.py", line 319, in calc_cond_batch
output = model.apply_model(input_x, timestep, **c).chunk(batch_chunks)
File "/home/hucd/ComfyUI/comfy/model_base.py", line 137, in apply_model
return comfy.patcher_extension.WrapperExecutor.new_class_executor(
File "/home/hucd/ComfyUI/comfy/patcher_extension.py", line 110, in execute
return self.original(*args, **kwargs)
File "/home/hucd/ComfyUI/comfy/model_base.py", line 170, in _apply_model
model_output = self.diffusion_model(xc, t, context=context, control=control, transformer_options=transformer_options, **extra_conds).float()
File "/home/hucd/miniconda3/envs/comfyui/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/hucd/miniconda3/envs/comfyui/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
return forward_call(*args, **kwargs)
File "/home/hucd/ComfyUI/custom_nodes/ComfyUI-nunchaku/nodes/models/flux.py", line 90, in forward
out = model(
File "/home/hucd/miniconda3/envs/comfyui/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/hucd/miniconda3/envs/comfyui/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
return forward_call(*args, **kwargs)
File "/home/hucd/nunchaku/nunchaku/caching/diffusers_adapters/flux.py", line 33, in new_forward
return original_forward(*args, **kwargs)
File "/home/hucd/nunchaku/nunchaku/models/transformers/transformer_flux.py", line 578, in forward
image_rotary_emb = self.pos_embed(ids)
File "/home/hucd/miniconda3/envs/comfyui/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "/home/hucd/miniconda3/envs/comfyui/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
return forward_call(*args, **kwargs)
File "/home/hucd/nunchaku/nunchaku/models/transformers/transformer_flux.py", line 217, in forward
emb = torch.cat([rope(ids[..., i], self.axes_dim[i], self.theta) for i in range(n_axes)], dim=-3)
File "/home/hucd/nunchaku/nunchaku/models/transformers/transformer_flux.py", line 217, in
emb = torch.cat([rope(ids[..., i], self.axes_dim[i], self.theta) for i in range(n_axes)], dim=-3)
File "/home/hucd/nunchaku/nunchaku/models/transformers/transformer_flux.py", line 198, in rope
stacked_out = torch.stack([sin_out, cos_out], dim=-1)
torch.OutOfMemoryError: Allocation on device
- Value not in list: data_type: 'bfloat16' not in ['float16']
Got an OOM, unloading all loaded models.
Prompt executed in 18.27 seconds
`
