Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: inference of fine-tuning model is not working #143

Open
KayMKM opened this issue Jan 9, 2025 · 1 comment
Open

bug: inference of fine-tuning model is not working #143

KayMKM opened this issue Jan 9, 2025 · 1 comment
Assignees
Labels
needs attention The issue needs contributor's attention needs more info Need user to provide more info

Comments

@KayMKM
Copy link

KayMKM commented Jan 9, 2025

I followed this document to fine-tune and deployed inference: https://github.com/microsoft/vscode-ai-toolkit/blob/main/doc/finetune.md

But I the inference endpoint occurs an error:

2025-01-09T06:46:43.857955385Z Traceback (most recent call last):
2025-01-09T06:46:43.857984860Z   File "/mount/inference/utils.py", line 55, in load_model
2025-01-09T06:46:43.875525329Z     model = AutoModelForCausalLM.from_pretrained(
2025-01-09T06:46:43.875556908Z             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025-01-09T06:46:43.875562920Z   File "/opt/conda/lib/python3.11/site-packages/transformers/models/auto/auto_factory.py", line 559, in from_pretrained
2025-01-09T06:46:43.875625156Z     return model_class.from_pretrained(
2025-01-09T06:46:43.875632990Z            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025-01-09T06:46:43.875660385Z   File "/opt/conda/lib/python3.11/site-packages/transformers/modeling_utils.py", line 4000, in from_pretrained
2025-01-09T06:46:43.876266321Z     dispatch_model(model, **device_map_kwargs)
2025-01-09T06:46:43.876299283Z   File "/opt/conda/lib/python3.11/site-packages/accelerate/big_modeling.py", line 498, in dispatch_model
2025-01-09T06:46:43.876418477Z     model.to(device)
2025-01-09T06:46:43.876484914Z   File "/opt/conda/lib/python3.11/site-packages/transformers/modeling_utils.py", line 2849, in to
2025-01-09T06:46:43.876775446Z     raise ValueError(
2025-01-09T06:46:43.876788781Z ValueError: `.to` is not supported for `4-bit` or `8-bit` bitsandbytes models. Please use the model as it is, since the model has already been set to the correct devices and casted to the correct `dtype`.
2025-01-09T06:46:43.876792589Z 
2025-01-09T06:46:43.876795624Z During handling of the above exception, another exception occurred:
2025-01-09T06:46:43.876797928Z 
2025-01-09T06:46:43.876801214Z Traceback (most recent call last):
2025-01-09T06:46:43.876829868Z   File "/mount/inference/./gradio_chat.py", line 42, in <module>
2025-01-09T06:46:43.895541690Z     model = load_model(model_name, torch_dtype, quant_type)
2025-01-09T06:46:43.895572127Z             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2025-01-09T06:46:43.895577717Z   File "/mount/inference/utils.py", line 70, in load_model
2025-01-09T06:46:43.908149431Z     raise RuntimeError(f"Error loading model: {e}")
2025-01-09T06:46:43.908180659Z RuntimeError: Error loading model: `.to` is not supported for `4-bit` or `8-bit` bitsandbytes models. Please use the model as it is, since the model has already been set to the correct devices and casted to the correct `dtype`.

I tried to upgrade the transformer package version to 4.47.1 but not working

@microsoft-github-policy-service microsoft-github-policy-service bot added the needs attention The issue needs contributor's attention label Jan 9, 2025
@swatDong swatDong added the needs more info Need user to provide more info label Jan 9, 2025
@swatDong
Copy link
Contributor

swatDong commented Jan 9, 2025

Could you please share more info like which model and what parameters?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs attention The issue needs contributor's attention needs more info Need user to provide more info
Projects
None yet
Development

No branches or pull requests

3 participants