0


Huggingface Transformers Deberta-v3-base安装踩坑记录

下载transformers的预训练模型时,使用bert-base-cased等模型在AutoTokenizer和AutoModel时并不会有太多问题。但在下载deberta-v3-base时可能会发生很多报错。

首先,

from transformers import AutoTokneizer, AutoModel, AutoConfig

checkpoint = 'microsoft/deberta-v3-base'

tokenizer = AutoTokenizer.from_pretrained(checkpoint)

此时会发生报错,提示

ValueError: Couldn't instantiate the backend tokenizer from one of: 
(1) a `tokenizers` library serialization file, 
(2) a slow tokenizer instance to convert or 
(3) an equivalent slow tokenizer class to instantiate and convert. 
You need to have sentencepiece installed to convert a slow tokenizer to a fast one.

解决方法是

pip install transformers sentencepiece

继续导入tokenizer,又会有如下报错

ImportError: 
DeberetaV2Converter requires the protobuf library but it was not found in your environment. Checkout the instructions on the
installation page of its repo: https://github.com/protocolbuffers/protobuf/tree/master/python#installation and follow the ones
that match your environment.

按照说明,如果这个时候点开所给的链接,那么你可能会花上一下午时间也搞不好(我就是这样)

其实直接在环境当中

pip install --no-binary=protobuf protobuf

即可

注意,千万不要直接

pip install protobuf

否则有可能有如下报错

couldn't build proto file into descriptor pool: duplicate file name (sentencepiece_model.proto)

引用:

  1. 抱抱脸transformers 报错Couldn't instantiate the backend tokenizer - 简书

Couldn‘t build proto file into descriptor pool_小米爱大豆的博客-CSDN博客

Protobuf · Issue #10020 · huggingface/transformers · GitHub


本文转载自: https://blog.csdn.net/weixin_43160040/article/details/127909459
版权归原作者 Guti Haz 所有, 如有侵权,请联系我们删除。

“Huggingface Transformers Deberta-v3-base安装踩坑记录”的评论:

还没有评论