0


Huggingface Transformers Deberta-v3-base安装踩坑记录

下载transformers的预训练模型时,使用bert-base-cased等模型在AutoTokenizer和AutoModel时并不会有太多问题。但在下载deberta-v3-base时可能会发生很多报错。

首先,

  1. from transformers import AutoTokneizer, AutoModel, AutoConfig
  2. checkpoint = 'microsoft/deberta-v3-base'
  3. tokenizer = AutoTokenizer.from_pretrained(checkpoint)

此时会发生报错,提示

  1. ValueError: Couldn't instantiate the backend tokenizer from one of:
  2. (1) a `tokenizers` library serialization file,
  3. (2) a slow tokenizer instance to convert or
  4. (3) an equivalent slow tokenizer class to instantiate and convert.
  5. You need to have sentencepiece installed to convert a slow tokenizer to a fast one.

解决方法是

  1. pip install transformers sentencepiece

继续导入tokenizer,又会有如下报错

  1. ImportError:
  2. DeberetaV2Converter requires the protobuf library but it was not found in your environment. Checkout the instructions on the
  3. installation page of its repo: https://github.com/protocolbuffers/protobuf/tree/master/python#installation and follow the ones
  4. that match your environment.

按照说明,如果这个时候点开所给的链接,那么你可能会花上一下午时间也搞不好(我就是这样)

其实直接在环境当中

  1. pip install --no-binary=protobuf protobuf

即可

注意,千万不要直接

  1. pip install protobuf

否则有可能有如下报错

  1. couldn't build proto file into descriptor pool: duplicate file name (sentencepiece_model.proto)

引用:

  1. 抱抱脸transformers 报错Couldn't instantiate the backend tokenizer - 简书

Couldn‘t build proto file into descriptor pool_小米爱大豆的博客-CSDN博客

Protobuf · Issue #10020 · huggingface/transformers · GitHub


本文转载自: https://blog.csdn.net/weixin_43160040/article/details/127909459
版权归原作者 Guti Haz 所有, 如有侵权,请联系我们删除。

“Huggingface Transformers Deberta-v3-base安装踩坑记录”的评论:

还没有评论