基于开源模型对文本和音频进行情感分析

应用场景

从商品详情页爬取商品评论，对其做舆情分析；
电话客服，对音频进行分析，做舆情分析；
通过对商品的评论分析，作为对供应商打分/商品个性化排序等依据；

模型选用

文本，选用了通义实验室fine-tune的structBERT 模型，基于大众点评的评论数据进行训练，使用预训练模型进行推理，CPU 能跑，支持模型微调，基本上不用微调了，因为他是基于电商领域的数据集进行训练的，基本够用，training dataset 使用了大众点评等平台数据，可本地部署；

参考论文：

title: Incorporating language structures into pre-training for deep language understanding
author：Wang, Wei and Bi, Bin and Yan, Ming and Wu, Chen and Bao, Zuyi and Xia, Jiangnan and Peng, Liwei and Si, Luo
journal：arXiv preprint arXiv:1908.04577，
year：2019

版本依赖：

modelscope-lib 最新版本

推理代码：

semantic_cls = pipeline(Tasks.text_classification, 'damo/nlp_structbert_sentiment-classification_chinese-base')

comment0 = '非常厚实的一包大米，来自遥远的东北，盘锦大米，应该不错的，密封性很好。卖家的服务真是贴心周到！他们提供了专业的建议，帮助我选择了合适的商品。物流速度也很快，让我顺利收到了商品。'
result0 = semantic_cls(input=comment0)
if result0['scores'][0] > result0['scores'][1]:
    print("'" + comment0 + "'，属于" + result0["labels"][0] + "评价")
else:
    print("'" + comment0 + "'，属于" + result0["labels"][1] + "评价")

comment1 = '食物的口感还不错，不过店员的服务态度可以进一步改善一下。'
result1 = semantic_cls(input=comment1)
if result1['scores'][0] > result1['scores'][1]:
    print("'" + comment1 + "'，属于" + result1["labels"][0] + "评价")
else:
    print("'" + comment1 + "'，属于" + result1["labels"][1] + "评价")

comment2 = '衣服尺码合适，色彩可以再鲜艳一些，客服响应速度一般。'
result2 = semantic_cls(input=comment2)
if result2['scores'][0] > result2['scores'][1]:
    print("'" + comment2 + "'，属于" + result2["labels"][0] + "评价")
else:
    print("'" + comment2 + "'，属于" + result2["labels"][1] + "评价")

comment3 = '物流慢，售后不好，货品质量差。'
result3 = semantic_cls(input=comment3)
if result3['scores'][0] > result3['scores'][1]:
    print("'" + comment3 + "'，属于" + result3["labels"][0] + "评价")
else:
    print("'" + comment3 + "'，属于" + result3["labels"][1] + "评价")

comment4 = '物流包装顺坏，不过客服处理速度比较快，也给了比较满意的赔偿。'
result4 = semantic_cls(input=comment4)
if result4['scores'][0] > result4['scores'][1]:
    print("'" + comment4 + "'，属于" + result4["labels"][0] + "评价")
else:
    print("'" + comment4 + "'，属于" + result4["labels"][1] + "评价")

comment5 = '冰箱制冷噪声较大，制冷慢。'
result5 = semantic_cls(input=comment5)
if result5['scores'][0] > result5['scores'][1]:
    print("'" + comment5 + "'，属于" + result5["labels"][0] + "评价")
else:
    print("'" + comment5 + "'，属于" + result5["labels"][1] + "评价")

comment6 = '买了一件刘德华同款鞋，穿在自己脚上不像刘德华，像扫大街的。'
result6 = semantic_cls(input=comment6)
if result6['scores'][0] > result6['scores'][1]:
    print("'" + comment6 + "'，属于" + result6["labels"][0] + "评价")
else:
    print("'" + comment6 + "'，属于" + result6["labels"][1] + "评价")

运行结果：

'非常厚实的一包大米，来自遥远的东北，盘锦大米，应该不错的，密封性很好。卖家的服务真是贴心周到！他们提供了专业的建议，帮助我选择了合适的商品。物流速度也很快，让我顺利收到了商品。'，属于正面评价
'食物的口感还不错，不过店员的服务态度可以进一步改善一下。'，属于正面评价
'衣服尺码合适，色彩可以再鲜艳一些，客服响应速度一般。'，属于正面评价
'物流慢，售后不好，货品质量差。'，属于负面评价
'物流包装顺坏，不过客服处理速度比较快，也给了比较满意的赔偿。'，属于正面评价
'冰箱制冷噪声较大，制冷慢。'，属于负面评价
'买了一件刘德华同款鞋，穿在自己脚上不像刘德华，像扫大街的。'，属于负面评价

音频，选用了通义实验室 fine-tune的emotion2vec微调模型，CPU 能跑，可本地部署；

参考论文：

title: Self-Supervised Pre-Training for Speech Emotion Representation
author：Ma, Ziyang and Zheng, Zhisheng and Ye, Jiaxin and Li, Jinchao and Gao, Zhifu and Zhang, Shiliang and Chen, Xie
journal：arXiv preprint arXiv:2312.15185
year：2023

开源地址：

Official PyTorch code for extracting features and training downstream models with emotion2vec: Self-Supervised Pre-Training for Speech Emotion Representation

版本依赖：

modelscope >= 1.11.1

funasr>=1.0.5

推理代码：

from funasr import AutoModel

model = AutoModel(model="iic/emotion2vec_base_finetuned", model_revision="v2.0.4")

wav_file = f"{model.model_path}/example/test.wav"
res = model.generate(wav_file, output_dir="./outputs", granularity="utterance", extract_embedding=False)
print(res)

scores = res[0]["scores"]

max_score = 0
max_index = 0
i = 0
for score in scores:
    if score > max_score:
        max_score = score
        max_index = i
    i += 1

print("音频分析后，情感基调为：" + res[0]["labels"][max_index])

运行结果

rtf_avg: 0.263: 100%|██████████| 1/1 [00:02<00:00, 2.64s/it]
[{'key': 'rand_key_2yW4Acq9GFz6Y', 'labels': ['生气/angry', '厌恶/disgusted', '恐惧/fearful', '开心/happy', '中立/neutral', '其他/other', '难过/sad', '吃惊/surprised', '<unk>'], 'scores': [0.06824027001857758, 0.030794354155659676, 0.20301730930805206, 0.09666425734758377, 0.12219445407390594, 0.06753909587860107, 0.13648174703121185, 0.11873088777065277, 0.1563376784324646]}]

音频分析后，情感为：恐惧/fearful

Process finished with exit code 0

总结

对于行业内的预训练模型，使用模型能力进行预处理，同时结合产品规则进行精细化筛选，基本能满足中小团队的需求；如果该预训练模型能使用 CPU 完成推理，那成本就更加容易控制了；可以开发相应的服务接口提供给业务系统使用；

标签：音视频自然语言处理

本文转载自: https://blog.csdn.net/tq02h2a/article/details/136224596
版权归原作者 清风1981 所有，如有侵权，请联系我们删除。

基于开源模型对文本和音频进行情感分析

应用场景

模型选用

总结

发表评论

“基于开源模型对文本和音频进行情感分析”的评论:

关于作者

overfit同步小助手

相关阅读

文章导航