项目名称：AI修仙之：救人一命胜造七级浮屠，NIM+RAG+SLM急救知识智能对话机器人

报告日期：2024年8月19日

项目负责人：Sophia

项目概述：

   业余做民间救援志愿者多年，虽然民间救援队一直在做急救普及工作，但是如果有一个智能对话机器人，在当突发事件发生时可以及时提供知识协助，也许能否挽救不少生命。本项目就是基于这样一个朴素的想法，进行的一点探索。
  项目基于NVIDIA的NIM平台+RAG+phi-3小模型进行实现，NIM平台提供了各种大模型和小模型的方便的API接口，使用RAG技术提升模型在某一领域的回答精准度，phi-3小模型可部署于边缘硬件如车载端、手机端，降低对设备性能的要求，提升项目对于硬件的兼容性。

技术方案与实施步骤

模型选择：

Embeding模型选择nvidia/nv-embed-v1 效果表现还不错；

RAG采用FAISS向量库，基于FAISS的高性能、易于集成、内存使用效率高等优势；

SLM选择microsoft/phi-3-small-128k-instruct，轻量级，高效速度快、低延迟，对于资源使用来说性价比优势大，尤其适用于边缘计算，低成本，面向资源有限且任务简单的场景。

数据的构建：

使用相应领域的知识数据构建知识库文档，数据读取之后需要进行数据清理，并对数据进行分块，使用FAISS进行向量化储存。

实施步骤：

环境搭建：

Python 3.8 或3.8+

pip install langchain-nvidia-ai-endpoints

pip install langchain_core

pip install langchain

pip install faiss-cpu==1.7.2

pip install openai

pip install gradio

代码实现：

使用NVIDIA的NIM平台使用NVIDIA_API_KEY

# 使用NVIDIA的NIM平台 使用NVIDIA_API_KEY
import getpass
import os
if os.environ.get("NVIDIA_API_KEY", "").startswith("nvapi-"):
    print("Valid NVIDIA_API_KEY already in environment. Delete to reset")
else:
    nvapi_key = getpass.getpass("NVAPI Key (starts with nvapi-): ")
    assert nvapi_key.startswith("nvapi-"), f"{nvapi_key[:5]}... is not a valid key"
    os.environ["NVIDIA_API_KEY"] = nvapi_key

初始化SLM phi-3

# 初始化SLM phi-3
from langchain_nvidia_ai_endpoints import ChatNVIDIA
llm = ChatNVIDIA(model="microsoft/phi-3-small-128k-instruct", nvidia_api_key=nvapi_key, max_tokens=512)

初始化向量模型

# 初始化向量模型
from langchain_nvidia_ai_endpoints import NVIDIAEmbeddings
embedder = NVIDIAEmbeddings(model="nvidia/nv-embed-v1")

获取专家库的数据

# 获取专家库的数据
import os
from tqdm import tqdm
from pathlib import Path
# Here we read in the text data and prepare them into vectorstore
ps = os.listdir("./data/")
data = []
sources = []
for p in ps:
    if p.endswith('.txt'):
        path2file="./data/"+p
        with open(path2file,encoding="utf-8") as f:
            lines=f.readlines()
            for line in lines:
                if len(line)>=1:
                    data.append(line)
                    sources.append(path2file)

数据清理及删除空行

# 数据清理及删除空行
documents=[d for d in data if d != '\n']
len(data), len(documents), data

将文档处理到 faiss vectorstore 并将其保存到磁盘

# 将文档处理到 faiss vectorstore 并将其保存到磁盘
from operator import itemgetter
from langchain.vectorstores import FAISS
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.runnables import RunnablePassthrough
from langchain.text_splitter import CharacterTextSplitter
from langchain_nvidia_ai_endpoints import ChatNVIDIA
import faiss
# 只需要执行一次，后面可以重读已经保存的向量存储
text_splitter = CharacterTextSplitter(chunk_size=400, separator=" ")
docs = []
metadatas = []
for i, d in enumerate(documents):
    splits = text_splitter.split_text(d)
    #print(len(splits))
    docs.extend(splits)
    metadatas.extend([{"source": sources[i]}] * len(splits))
store = FAISS.from_texts(docs, embedder , metadatas=metadatas)
store.save_local('./data/nv_embedding')

读取之前处理并保存的 Faiss Vectore 存储

# 读取之前处理并保存的 Faiss Vectore 存储
store = FAISS.load_local("./data/nv_embedding", embedder,allow_dangerous_deserialization=True)
#提出问题并基于phi-3-small-128k-instruct模型进行RAG检索
retriever = store.as_retriever()
prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "Answer solely based on the following context:\n<Documents>\n{context}\n</Documents>",
        ),
        ("user", "{question}"),
    ]
)
chain = (
    {"context": retriever, "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

智能机器人对话界面展示

# 读取之前处理并保存的 Faiss Vectore 存储
store = FAISS.load_local("./data/nv_embedding", embedder,allow_dangerous_deserialization=True)
#提出问题并基于phi-3-small-128k-instruct模型进行RAG检索
retriever = store.as_retriever()
prompt = ChatPromptTemplate.from_messages(
    [
        (
            "system",
            "Answer solely based on the following context:\n<Documents>\n{context}\n</Documents>",
        ),
        ("user", "{question}"),
    ]
)
chain = (
    {"context": retriever, "question": RunnablePassthrough()}
    | prompt
    | llm
    | StrOutputParser()
)

测试与调优：

调整专家知识库更好的匹配对话问题，比如知识库内容是【判断是否需要进行心肺复苏包括以下3个步骤，1……，2……，3……】；在RAG匹配的时候效果并不好，不能读到需要的3个步骤，但是将知识库内容改为【判断是否需要进行心肺复苏包括以下3个步骤，1判断是否需要进行心肺复苏首先应该……，2判断是否需要进行心肺复苏其次应该……，3判断是否需要进行心肺复苏还应该……】，再进行小模型加RAG去对话的时候可以完美匹配知识库内容。

这一点只是在调试过程中发现的一个小技巧，路过的小伙伴欢迎多多探讨。

项目成果与展示：

应用场景展示：

本项目基于智能对话机器人在急救领域的应用做了一个小小的尝试

功能演示：

问题与解决方案：

问题分析：

回答问题时与知识库内容匹配不好

解决措施：

调整知识库内容与问题的联系紧密度；

进一步可以尝试调整token以及chunk的参数设置。

项目总结与展望：

项目评估：

成功点：项目在调整知识库内容之后对话表现回答精准度有大幅提高；

不足点：对于上下文关联的内容从知识库的读取和匹配不太好；

未来方向：

可以更换模型做更多的尝试，比如Mamba架构的模型

标签：人工智能

本文转载自: https://blog.csdn.net/SophiaC2020/article/details/141298614
版权归原作者 Sophia.Chen 小灰灰 所有，如有侵权，请联系我们删除。

AI修仙之：救人一命胜造七级浮屠，NIM+RAG+SLM急救知识智能对话机器人