0


ubuntu24.0离线安装Ollama和纯cpu版本以及对接Spring AI

文章目录


一.官网下载 0.3.13版本

ollama离线安装包下载地址
在这里插入图片描述


二.将文件包上传至ubuntu服务器

在这里插入图片描述


三.下载安装脚本

curl -fsSL https://ollama.com/install.sh

修改远程拉取ollama代码为本地解压

源需要修改的脚本代码如下
ifcurl -I --silent --fail --location "https://ollama.com/download/ollama-linux-${ARCH}.tgz${VER_PARAM}">/dev/null ;then
    status "Downloading Linux ${ARCH} bundle"curl --fail --show-error --location --progress-bar \"https://ollama.com/download/ollama-linux-${ARCH}.tgz${VER_PARAM}"|\$SUDOtar -xzf - -C "$OLLAMA_INSTALL_DIR"BUNDLE=1if["$OLLAMA_INSTALL_DIR/bin/ollama"!="$BINDIR/ollama"];then
        status "Making ollama accessible in the PATH in $BINDIR"$SUDOln -sf "$OLLAMA_INSTALL_DIR/ollama""$BINDIR/ollama"fielse
    status "Downloading Linux ${ARCH} CLI"curl --fail --show-error --location --progress-bar -o "$TEMP_DIR/ollama"\"https://ollama.com/download/ollama-linux-${ARCH}${VER_PARAM}"$SUDOinstall -o0 -g0 -m755 $TEMP_DIR/ollama $OLLAMA_INSTALL_DIR/ollama
    BUNDLE=0if["$OLLAMA_INSTALL_DIR/ollama"!="$BINDIR/ollama"];then
        status "Making ollama accessible in the PATH in $BINDIR"$SUDOln -sf "$OLLAMA_INSTALL_DIR/ollama""$BINDIR/ollama"fifi
新改后代码
status "Downloading Linux ${ARCH} bundle"#    curl --fail --show-error --location --progress-bar \#        "https://ollama.com/download/ollama-linux-${ARCH}.tgz${VER_PARAM}" | \$SUDOtar -xzf ./ollama-linux-amd64.tgz -C "$OLLAMA_INSTALL_DIR"BUNDLE=1if["$OLLAMA_INSTALL_DIR/bin/ollama"!="$BINDIR/ollama"];then
    status "Making ollama accessible in the PATH in $BINDIR"$SUDOln -sf "$OLLAMA_INSTALL_DIR/ollama""$BINDIR/ollama"fi

四.剔除GPU相关下载ROCM等,纯CPU运行脚本

在题目3的基础上,又剔除了GPU部分,即从wls2注释将下面全部删除
完整版 离线基于CPU的运行脚本
#!/bin/sh# This script installs Ollama on Linux.# It detects the current operating system architecture and installs the appropriate version of Ollama.set -eu

status(){echo">>> $*">&2;}error(){echo"ERROR $*";exit1;}warning(){echo"WARNING: $*";}TEMP_DIR=$(mktemp -d)cleanup(){rm -rf $TEMP_DIR;}trap cleanup EXIT

available(){command -v $1>/dev/null;}require(){localMISSING=''forTOOLin$*;doif! available $TOOL;thenMISSING="$MISSING$TOOL"fidoneecho$MISSING}["$(uname -s)"="Linux"]|| error 'This script is intended to run on Linux only.'ARCH=$(uname -m)case"$ARCH"in
    x86_64)ARCH="amd64";;
    aarch64|arm64)ARCH="arm64";;
    *) error "Unsupported architecture: $ARCH";;esacIS_WSL2=false

KERN=$(uname -r)case"$KERN"in
    *icrosoft*WSL2 | *icrosoft*wsl2)IS_WSL2=true;;
    *icrosoft) error "Microsoft WSL1 is not currently supported. Please use WSL2 with 'wsl --set-version <distro> 2'";;
    *);;esacVER_PARAM="${OLLAMA_VERSION:+?version=$OLLAMA_VERSION}"SUDO=if["$(id -u)" -ne 0];then# Running as root, no need for sudoif! available sudo;then
        error "This script requires superuser permissions. Please re-run as root."fiSUDO="sudo"fiNEEDS=$(require curlawkgrepsedteexargs)if[ -n "$NEEDS"];then
    status "ERROR: The following tools are required but missing:"forNEEDin$NEEDS;doecho"  - $NEED"doneexit1fiforBINDIRin /usr/local/bin /usr/bin /bin;doecho$PATH|grep -q $BINDIR&&break||continuedoneOLLAMA_INSTALL_DIR=$(dirname ${BINDIR})

status "Installing ollama to $OLLAMA_INSTALL_DIR"$SUDOinstall -o0 -g0 -m755 -d $BINDIR$SUDOinstall -o0 -g0 -m755 -d "$OLLAMA_INSTALL_DIR"
status "Downloading Linux ${ARCH} bundle"#    curl --fail --show-error --location --progress-bar \#        "https://ollama.com/download/ollama-linux-${ARCH}.tgz${VER_PARAM}" | \$SUDOtar -xzf ./ollama-linux-amd64.tgz -C "$OLLAMA_INSTALL_DIR"BUNDLE=1if["$OLLAMA_INSTALL_DIR/bin/ollama"!="$BINDIR/ollama"];then
    status "Making ollama accessible in the PATH in $BINDIR"$SUDOln -sf "$OLLAMA_INSTALL_DIR/ollama""$BINDIR/ollama"fiinstall_success(){
    status 'The Ollama API is now available at 127.0.0.1:11434.'
    status 'Install complete. Run "ollama" from the command line.'}trap install_success EXIT

# Everything from this point onwards is optional.configure_systemd(){if!id ollama >/dev/null 2>&1;then
        status "Creating ollama user..."$SUDOuseradd -r -s /bin/false -U -m -d /usr/share/ollama ollama
    fiif getent group render >/dev/null 2>&1;then
        status "Adding ollama user to render group..."$SUDOusermod -a -G render ollama
    fiif getent group video >/dev/null 2>&1;then
        status "Adding ollama user to video group..."$SUDOusermod -a -G video ollama
    fi

    status "Adding current user to ollama group..."$SUDOusermod -a -G ollama $(whoami)

    status "Creating ollama systemd service..."cat<<EOF|$SUDOtee /etc/systemd/system/ollama.service >/dev/null
[Unit]
Description=Ollama Service
After=network-online.target

[Service]
ExecStart=$BINDIR/ollama serve
User=ollama
Group=ollama
Restart=always
RestartSec=3
Environment="PATH=$PATH"

[Install]
WantedBy=default.target
EOFSYSTEMCTL_RUNNING="$(systemctl is-system-running ||true)"case$SYSTEMCTL_RUNNINGin
        running|degraded)
            status "Enabling and starting ollama service..."$SUDO systemctl daemon-reload
            $SUDO systemctl enable ollama

            start_service(){$SUDO systemctl restart ollama;}trap start_service EXIT
            ;;esac}if available systemctl;then
    configure_systemd
fi

install_success

在这里插入图片描述

五.ollama常用命令

# 关闭ollama服务service ollama stop

ollama serve # 启动ollama
ollama create # 从模型文件创建模型
ollama show  # 显示模型信息
ollama run qwen2.5:3b-instruct-q4_K_M  # 运行模型,会先自动下载模型
ollama pull  # 从注册仓库中拉取模型
ollama push  # 将模型推送到注册仓库
ollama list  # 列出已下载模型
ollama ps# 列出正在运行的模型
ollama cp# 复制模型
ollama rm# 删除模型

六. 远程测试

建议生产不开启,因为没有token等限制,必须注意接口调用安全

1.首先停止ollama服务:

systemctl stop ollama

2.修改ollama的service文件:

vim /etc/systemd/system/ollama.service

3.新增

Environment="OLLAMA_HOST=0.0.0.0:11434"
[Unit]Description=Ollama Service
After=network-online.target

[Service]ExecStart=/usr/local/bin/ollama serve
User=ollama
Group=ollama
Restart=always
RestartSec=3Environment="PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin"Environment="OLLAMA_HOST=0.0.0.0:11434"[Install]WantedBy=default.target
  1. 启动ollama
systemctl daemon-reload
systemctl start ollama
# 若启动失败可以使用 ollama serve测试

七.对接spring AI

<dependencyManagement><dependencies><!--spring boot依赖--><dependency><groupId>org.springframework.boot</groupId><artifactId>spring-boot-dependencies</artifactId><version>${spring.boot.version}</version><type>pom</type><scope>runtime</scope></dependency><dependency><groupId>org.springframework.ai</groupId><artifactId>spring-ai-bom</artifactId><version>1.0.0-SNAPSHOT</version><type>pom</type><scope>import</scope></dependency></dependencies></dependencyManagement><dependencies><dependency><groupId>org.springframework.boot</groupId><artifactId>spring-boot-starter-web</artifactId></dependency><!-- https://mvnrepository.com/artifact/org.springframework.ai/spring-ai-ollama-spring-boot-starter --><dependency><groupId>org.springframework.ai</groupId><artifactId>spring-ai-ollama-spring-boot-starter</artifactId></dependency></dependencies><repositories><repository><id>spring-milestones</id><name>Spring Milestones</name><url>https://repo.spring.io/milestone</url><snapshots><enabled>false</enabled></snapshots></repository><repository><id>spring-snapshots</id><name>Spring Snapshots</name><url>https://repo.spring.io/snapshot</url><releases><enabled>false</enabled></releases></repository></repositories>
若以上代码无法拉取,可能被setting.xml全局拦截到镜像站。 以上spring ai还未发布到maven中央仓库

请参考maven多仓库私库模板配置

spring:application:name: spring-ai-ollama
  ai:ollama:base-url: http://192.168.200.94:11434chat:# 为了使模型输入内容拥有更多的多样性或随机性,应当增加temperature。#在 temperature 非零的情况下,从 0.95 左右的 top-p(或 250 左右的 top-k )开始,根据需要降低 temperature。# 如果有太多无意义的内容、垃圾内容或产生幻觉,应当降低 temperature 和 降低top-p/top-k。# 如果 temperature 很高而模型输出内容的多样性却很低,应当增加top-p/top-k。# 为了获得更多样化的主题,应当增加存在惩罚值。# 为了获得更多样化且更少重复内容的模型输出,应当增加频率惩罚。options:# 配置文件指定时,现在程序中指定的模型,程序没有指定模型在对应查找配置中的模型#          model: qwen:0.5b-chatmodel: qwen2.5:3b-instruct-q4_K_M
          # 支持的最大字符数max_tokens:2048# 温度值越高,准确率下降,温度值越低,准确率上升# 对于每个提示语只需要单个答案:零。#对于每个提示语需要多个答案:非零。temperature:0.4# 随机采样 值越大,随机性越高# 在 temperature 为零的情况下:输出不受影响。# 在 temperature 不为零的情况下:非零。top_p:0.2# 贪心解码 值越大,随机性越高top-k:40# 频率惩罚 让token每次在文本中出现都受到惩罚。这可以阻止重复使用相同的token/单词/短语,同时也会使模型讨论的主题更加多样化,更频繁地更换主题# 当问题仅存在一个正确答案时:零。# 当问题存在多个正确答案时:可自由选择。frequency-penalty:0# 存在惩罚 如果一个token已经在文本中出现过,就会受到惩罚 使其讨论的主题更加多样化,话题变化更加频繁,而不会明显抑制常用词的重复presence-penalty:0
@RestControllerpublicclassQianWenController{@ResourceprivateOllamaChatModel ollamaChatModel;@RequestMapping(value ="/ai/ollama")publicObjectollama(@RequestParam(value ="msg")String msg){String called = ollamaChatModel.call(msg);System.out.println(called);return called;}@RequestMapping(value ="/ai/ollama2")publicMap<String,Object>ollama2(@RequestParam(value ="msg")String msg){Map<String,Object> map =newHashMap<String,Object>();long start =System.currentTimeMillis();ChatResponse chatResponse = ollamaChatModel.call(newPrompt(msg,OllamaOptions.create().withModel("qwen2.5:3b-instruct-q4_K_M")//使用哪个大模型.withTemperature(0.4D)));//温度,温度值越高,准确率下降,温度值越低,准确率上升String content = chatResponse.getResult().getOutput().getContent();long end =System.currentTimeMillis();
        map.put("content", content);
        map.put("time",(end - start)/1000);return map;}@RequestMapping(value ="/ai/stream",produces =MediaType.TEXT_EVENT_STREAM_VALUE)publicFlux<String>stream(@RequestParam(value ="msg")String msg){return ollamaChatModel
                .stream(newPrompt(msg)).flatMapSequential(chunk ->Flux.just(chunk.getResult().getOutput().getContent()));}}

在这里插入图片描述
在这里插入图片描述

标签: 人工智能

本文转载自: https://blog.csdn.net/m0_50913327/article/details/143000552
版权归原作者 最难不过坚持丶渊洁 所有, 如有侵权,请联系我们删除。

“ubuntu24.0离线安装Ollama和纯cpu版本以及对接Spring AI”的评论:

还没有评论