ChatGLM2 Chatbot-TPU

ChatBot-TPU is an application that utilizes the Sophon SDK to port the open-source ChatGLM2 model from Tsinghua University's KEG Lab to the SG2300X chip series products. This enables hardware-accelerated inference using local TPU. The application is designed as a chatbot using Gradio, allowing users to ask real-life questions.

Clone the repository:

git clone https://github.com/zifeng-radxa/chatbot

Download the chatglm2 model. This example provides three chatglm2 models: int8-2048, int8-1024, and int4-512.

Assuming we are using the int4-512 model (quantized to int4 with a maximum token length of 512):

# chatglm-int4-512
wget https://github.com/radxa-edge/TPU-Edge-AI/releases/download/chatglm-int4-512/tar_downloader.sh
bash tar_downloader.sh
tar -xvf chatglm-int4-512.tar.gz

# chatglem-int8-1024
# wget https://github.com/radxa-edge/TPU-Edge-AI/releases/download/chatglm-int8-1024/tar_downloader.sh
# bash tar_downloader.sh
# tar -xvf chatglm-int8-1024.tar.gz

# chatglm-int8-2048
# wget https://github.com/radxa-edge/TPU-Edge-AI/releases/download/chatglm-int8-2048/tar_downloader.sh
# bash tar_downloader.sh
# tar -xvf chatglm-int8-2048.tar.gz

The resulting file structure will be as follows:

.
├── chatbot
└── chatglm-int4-512

Modify the config.ini configuration file according to the selected model:
```
cd chatbot
vim config.ini
```
```
[llm_model]
libtpuchat_path = ../chatglm-int4-512/libtpuchat.so
bmodel_path = ../chatglm-int4-512/chatglm2-6b_512_int4.bmodel
token_path = ../chatglm-int4-512/tokenizer.model
```
The config.ini file needs to have the correct model files configured. If you want to switch to other model files, please modify the paths in the configuration file accordingly.
Set up the environment:

It is necessary to create a virtual environment to avoid potential interference with other applications. For virtual environment usage, please refer to this guide.
```
python3 -m virtualenv .venv
source .venv/bin/activate
```

Install dependencies:

pip3 install --upgrade pip
pip3 install -r requirements.txt

Set environment variables:

export LD_LIBRARY_PATH=/opt/sophon/libsophon-current/lib:$LD_LIBRARY_PATH

Start the web service:
```
python3 web_demo.py
```
Access the 7860 port of the Airbox IP address in the browser.

Application Display​

Application Display