Run a LLM and API server with ngrok on Google Colab

2 min readNov 6, 2023

GitHub - koji/llm_api_template: API template for LLM model with llama.cpp

API template for LLM model with llama.cpp. Contribute to koji/llm_api_template development by creating an account on…

github.com

What is this?

This is a jupyter notebook that uses LLMs via endpoints. This is using llama.cpp, ngrok, and a model from TheBloke The base jupyter notebook uses zephyr-7b.

How to use

[Requiremetns]

Google account for Google colab https://colab.google/
ngrok account https://ngrok.com

step0. create the above accounts

Need to create a google account and ngrok account if you don’t have them.

step1. Copy the jupyter notebook

llm_api_template/Zephyr_API_server.ipynb at main · koji/llm_api_template

API template for LLM model with llama.cpp. Contribute to koji/llm_api_template development by creating an account on…

github.com

step2. Create a secreat key

There is the key icon in Google colab’s sidebar. You will need to add your token as a secret key. In the jupyter notebook, I named NGROK. You can change that into anything you want.

step3. Run the jupyter notebook

After setting a new secreat key, you can run the jupyter notebook to run the API server.

step4. Check the API server

If everything works properly, you can acces https://ngrok_address/docs and you will see something like 👇
You can see the all available endpoints.

Do not run the last two lines
The first one is to kill FastAPI server and the second one is to kill ngrok.

!pkill uvicorn
!pkill ngrok

step5. Call the endpoint

Now you can call the endpoint via any language you like. In this repo, I put a python code as a sample.
What you need to do is to change the endpoint.

The speed of response wouldn’t be really great but this would be useful for building something to understand the possibility of the combination of LLM and something without spending much time and cost for the environment for LLMs.

Run a LLM and API server with ngrok on Google Colab

GitHub - koji/llm_api_template: API template for LLM model with llama.cpp

API template for LLM model with llama.cpp. Contribute to koji/llm_api_template development by creating an account on…

llm_api_template/Zephyr_API_server.ipynb at main · koji/llm_api_template

API template for LLM model with llama.cpp. Contribute to koji/llm_api_template development by creating an account on…

Written by 0𝕏koji