Public Open LLM API
This is a solution we build, which will have different version of the open source LLM hosted in the background you can test on.
Introduction
As we discuss before, for the self hosted LLMs, it is actually a one-off work you can do, after you setup the self hosted llm, and deploy it somewhere on the server, everyone can use it.
There are several open source solutions for this, which will allow you to pull their code and run it on your local/cloud machines, and test one that.
However, it still require a bit tech skills, at the same time, I found the quality and support for this codes still need a bit more effort.
So I was thinking to build an centric endpoints, and deploy the Open Source LLM models as requested by researchers, then everyone with a valid token can request the endpoints to evaluate the performance of the models, which can contribute to the wider researech community.
In this way
You only need to handle with the endpoint interface we provide, so it will simplify everything in the background.
You can contact us if you want to deploy a new Open LLM models
I will actively fix bugs in the background
I have built a list of features specific for researchers to use, and will benefit researchers.
For example, if you want to evaluate the Open LLM performance for a list of prompts, after you make all calls via the API endpoints, we have a admin page you can download all the request content and results, even the time it take for the OpenLLM to finish the jobs.
We will actively do that
It is hosted on a Pawsey machine, data is totally localized there, so you do not need to worry data leaking, but this probably will require further disucssion based on your specific requirements.
If there are a lot of researchers using it, we may can contribute to the hardware part together to make it faster for everyone to use.
....
It do have the bad parts
It will rely on us to make sure it is running smoothly, hopefully we have the energy to contine this.
and other bad parts you can name it, hahah
But anyway, we nearly there now.
We plan to build
An API endpoint, you can easily do a HTTP request with model_name and prompts
A Web interface, you can test it out via browser
A Database store the transcational data you can download and keep it for your research
So we are there, we have a frontend application interface, you can login and upload a csv/json files with the LLM tasks you want to do evaluation.
We also provide an API interface, so you can queue the tasks with Python or any language you want to use.
Access the Open Source LLM Evaluation Dashboard
Link: https://llm.nlp-tlp.org/
You will need to have an account to login, if you are interested, feel free to contact us to setup an account. We are in testing stage now, and will open to public registration later.
JSON and CSV examples
Access the Public Open LLM API
The documtnation url is: https://api.nlp-tlp.org/redoc/
Under the LLM section
So there are one important endpoints now.
list all available llm models
Auth
To make a valid call to the endpoints, you will need a valid token. Feel free to contact us if you want one: pascal.sun@research.uwa.edu.au
Or if you already have an account, you can generate via the WA Data & LLM platform UI interface.
With the token, then you can set your HTTP request with header
In this way, you will be allowed to call the endpoints
List Available LLMs
This endpoint will list all the models in the system records, if it is downloaded and ready to go, the available
field will be marked as true
.
The model_name
field will be the one you care about, as this will be the input later for your other requests.
You can request to add models by contact us: pascal.sun@research.uwa.edu.au
Name the model you want to add, the huggingface repo id, etc. We will try to add it as soon as possible.
One of the example return
Queue task and tasks
If you want to queue a list of the tasks, or because of some of the models do take a long time to finish the request, you can try to queue the task and then grab the result/results later.
So the endpoint is
Data should be like
You will need to use the same token way to authenticate.
It will return something like this:
Then you can use the task_id to track the progrss via the status endpoint, we will mention below.
To queue a list of the tasks, you can do is with this endpoint
Data will be
Code will be like
You will get some return like this
Check task status
After you have the task_id, you can track the progress by the endpoint
Authenticate it with your token
And you should be able to get some results like
Supported Models
This is not latest one, use the endpoint to query the latest list.
chatglm3-6b
6b
chatglm
chatglm.cpp
npc0/chatglm3-6b-int4
chatglm3-ggml-q4_1.bin
internlm-20b
20b
internlm
llama.cpp
intervitens/internlm-chat-20b-GGUF
internlm-chat-20b.Q4_K_M.gguf
gemma-7b-instruct
7b
gemma
llama.cpp
brittlewis12/gemma-7b-it-GGUF
gemma-7b-it.Q4_K_M.gguf
gemma-7b
7b
gemma
llama.cpp
brittlewis12/gemma-7b-GGUF
gemma-7b.Q4_K_M.gguf
gemma-2b-instruct
2b
gemma
llama.cpp
brittlewis12/gemma-2b-it-GGUF
gemma-2b-it.Q4_K_M.gguf
gemma-2b
2b
gemma
llama.cpp
brittlewis12/gemma-2b-GGUF
gemma-2b.Q4_K_M.gguf
llama2-13b-chat
13b
llama2
llama.cpp
TheBloke/Llama-2-13B-Chat-GGUF
llama-2-13b-chat.Q8_0.gguf
llama2-13b
13b
llama2
llama.cpp
TheBloke/Llama-2-13B-GGUF
llama-2-13b.Q4_K_M.gguf
llama2-7b-chat
7b
llama2
llama.cpp
TheBloke/Llama-2-7B-Chat-GGUF
llama-2-7b-chat.Q4_K_M.gguf
llama2-7b
7b
llama2
llama.cpp
TheBloke/Llama-2-7B-GGUF
llama-2-7b.Q4_K_M.gguf
medicine-chat
13b
medicine-chat
llama.cpp
TheBloke/medicine-chat-GGUF
medicine-chat.Q8_0.gguf
medicine-llm-13b
13b
medicine-llm
llama.cpp
TheBloke/medicine-LLM-13B-GGUF
medicine-llm-13b.Q8_0.gguf
dolphin-2.5-mixtral-7x7b
8x7b
dolphin-2.5-mixtral
llama.cpp
TheBloke/dolphin-2.5-mixtral-8x7b-GGUF
dolphin-2.5-mixtral-8x7b.Q2_K.gguf
Support
If you need any support, like models or coding to get it work, or found out problems, feel free to contact: pascal.sun@research.uwa.edu.au or through LinkedIn: https://www.linkedin.com/in/pascalsun23/
We are actively developing the web interface now, and we keep you updated. If you have any research idea and look to do the collaboration, also feel free to contact us.
Last updated