Start an LlamaEdge API service
Since LlamaEdge provides an OpenAI-compatible API service, it can be a drop-in replacement for OpenAI in almost all LLM applications and frameworks. Checkout the articles in this section for instructions and examples for how to use locally hosted LlamaEdge API services in popular LLM apps.
Start the API servers for multiple models
First, you will need to start an OpenAI compatible API server.
-
Start an OpenAI compatible API server for Large Language Models (LLM) ➔ Get Started with LLM
-
Start an OpenAI compatible API server for Whisper ➔ Get Started with Speech to Text
-
Start an OpenAI compatible API server for GPT-SOVITs and Piper ➔ Get Started with Text to Speech
-
Start an OpenAI compatible API server for Stable Diffusion and FLUX ➔ Get Started with Text-to-Image
-
Start an OpenAI compatible API server for Llava and Qwen-VL ➔ Get Started with Multimodal
OpenAI replacement
Now, you can ready to use this API server in OpenAI ecosystem apps as a drop-in replacement for the OpenAI API! In general, for any OpenAI tool, you could just replace the following.
Config option | Value | Note |
---|---|---|
API endpoint URL | http://localhost:8080/v1 | If the server is accessible from the web, you could use the public IP and port |
Model Name (for LLM) | llama-3-8b-chat | The first value specified in the --model-name option |
Model Name (for Text embedding) | nomic-embed | The second value specified in the --model-name option |
API key | Empty | Or any value if the app does not permit empty string |