Skip to main content

Qucik start with the MCP support

One of the key features of Llama-Nexus is its built-in MCP Client, which allows you to use Llama-Nexus for MCP-related tasks just like Claude Desktop and Cursor.

This tutorial shows how to set up real-time weather functionality with Llama-Nexus, a Weather MCP server, and an LLM that supports tool calls. The MCP server should be running successfully before you start Llama-Nexus.

Prerequisites

  • OpenWeather API key (obtain from openweathermap.org)
  • macOS or Linux environment
  • Network access for remote server setup (if using remote option)

1. Set Up Your MCP Server

curl -LO https://github.com/decentralized-mcp/gaia-mcp-servers/releases/download/0.6.0/gaia-mcp-servers-unknown-linux-gnu-x86_64.tar.gz
tar xvf gaia-mcp-servers-unknown-linux-gnu-x86_64.tar.gz

Download for your platform: https://github.com/decentralized-mcp/gaia-mcp-servers/releases/tag/0.6.0

Set the environment variables:

export OPENWEATHERMAP_API_KEY=YOUR_KEY
export RUST_LOG=debug
export LLAMA_LOG=debug

Run the MCP server (accessible from external connections):

./gaia-weather-mcp-server --transport stream-http --socket-addr 0.0.0.0:8002

Important: Ensure port 8002 is open in your firewall/security group settings if you're running on a cloud machine.

2. Set Up the Inference Server

Install llama-nexus

Download and extract llama-nexus:

curl -LO https://github.com/LlamaEdge/llama-nexus/releases/download/0.5.0/llama-nexus-apple-darwin-aarch64.tar.gz
tar xvf llama-nexus-apple-darwin-aarch64.tar.gz

Download for your platform: https://github.com/LlamaEdge/llama-nexus/releases/tag/0.5.0

Configure llama-nexus

Edit the config.toml file to specify the gateway server port:

[server]
host = "0.0.0.0" # The host to listen on
port = 9095 # The port to listen on

Configure the Gaia Weather MCP server connection:

gai

[[mcp.server.tool]]
name = "gaia-weather"
transport = "stream-http"
url = "http://YOUR-IP-ADDRESS:8002/mcp"
enable = true

You can configure multiple MCP servers in the config.toml file by adding additional [[mcp.server.tool]] sections.

Start llama-nexus

nohup ./llama-nexus --config config.toml &

Register Downstream API Servers

Register an LLM chat API server for the /chat/completions endpoint:

curl --location 'http://localhost:9095/admin/servers/register' \
--header 'Content-Type: application/json' \
--data '{
"url": "https://0xb2962131564bc854ece7b0f7c8c9a8345847abfb.gaia.domains",
"kind": "chat"
}'

Register an embedding API server for the /embeddings endpoint:

curl --location 'http://localhost:9095/admin/servers/register' \
--header 'Content-Type: application/json' \
--data '{
"url": "https://0x448f0405310a9258cd5eab5f25f15679808c5db2.gaia.domains",
"kind": "embeddings"
}'

3. Test the Setup

Test the inference server by requesting the /chat/completions API endpoint:

curl -X POST http://localhost:9095/v1/chat/completions \
-H 'accept: application/json' \
-H 'Content-Type: application/json' \
-d '{"messages":[{"role":"system", "content": "You are a helpful assistant. You will use the tool to solve user problems."},{"role":"user", "content": "What is the weather in Singapore?"}]}'

Expected output:

{
"id": "chatcmpl-cf63660e-3494-472c-b4d0-6cda72e1f8e9",
"object": "chat.completion",
"created": 1751602566,
"model": "Qwen3-4B",
"choices": [
{
"index": 0,
"message": {
"content": "The current temperature in Singapore is 30.13°C. Would you like to know the weather forecast for the next few days as well?",
"role": "assistant"
},
"finish_reason": "stop",
"logprobs": null
}
],
"usage": {
"prompt_tokens": 62,
"completion_tokens": 32,
"total_tokens": 94
}
}