nani-qwen-3.5-2B

A fine-tuned Qwen3.5-2B for crypto wallet tool calling. Trained with Unsloth LoRA on 2,029 examples covering 103 blockchain tools from agentek.

Quickstart

# Download the GGUF
huggingface-cli download NaniDAO/nani-qwen-3.5-2B-gguf-q4km \
  --local-dir ~/models/nani-2b-q4km

# Register with Ollama
cd ~/models/nani-2b-q4km
ollama create nani -f Modelfile

# Run
ollama run nani --system "$(cat system-prompt.txt)" "resolve vitalik.eth"

Model Details


Base model	Qwen3.5-2B
Method	LoRA (r=16, alpha=16, dropout=0.05)
Training data	2,029 examples, 103 tools
Epochs	3 (optimal — 4th showed no improvement)
Hardware	1x T4 GPU (Kaggle)
Quantization	Q4_K_M (this repo)
Context	4096 tokens
License	Same as Qwen3.5-2B

Eval Results

Evaluated on 50 held-out examples. Base = Qwen3.5-2B without fine-tuning.

Metric	Base	Nani	Delta
Tool call accuracy	98.0%	96.0%	-2.0%
Correct function	75.5%	81.6%	+6.1%
Correct params	67.3%	71.4%	+4.1%
Format valid	100.0%	100.0%	—
Has `<think>` block	28.0%	100.0%	+72.0%
No-tool correct	0.0%	100.0%	+100.0%

The model learned to pick the right tool (+6.1%), use correct parameters (+4.1%), reason before acting (100% thinking), and know when NOT to call a tool (+100%). The small drop in tool call accuracy (-2%) is mostly sampling noise — 3 of 4 "failures" succeed on rerun.

Tool Call Format

The model uses Qwen3.5's native XML tool calling format:

<think>
The user wants to resolve an ENS name. I'll use resolveENS.
</think>

<tool_call>
<function=resolveENS>
<parameter=name>vitalik.eth</parameter>
</function>
</tool_call>

Tool results are passed back as tool role messages, then the model generates a final response.

System Prompt Format

Tools must be defined in the system prompt as newline-separated JSON inside <tools> tags. This matches the training data format exactly:

# Tools

You have access to the following functions:

<tools>
{"type":"function","function":{"name":"resolveENS","description":"Resolves an ENS name to an Ethereum address","parameters":{"type":"object","properties":{"name":{"type":"string","description":"The ENS name to resolve"}},"required":["name"]}}}
{"type":"function","function":{"name":"getBalance","description":"Get the native token (ETH) balance for an address","parameters":{"type":"object","properties":{"address":{"type":"string","description":"The wallet address (0x...)"},"chainId":{"type":"number","description":"Chain ID (1=Ethereum, 8453=Base)"}},"required":["address"]}}}
</tools>

If you choose to call a function ONLY reply in the following format with NO suffix:

<tool_call>
<function=example_function_name>
<parameter=example_parameter_1>
value_1
</parameter>
</function>
</tool_call>

<IMPORTANT>
Reminder:
- Function calls MUST follow the specified format
- Required parameters MUST be specified
- You may provide optional reasoning BEFORE the function call, but NOT after
- If there is no function call available, answer the question like normal
</IMPORTANT>

You are Nani, a crypto wallet assistant.

Keep tool schemas simple — use {"type": "string", "description": "..."} per property. Complex schemas with anyOf, $ref, or nested objects confuse the 2B model.

Supported Tools

Trained on 103 tools from agentek. Top tools by training examples:

Tool	Examples	Category
intentSwap	183	DEX trading
intentTransfer	157	Token transfers
resolveENS	71	ENS resolution
getBalance	64	Balance queries
getBalanceOf	59	ERC20 balances
resolveWNS	58	WNS resolution
getCryptoPrice	55	Price data
lookupENS	54	Reverse ENS
getQuote	53	Swap quotes
getFearAndGreedIndex	50	Market sentiment

Full coverage includes: ENS/WNS, ERC20, Uniswap V3, Aave, bridging (Across), security (ScamSniffer), blockscout explorer, gas estimation, DeFillama yields, NFTs, and more.

Ollama Modelfile

FROM ./nani-qwen-3.5-2B-Q4_K_M.gguf

PARAMETER temperature 0.7
PARAMETER top_p 0.8
PARAMETER top_k 20
PARAMETER stop "<|im_start|>"
PARAMETER stop "<|im_end|>"
PARAMETER num_ctx 4096

TEMPLATE """{{- if .System }}<|im_start|>system
{{ .System }}<|im_end|>
{{ end }}{{- range .Messages }}{{- if eq .Role "user" }}<|im_start|>user
{{ .Content }}<|im_end|>
{{ else if eq .Role "assistant" }}<|im_start|>assistant
{{ .Content }}<|im_end|>
{{ else if eq .Role "tool" }}<|im_start|>tool
{{ .Content }}<|im_end|>
{{ end }}{{- end }}<|im_start|>assistant
"""

SYSTEM """You are Nani, a crypto wallet assistant."""

Do NOT add </tool_call> as a stop token — it prevents the model from outputting the closing tag, breaking tool call parsing.

Ollama API Usage

Call Ollama directly and parse tool calls from the text output:

curl http://localhost:11434/api/chat -d '{
  "model": "nani",
  "messages": [
    {"role": "system", "content": "<your system prompt with tools>"},
    {"role": "user", "content": "resolve vitalik.eth"}
  ],
  "stream": false
}'

Parse <tool_call> XML from message.content, execute the tool, then send the result back:

curl http://localhost:11434/api/chat -d '{
  "model": "nani",
  "messages": [
    {"role": "system", "content": "<your system prompt with tools>"},
    {"role": "user", "content": "resolve vitalik.eth"},
    {"role": "assistant", "content": "<tool_call>\n<function=resolveENS>\n<parameter=name>vitalik.eth</parameter>\n</function>\n</tool_call>"},
    {"role": "tool", "content": "0xd8dA6BF26964aF9D7eEd9e03E53415D37aA96045"}
  ],
  "stream": false
}'

Tool Call Parsing

Regex to extract tool calls from model output:

const regex = /<tool_call>\s*<function=(\w+)>([\s\S]*?)<\/function>\s*(?:<\/tool_call>)?/g;
const paramRegex = /<parameter=(\w+)>([\s\S]*?)<\/parameter>/g;

All parameter values are strings — coerce to numbers/booleans based on the tool's schema before execution.

Training Config

model = FastLanguageModel.get_peft_model(
    model,
    r=16,
    lora_alpha=16,
    lora_dropout=0.05,
    target_modules=["q_proj", "k_proj", "v_proj", "o_proj",
                     "gate_proj", "up_proj", "down_proj"],
)

# Tokenize with enable_thinking=True (critical)
text = tokenizer.apply_chat_template(messages, tokenize=False,
    add_generation_prompt=False, enable_thinking=True)

TrainingArguments(
    num_train_epochs=3,
    learning_rate=5e-5,
    per_device_train_batch_size=1,
    gradient_accumulation_steps=16,
    lr_scheduler_type="cosine",
    warmup_ratio=0.1,
    fp16=True,
    optim="adamw_8bit",
)

enable_thinking=True is critical. With it disabled (v2), the model regressed -64% in tool accuracy because the tokenizer mangled the <think> blocks present in 97% of training data.

Local Testing

See nani-local for a Vite + React test app that connects to Ollama and executes real agentek tools with streaming, tool call visualization, and regex-based tool filtering (5 tools per message).

HuggingFace Repos

Repo	Format	Size
NaniDAO/nani-qwen-3.5-2B	Merged fp16	~5GB
NaniDAO/nani-qwen-3.5-2B-gguf-q4km	GGUF Q4_K_M	~1.3GB

Vision Support

Qwen3.5-2B is a vision-language model (VLM), and the base weights include a vision encoder. However, this GGUF only contains the text/language model weights. The vision encoder (mmproj) was not included in the GGUF conversion.

Current state:

Text-based tool calling works fully via Ollama
Image input is NOT supported in this GGUF
Ollama does not yet support Qwen3.5's mmproj format
llama.cpp supports it with separate --mmproj flag, but requires reconversion

To add vision later: reconvert from the merged fp16 model using llama.cpp's convert_hf_to_gguf.py which extracts the mmproj automatically, then use llama.cpp directly with --model nani.gguf --mmproj nani-mmproj-f16.gguf.

Known Limitations

Text-only — vision encoder not included in this GGUF (see above)
2B model size — sometimes hallucinates tool names or picks wrong parameters. Works best with 5 or fewer tools in the system prompt.
Simple schemas only — complex JSON schemas with anyOf, $ref, or nested objects confuse the model. Keep tool definitions flat: {"type": "string", "description": "..."} per property.
Training data imbalance — 78 of 103 tools have fewer than 15 examples. Performance on underrepresented tools is weaker.

Future Work

The model has plateaued with the current ~2K dataset. Next improvements require more data:

Expand to 4,000-5,000 examples
Cover all 150+ agentek tools (currently 103)
Balance tool distribution (cap at 50, min 20-25 per tool)
More multi-tool chains (currently 47, target 150+)
More no-tool conversations (currently 94, target 250+)
Parameter edge cases (optional params, wei values, chain variants)
Vision encoder GGUF extraction + Ollama/llama.cpp support

Downloads last month: 562

GGUF

Model size

2B params

Architecture

qwen35

Hardware compatibility

4-bit

View +1 variant

Model tree for NaniDAO/nani-qwen-3.5-2B-gguf-q4km

Base model

Qwen/Qwen3.5-2B-Base

Finetuned

Qwen/Qwen3.5-2B

Adapter

(45)

this model

Evaluation results

Tool Call Accuracy
self-reported

96.000
Correct Function
self-reported

81.600
Correct Params
self-reported

71.400
Format Valid
self-reported

100.000
No-Tool Correct
self-reported

100.000