Tokenizer problems, or just quants?
#105 opened about 15 hours ago
by
Nesy1
Drops in performance randomly VLLM on H200 and B200 with 2 GPUs
#104 opened about 19 hours ago
by
jtvino
gemma4-31b-it can not work on dual DGX SPARKs now
#103 opened 3 days ago
by
kenleo
Gemma4 is super slow on vllm docker , max to max 15 token per second only (4 X L4 GPU)
1
#102 opened 3 days ago
by
vi-ajayb
This model is so good, but the KV cache ruins it.
1
#100 opened 7 days ago
by
UniversalLove333
Update README.md
#99 opened 8 days ago
by
hectorruiz9
o-O too good to be true
#98 opened 8 days ago
by
mekasu
could ya' add gpt j 6b to google ai studio? completion mode
#97 opened 9 days ago
by
Tralalabs
Weird token after underscore `_//`
3
#96 opened 10 days ago
by
yamikumods
CodeGemma 4 collection
➕ 2
1
#95 opened 10 days ago
by
cbrain
fix(chat_template): Emit multimodal placeholders in tool response content-parts
👍 4
#94 opened 11 days ago
by
harshaljanjani
{Feature request] Eliminate pre-attention RMSNorm in Gemma 4 via scale invariance + weight folding
#93 opened 11 days ago
by
graefics
Reviews of Gemma 4
1
#92 opened 13 days ago
by
Juanoto2012
Update chat_template.jinja to address JSON Schema shapes that do not expose their meaning through a direct top-level type
👍 6
#91 opened 13 days ago
by
sigjhl
Possible chat_template.jinja issue: nullable $ref tool schemas are rendered as empty types
1
#87 opened 13 days ago
by
sigjhl
When training models from the gemma4 series using GRPO, an abnormally high grad norm was observed
#84 opened 14 days ago
by
mamazi00
add newlines and thinking tokens to template to avoid having to compute 3 extra tokens per generation in chat completion+reasoning
👍 2
2
#83 opened 15 days ago
by
quasar-of-mikus
Update README.md
#81 opened 17 days ago
by
hectorruiz9
Gemma 4 models are way to paranoid about dates, any tips?
🔥 3
3
#80 opened 17 days ago
by
Ahugm
Incorrect output in Gemma 4: seeking a solution to the problem ( la la la )
7
#79 opened 17 days ago
by
Lintrarius
Fix chat_template: emit empty <|channel>thought\n<channel|> wrapper for existing asst turns
1
#78 opened 18 days ago
by
flotherxi
[Bug] chat_template: missing <|channel>thought\n<channel|> wrapper for non-thinking SFT / multi-turn
👍 1
2
#77 opened 18 days ago
by
flotherxi
Thinking erratic at 30000+ context
1
#76 opened 19 days ago
by
JeslynMcKenzie
Multilingual Support List
➕ 1
#75 opened 19 days ago
by
abcdvzz
Will there be a small model like gemma-3-270m?
🔥 1
#74 opened 20 days ago
by
ymcki
Unexpected loss spikes and performance degradation when fine-tuning Gemma 4 (google/gemma-4-31B-it)
1
#73 opened 20 days ago
by
rstaruch
Add ParseBench evaluation results
4
#72 opened 25 days ago
by
boyang-runllama
Will there be a small model for speculative decoding?
3
#71 opened 25 days ago
by
Regrin
Imagen 1 (2022) Should Be Open Sourced
👍 5
#70 opened 25 days ago
by
Tralalabs
Question about tool-calling order in chat_template.jinja
1
#67 opened 26 days ago
by
json0
gemma-4-31b-it unable to execute tool calling
3
#66 opened 26 days ago
by
Naman2302
Do Gemma 4 models work well?
3
#65 opened 27 days ago
by
Regrin
fix: embed chat_template in tokenizer_config.json
#64 opened 29 days ago
by
NERDDISCO
Infinite loop is not fixed even with Google API
👀 1
2
#63 opened 29 days ago
by
alexcardo
Chat Template has a bug.
🤗 2
5
#62 opened 29 days ago
by
Reithan
why print rightarrow
❤️👀 3
5
#61 opened 30 days ago
by
wangtf-Kevin
Can anyone improve the model using the Rys methodology—by duplicating a block of layers?
11
#60 opened about 1 month ago
by
Regrin
Strange behaviour of the tokenizer
2
#58 opened about 1 month ago
by
andercorral
Good Workflow
2
#57 opened about 1 month ago
by
anthoekfj
fix: function calling formatting in chat template
❤️ 2
1
#55 opened about 1 month ago
by
RyanMullins
Chat template is too complicated that even Gemma 4 itself has no idea how to parse it
1
#53 opened about 1 month ago
by
alexcardo
Hardware requirement
👍👀 3
13
#52 opened about 1 month ago
by
Charan01
Tokens per Image Parameter?
2
#51 opened about 1 month ago
by
buckeye17
Guys please add the MTP to this model
🔥 5
5
#50 opened about 1 month ago
by
Narutoouz
Will there be QAT models?
🤝👍 12
2
#49 opened about 1 month ago
by
Regrin
Gemma 4 E4B will be as encyclopedically well-read as the 12b model?
3
#48 opened about 1 month ago
by
Regrin
Create BTS
#47 opened about 1 month ago
by deleted
brokersponsor
1
#46 opened about 1 month ago
by
Brokersponsor
Update README.md
#45 opened about 1 month ago
by
Brokersponsor