Qwen3 models (123M/300M/600M) trained from scratch on 2.47B kk+ru tokens. Includes tokenizer, datasets, and checkpoints.
Saken Tukenov PRO
stukenov
AI & ML interests
None yet
Recent Activity
updated a Space about 2 hours ago
stukenov/omniff-runtime published a Space about 2 hours ago
stukenov/omniff-runtime updated a Space about 5 hours ago
stukenov/omniff-demoOrganizations
spaces 5
pinned
Sleeping
Agents
SozKZ -- Kazakh LLM Demo
💬
Chat in Kazakh with an AI language model
Running on Zero
Agents
OmniFF — FFmpeg for AI
🎬
Generate and understand text, images, audio, and video with AI
Running on Zero
Agents
OmniFF — FFmpeg for AI
🎬
Chat with AI that understands text, images, audio, video, and docs
Sleeping
Agents
SozKZ ASR -- Kazakh Speech Recognition
🎙
Transcribe Kazakh audio into text
Running
Kaz LLM Leaderboard
🏆
Evaluate LLMs on Kazakh benchmarks
models 75
stukenov/omniff
Text Generation • Updated
stukenov/sozkz-morphbpe-256k-kk-v1
Token Classification • Updated
stukenov/sozkz-fix-mt5-50m-kk-gec-v1
Text Generation • 50.6M • Updated • 19
stukenov/sozkz-nllb-1b-kk-pretrain-v1
Translation • 1B • Updated • 51
stukenov/sozkz-nllb-1b-kk-gec-v1
1B • Updated • 64
stukenov/sozkz-fix-mt5b-kk-gec-run13-v1
Text Generation • 0.6B • Updated • 3
stukenov/sozkz-fix-qwen-500m-kk-gec-v1
Text Generation • 0.4B • Updated • 3
stukenov/sozkz-fix-qwen-500m-kk-gec-v2
Text Generation • 0.4B • Updated • 25
stukenov/sozkz-fix-qwen-500m-kk-gec-v3
Text Generation • 0.4B • Updated • 298
stukenov/sozkz-fix-qwen-500m-kk-gec-v4
Text Generation • 0.4B • Updated • 299
datasets 54
stukenov/sozkz-corpus-tokenized-kk-morphbpe256k-v1
Viewer • Updated • 1.51M • 48
stukenov/sozkz-corpus-segmented-kk-v1
Viewer • Updated • 55.5M • 387
stukenov/sozkz-corpus-gec-benchmark-kk-v1
Viewer • Updated • 1.44k • 170
stukenov/sozkz-corpus-pretrain-gec-mix-v1
Viewer • Updated • 1.77M • 52
stukenov/sozkz-corpus-synthetic-kk-gec-rulebased-v1
Viewer • Updated • 1.06M • 15
stukenov/sozkz-corpus-synthetic-kk-gec-v1
Viewer • Updated • 19.3k • 27
stukenov/sozkz-gec-synthetic-gpt4o-v1
Viewer • Updated • 9.6k • 26
stukenov/sozkz-corpus-clean-v3
Viewer • Updated • 13.5M • 30
stukenov/sozkz-corpus-instruct-kk-alpaca-qwen35-v1
Viewer • Updated • 4.88k • 22 • 1
stukenov/kaznet-crawl-raw
Viewer • Updated • 1.55M • 3 • 1