RubiRLM-1B-Base
RubiRLM-1B-Base is a 1B-parameter base language model released by DevHunterAI.
Model size: 1B parameters
Training datasets: FineWeb, UltraChat-200k
Model type: Base / pretrained language model
Important: This release is a base model. It can be used for prompt-based generation and experimental chat-style interaction, but it is not an instruction-tuned chat assistant.
Architecture
RubiRLM 1B uses a recursive language modeling architecture with recurrent state flow, Mixture-of-Experts routing, and conditional block execution.
Key Features
- 1B parameters
- Recursive Language Model (RLM) architecture
- 10 recursive blocks
- d_model = 1024
- 16 attention heads
- max sequence length = 2048
- 6 recursive reasoning steps
- Mixture-of-Experts: 32 experts, top-1 routing
- Layer skip router for conditional execution
- Packed execution support
- Tied token embedding and LM head
Training Data
This model was trained using a mixture of:
- FineWeb
- UltraChat-200k
Intended Usage
This model is intended for:
- base language modeling research
- continued pretraining
- experimental prompt-based generation
- architecture experimentation around recursive and MoE-based language models
Not Intended As
This release should not be treated as:
- a fully aligned assistant
- a safety-tuned production chatbot
- an instruction-following model with guaranteed conversational quality
Loading
Because this repository includes custom model code, loading may require trust_remote_code=True depending on your workflow.
Files
pytorch_model.bin: exported RubiRLM weightstraining_checkpoint.pt: original training checkpointconfig.json: Hugging Face-facing configrubirlm_config.json: full RubiRLM architecture configRubiRLM.py: model implementationxqs_moe.py,xqs_stack.py,x_quantum_sparse_ops.py,rubi_train_stack.py: supporting code
Notes
The exported weights were produced from the final training checkpoint and packaged for Hugging Face publication.
- Downloads last month
- 469
