RubiRLM-1B-Base

RubiRLM-1B-Base is a 1B-parameter base language model released by DevHunterAI.

Model size: 1B parameters

Training datasets: FineWeb, UltraChat-200k

Model type: Base / pretrained language model

Important: This release is a base model. It can be used for prompt-based generation and experimental chat-style interaction, but it is not an instruction-tuned chat assistant.

Architecture

RubiRLM 1B Architecture

RubiRLM 1B uses a recursive language modeling architecture with recurrent state flow, Mixture-of-Experts routing, and conditional block execution.

Key Features

  • 1B parameters
  • Recursive Language Model (RLM) architecture
  • 10 recursive blocks
  • d_model = 1024
  • 16 attention heads
  • max sequence length = 2048
  • 6 recursive reasoning steps
  • Mixture-of-Experts: 32 experts, top-1 routing
  • Layer skip router for conditional execution
  • Packed execution support
  • Tied token embedding and LM head

Training Data

This model was trained using a mixture of:

  • FineWeb
  • UltraChat-200k

Intended Usage

This model is intended for:

  • base language modeling research
  • continued pretraining
  • experimental prompt-based generation
  • architecture experimentation around recursive and MoE-based language models

Not Intended As

This release should not be treated as:

  • a fully aligned assistant
  • a safety-tuned production chatbot
  • an instruction-following model with guaranteed conversational quality

Loading

Because this repository includes custom model code, loading may require trust_remote_code=True depending on your workflow.

Files

  • pytorch_model.bin: exported RubiRLM weights
  • training_checkpoint.pt: original training checkpoint
  • config.json: Hugging Face-facing config
  • rubirlm_config.json: full RubiRLM architecture config
  • RubiRLM.py: model implementation
  • xqs_moe.py, xqs_stack.py, x_quantum_sparse_ops.py, rubi_train_stack.py: supporting code

Notes

The exported weights were produced from the final training checkpoint and packaged for Hugging Face publication.

Downloads last month
469
Safetensors
Model size
1.0B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Datasets used to train DevHunterAI/RubiRLM-1B-Base