Text Generation
Transformers.js
ONNX
English
llama
code
python
maincoder
code-generation
reinforcement-learning
mcpo
conversational
Instructions to use Maincode/Maincoder-1B-ONNX with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers.js
How to use Maincode/Maincoder-1B-ONNX with Transformers.js:
// npm i @huggingface/transformers import { pipeline } from '@huggingface/transformers'; // Allocate pipeline const pipe = await pipeline('text-generation', 'Maincode/Maincoder-1B-ONNX');
| license: apache-2.0 | |
| language: | |
| - en | |
| library_name: transformers.js | |
| tags: | |
| - code | |
| - python | |
| - maincoder | |
| - code-generation | |
| - reinforcement-learning | |
| - mcpo | |
| - onnx | |
| pipeline_tag: text-generation | |
| base_model: Maincode/Maincoder-1B | |
| <img src="https://huggingface.co/datasets/Maincode/assets/resolve/e51154e034201be1a5dad0e9c8de31d8b9f17643/maincoder_logo.png" alt="" width="1250"> | |
| [**Maincoder-1B-ONNX**](https://maincode.com/maincoder/) is the ONNX-optimized version of [Maincoder-1B](https://huggingface.co/Maincode/Maincoder-1B), a code-focused language model optimized for code generation and completion tasks. This version enables fast inference using ONNX Runtime in Python and runs directly in the browser with Transformers.js. | |
| # Key Features | |
| - **ONNX Optimized**: Efficient inference with ONNX Runtime and KV-cache support | |
| - **Cross-Platform**: Run in Python, Node.js, or directly in the browser | |
| - **Code Generation**: Optimized for Python code completion and generation tasks. | |
| - **Compact Size**: 1 billion parameters, lightweight enough to run on consumer hardware. | |
| - **SOTA Performance**: State-of-the-art performance on Python coding benchmarks HumanEval, HumanEval+ and MBPP+. | |
| # Benchmark Results | |
| <img src="https://huggingface.co/datasets/Maincode/assets/resolve/main/performance_h.png" alt="Benchmark Performance Across Baseline LLMs" width="1050"> | |
| | Model | HumanEval | HumanEval+ | MBPP+ | MMLU | GSM8K | | |
| |---|---:|---:|---:|---:|---:| | |
| | [Maincode/Maincoder-1B](https://huggingface.co/Maincode/Maincoder-1B) | **0.7622** | **0.7256** | **0.7090** | 0.3054 | 0.2976 | | |
| | [deepseek-ai/deepseek-coder-1.3b-instruct](https://huggingface.co/deepseek-ai/deepseek-coder-1.3b-instruct) | 0.5610 | 0.5305 | 0.6217 | 0.2705 | 0.0413 | | |
| | [HuggingFaceTB/SmolLM3-3B](https://huggingface.co/HuggingFaceTB/SmolLM3-3B) | 0.5366 | 0.5000 | 0.6799 | **0.5928** | 0.5505 | | |
| | [Qwen/Qwen2.5-Coder-1.5B-Instruct](https://huggingface.co/Qwen/Qwen2.5-Coder-1.5B-Instruct) | 0.4634 | 0.4451 | 0.6561 | 0.4984 | 0.4944 | | |
| | [Qwen/Qwen3-1.7B](https://huggingface.co/Qwen/Qwen3-1.7B) | 0.4024 | 0.3780 | 0.5582 | 0.5571 |**0.6865** | | |
| # Model Overview | |
| Maincoder uses a modern transformer decoder architecture with: | |
| - **Rotary Position Embeddings**: With theta of 1,000,000. | |
| - **RMSNorm**: Pre-normalization for stable training. | |
| - **Grouped Query Attention**: 4:1 ratio of query to key-value heads. | |
| - **QK Normalization**: RMSNorm applied to attention queries and keys. | |
| - **SwiGLU MLP**: Gated linear units with SiLU activation. | |
| | Attribute | Value | | |
| |-----------|-------| | |
| | Parameters | 1B | | |
| | Hidden Size | 1536 | | |
| | Layers | 32 | | |
| | Attention Heads | 16 (4 KV heads) | | |
| | Head Dimension | 96 | | |
| | Vocabulary Size | 151,936 | | |
| | Context Length | 2,048 | | |
| | Format | ONNX | | |
| # Usage | |
| ## Python (ONNX Runtime) | |
| ### Installation | |
| ```bash | |
| pip install optimum[onnxruntime] transformers | |
| ``` | |
| For GPU acceleration: | |
| ```bash | |
| pip install optimum[onnxruntime-gpu] | |
| ``` | |
| ### Quick Start | |
| ```python | |
| from optimum.onnxruntime import ORTModelForCausalLM | |
| from transformers import AutoTokenizer | |
| # Load the ONNX model with KV-cache support | |
| model = ORTModelForCausalLM.from_pretrained( | |
| "Maincode/Maincoder-1B-ONNX", | |
| file_name="decoder_with_past_model.onnx", | |
| use_cache=True | |
| ) | |
| # Load the tokenizer | |
| tokenizer = AutoTokenizer.from_pretrained("Maincode/Maincoder-1B-ONNX") | |
| # Code completion example | |
| prompt = '''def fibonacci(n: int) -> int: | |
| """Return the n-th Fibonacci number.""" | |
| ''' | |
| inputs = tokenizer(prompt, return_tensors="pt") | |
| outputs = model.generate( | |
| **inputs, | |
| max_new_tokens=128, | |
| temperature=0.2, | |
| do_sample=True, | |
| pad_token_id=tokenizer.eos_token_id | |
| ) | |
| print(tokenizer.decode(outputs[0], skip_special_tokens=True)) | |
| ``` | |
| ### GPU Acceleration | |
| ```python | |
| from optimum.onnxruntime import ORTModelForCausalLM | |
| model = ORTModelForCausalLM.from_pretrained( | |
| "Maincode/Maincoder-1B-ONNX", | |
| use_cache=True, | |
| file_name="decoder_with_past_model.onnx", | |
| provider="CUDAExecutionProvider" | |
| ) | |
| ``` | |
| --- | |
| ## JavaScript (Transformers.js) | |
| ### Installation | |
| ```bash | |
| npm install @huggingface/transformers | |
| ``` | |
| ### Node.js | |
| ```javascript | |
| import { AutoModelForCausalLM, AutoTokenizer } from '@huggingface/transformers'; | |
| // Load the tokenizer and model | |
| const tokenizer = await AutoTokenizer.from_pretrained('Maincode/Maincoder-1B-ONNX'); | |
| const model = await AutoModelForCausalLM.from_pretrained('Maincode/Maincoder-1B-ONNX', { | |
| subfolder: '.', | |
| model_file_name: 'decoder_with_past_model', | |
| use_external_data_format: true, | |
| }); | |
| // Code completion example | |
| const prompt = `def fibonacci(n: int) -> int: | |
| """Return the n-th Fibonacci number.""" | |
| `; | |
| const inputs = await tokenizer(prompt, { return_tensors: 'pt' }); | |
| const outputs = await model.generate({ | |
| input_ids: inputs.input_ids, | |
| attention_mask: inputs.attention_mask, | |
| max_new_tokens: 128, | |
| temperature: 0.2, | |
| do_sample: true, | |
| }); | |
| const decoded = tokenizer.decode(outputs[0], { skip_special_tokens: true }); | |
| console.log(decoded); | |
| ``` | |
| --- | |
| ## Code Completion Examples | |
| ```python | |
| # Function completion | |
| prompt = '''def quicksort(arr: list) -> list: | |
| """Sort a list using the quicksort algorithm.""" | |
| ''' | |
| # Class completion | |
| prompt = '''class BinarySearchTree: | |
| """A binary search tree implementation.""" | |
| def __init__(self): | |
| ''' | |
| # Algorithm implementation | |
| prompt = '''def dijkstra(graph: dict, start: str, end: str) -> tuple: | |
| """Find the shortest path using Dijkstra's algorithm. | |
| Args: | |
| graph: Adjacency list representation of the graph | |
| start: Starting node | |
| end: Target node | |
| Returns: | |
| Tuple of (distance, path) | |
| """ | |
| ''' | |
| ``` | |
| # Additional Notes | |
| ## Limitations | |
| - Context length limited to 2,048 tokens | |
| - Primarily optimized for Python, performance may vary on other languages | |
| - May generate code with bugs or security issues - always review generated code | |
| - Browser performance depends on device capabilities | |
| <div style="margin-left:14px; border-left:4px solid #3b82f6; background:rgba(59,130,246,0.08); padding:8px 10px; border-radius:8px; font-size:0.92em; margin:10px 0;"> | |
| <strong>Disclaimer</strong>: This model has <strong>not</strong> undergone any alignment or safety tuning (e.g., RLHF/RLAIF, DPO, or safety fine-tuning). Outputs may be unsafe or biased. Please use appropriate safeguards and evaluate carefully for your use case. | |
| </div> | |
| ## License | |
| This model is released under the [Apache 2.0 License](https://www.apache.org/licenses/LICENSE-2.0). | |
| ## Citation | |
| ```bibtex | |
| @misc{maincoder2025, | |
| title = {Maincoder-1B: A High-Performance 1B Parameter Coding Model}, | |
| author = {Maincode Team}, | |
| year = {2025}, | |
| organization = {Maincode}, | |
| howpublished = {\url{https://huggingface.co/Maincode/Maincoder-1B}} | |
| } | |
| ``` | |
| ## Related Models | |
| - [Maincode/Maincoder-1B](https://huggingface.co/Maincode/Maincoder-1B) - Original PyTorch model | |
| ## Contact | |
| For questions, issues, or collaboration inquiries, please visit [Maincode](https://maincode.com). | |