πŸš€ Indenta-13M-Python (GGUF)

An optimized from-scratch model made with a custom tokenizer and GPT-2 architecture. This model is built specifically for lightning-fast Python code completions and basic code generation. At ~13M parameters, it runs with near-zero latency on absolutely any hardware!


πŸ› οΈ Web UI & Multi-Turn Stability Note

This model features a hyper-lightweight 13M parameter footprint optimized for single-task completions based directly on structural templates.

Because it lacks the large capacity required for conversational context processing, it can drop formatting structure if a graphical Web UI forces a massive, multi-message chat stream into its context layers.

For Best Performance in Web UIs:

  1. Use New Chat Threads: Click the "New Chat" or "Clear" button in your user interface between coding tasks. This keeps the model's focus squarely on your active prompt.
  2. Synchronized Template: This model card includes an aligned layout format (## Instruction: and ### Input:) matching the training data to stop token bleeding across chat iterations.
Downloads last month
64
GGUF
Model size
13.3M params
Architecture
gpt2
Hardware compatibility
Log In to add your hardware

We're not able to determine the quantization variants.

Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Dataset used to train Rohanify/Indenta-13M-Python