Rohanify commited on
Commit
050df37
·
verified ·
1 Parent(s): 34186d6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +17 -32
README.md CHANGED
@@ -13,18 +13,24 @@ tags:
13
  pipeline_tag: text-generation
14
  ollama:
15
  template: |
16
- ### Instruction:
17
  {{ .Prompt }}
18
 
 
 
 
 
19
  ### Response:
 
20
  params:
21
  temperature: 0.1
22
- top_p: 0.8
23
  repeat_penalty: 2.0
24
- repeat_last_n: 64
25
- num_ctx: 256
26
  stop:
27
- - "### Instruction:"
 
28
  - "### Response:"
29
  - "<|endoftext|>"
30
  ---
@@ -36,33 +42,12 @@ This model is built specifically for lightning-fast Python code completions and
36
 
37
  ---
38
 
39
- ## 🛠️ How to Run Instantly via Ollama
40
-
41
- Because the system configuration is baked directly into this Hugging Face repository card, nobody needs to manually create a local `Modelfile`.
42
-
43
- ### ⚠️ Critical Usage Note for 13M Parameters
44
- Because this model is highly optimized and ultra-lightweight (~13M parameters), it is architectural design-limited to **single-turn tasks**. It does not possess a multi-turn chat memory tracking mechanism.
45
- Allowing Ollama to stack message history will cause the attention layers to collapse.
46
-
47
- To ensure a perfect experience without crashes, use either of the two methods below:
48
-
49
- ### Method 1: Stateless Mode (Recommended 🚀)
50
- Pass your prompt directly inside quotes in your terminal. This forces Ollama to run a clean, stateless, single-turn generation that will **never crash**:
51
-
52
- ```bash
53
- ollama run hf.co/Rohanify/Indenta-13M-Python "write a function to reverse a list"
54
- ```
55
-
56
- ### Method 2: Interactive Terminal Mode
57
- If you are using the interactive chat loop (`ollama run hf.co/Rohanify/Indenta-13M-Python`), simply wipe the conversation memory before typing your next prompt by entering `/clear`:
58
 
59
- ```text
60
- >>> write a for loop
61
- [Model generates code safely]
62
 
63
- >>> /clear
64
- Cleared session context
65
 
66
- >>> reverse a linked list
67
- [Model generates next code safely]
68
- ```
 
13
  pipeline_tag: text-generation
14
  ollama:
15
  template: |
16
+ {{- if .Prompt }}## Instruction:
17
  {{ .Prompt }}
18
 
19
+ ### Input:
20
+ Processing your request...
21
+
22
+ {{ end -}}
23
  ### Response:
24
+ {{- if .Response }}{{ .Response }}{{ end -}}
25
  params:
26
  temperature: 0.1
27
+ top_p: 0.85
28
  repeat_penalty: 2.0
29
+ repeat_last_n: 32
30
+ num_ctx: 512
31
  stop:
32
+ - "## Instruction:"
33
+ - "### Input:"
34
  - "### Response:"
35
  - "<|endoftext|>"
36
  ---
 
42
 
43
  ---
44
 
45
+ ## 🛠️ Web UI & Multi-Turn Stability Note
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
46
 
47
+ This model features a hyper-lightweight 13M parameter footprint optimized for single-task completions based directly on structural templates.
 
 
48
 
49
+ Because it lacks the large capacity required for conversational context processing, it can drop formatting structure if a graphical Web UI forces a massive, multi-message chat stream into its context layers.
 
50
 
51
+ ### For Best Performance in Web UIs:
52
+ 1. **Use New Chat Threads:** Click the "New Chat" or "Clear" button in your user interface between coding tasks. This keeps the model's focus squarely on your active prompt.
53
+ 2. **Synchronized Template:** This model card includes an aligned layout format (`## Instruction:` and `### Input:`) matching the training data to stop token bleeding across chat iterations.