17 46 25

ct2

ct-2

AI & ML interests

None yet

Recent Activity

upvoted a paper 9 days ago

LLMs as Noisy Channels: A Shannon Perspective on Model Capacity and Scaling Laws

updated a bucket 9 days ago

ct-2/Sherry-3B-1.25bit-per-channel-bucket

published a bucket 9 days ago

ct-2/Sherry-3B-1.25bit-per-channel-bucket

View all activity

Organizations

None yet

upvoted a paper 9 days ago

LLMs as Noisy Channels: A Shannon Perspective on Model Capacity and Scaling Laws

Paper • 2605.23901 • Published 12 days ago • 13

upvoted a collection 11 days ago

BitCPM-CANN

Collection

Full-pipeline ternary quantized model trained on CANN. • 12 items • Updated 10 days ago • 27

upvoted 2 papers 12 days ago

Mix-Quant: Quantized Prefilling, Precise Decoding for Agentic LLMs

Paper • 2605.20315 • Published 15 days ago • 28

Enhancing Train-Free Infinite-Frame Generation for Consistent Long Videos

Paper • 2605.18233 • Published 16 days ago • 92

upvoted a paper 14 days ago

Post-Trained MoE Can Skip Half Experts via Self-Distillation

Paper • 2605.18643 • Published 16 days ago • 30

upvoted 2 papers 22 days ago

Large Language Models Explore by Latent Distilling

Paper • 2604.24927 • Published Apr 27 • 74

StateSMix: Online Lossless Compression via Mamba State Space Models and Sparse N-gram Context Mixing

Paper • 2605.02904 • Published Apr 5 • 8

upvoted a paper about 2 months ago

Attention Sink in Transformers: A Survey on Utilization, Interpretation, and Mitigation

Paper • 2604.10098 • Published Apr 11 • 82

upvoted a collection 2 months ago

Trinity-Large-Thinking

Collection

5 items • Updated Apr 10 • 31

upvoted 10 papers 3 months ago

InCoder-32B: Code Foundation Model for Industrial Scenarios

Paper • 2603.16790 • Published Mar 17 • 311

FineRMoE: Dimension Expansion for Finer-Grained Expert with Its Upcycling Approach

Paper • 2603.13364 • Published Mar 9 • 9

Decoding as Optimisation on the Probability Simplex: From Top-K to Top-P (Nucleus) to Best-of-K Samplers

Paper • 2602.18292 • Published Feb 20 • 13

Arcee Trinity Large Technical Report

Paper • 2602.17004 • Published Feb 19 • 21

upvoted a collection 4 months ago

Hy Low-bit model

Collection

8 items • Updated 13 days ago • 11

ct2

AI & ML interests

Recent Activity

Organizations

ct-2's activity