Questions regarding Stack v2 and StarCoder v2

#111

by aditya2211 - opened Mar 23, 2024

Mar 23, 2024

Hi BigCoders,

I had a few questions around Stack v2 and StarCoder v2:
(a) When can we expect the remaining Stack v2 data (documentation etc.) to be released?
(b) For StarCoder v2 pretraining, what was the policy used for packing and chunking? Were documents chunked into multiple segments during pretraining? If so, was there some overlap maintained between chunks?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment