PocketDoc Labs Blog

Primarily machine learning and adjacent work. Home of all the development notes that go into Dans-PersonalityEngine and related models.

Read more

Visually Adequate

Recovering Rampant Repetition

The Problem Historically open weights models have struggled with recovering from repetitive spirals both in the immediate prior few tokens, i.e. repeating the same token endlessly, and more structurally in phrasing or general writing flow. Easy, we train the model on sections of low quality…

Continue reading...
Visually Adequate

Lifting the Mask on Ghost Attention

One of the many gems from the Llama 2 paper was a technique they referred to as “Ghost Attention”. More generally this was the idea of synthetically creating chat histories paired with system messages depicting only the final message obeying or acknowledging the system message.

Continue reading...