A co-distillation framework is used to iteratively adapt sequence-only protein language models for high-accuracy variant effect prediction, without the need for additional structural or genetic data.