• Researchers from top US universities warn extending pre-training can be detrimental to performance
  • Too much pre-training can deliver worse performance due to something akin to the butterfly effect
  • The more they are pre-trained, the more they become sensitive to small changes that could disrupt the end result

Researchers from Carnegie Mellon, Stanford, Harvard, and Princeton are challenging one of AI developmentโ€™s accepted core beliefs – that the more pre-training data the better the performance.

As reported by HPCwire, a new paper discuses the concept of โ€œcatastrophic overtraining,โ€ whereby extended pre-training can harm a modelโ€™s performance after fine-tuning.



Source link


Leave a Reply

Your email address will not be published. Required fields are marked *