Cerebras Systems, maker of the world's largest chip, announced that its CS-2 system now supports PyTorch and TensorFlow, allowing researchers to train models with billions of parameters quickly and easily.
The company's CS-2 is the world's fastest AI system and is powered by its Wafer-Scale Engine 2 (WSE-2) processor. With the release of version 1.2 of the Cerebras Software Platform (CSoft), CS-2 now supports additional machine learning frameworks that will give developers even more choice when it comes to the types of models they want to run.
Emad Barsoum, Senior Director of AI Framework at Cerebras Systems, explained in a press release how CSoft now allows developers to express models written in TensorFlow or PyTorch, saying:
“From the beginning, our goal was to seamlessly support the machine learning framework our customers wanted to write on. Our customers write in TensorFlow and PyTorch, and our software stack, CSoft, makes it possible to quickly and easily express your models in the framework of your choice. In doing so, our customers have access to the Cerebras CS-850,000's 40 AI-optimized cores and 2 gigabytes of on-chip memory."
Scaling large language models
CSoft version 1.2 now allows developers to write their models in the open source PyTorch or TensorFlow frameworks and run them in Cerebras CS-2 without any modification. At the same time, an AI model written for either a GPU or a CPU can be run in CSoft on CS-2 without any changes.
With the combined power of CS-2 and CSoft, developers can scale from small models like BERT to larger existing models like GPT-3.
Training large models with a GPU is difficult and time consuming, while training from scratch on new datasets often takes weeks and tens of megawatts of power on large clusters of legacy equipment. Also, as the size of the cluster increases, the power, cost, and complexity increase exponentially.
Cerebras Systems built the CS-2 to meet these challenges, and its AI system can set up even the largest models in just minutes. Because developers spend less time installing, configuring, and training their models with CS-2, they can explore more ideas in even less time.