Serverless for machine learning
While building ML systems for the Playbook platform, we also helped optimize their architecture, making it as serverless as possible.
Further, we facilitated other cloud computing solutions. For example, when faced with the challenge of adding some system libraries to Cloud Functions, we implemented several services with the higher-level Google Cloud Run solution. This gave us the opportunity to prepare the container’s image, which runs on demand, on our own (in Cloud Functions, this image is predefined and unchangeable; you can only load and run your code there.) We also configured the deployment for cloud run through GitHub Actions.
However, the limitations of Cloud Functions and Cloud Run do not allow graphics accelerators, and in addition, their instance types are quite limited in terms of resources. So, we needed to step up to an even higher level in the realm of serverless architecture. To work with heavier neural networks that require accelerators (GPUs), we started working with Managed Instance Groups. We took the opportunity to create an instance template that uses the accelerator, set up a startup script for it, and create a Managed Instance Group on top of it to run and shut down additional instances, depending on the load, in accordance with the auto-scaling settings.
By working with a variety of Google serverless tools, we managed to implement all the tasks related to building machine learning features without wasting time on setup and infrastructure support. Since all these tools have an auto-scaling feature and are designed to optimize computing resources, they save money and help withstand a sudden influx of users without the need for additional client actions.