Warming Up Model
Model warming up involves pre-running requests through an AI model to fine-tune its components for production. This step minimizes delays during initial inferences, ensuring readiness for immediate use.
Key Advantages:
- Improved Initial Performance.
- Stable Response Times.
How to Enable Model Warming Up?
On the Nitro server, model warming up is automatically enabled whenever a new model is loaded. This means that the server handles the warm-up process behind the scenes, ensuring that the model is ready for efficient and effective performance from the first inference request.