Faster Inference
Model Gateway can be used with Azure OpenAI to provide fast load balancing and failover across multiple regions.
Here is a schema of a typical load balancing setup with Azure OpenAI and Model Gateway:
Routing to the fastest region
Model Gateway allows also dynamic load balancing to the fastest available region of any cloud you use. To enable this feature, please contact us.