Queue API

For requests that take longer than several seconds, as it is usually the case with generative AI models, we provide a queue system.

It offers granular control in dealing with surges in traffic, allows you to cancel requests and monitor the current position within the queue, and removes the need to keep long running connections open.

Last updated