Limits

During the open beta, the following limits are in place:

Inference requests per minute (per model)

Note that these limits are estimates, subject to change, and will vary by location while in Open Beta.

Model inferences in local mode using Wrangler will also count towards these limits.