Exceeded GPU quota via API , but fine interactively

I’ve a pro account and I have created a space using Gradle, and GPU. It works fine interactively, but when I connect to my space via Gradle API then, after a few requests, I get:
You have exceeded your GPU quota (60s requested vs. 58s left).
Create a free account to get more usage quota.
Can I somehow pass my credentials via the API in order to resolve this? I need to use an API to test the space I’ve created.
Thanks in advance.

Update: I set the space to private and used the hf_token to connect via Gradle API. Works for about 3 API calls and then then I get the exact same error.
Why is is asking me to create a free account when I am using my access token?

Ehh…
That Gradio error has been happening to me for months now, even when I use it interactively while logged into my Pro account…

I think the log in part is simply not working.
I think it’s just written, but I don’t think it’s actually working.

P.S.

You can report any bugs or specification oddities in Zero GPU to the Discussion section of this group.

My report as an example.

Thanks a mill for that. Login does work; I just got about 20 good API requests before it started to fail. Its still working interactively so it looks like there are quotas on the API, and these are not documented anywhere as far as I can see. Looks like I need to go elsewhere for a solution like Runpod or Beam or another crew that offer GPUs on usage basis, as opposed to a per hour basis. The A100 would cost US$3000 per months here. I’m surprised HuggingFace don’t offer access on a usage basis; after all they do know how to share GPUs - that’s what spaces is all about!

1 Like

After hours of googling, I have to say that the Huggingface articles on how to host on Runpod are absolutely awful. This was posted 3 weeks ago but there is no Step 2, it doesn’t exist. These guys waste so much of our time:

1 Like

but there is no Step 2, it doesn’t exist

Oh, come on, you’re kidding.:sweat_smile:
But it’s rather common at HF. I don’t know if it was at the time they wrote the article or if they wrote it because it’s theoretically possible in an ideal state…
But it’s a little unusual for them to lie in the article, let alone in README.md, which has a lot of copy and paste.

I’m surprised HuggingFace don’t offer access on a usage basis; after all they do know how to share GPUs - that’s what spaces is all about!

I think it’s by the hour, not by the amount of usage, since we all share the same GPU at HF. I understand that’s the only way to do it, especially with the Zero GPU structure.
But I can very well understand that a billing plan on a per usage basis would be helpful.
Because now you are charged even while the system is down due to errors. (I saw someone complaining about it on the forum)

I think it would be good to submit a request if possible. Well, it won’t happen right now, but adding more plans is technically just writing a few lines. From a management standpoint, though, it would require a meeting or something.

Request Form: