[FEEDBACK] Inference Providers

#49
by julien-c - opened
Hugging Face org

Any inference provider you love, and that you'd like to be able to access directly from the Hub?

Love that I can call DeepSeek R1 directly from the Hub 🔥

from huggingface_hub import InferenceClient

client = InferenceClient(
    provider="together",
    api_key="xxxxxxxxxxxxxxxxxxxxxxxx"
)

messages = [
    {
        "role": "user",
        "content": "What is the capital of France?"
    }
]

completion = client.chat.completions.create(
    model="deepseek-ai/DeepSeek-R1", 
    messages=messages, 
    max_tokens=500
)

print(completion.choices[0].message)

Is it possible to set a monthly payment budget or rate limits for all the external providers? I don't see such options in billings tab. In case a key is or session token is stolen, it can be quite dangerous to my thin wallet:(

Hugging Face org

@benhaotang you already get spending notifications when crossing important thresholds ($10, $100, $1,000) but we'll add spending limits in the future

@benhaotang you already get spending notifications when crossing important thresholds ($10, $100, $1,000) but we'll add spending limits in the future

Thanks for your quick reply, good to know!

Would be great if you could add Nebius AI Studio to the list :) New inference provider on the market, with the absolute cheapest prices and the highest rate limits...

Could be good to add featherless.ai

TitanML !!

Would be great to have Simplismart to the list!

Hi!

Would be great if Snowcell could be added to the list. We build complete inferencing solutions from the ground up.

I couldn't find a specific contact point to reach about this, but for any questions we are available at [email protected].

Best Regards

Hello,

At FAIM, we are building an inference platform for time-series foundation models: https://faim.it.com/.
All models we currently support are available on Hugging Face.

I would like to clarify whether it’s possible for us to become an inference provider on Hugging Face for time-series models.

Thank you and best regards,
Andrei
[email protected]

Hello Gatewayz is ready for integation. Please email me at [email protected]

Hi Hugging Face team!

We're gcube (https://gcube.ai), a GPU sharing platform from South Korea. We make AI inference super affordable by connecting idle GPUs from cloud providers and even PC cafes across Korea - basically turning unused computing power into a distributed GPU network.

Our customers are seeing 55-70% cost savings, and we work with major Korean cloud partners like Naver Cloud, NHN Cloud, and KT Cloud.

We'd love to become an official Inference Provider on Hugging Face. Would really appreciate any guidance on the next steps!

Our HF org: https://huggingface.co/gcube-ai (Team plan subscribed)

Thanks!

Best,
Koo
Data Alliance (gcube)
[email protected]

Hello Hugging Face team 👋

We’re from Simplismart.ai, a Series A startup backed by Accel, building a modular MLOps platform focused on high-performance inference. We’re currently exploring the process of listing our inference APIs as a provider on Hugging Face.

We’ve gone through the inference provider documentation and are preparing for the next steps, but before raising a PR, we’d appreciate some clarity around the billing flow, specifically:

Questions:

  1. What is the expected delay between a successful inference request and Hugging Face calling the billing endpoint?
  2. If we’re unable to return cost details within one minute when Hugging Face hits the billing endpoint, does Hugging Face retry the request? If so, what’s the retry behavior?

We want to ensure our implementation aligns closely with Hugging Face’s billing expectations, so any guidance on the above would be very helpful.

Thanks in advance for the support! 🤗

-- Pratik Parmar
Developer Advocate @ Simplismart.ai
[email protected]

Hello Hugging Face team 👋

We are from AlphaNeural AI, an AI infrastructure and inference platform built around one core idea:
Developers should be able to deploy their models and make money from them easily.

AlphaNeural allows model creators to deploy their models on our infrastructure, set licensing or access rules, and monetize usage directly. These deployed models can then be exposed on Hugging Face, while AlphaNeural handles the inference layer with a strong focus on affordability, scalability, performance, and reliability.

We provide inference for LLMs, text, image, video, audio, and multimodal models, optimized for high throughput and low latency, powered by GPU infrastructure that is up to 70% cheaper than traditional cloud providers. This makes it possible for developers to price their models competitively while still earning meaningful revenue.

Our platform is designed to be:
• Developer friendly with simple deployment and API access
• Cost efficient for both creators and users
• High performance and production ready
• Scalable from experimentation to enterprise workloads

AlphaNeural also bridges Web2 and Web3, enabling both traditional subscription based monetization and on chain licensing for models, opening new revenue paths for creators without forcing any Web3 complexity on end users.

We would love to become an official Inference Provider on Hugging Face, enabling model authors to deploy and monetize their models through AlphaNeural, while Hugging Face users access them with fast, affordable, and reliable inference. To maintain high quality, AlphaNeural applies strict technical and operational standards for all deployed models, ensuring strong performance, stability, and production readiness across the platform.

From a technical perspective, we have already completed the full API integration according to Hugging Face’s inference provider specifications, prepared the required interfaces, and have the PR ready for review. We are now simply waiting for approval to move forward.

Happy to share technical details and follow the onboarding process.

We have two PRs ready for review:

https://github.com/huggingface/huggingface.js/pull/1907
https://github.com/huggingface/huggingface_hub/pull/3678

Please reach out to:
[email protected]
[email protected]

Looking forward to collaborating 🤗

Team AlphaNeural AI

Sign up or log in to comment