Like many companies, we're starting to develop applications with generative AI, primarily using the RAG architecture because frequent data changes make fine-tuning less effective. We're considering GPT-3.5, GPT-4, and Gemini, but want options beyond these due to cloud dependency and cost. We're planning a multi-model approach, selecting LLMs based on context size and other factors. Can you suggest a decision-making framework for evaluating LLMs?

3.7k viewscircle icon4 Upvotescircle icon2 Comments
Sort by:
IT Manager in IT Servicesa year ago

Hugging Face (https://huggingface.co) is something like GitHub for LLMs, Datasets etc. It will give your team the opportunity to test a variety of LLMs, both open source and proprietary, including OpenAI's, Microsoft's, and Google's models. Through my research I have found that while some models may not currently be the best in the market, they show great potential to become very good in the near future, particularly in the healthtech or fintech market.

Lightbulb on1
Product Associate in Softwarea year ago

Hi!
I cannot directly answer your question as I myself have used only the GPT family from OpenAI. However just for completeness I would add some good options to your list:
- GPT-family LLMs from Azure OpenAI services: provide similar value to the direct OpenAI API, but with more control, e.g. where the data is processed geographically;
- Claude API from Anthropic: haven't used but plan to check as the quality seems to be on par with OpenAI models.

Content you might like

Lack of mature vendor solutions51%

Trust in AI accuracy65%

Budget constraints19%

Skills to operate the tools26%

View Results

90 Days14%

365 Days49%

3 years23%

5 years9%

7 years6%

Other (share in the comments)

View Results