Like many companies, we're starting to develop applications with generative AI, primarily using the RAG architecture because frequent data changes make fine-tuning less effective. We're considering GPT-3.5, GPT-4, and Gemini, but want options beyond these due to cloud dependency and cost. We're planning a multi-model approach, selecting LLMs based on context size and other factors. Can you suggest a decision-making framework for evaluating LLMs?

3.7k views4 Upvotes2 Comments

Sort by:

IT Manager in IT Servicesa year ago

Hugging Face (https://huggingface.co) is something like GitHub for LLMs, Datasets etc. It will give your team the opportunity to test a variety of LLMs, both open source and proprietary, including OpenAI's, Microsoft's, and Google's models. Through my research I have found that while some models may not currently be the best in the market, they show great potential to become very good in the near future, particularly in the healthtech or fintech market.

Product Associate in Software2 years ago

Hi!
I cannot directly answer your question as I myself have used only the GPT family from OpenAI. However just for completeness I would add some good options to your list:
- GPT-family LLMs from Azure OpenAI services: provide similar value to the direct OpenAI API, but with more control, e.g. where the data is processed geographically;
- Claude API from Anthropic: haven't used but plan to check as the quality seems to be on par with OpenAI models.

Content you might like

Which tech conference do you intend to attend in person next year or have already participated in this year, and what motivated your choice?

We need a central platform to aggregate tasks, approvals, delegations and notifications from multiple systems into one web/mobile interface. The goal is to improve Digital Employee Experience and productivity. The solution will require a robust middleware for different APIs and data structures (SAP SuccessFactors, SAP Concur, ServiceNow, DocuSign, MS Dynamics, IFS, ...). What solutions or lessons have worked for you in similar initiatives?

How long does your organization retain original systems logs used to filter SOX-related actions into a system that requires review of the logs and retains the filtered logs for seven years? Does your organization consider those original system logs records subject to record retention requirements, or supporting information used to create the SOX records?

90 Days13%

365 Days38%

3 years33%

5 years8%

7 years8%

Other (share in the comments)

View Results

What impacts do high costs of AI and data science talent have on your AI initiatives? (Check all that apply)

Lower AI project volume 17%

Lower impact AI projects35%

Reduced AI project scope 37%

Riskier AI solutions (e.g., less thorough validation before production)30%

Increased reliance on vendors28%

More generic AI solutions (e.g., commercial-off-the-shelf)15%

Other (please share in comments)

View Results

As someone with no background in IT / CS, I am working on a project that requires me to get some understanding of the entire IT / systems landscape of a company - i.e. the various Hardware and Software requirements as well as the market leaders offering these products and services.

What report / webpage / any other resource would be a good starting point to understand this basket of markets?

Sort by:

Content you might like

Which tech conference do you intend to attend in person next year or have already participated in this year, and what motivated your choice?

What impacts do high costs of AI and data science talent have on your AI initiatives? (Check all that apply)

What sets us apart?

RELATED ONE-MINUTE INSIGHTS

CrowdStrike Outage: Impact And Recovery

DevSecOps: Strategies, Organizational Benefits and Challenges

Green Cloud Computing

Omnicloud: The Future of Cloud Computing?

2024 Software Engineering Priorities and Challenges

Take Your Insights On-the-Go