We are looking for a solution to implement a chatGPT-like solution on premise? The solution should propose a GUI & back end components, ability to connect to LLM on prem and in the cloud (routing according to sensitivity of the data). Feature examples: questions to document/knowledge base, simple agent creation, RAG, and multimodal.

3k views2 Upvotes6 Comments

Sort by:

Senior Vice President, Engineering in Software17 days ago

Please take a look at a combination of LM Studio and AnythingLLM. These tools are tailored to meet such needs.

The guy you should take seriously in Travel and Hospitality24 days ago

If you are looking to roll out a ChatGPT-style solution on-prem with cloud smarts, I'd advise the Microsoft solution stack, which IMHO meets all your needs and then some.

Architecture:
Use a web-based GUI (React, Blazor, or Power Apps) with Azure Functions or AKS for orchestration. AKS now supports KAITO for deploying OSS LLMs like Phi-4 and Qwen locally—YAML your way to greatness.

LLMs:
On-prem: Use ONNX Runtime GenAI or vLLM for fast, efficient inference.
Cloud: Azure OpenAI Service gives you GPT-4, GPT-4o, and embeddings with enterprise-grade security.

Hybrid Routing:
Route sensitive data to on-prem LLMs and general queries to Azure OpenAI using Azure Arc and AI Studio.

RAG & Multimodal:
The Azure Multimodal AI & LLM Processing Accelerator supports RAG, document Q&A, multimodal inputs, and even confidence scoring.

Agent Creation:
Copilot Studio and Azure AI Studio let you build and orchestrate agents with Prompt Flow and LLMOps best practices.

Governance:
Microsoft’s AI Governance framework ensures your solution is secure, compliant, and responsibly deployed.

So yes, you can have your AI cake on-prem, drizzle it with cloud, and serve it with a side of governance! Hit me up if you (or anyone reading this) would like links to the accelerators or GitHub repos?

4 1 Reply

no title15 days ago

That's basically how we've built our internal GPT solution as well, using Azure AI functions etc.

VP of Engineering in Manufacturing24 days ago

I assume that you are looking to create a domain-specific knowledge base Q&A. We have found that just loading the documents into a RAG database and opening it up creates a very low-quality experience. Depending on the LLM used we see a lot of hallucination. If we lower the temperature, we see unhelpful answers.

To fix this we focused on a few things.
1) We really needed to create a Q&A test suite for making sure we get accurate responses. So we manually created about 200 questions and answers and would run the system through this test matrix to test its accuracy before going live into production.
2) Turns out that just chopping up our documents resulted in poor affinity out of the vector DB. We needed to look at both the structure and content of the documents. Adding metadata about a paragraph or set of paragraphs, rather than just blindly segmenting the input documents helped.
3) Sometimes the content itself was not well structured to enable a RAG application. For example tables and charts are particularly difficult. So sometimes we would put in context text that explained the images, or we rewrote that part of the documentation to make it easier to parse. Going forward I expect documentation teams and knowledge base design will become constrained by requirements to enable LLM and that will change the style and structure of these repositories.
4) Look at agentic AI platform (e.g. langchain). Instead of just dumping everything into a single vector DB and summarizing it out of context, try putting general documentation separate from break/fix or knowledge base. Then prompt-rewrite the user prompt based on general documentation so that when you ultimately query the break/fix documentation you ask the right question. Or alternatively query the user to fill in information that is missing (e.g. product model # or other context that is important).

We have found these systems are conceptually simple, but when they hit scale it becomes tricky, and you have to marry your organization's domain, documentation/KB, and technology together to get a predictable answer.

Director of IT Governance in Finance (non-banking)24 days ago

We stood up some technology to leverage LLM capabilities with our contract data - help determine risks, opportunities, consistency, etc.. One of the first challenges was applying guardrails to the Chat part of the solution and obfuscate / mask what is private to us. Our technology partner created an obfuscation engine to control what goes to the LLM and how the tool is used. It became a stand alone solution and thousand of people at our company are using it, now - it sounds like it is what you're looking for. It also has a private LLM, etc.. If you want to know more about that, connect with me- Gartner is very familiar with what we've done.

VP of Product Management24 days ago

Hi Cedric, our organization has faced similar challenges – deciding between hosting options and ensuring a new environment is secure, user-friendly, has RAG, and integrates with existing systems. I'd be happy to have a conversation with you and one of our AI architects to discuss our path. I've sent you a connection request.

Content you might like

What "tells" have you noticed for AI-generated text (i.e., typical words/phrases, or other patterns indicative of GenAI)?

Do you accept payments in crypto currency?

Yes, we do today.10%

No, but we plan to in the next 6 months.34%

No, but we plan to further in the future.10%

No, and we have no plans to.44%

View Results

Have you encountered AI-powered infrastructure tools that magnified inefficient resource habits instead of optimizing them? How did you restructure your approach?

Which of the following activities best characterize your focus and how you spend your time in your current role?

Transformational44%

Strategic44%

Functional10%

View Results

Are there any specific risks that “shadow AI” introduces that traditional shadow IT does not? How are you handling any risk or compliance issues that didn’t exist with shadow IT?

Sort by:

Content you might like

What "tells" have you noticed for AI-generated text (i.e., typical words/phrases, or other patterns indicative of GenAI)?

Do you accept payments in crypto currency?

Have you encountered AI-powered infrastructure tools that magnified inefficient resource habits instead of optimizing them? How did you restructure your approach?

Which of the following activities best characterize your focus and how you spend your time in your current role?

Are there any specific risks that “shadow AI” introduces that traditional shadow IT does not? How are you handling any risk or compliance issues that didn’t exist with shadow IT?

What sets us apart?

RELATED ONE-MINUTE INSIGHTS

CrowdStrike Outage: Impact And Recovery

GenAI Training Methodologies: Insights for Marketing Teams

2024 Cloud Spending: IT Balances Costs with GenAI Innovations

Generative AI Training for IT Organizations

Navigating the Future: The Evolution of GenAI in Legal

Take Your Insights On-the-Go