If a mid size bank wants to create its own CHATGPT + its own data (like policy, procedures etc.) loaded into it for fast retrieval and search, what advice folks will have for the reference architecture and components/tools that might be needed to accomplish that? Any thoughts on resources and cost will also be welcome.

1.4k views1 Upvote6 Comments

Sort by:

Vice President Technologya month ago

When you say “it’s own ChatGPT” that is a bit ambiguous. If you’d like to host your own ChatGPT there is an option to do this thru Ollama using an offline model of ChagGPT but it is somewhat limited to GPT 4 mini level. It was just made available a few months ago.

If you are looking for an AI solution in a regulated industry (I am in insurance - similar to banking) I suggest utilizing a secure AI environment like Copilot which guarantees data safety (when used appropriately). We implemented a solution with grounded AI (policies/procedures) and depending on the way it is implemented you would pay for each query. Costs vary depending on the input of the query and the models used GPT 4 mini for instance is about 100 times cheaper compared to GPT 5 models.

Director of IT in Finance (non-banking)a month ago

Two words 'Explainability' and 'HITL (Human-In -The-Loop) '. Make sure that you have enough guidelines and guardrails to explain the outcome that is potentially going to be utilized for any decisioning (Ideally, refrain from it). Onboard GRC criteria and feed those into your plans.

VP of AI Innovation in Softwarea month ago

You will not be able to create an AI-based solution efficient enough to positively impact your Opex, while satisfying security and compliance, without building an industry-top level AI/ML team.

Instead, you may wish to re-think what kind of problematics you indend to address - and leverage appropriate COTS technology. Matching outcomes to OKR/KPI structure within the organization involved.

1 Reply

no titlea month ago

Thank you very much.

Chief Executive Officera month ago

You will need at least the following infrastructures:

-> RAG (Retrieval-Augmented Generation) is the default: do not fine-tune initially.
->Orchestration: LangChain, LlamaIndex, Semantic Kernel, or Guidance
--> Route queries (FAQ vs policy vs procedure).
--> Generate queries for multi-step retrieval (expansion, re-ranking, de-dup).
--> Insert citations and quoted snippets in answers.

-> Models (pick 1–2, keep pluggable):
--> Hosted API (enterprise controls): Azure OpenAI, OpenAI, Anthropic on AWS Bedrock, Google Vertex.
--> Self/Private-host (compliance/data residency): Llama 3.1/3.2 variants, Mistral Large, Qwen, etc. via NVIDIA NIM, vLLM, or TGI.

->Embedding model: use a modern embedding that supports multilingual + long context. Keep an embeddings layer you can re-run offline if you swap models.

Plus you have to add all the application layers ( depending on your use cases ).

Costly wise

**** NB: it really depends about the application type and on how many users will use it. ****

based on my persona experiences:

CAPEX ( infrastructure setup, data ingestion and cleaning, vectorization and embeddings, APP development, Security and COmpliance setup) -- it really depends on in house skills and the type of applications, roughly from $50k to $150k.

OPEX - recurring - (hosting for LLM, Vector database and storage, data pipeline ingestion, monitoring infrastructure, human oversight), roughly $30k - $50k /month

4 1 Reply

no titlea month ago

Thank you.

Content you might like

Why do you think there are so few mature AI-driven autonomous pentesting solutions on the market, and why does this topic seem to generate more hype than in-depth technical discussion?

If you’re going to pilot a new AI initiative and you could borrow talent from anywhere within your organization, who would you want on your dream team?

Any plans to deploy a Secure Access Service Edge (SASE) architecture model in your org?

Yes79%

No8%

Not sure12%

Other (comment below)

View Results

I am curious if your organization has used code generators in any real way. Not low-code tools, but actual generators like RAD-style scaffolding or Domain Specific Languages. I feel that building simple apps is still as complicated as it was twenty years ago, maybe even more. With everything we have today, why does it still feel this heavy? I would love to hear past experiences, suggestions, or ideas from people who have tried to make this easier in their teams.

Is the future of cloud cybersecurity tools more about end-to-end solutions or point solutions?

End-to-end solutions35%

Point solutions47%

A combination of end-to-end and point solutions16%

Not sure

View Results

Sort by:

Content you might like

Why do you think there are so few mature AI-driven autonomous pentesting solutions on the market, and why does this topic seem to generate more hype than in-depth technical discussion?

If you’re going to pilot a new AI initiative and you could borrow talent from anywhere within your organization, who would you want on your dream team?

Any plans to deploy a Secure Access Service Edge (SASE) architecture model in your org?

Is the future of cloud cybersecurity tools more about end-to-end solutions or point solutions?

What sets us apart?

RELATED ONE-MINUTE INSIGHTS

CrowdStrike Outage: Impact And Recovery

GenAI Training Methodologies: Insights for Marketing Teams

2024 Cloud Spending: IT Balances Costs with GenAI Innovations

Generative AI Training for IT Organizations

Navigating the Future: The Evolution of GenAI in Legal

Take Your Insights On-the-Go