Resource Request for RAG-Based HPC User Support Chatbot (Faculty/Junior Researcher Collaboration Opportunity)

Resource Request for RAG-Based HPC User Support Chatbot

PI: Lindsay Wells

Objective:

To improve the accessibility of ICDS documentation and reduce the support load on client-facing teams by deploying an intelligent chatbot that provides accurate, context-aware answers to user questions using a consolidated knowledge base (KB) and RAG-based LLM architecture.

Background:

ICDS maintains a distributed knowledge base including help desk tickets (ServiceNow), internal documentation (BookStack, GitLab/Hub), and external-facing content (ICDS website, user guide, vendor docs like Slurm and MATLAB). Navigating this fragmented knowledge base presents challenges for both users and support teams. A semantic search and RAG-based chatbot system will allow both users and staff to query this information naturally, improving both efficiency and experience.

Project Scope & Deliverables:

• Consolidated, extensible vector database of ICDS documentation

• Internal semantic search and chatbot interface for cross-team support

• User-facing chatbot for Level 0–1 user self-service, integrated into Client Support

• Scalable, maintainable system based on open-source LLMs and Responsible AI principles

People Resource Request:

1 – 2 Junior Researchers:

• web scraping, document parsing, text chunking, metadata tagging, and generating semantic embeddings for ingestion into a vector database

• Scrub data of personal identifiers (names, user IDs, emails, etc) from help desk tickets, logs, and internal documentation prior to ingestion into the vector database

• Evaluate, fine-tune embedding & LLM models; build RAG pipeline

• Curate domain-specific context; verify chatbot output relevance

• Develop ingestion pipelines, vector DB, backend API, and UI

• Assist with deploying inference services on ICDS infrastructure. Generating documentation for future management and maintenance of this service.

Timeline:

  • Initial prototype for internal testing: ~3 months
  • Production deployment with feedback loop: ~6 months
  • Strategic Benefit:

This project aims to modernize user support, reduce manual load on staff, and democratize access to HPC knowledge. It also enables ICDS to evaluate internal capabilities with open-source LLM infrastructure, positioning the team for future AI-enhanced services.