Leveraging the OpenAI Assistant for a Customized Chatbot Experience: A Technical Overview

Introduction

In this article, I aim to show the process of integrating the OpenAI Assistant into a web-based chatbot, highlighting its distinction from the commonly known ChatGPT and the incorporation of preprocessing and retrieval features for personalized responses.

This was a unique project for a few reasons. Namely, without administrative access to this WordPress installation, I was limited to just a few options for integration. Therefore, I had to use a javascript front end and develop a wrapper around the Open API as a backend. This separates functionality and, importantly, hides my API key.

Understanding the OpenAI Assistant vs. ChatGPT

The OpenAI Assistant is an advanced iteration of the GPT-3 model, designed to provide more nuanced and context-aware responses. Unlike ChatGPT, which primarily focuses on generating text based on the input prompt, the OpenAI Assistant is equipped with retrieval capabilities, enabling it to access a predefined knowledge base to deliver more informed and relevant responses.

Setting Up the Backend

  1. API Access and Configuration: Securing access to the OpenAI API is the foundational step. This involves registering on the OpenAI platform and obtaining an API key for authenticating requests to the OpenAI Assistant.
  2. Server Setup: A server is essential for intermediating the communication between the frontend interface and the OpenAI API. This server processes requests from the frontend, forwards them to the OpenAI API, and relays the responses back to the frontend.
  3. Endpoint Establishment: An endpoint is created on the server to handle requests from the frontend. This endpoint is responsible for receiving user inputs, transmitting them to the OpenAI API, and returning the generated responses.

Backend Code (index.php)

<?php
header('Content-Type: application/json');

// Read the incoming JSON payload
$input = json_decode(file_get_contents('php://input'), true);
$user_message = $input['message'] ?? '';
$provided_thread_id = $input['thread_id'] ?? '';

// Check if the user message is valid
if (empty($user_message)) {
    echo json_encode(['error' => 'Message is empty']);
    exit;
} elseif (strlen($user_message) > 500) {
    echo json_encode(['error' => 'Message exceeds 500 characters']);
    exit;
}

// Your OpenAI API key
$api_key = 'XXX';

// Set up the headers for the OpenAI API requests
$headers = [
    'Content-Type: application/json',
    'Authorization: Bearer ' . $api_key,
    'OpenAI-Beta: assistants=v1'
];

if (empty($provided_thread_id)) {
    // 1. Create a thread
    $url = 'https://api.openai.com/v1/threads';
    $ch = curl_init($url);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
    curl_setopt($ch, CURLOPT_HTTPHEADER, $headers);
    curl_setopt($ch, CURLOPT_POSTFIELDS, '{}');
    $response = curl_exec($ch);
    
    if (curl_errno($ch)) {
        error_log('Curl error: ' . curl_error($ch));
    }
    
    if (!$response) {
        error_log('No response from API');
    } else {
        $thread = json_decode($response, true);
        if (isset($thread['id'])) {
            $thread_id = $thread['id'];
        } else {
            error_log('Unexpected response format: ' . $response);
        }
    }
} else {
    // Use the provided thread ID
    $thread_id = $provided_thread_id;
}

curl_close($ch);

// 2. Add a message to the thread
$url = "https://api.openai.com/v1/threads/$thread_id/messages";
$message_data = json_encode([
    'role' => 'user',
    'content' => $user_message
]);
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_HTTPHEADER, $headers);
curl_setopt($ch, CURLOPT_POSTFIELDS, $message_data);
$response = curl_exec($ch);
curl_close($ch);

// 3. Create a run
$url = "https://api.openai.com/v1/threads/$thread_id/runs";
$run_data = json_encode([
    'assistant_id' => 'XXX'
]);

$ch = curl_init($url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_HTTPHEADER, $headers);
curl_setopt($ch, CURLOPT_POSTFIELDS, $run_data);
$response = curl_exec($ch);

if (curl_errno($ch)) {
    error_log('Curl error: ' . curl_error($ch));
}

if (!$response) {
    error_log('No response from API');
} else {
    $run = json_decode($response, true);
    if (isset($run['id'])) {
        $run_id = $run['id'];
    } else {
        error_log('Unexpected response format: ' . $response);
    }
}

curl_close($ch);

// 4. Check the run's status and wait for completion or timeout
$url = "https://api.openai.com/v1/threads/$thread_id/runs/$run_id";
$max_polling_duration = 10; // in seconds
$polling_interval = 1000000; // in microseconds (250 milliseconds)
$polling_start_time = time();

while (true) {
    $ch = curl_init($url);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
    curl_setopt($ch, CURLOPT_HTTPHEADER, $headers);
    $response = curl_exec($ch);

    if (curl_errno($ch)) {
        error_log('Curl error: ' . curl_error($ch));
    }

    if (!$response) {
        error_log('No response from API');
    } else {
        $run_status = json_decode($response, true);
        if (isset($run_status['status']) && $run_status['status'] === 'completed') {
            break; // Exit the loop if the run is completed
        } else {
            error_log('Run status: ' . $response);
        }
    }

    curl_close($ch);

    if (time() - $polling_start_time > $max_polling_duration) {
        echo json_encode(['response' => 'Polling timed out.']);
        exit; // Exit the script if the polling duration is exceeded
    }

    usleep($polling_interval); // Wait before checking the status again
}

// 5. Display the results
$url = "https://api.openai.com/v1/threads/$thread_id/messages";
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_HTTPHEADER, $headers);
$response = curl_exec($ch);
$messages = json_decode($response, true);
curl_close($ch);

// Output the final response
$assistant_response = '';
foreach ($messages['data'] as $message) {
    if ($message['role'] === 'assistant') {
        foreach ($message['content'] as $content) {
            if ($content['type'] === 'text') {
                $assistant_response = $content['text']['value'];
                break; // Break the inner loop once the content is found
            }
        }
    }
    if (!empty($assistant_response)) {
        break; // Break the outer loop once the assistant's response is found
    }
}

echo json_encode(['response' => $assistant_response, 'thread_id' => $thread_id]);

?>

Incorporating Preprocessing and Retrieval

  1. Data Preprocessing: Prior to sending user input to the OpenAI Assistant, a preprocessing step is applied to the website data. This step involves cleaning and formatting the data to ensure that it is in a suitable form for the Assistant to process.
  2. Retrieval Feature: The OpenAI Assistant’s retrieval feature is leveraged to provide tailored responses. By accessing a predefined knowledge base, the Assistant can deliver responses that are more relevant and personalized to the user’s query.

Pre-processing Webpages (GPT_exporter.py)

import requests
from bs4 import BeautifulSoup
import re

# List of URLs to scrape
urls = [
    'https://neurotechhub.wustl.edu/',
    'https://neurotechhub.wustl.edu/about/',
    'https://neurotechhub.wustl.edu/about-us/culture-handbook/',
    'https://neurotechhub.wustl.edu/our-shop/',
    'https://neurotechhub.wustl.edu/membership/',
    'https://neurotechhub.wustl.edu/contact/'
]

for i, url in enumerate(urls):
    # Send a GET request to the website
    response = requests.get(url)

    # Parse the HTML content of the page
    soup = BeautifulSoup(response.content, 'html.parser')

    # Find the text within <div class="page-content">
    page_content = soup.find('div', class_='page-content')
    if page_content:
        text = re.sub(r'\n+', '\n', page_content.get_text())  # Replace multiple newlines with a single newline
    else:
        text = 'No page content found'

    # Save the text to a file, including the URL at the top
    filename = f'neurotechhub_data_{i+1}.txt'
    with open(filename, 'w', encoding='utf-8') as file:
        file.write(f'Page URL: {url}\n\n{text}\n')

    print(f'Data exported to {filename}')

Developing the Frontend

  1. User Interface Design: The user interface is designed to provide a seamless interaction experience, featuring a chat window for conversation display and an input field for user message entry.
  2. Message Transmission: JavaScript is utilized to capture the user’s input and send it to the backend server. This can be accomplished using the fetch API or other HTTP client libraries.
  3. Response Display: Once the response is received from the backend, it is displayed in the chat window. Ensuring a natural conversation flow and clear differentiation between user messages and chatbot responses is crucial.

HubBot Frontend (embedded code)

<style>
    #chatModule {
        width: 100%;
        border: 1px solid #ccc;
        border-radius: 10px;
        padding: 10px;
        box-shadow: 0 2px 4px rgba(0, 0, 0, 0.1);
    }

    #chatbox {
        height: auto;
        overflow-y: auto;
        margin-bottom: 10px;
    }

    #inputBoxContainer {
        display: flex;
    }

    #inputbox {
        flex-grow: 1;
        border-radius: 5px;
        border: 1px solid #ccc;
        padding: 5px;
        margin-right: 5px;
    }

    #askButton {
        width: 25%;
    }

    #askButton:disabled {
        background-color: #ccc;
        cursor: not-allowed;
    }

    .message {
        margin-bottom: 5px;
    }

    .loading {
        border: 4px solid #f3f3f3;
        border-top: 4px solid #3498db;
        border-radius: 50%;
        width: 20px;
        height: 20px;
        animation: spin 2s linear infinite;
    }

    @keyframes spin {
        0% {
            transform: rotate(0deg);
        }

        100% {
            transform: rotate(360deg);
        }
    }
</style>
<div id="chatModule">
    <div id="chatbox"></div>
    <div id="inputBoxContainer">
        <input type="text" id="inputbox" placeholder="What can I help you with...">
        <button id="askButton" onclick="askQuestion()">Ask HubBot</button>
    </div>
</div>
<script>
    function askQuestion() {
        var inputBox = document.getElementById('inputbox');
        var chatbox = document.getElementById('chatbox');
        var askButton = document.getElementById('askButton');

        var userMessage = inputBox.value.trim();
        if (userMessage === '') {
            alert('Please enter a message.');
            return;
        }

        chatbox.innerHTML = '<div class="message"><strong>You:</strong> ' + userMessage + '</div>';
        inputBox.value = '';
        inputBox.disabled = true;
        askButton.disabled = true;
        var loadingSpinner = document.createElement('div');
        loadingSpinner.className = 'loading';
        chatbox.appendChild(loadingSpinner);

        fetch('https://labs.gaidi.ca/gpt/index.php', {
            method: 'POST',
            headers: {
                'Content-Type': 'application/json'
            },
            body: JSON.stringify({ message: userMessage })
        })
            .then(response => response.json())
            .then(data => {
                chatbox.innerHTML = '<div class="message"><strong>You:</strong> ' + userMessage + '</div>'; // Clear chatbox and add user message
                if (data.response) {
                    chatbox.innerHTML += '<div class="message"><strong>Neurotech Hub:</strong> ' + data.response + '</div>';
                } else {
                    chatbox.innerHTML += '<div class="message"><strong>Neurotech Hub:</strong> Sorry, I couldn\'t understand that.</div>';
                }
            })
            .catch(error => {
                console.error('Error:', error);
                chatbox.innerHTML = '<div class="message"><strong>You:</strong> ' + userMessage + '</div>'; // Clear chatbox and add user message
                chatbox.innerHTML += '<div class="message"><strong>Neurotech Hub:</strong> Sorry, there was an error processing your request.</div>';
            })
            .finally(() => {
                inputBox.disabled = false;
                askButton.disabled = false;
                inputBox.focus(); // Focus on the input box for the next message
            });
    }
</script>

Conclusion

Integrating the OpenAI Assistant into a chatbot offers a significant enhancement in providing personalized and context-aware responses. The incorporation of preprocessing and retrieval features further augments the chatbot’s ability to deliver relevant and informed responses, thereby elevating the user’s interaction experience. It’s not perfect, but it I only gave it 3 hours (including this blog post).

Additional Considerations:

  • Security and Privacy: Ensuring the secure handling of the API key and user data is paramount.
  • Scalability and Performance: The chatbot should be designed to efficiently handle increased traffic and usage.
  • Continuous Improvement: Regular testing and refinement of the chatbot’s performance and user interface are essential for maintaining an engaging and effective user experience.

In summary, the utilization of the OpenAI Assistant in a web-based chatbot, coupled with preprocessing and retrieval features, presents a sophisticated approach to creating conversational agents that can provide highly personalized and contextually relevant interactions.

*95% of this code and post were written by AI.

Leave a Reply

Your email address will not be published. Required fields are marked *