This section includes all endpoints that facilitate AI interactions through various models such as OpenAI, Grok, DeepSeek, and Replicate. These APIs allow users to initiate AI conversations, store memories, submit batch jobs, retrieve results, and generate ephemeral keys for short-lived operations. All endpoints require authentication and operate on a token-based credit system.
1. Post AI Chat
Endpoint:/v2/ai_chat
Method: POST
Access: Private API
Description: Handles user chat requests to the AI agent (e.g., OpenAI, Grok, DeepSeek, Replicate). Sends prompt, context, or shared file information and returns AI-generated output (text, image, or PDF).
Request Body:
{"share_info_urls": ["https://..."],// Optional: URLs to shared files for context"share_info_ids":"abc123,def456",// Optional: Comma-separated share info IDs"prompt":"Explain quantum computing.",// Required: User’s input question"model":"gpt-4o",// Required: AI model to use"is_file_processing":false,// Optional: If true, files will be processed"master_prompt":"summarize",// Optional: System-level prompt"messages": [ {"role":"user","content":"Hi!"} // Optional: Message history ],"file_type":"text",// Optional: File type if uploading"is_agent_input":false,// Optional: If true, returns a PDF"web_search":false// Optional: Enables web search context}
Response:
Status: 200 OK
Content-Type: application/json
Returns AI-generated content in text, image, or PDF form depending on input.
Logic:
Verifies user’s AI token balance
Validates model access based on subscription tier
Applies master_prompt if provided
Loads previous memory/context if model supports it
Downloads files via GoSDK using share_info_urls/ids
Uploads image to S3 if file_type is image and prepares for AI processing
Sends prompt/context to model backend (e.g., OpenAI, Grok)
Deducts token cost based on operation
Returns AI-generated result (text/image/PDF)
Create Memory for AI
Endpoint:/v2/create_memory
Method: POST
Access: Private API
Description: Stores a memory/context string to personalize future AI interactions. Each operation deducts tokens.
Request Body:
Response:
Status: 200 OK
Content-Type: application/json
Logic:
Verifies AI token balance
If sufficient:
Stores memory via mem0Client.CreateMemoryWithContent
Deducts token cost
Returns success message
If insufficient:
Returns error message indicating low balance
Post OpenAI Batch Request
Path:/v2/openai_batch_request
Method: POST
Access: Private API
Description: Queues a batch AI request for asynchronous processing by OpenAI, suitable for large-scale prompt execution or document-based answering.
Request Body:
Response:
Status: 200 OK
Content-Type: application/json
Logic:
Verifies user’s AI token balance
Validates selected model against plan restrictions
Stores a batch request record in OpenAiBatchRepository with metadata
Returns batch ID and status to user
Worker Behavior:
A background worker named openai_batch_request picks up new requests
Submits the prompts to OpenAI’s batch API
Monitors for completion
Stores result and cost in the database
Deducts tokens accordingly
Get OpenAI Batch Response
Path:/v2/openai_batch_response
Method: GET
Access: Private API
Query Parameter:
batch_id: ID of the previously submitted batch job (required)
Description:
Fetches the results of a previously submitted batch AI job from OpenAI, streaming a .jsonl file.
Response is a streamed file containing one JSON response per line.
Logic:
Uses OpenAI API client to query batch status
Checks if job is completed
If not completed, returns an error
If completed, finds and downloads the result file
Streams the .jsonl output back to the user
Get Replicate Prediction Response
Path:/v2/prediction_response
Method: GET
Access: Private API
Query Parameter:
prediction_id: ID of the prediction job on Replicate (required)
Description: Retrieves the output of a previously submitted AI model job from Replicate. Returns as JSON or PDF depending on model type and output.
Response (Example - Text JSON):
Status: 200 OK
Content-Type: application/json
Logic:
Polls Replicate API using PollPredictionResult(prediction_id)
Checks current job status:
If succeeded:
Returns raw result for most models
If model is deepseek and file type is not chat, generates and streams a PDF
If failed: returns an error with failure info
If running: returns a status update
Generate Ephemeral Key
Path:/v2/generate_ephemeral_key
Method:GET
Access: Private API
Description:
Generates a temporary ephemeral key for secure, short-lived AI operations (e.g., audio chat using OpenAI). This key provides limited access and expires after a short period, improving security.
How it works:
Authenticates the user using headers.
Checks the user’s AI subscription token balance.
If tokens are sufficient, a fixed amount is deducted.
If not, an error is returned.
Calls OpenAI’s API to generate the ephemeral key.
Returns the generated ephemeral key to the client.