AI APIs
AI APIs
This section includes all endpoints that facilitate AI interactions through various models such as OpenAI, Grok, DeepSeek, and Replicate. These APIs allow users to initiate AI conversations, store memories, submit batch jobs, retrieve results, and generate ephemeral keys for short-lived operations. All endpoints require authentication and operate on a token-based credit system.
1. Post AI Chat
Endpoint: /v2/ai_chat
Method: POST
Access: Private API
Description: Handles user chat requests to the AI agent (e.g., OpenAI, Grok, DeepSeek, Replicate). Sends prompt, context, or shared file information and returns AI-generated output (text, image, or PDF).
Request Body:
{
"share_info_urls": ["https://..."], // Optional: URLs to shared files for context
"share_info_ids": "abc123,def456", // Optional: Comma-separated share info IDs
"prompt": "Explain quantum computing.", // Required: User’s input question
"model": "gpt-4o", // Required: AI model to use
"is_file_processing": false, // Optional: If true, files will be processed
"master_prompt": "summarize", // Optional: System-level prompt
"messages": [
{"role": "user", "content": "Hi!"} // Optional: Message history
],
"file_type": "text", // Optional: File type if uploading
"is_agent_input": false, // Optional: If true, returns a PDF
"web_search": false // Optional: Enables web search context
}
Response:
Status: 200 OK
Content-Type: application/json
Returns AI-generated content in text, image, or PDF form depending on input.
Logic:
Verifies user’s AI token balance
Validates model access based on subscription tier
Applies
master_prompt
if providedLoads previous memory/context if model supports it
Downloads files via GoSDK using
share_info_urls
/ids
Uploads image to S3 if
file_type
is image and prepares for AI processingSends prompt/context to model backend (e.g., OpenAI, Grok)
Deducts token cost based on operation
Returns AI-generated result (text/image/PDF)
Create Memory for AI
Endpoint: /v2/create_memory
Method: POST
Access: Private API
Description: Stores a memory/context string to personalize future AI interactions. Each operation deducts tokens.
Request Body:
{
"prompt": "Store this context for future reference." // Required
}
Response:
Status: 200 OK
Content-Type: application/json
{
"message": "Memory stored successfully."
}
Logic:
Verifies AI token balance
If sufficient:
Stores memory via
mem0Client.CreateMemoryWithContent
Deducts token cost
Returns success message
If insufficient:
Returns error message indicating low balance
Post OpenAI Batch Request
Path: /v2/openai_batch_request
Method: POST
Access: Private API
Description: Queues a batch AI request for asynchronous processing by OpenAI, suitable for large-scale prompt execution or document-based answering.
Request Body:
{
"share_info_urls": ["https://..."], // Optional
"share_info_ids": "abc123,def456", // Optional
"prompt": "Summarize the following document.", // Required
"model": "gpt-4o-mini", // Required
"is_file_processing": true, // Optional
"master_prompt": "summarize", // Optional
"messages": [], // Optional
"file_type": "text", // Optional
"is_agent_input": false, // Optional
"web_search": false // Optional
}
Response:
Status: 200 OK
Content-Type: application/json
{
"batch_id": "batch_abc123xyz",
"status": "queued"
}
Logic:
Verifies user’s AI token balance
Validates selected model against plan restrictions
Stores a batch request record in
OpenAiBatchRepository
with metadataReturns batch ID and status to user
Worker Behavior:
A background worker named
openai_batch_request
picks up new requestsSubmits the prompts to OpenAI’s batch API
Monitors for completion
Stores result and cost in the database
Deducts tokens accordingly
Get OpenAI Batch Response
Path: /v2/openai_batch_response
Method: GET
Access: Private API
Query Parameter:
batch_id
: ID of the previously submitted batch job (required)
Description:
Fetches the results of a previously submitted batch AI job from OpenAI, streaming a .jsonl
file.
Response:
Status: 200 OK
Content-Disposition: attachment; filename="batch_results.jsonl"
Content-Type: application/jsonl
Response is a streamed file containing one JSON response per line.
Logic:
Uses OpenAI API client to query batch status
Checks if job is completed
If not completed, returns an error
If completed, finds and downloads the result file
Streams the
.jsonl
output back to the user
Get Replicate Prediction Response
Path: /v2/prediction_response
Method: GET
Access: Private API
Query Parameter:
prediction_id
: ID of the prediction job on Replicate (required)
Description: Retrieves the output of a previously submitted AI model job from Replicate. Returns as JSON or PDF depending on model type and output.
Response (Example - Text JSON):
Status: 200 OK
Content-Type: application/json
{
"model": "deepseek",
"status": "succeeded",
"output": "Here is the predicted response."
}
Logic:
Polls Replicate API using
PollPredictionResult(prediction_id)
Checks current job status:
If
succeeded
:Returns raw result for most models
If model is
deepseek
and file type is not chat, generates and streams a PDF
If
failed
: returns an error with failure infoIf
running
: returns a status update
Generate Ephemeral Key
Path: /v2/generate_ephemeral_key
Method: GET
Access: Private API
Description: Generates a temporary ephemeral key for secure, short-lived AI operations (e.g., audio chat using OpenAI). This key provides limited access and expires after a short period, improving security.
How it works:
Authenticates the user using headers.
Checks the user’s AI subscription token balance.
If tokens are sufficient, a fixed amount is deducted.
If not, an error is returned.
Calls OpenAI’s API to generate the ephemeral key.
Returns the generated ephemeral key to the client.
Response:
Status: 200 OK
Content-Type: application/json
{
"ephemeral_key": "sk-ephemeral-xyz123456789",
"expires_in": 3600,
"created_at": "2025-06-24T12:34:56Z"
}
Error Responses:
Status:
401 Unauthorized
Returned if authentication fails or token is missing.Status:
402 Payment Required
Returned if the user’s AI subscription token balance is insufficient.{ "error": "Insufficient AI tokens to generate ephemeral key" }
Status:
500 Internal Server Error
If OpenAI API fails or a server error occurs.{ "error": "Failed to generate ephemeral key. Please try again later." }
Last updated