AI APIs

This section includes all endpoints that facilitate AI interactions through various models such as OpenAI, Grok, DeepSeek, and Replicate. These APIs allow users to initiate AI conversations, store memories, submit batch jobs, retrieve results, and generate ephemeral keys for short-lived operations. All endpoints require authentication and operate on a token-based credit system.

1. Post AI Chat

Endpoint: /v2/ai_chat

Method: POST

Access: Private API

Description: Handles user chat requests to the AI agent (e.g., OpenAI, Grok, DeepSeek, Replicate). Sends prompt, context, or shared file information and returns AI-generated output (text, image, or PDF).

Request Body:

{
  "share_info_urls": ["https://..."],         // Optional: URLs to shared files for context
  "share_info_ids": "abc123,def456",          // Optional: Comma-separated share info IDs
  "prompt": "Explain quantum computing.",      // Required: User’s input question
  "model": "gpt-4o",                           // Required: AI model to use
  "is_file_processing": false,                // Optional: If true, files will be processed
  "master_prompt": "summarize",               // Optional: System-level prompt
  "messages": [                               
    {"role": "user", "content": "Hi!"}         // Optional: Message history
  ],
  "file_type": "text",                         // Optional: File type if uploading
  "is_agent_input": false,                     // Optional: If true, returns a PDF
  "web_search": false                          // Optional: Enables web search context
}

Response:

Status: 200 OK

Content-Type: application/json

Returns AI-generated content in text, image, or PDF form depending on input.

Logic:

Verifies user’s AI token balance
Validates model access based on subscription tier
Applies master_prompt if provided
Loads previous memory/context if model supports it
Downloads files via GoSDK using share_info_urls/ids
Uploads image to S3 if file_type is image and prepares for AI processing
Sends prompt/context to model backend (e.g., OpenAI, Grok)
Deducts token cost based on operation
Returns AI-generated result (text/image/PDF)

Create Memory for AI

Endpoint: /v2/create_memory

Method: POST

Access: Private API

Description: Stores a memory/context string to personalize future AI interactions. Each operation deducts tokens.

Request Body:

{
  "prompt": "Store this context for future reference."   // Required
}

Response: Status: 200 OK Content-Type: application/json

{
  "message": "Memory stored successfully."
}

Logic:

Verifies AI token balance
If sufficient:
- Stores memory via mem0Client.CreateMemoryWithContent
- Deducts token cost
- Returns success message
If insufficient:
- Returns error message indicating low balance

Post OpenAI Batch Request

Path: /v2/openai_batch_request

Method: POST

Access: Private API

Description: Queues a batch AI request for asynchronous processing by OpenAI, suitable for large-scale prompt execution or document-based answering.

Request Body:

{
  "share_info_urls": ["https://..."],         // Optional
  "share_info_ids": "abc123,def456",          // Optional
  "prompt": "Summarize the following document.", // Required
  "model": "gpt-4o-mini",                     // Required
  "is_file_processing": true,                 // Optional
  "master_prompt": "summarize",               // Optional
  "messages": [],                             // Optional
  "file_type": "text",                        // Optional
  "is_agent_input": false,                    // Optional
  "web_search": false                         // Optional
}

Response: Status: 200 OK

Content-Type: application/json

{
  "batch_id": "batch_abc123xyz",
  "status": "queued"
}

Logic:

Verifies user’s AI token balance
Validates selected model against plan restrictions
Stores a batch request record in OpenAiBatchRepository with metadata
Returns batch ID and status to user

Worker Behavior:

A background worker named openai_batch_request picks up new requests
Submits the prompts to OpenAI’s batch API
Monitors for completion
Stores result and cost in the database
Deducts tokens accordingly

Get OpenAI Batch Response

Path: /v2/openai_batch_response

Method: GET

Access: Private API

Query Parameter:

batch_id: ID of the previously submitted batch job (required)

Description: Fetches the results of a previously submitted batch AI job from OpenAI, streaming a .jsonl file.

Response:

Status: 200 OK

Content-Disposition: attachment; filename="batch_results.jsonl"

Content-Type: application/jsonl

Response is a streamed file containing one JSON response per line.

Logic:

Uses OpenAI API client to query batch status
Checks if job is completed
- If not completed, returns an error
If completed, finds and downloads the result file
Streams the .jsonl output back to the user

Get Replicate Prediction Response

Path: /v2/prediction_response

Method: GET

Access: Private API

Query Parameter:

prediction_id: ID of the prediction job on Replicate (required)

Description: Retrieves the output of a previously submitted AI model job from Replicate. Returns as JSON or PDF depending on model type and output.

Response (Example - Text JSON):

Status: 200 OK

Content-Type: application/json

{
  "model": "deepseek",
  "status": "succeeded",
  "output": "Here is the predicted response."
}

Logic:

Polls Replicate API using PollPredictionResult(prediction_id)
Checks current job status:
- If succeeded:
  - Returns raw result for most models
  - If model is deepseek and file type is not chat, generates and streams a PDF
- If failed: returns an error with failure info
- If running: returns a status update

Generate Ephemeral Key

Path: /v2/generate_ephemeral_key

Method: GET

Access: Private API

Description: Generates a temporary ephemeral key for secure, short-lived AI operations (e.g., audio chat using OpenAI). This key provides limited access and expires after a short period, improving security.

How it works:

Authenticates the user using headers.
Checks the user’s AI subscription token balance.
- If tokens are sufficient, a fixed amount is deducted.
- If not, an error is returned.
Calls OpenAI’s API to generate the ephemeral key.
Returns the generated ephemeral key to the client.

Response: Status: 200 OK Content-Type: application/json

{
  "ephemeral_key": "sk-ephemeral-xyz123456789",
  "expires_in": 3600,
  "created_at": "2025-06-24T12:34:56Z"
}

Error Responses:

Status: 401 Unauthorized Returned if authentication fails or token is missing.
Status: 402 Payment Required Returned if the user’s AI subscription token balance is insufficient.
```
{
  "error": "Insufficient AI tokens to generate ephemeral key"
}
```
Status: 500 Internal Server Error If OpenAI API fails or a server error occurs.
```
{
  "error": "Failed to generate ephemeral key. Please try again later."
}
```

PreviousStripe APIs NextCLIs

Last updated 9 hours ago