AI APIs

AI APIs

This section includes all endpoints that facilitate AI interactions through various models such as OpenAI, Grok, DeepSeek, and Replicate. These APIs allow users to initiate AI conversations, store memories, submit batch jobs, retrieve results, and generate ephemeral keys for short-lived operations. All endpoints require authentication and operate on a token-based credit system.

1. Post AI Chat

Endpoint: /v2/ai_chat

Method: POST

Access: Private API

Description: Handles user chat requests to the AI agent (e.g., OpenAI, Grok, DeepSeek, Replicate). Sends prompt, context, or shared file information and returns AI-generated output (text, image, or PDF).

Request Body:

{
  "share_info_urls": ["https://..."],         // Optional: URLs to shared files for context
  "share_info_ids": "abc123,def456",          // Optional: Comma-separated share info IDs
  "prompt": "Explain quantum computing.",      // Required: User’s input question
  "model": "gpt-4o",                           // Required: AI model to use
  "is_file_processing": false,                // Optional: If true, files will be processed
  "master_prompt": "summarize",               // Optional: System-level prompt
  "messages": [                               
    {"role": "user", "content": "Hi!"}         // Optional: Message history
  ],
  "file_type": "text",                         // Optional: File type if uploading
  "is_agent_input": false,                     // Optional: If true, returns a PDF
  "web_search": false                          // Optional: Enables web search context
}

Response:

Status: 200 OK

Content-Type: application/json

Returns AI-generated content in text, image, or PDF form depending on input.

Logic:

  • Verifies user’s AI token balance

  • Validates model access based on subscription tier

  • Applies master_prompt if provided

  • Loads previous memory/context if model supports it

  • Downloads files via GoSDK using share_info_urls/ids

  • Uploads image to S3 if file_type is image and prepares for AI processing

  • Sends prompt/context to model backend (e.g., OpenAI, Grok)

  • Deducts token cost based on operation

  • Returns AI-generated result (text/image/PDF)

Create Memory for AI

Endpoint: /v2/create_memory

Method: POST

Access: Private API

Description: Stores a memory/context string to personalize future AI interactions. Each operation deducts tokens.

Request Body:

{
  "prompt": "Store this context for future reference."   // Required
}

Response: Status: 200 OK Content-Type: application/json

{
  "message": "Memory stored successfully."
}

Logic:

  • Verifies AI token balance

  • If sufficient:

    • Stores memory via mem0Client.CreateMemoryWithContent

    • Deducts token cost

    • Returns success message

  • If insufficient:

    • Returns error message indicating low balance

Post OpenAI Batch Request

Path: /v2/openai_batch_request

Method: POST

Access: Private API

Description: Queues a batch AI request for asynchronous processing by OpenAI, suitable for large-scale prompt execution or document-based answering.

Request Body:

{
  "share_info_urls": ["https://..."],         // Optional
  "share_info_ids": "abc123,def456",          // Optional
  "prompt": "Summarize the following document.", // Required
  "model": "gpt-4o-mini",                     // Required
  "is_file_processing": true,                 // Optional
  "master_prompt": "summarize",               // Optional
  "messages": [],                             // Optional
  "file_type": "text",                        // Optional
  "is_agent_input": false,                    // Optional
  "web_search": false                         // Optional
}

Response: Status: 200 OK

Content-Type: application/json

{
  "batch_id": "batch_abc123xyz",
  "status": "queued"
}

Logic:

  • Verifies user’s AI token balance

  • Validates selected model against plan restrictions

  • Stores a batch request record in OpenAiBatchRepository with metadata

  • Returns batch ID and status to user

Worker Behavior:

  • A background worker named openai_batch_request picks up new requests

  • Submits the prompts to OpenAI’s batch API

  • Monitors for completion

  • Stores result and cost in the database

  • Deducts tokens accordingly

Get OpenAI Batch Response

Path: /v2/openai_batch_response

Method: GET

Access: Private API

Query Parameter:

  • batch_id: ID of the previously submitted batch job (required)

Description: Fetches the results of a previously submitted batch AI job from OpenAI, streaming a .jsonl file.

Response:

Status: 200 OK

Content-Disposition: attachment; filename="batch_results.jsonl"

Content-Type: application/jsonl

Response is a streamed file containing one JSON response per line.

Logic:

  • Uses OpenAI API client to query batch status

  • Checks if job is completed

    • If not completed, returns an error

  • If completed, finds and downloads the result file

  • Streams the .jsonl output back to the user

Get Replicate Prediction Response

Path: /v2/prediction_response

Method: GET

Access: Private API

Query Parameter:

  • prediction_id: ID of the prediction job on Replicate (required)

Description: Retrieves the output of a previously submitted AI model job from Replicate. Returns as JSON or PDF depending on model type and output.

Response (Example - Text JSON):

Status: 200 OK

Content-Type: application/json

{
  "model": "deepseek",
  "status": "succeeded",
  "output": "Here is the predicted response."
}

Logic:

  • Polls Replicate API using PollPredictionResult(prediction_id)

  • Checks current job status:

    • If succeeded:

      • Returns raw result for most models

      • If model is deepseek and file type is not chat, generates and streams a PDF

    • If failed: returns an error with failure info

    • If running: returns a status update

Generate Ephemeral Key

Path: /v2/generate_ephemeral_key

Method: GET

Access: Private API

Description: Generates a temporary ephemeral key for secure, short-lived AI operations (e.g., audio chat using OpenAI). This key provides limited access and expires after a short period, improving security.

How it works:

  1. Authenticates the user using headers.

  2. Checks the user’s AI subscription token balance.

    • If tokens are sufficient, a fixed amount is deducted.

    • If not, an error is returned.

  3. Calls OpenAI’s API to generate the ephemeral key.

  4. Returns the generated ephemeral key to the client.

Response: Status: 200 OK Content-Type: application/json

{
  "ephemeral_key": "sk-ephemeral-xyz123456789",
  "expires_in": 3600,
  "created_at": "2025-06-24T12:34:56Z"
}

Error Responses:

  • Status: 401 Unauthorized Returned if authentication fails or token is missing.

  • Status: 402 Payment Required Returned if the user’s AI subscription token balance is insufficient.

    {
      "error": "Insufficient AI tokens to generate ephemeral key"
    }
  • Status: 500 Internal Server Error If OpenAI API fails or a server error occurs.

    {
      "error": "Failed to generate ephemeral key. Please try again later."
    }

Last updated