Issue with OpenAI's Rate Limit 429 Restriction

Question

Issue with OpenAI's Rate Limit 429 Restriction

I've been experimenting with this repository in order to implement semantic search for YouTube videos using OpenAI + Pinecone. However, I keep encountering a 429 error at the following step - "Run the command npx tsx src/bin/process-yt-playlist.ts to pre-process the transcripts and fetch embeddings from OpenAI, then insert them into a Pinecone search index."

Any assistance would be greatly appreciated!

Here is my openai.ts file

import pMap from 'p-map'
import unescape from 'unescape'

import * as config from '@/lib/config'

import * as types from './types'

import pMemoize from 'p-memoize'
import pRetry from 'p-retry'
import pThrottle from 'p-throttle'

// TODO: enforce max OPENAI_EMBEDDING_CTX_LENGTH of 8191

// https://platform.openai.com/docs/guides/rate-limits/what-are-the-rate-limits-for-our-api
// TODO: enforce TPM
const throttleRPM = pThrottle({
  // 3k per minute instead of 3.5k per minute to add padding
  limit: 3000,
  interval: 60 * 1000,
  strict: true
})

type PineconeCaptionVectorPending = {
  id: string
  input: string
  metadata: types.PineconeCaptionMetadata
}

export async function getEmbeddingsForVideoTranscript({
  transcript,
  title,
  openai,
  model = config.openaiEmbeddingModel,
  maxInputTokens = 100, // TODO???
  concurrency = 1
}: {
  transcript: types.Transcript
  title: string
  openai: types.OpenAIApi
  model?: string
  maxInputTokens?: number
  concurrency?: number
}) {
  const { videoId } = transcript

  let pendingVectors: PineconeCaptionVectorPending[] = []
  let currentStart = ''
  let currentNumTokensEstimate = 0
  let currentInput = ''
  let currentPartIndex = 0
  let currentVectorIndex = 0
  let isDone = false

  // const createEmbedding = pMemoize(throttleRPM(createEmbeddingImpl))

  // Pre-compute the embedding inputs, making sure none of them are too long
  do {
    isDone = currentPartIndex >= transcript.parts.length

    const part = transcript.parts[currentPartIndex]
    const text = unescape(part?.text)
      .replaceAll('[Music]', '')
      .replaceAll(/[\t\n]/g, ' ')
      .replaceAll('  ', ' ')
      .trim()
    const numTokens = getNumTokensEstimate(text)

    if (!isDone && currentNumTokensEstimate + numTokens < maxInputTokens) {
      if (!currentStart) {
        currentStart = part.start
      }

      currentNumTokensEstimate += numTokens
      currentInput = `${currentInput} ${text}`

      ++currentPartIndex
    } else {
      currentInput = currentInput.trim()
      if (isDone && !currentInput) {
        break
      }

      const currentVector: PineconeCaptionVectorPending = {
        id: `${videoId}:${currentVectorIndex++}`,
        input: currentInput,
        metadata: {
          title,
          videoId,
          text: currentInput,
          start: currentStart
        }
      }

      pendingVectors.push(currentVector)

      // reset current batch
      currentNumTokensEstimate = 0
      currentStart = ''
      currentInput = ''
    }
  } while (!isDone)
  let index = 0;

  console.log("Entering embeddings calculation")
  // Evaluate all embeddings with a max concurrency
  // const delay = (ms) => new Promise((resolve) => setTimeout(resolve, ms));
  const vectors: types.PineconeCaptionVector[] = await pMap(
    pendingVectors,
    async (pendingVector) => {
      // await delay(6000); // add a delay of 1 second before each iteration
      console.log(pendingVector.input + " " + model)


      // const { data: embed } = await openai.createEmbedding({
      //   input: pendingVector.input,
      //   model
      // })

      async function createEmbeddingImpl({
        input = pendingVector.input,
        model = 'text-embedding-ada-002'
      }: {
        input: string
        model?: string
      }): Promise<number[]> {
        const res = await pRetry(
          () =>
            openai.createEmbedding({
              input,
              model
            }),
          {
            retries: 4,
            minTimeout: 1000,
            factor: 2.5
          }
        )
      
        return res.data.data[0].embedding
      }

      const embedding = await pMemoize(throttleRPM(createEmbeddingImpl));
      

      const vector: types.PineconeCaptionVector = {
        id: pendingVector.id,
        metadata: pendingVector.metadata,
        values: await embedding(pendingVector)
      }
      console.log(index + " THIS IS THE NUMBER OF CALLS TO OPENAI Embedding: " + embedding)
      index++;
      return vector
    },
    {
      concurrency
    }
  )

  return vectors
}

function getNumTokensEstimate(input: string): number {
  const numTokens = (input || '')
    .split(/\s/)
    .map((token) => token.trim())
    .filter(Boolean).length

  return numTokens
}

I have attempted to increase the time gap between API calls well below the allowable limit, but I am still confronted with the same issue.

typescript next.js openai-api chatgpt-api semantic-search

Answer 1

Answer №1

If your OpenAI account runs out of credits, a 429 Rate Limit error will be sent to you. I had previously been utilizing free credits that had a 3-month expiration date. To keep track of your available credits, visit the Usage page:

On a side note, once I added a credit card to my account, it took approximately 5 minutes for the rate limit issue to be resolved.

Answer 2

If your OpenAI account runs out of credits, a 429 Rate Limit error will be sent to you. I had previously been utilizing free credits that had a 3-month expiration date. To keep track of your available credits, visit the Usage page:

On a side note, once I added a credit card to my account, it took approximately 5 minutes for the rate limit issue to be resolved.

Issue with OpenAI's Rate Limit 429 Restriction

Answer №1

Similar questions

Looking for help with setting up Nodemailer and installing it via NPM

What is the best way to design functions that can return a combination of explicit types and implicit types?

Abrupt surge in Firestore read operations detected in Next.js application following a period of no modifications - refer to the accompanying visual

Learn how to transfer information (specifically, 2 strings) from a next.js server to a node.js server, followed by the necessary steps to modify the values on the node.js server

Choose only the options that are present in both arrays

Despite implementing a no-store cache in Next JS 13.4, the page fails to refresh its data

Show image using Typescript model in Angular application

What is the best way to incorporate an interface in TypeScript that is both callable and has properties?

How can the default scrolling feature be disabled for all NextJS <Link>s?

Encountering issues with fs.readFileSync when used within a template literal

Is there a way to operate both websocket and http methods concurrently on a server in Typescript?

Implementing Firebase-triggered Push Notifications

The attribute 'X' is not present in the specified type 'IntrinsicAttributes & InferPropsInner'

Display real-time information in angular material table segment by segment

Is the getStaticProps function executed on the server side?

Create a hierarchical tree structure using a string separated by dots

Typescript Syntax for Inferring Types based on kind

React - the use of nested objects in combination with useState is causing alterations to the initial

Tips for showing the upcoming week in an angular application

Establish a table containing rows derived from an object