![]() Additionally, the user base varies widely in terms of device specifications and internet speeds. It is difficult to provide an exact percentile ranking for your setup because there are many factors that contribute to the overall performance of a system, including hardware, software, and network configurations. Given my hardware and software config and my internet speeds, what percentile of all users do you estimate my setup is in terms of how fast I should be able to receive text from you and display it on the screen? I fed it my system specs and internet speeds and asked it the following ![]() When I asked ChatGPT 4.0 (Plus account) why it takes so long for it to print responses to the screen, it blamed my system. I don’t know if the processing speed is slow or not, but it prints text to the screen when it provides and answer at such a slow speed that it feels like I am watching the opening up-scrolling credits of a Stars Wars movie. I am just using the web client and it is as slow as a tortoise. You won’t be able to have a running chat at this speed especially if you want to maintain thread context, these are small calls to classify command text. This doesnt help with 8s but you can make 20 calls in 8 seconds for larger background tasks like research or document summarization.Įdit: Was able to run some classification calls to the API, minimum size, 3s fastest response time from the server. Also we are able to run 20 threads with 500ms overlap and not time out. Also have run tests at all intervals of 24 hour clock and it can be much worse during higher usage times. Edit: also tried the beta chat endpoint as well and no difference in response times. So my question is what is OpenAI doing about back end support and unacceptable response times? Telling all our customers that OpenAI is not ready for prime time is not a solution. Looking at other LLMs at this point and at least a 3 provider redundancy as this is not going to sink our ship. If OpenAI drops the ball we will take over 36million in total funding and 500 billion TAM somewhere else. ![]() We need a guaranteed level of service and response to outage as Microsoft and all the hyperscalers provide. So I need a definitive answer to give my team and company and as well the VCs will be lining up for their 12 million back. Can you say faked cached responses in the demo and pray they stay within cache scope lol. Looking at demos with Microsoft, Amazon, Google, IBM all lined up and unacceptable response times. Fastest internet and located in North America. All that being said with maximum optimization on the client and transport code, and using the fastest /completions endpoint, with the fast 3.5 turbo model and minimals prompts I cannot get the 500 token prompt and completion packages back faster than 8250ms or 8.2 seconds. I cant even hijack that as it is shrouded in Cloudflare security. Could be caching on both client and mostly server and Ajax streaming as well that skews the results towards what appears to be significantly longer API response times and a lightning fast browser response in comparison. Using a paid Plus account and a paid for API account, the response times are much slower on the API. Was just running a significant amount of automated testing for various endpoints and models in terms of response time. If PLUS doesn’t help the slow API calls, would the Azure hosted ChatGPT version help instead? It would be a pain to change a bunch of code to use Azure instead, but again, the OpenAI API endpoint is kind of unusable as things stand. If PLUS improves API response time, I’m more than happy to pay for it - just give us the proper documentation/information that we need. I find myself in the same boat as other users here - I have a service/app nearing readiness for launch, but have noticed a huge slowdown in the API (using 3.5-turbo model), to the point where it’s totally unusable and I’m even unwilling to risk a live demo of my system at this point. User patrickkzhao says that he has noticed a significant speed up of API responses when subscribed to PLUS ( Chat GPT's API is significantly slower than the website with GPT Plus - #12 by patrickkzhao), however, user ruby_coder said that the billing systems are unrelated and PLUS would not improve API responses.( Chat GPT's API is significantly slower than the website with GPT Plus - #8 by ruby_coder) Can we please get an official OpenAI statement/clarification on this?
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |