How are we supposed to build AI apps on Xano if requests to large slow models (like OpenAI’s o3-pro or o1-pro) take several minutes to return a response, and after any waiting times over 2 mins, Xano starts draining out lots of memory?
I'm not even talking about multiple llm requests, just 1 that takes probably 5 or 10 minutes to complete, this, multiplied by a large a amount of users doing it at the same time.
Is there a recommended way to handle long-running AI tasks without hitting memory issues?