What is the best practice for processing large batches of api calls to optimize for performance and cost?

I am exploring how to architect a new feature that requires processing large amounts of API calls to LLM providers like openAI, Claude, Openrouter etc.

I am building an agentic research system where a user's task requires multiple API calls to LLMs (at minimum 3) and other external APIs (search) to complete the task.

Users will be able to initiate many of these tasks (potentially 1000's) , of course it's not expected that the results will be instant, so users will initiate a task and await results once the task is completed however long it takes.

As i understand Xano has a few options to potentially handle these batch style calls via Async functions, Background tasks, Postprocess.

Does anyone have experience building a similar system and what your experience was . I'm particular interested to know what might be the best approach to optimize for performance, enable logging errors, and controlling costs.