Zendesk interview question

How would you design an Realtime LLM Inference Service