T-Rex Label

Inference Server

Inference server is a software or hardware system specially used to deploy deep learning models and provide inference services. It has functions such as model loading, request scheduling, and resource management, and can efficiently process large-scale inference requests, such as TensorRT Inference Server and TorchServe.