T-Rex Label

FP8

8-bit floating point (also known as quarter-precision). Reducing a model's precision to FP8 can enhance both its inference speed and efficiency while leveraging specialized features of modern GPUs such as tensor cores.