This is officially supported by Triton, but fp8 (also known as float8) will not work, see the known issue. I recommend to use GGUF instead of fp8 models in this case. RTX 30xx (Ampere)を使用している方は注意しましょう ...