To enable vLLM's sleep mode, you can first use community patched code (from this pull request) to build vLLM from the source code in the corresponding pull request. After the patch merged in vLLM main ...
For now, you can use the vLLM main branch and build it from the source code, or you can directly install vLLM from the pre-built ROCm wheels for vLLM version later than 0.11.0 when it's available.