{"id":281,"date":"2026-04-14T21:39:22","date_gmt":"2026-04-14T19:39:22","guid":{"rendered":"https:\/\/blog.nickywin.com\/?p=281"},"modified":"2026-06-05T16:24:00","modified_gmt":"2026-06-05T14:24:00","slug":"ubuntu-24-04-comfyui-vllm-local-ai-workstation","status":"publish","type":"post","link":"https:\/\/blog.nickywin.com\/ubuntu-24-04-comfyui-vllm-local-ai-workstation\/","title":{"rendered":"Ubuntu 24.04 \u642d\u5efa ComfyUI + vLLM \u5b8c\u5168\u6307\u5357\uff1a\u6253\u9020\u672c\u5730 AI \u5de5\u4f5c\u7ad9"},"content":{"rendered":"<p><div class='fancybox-wrapper lazyload-container-unload' data-fancybox='post-images' href='https:\/\/blog.nickywin.com\/wp-content\/uploads\/2026\/04\/tutorial_header_1776194185670.jpg'><img class=\"lazyload lazyload-style-1\" src=\"data:image\/svg+xml;base64,PCEtLUFyZ29uTG9hZGluZy0tPgo8c3ZnIHdpZHRoPSIxIiBoZWlnaHQ9IjEiIHhtbG5zPSJodHRwOi8vd3d3LnczLm9yZy8yMDAwL3N2ZyIgc3Ryb2tlPSIjZmZmZmZmMDAiPjxnPjwvZz4KPC9zdmc+\"  decoding=\"async\" data-original=\"https:\/\/blog.nickywin.com\/wp-content\/uploads\/2026\/04\/tutorial_header_1776194185670.jpg\" src=\"data:image\/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAYAAAAfFcSJAAAAAXNSR0IArs4c6QAAAARnQU1BAACxjwv8YQUAAAAJcEhZcwAADsQAAA7EAZUrDhsAAAANSURBVBhXYzh8+PB\/AAffA0nNPuCLAAAAAElFTkSuQmCC\" alt=\"Ubuntu 24.04 \u642d\u5efa ComfyUI + vLLM \u6559\u7a0b\u5c01\u9762\" style=\"width:100%;border-radius:12px;margin-bottom:20px\" \/><\/div><\/p>\n<h2>\u524d\u8a00<\/h2>\n<p>\u968f\u7740 AI \u6280\u672f\u7684\u98de\u901f\u53d1\u5c55\uff0c\u672c\u5730\u90e8\u7f72 AI \u6a21\u578b\u5df2\u7ecf\u6210\u4e3a\u8d8a\u6765\u8d8a\u591a\u5f00\u53d1\u8005\u548c\u521b\u4f5c\u8005\u7684\u521a\u9700\u3002<strong>ComfyUI<\/strong> \u662f\u76ee\u524d\u6700\u5f3a\u5927\u7684 Stable Diffusion \u56fe\u50cf\u751f\u6210\u5de5\u4f5c\u6d41\u5de5\u5177\uff0c\u800c <strong>vLLM<\/strong> \u5219\u662f\u6027\u80fd\u9876\u5c16\u7684\u5927\u8bed\u8a00\u6a21\u578b\u63a8\u7406\u5f15\u64ce\u3002\u672c\u6587\u5c06\u624b\u628a\u624b\u6559\u4f60\u5982\u4f55\u5728 <strong>Ubuntu 24.04 LTS<\/strong> \u4e0a\u540c\u65f6\u642d\u5efa\u8fd9\u4e24\u4e2a AI \u5229\u5668\uff0c\u6253\u9020\u5c5e\u4e8e\u81ea\u5df1\u7684\u672c\u5730 AI \u5de5\u4f5c\u7ad9\u3002<\/p>\n<h2>\ud83d\udccb \u73af\u5883\u8981\u6c42<\/h2>\n<p>\u5728\u5f00\u59cb\u4e4b\u524d\uff0c\u8bf7\u786e\u4fdd\u4f60\u7684\u786c\u4ef6\u548c\u8f6f\u4ef6\u6ee1\u8db3\u4ee5\u4e0b\u8981\u6c42\uff1a<\/p>\n<table style=\"border-collapse:collapse;width:100%;border:1px solid #ddd\">\n<thead>\n<tr style=\"background-color:#2d3748;color:#fff\">\n<th style=\"padding:12px;text-align:left;border:1px solid #4a5568\">\u9879\u76ee<\/th>\n<th style=\"padding:12px;text-align:left;border:1px solid #4a5568\">\u6700\u4f4e\u8981\u6c42<\/th>\n<th style=\"padding:12px;text-align:left;border:1px solid #4a5568\">\u63a8\u8350\u914d\u7f6e<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr style=\"background-color:#1a202c;color:#e2e8f0\">\n<td style=\"padding:10px;border:1px solid #4a5568\">\u64cd\u4f5c\u7cfb\u7edf<\/td>\n<td style=\"padding:10px;border:1px solid #4a5568\">Ubuntu 24.04 LTS<\/td>\n<td style=\"padding:10px;border:1px solid #4a5568\">Ubuntu 24.04 LTS Server<\/td>\n<\/tr>\n<tr style=\"background-color:#2d3748;color:#e2e8f0\">\n<td style=\"padding:10px;border:1px solid #4a5568\">GPU<\/td>\n<td style=\"padding:10px;border:1px solid #4a5568\">NVIDIA GPU\uff08CUDA 3.5+\uff09<\/td>\n<td style=\"padding:10px;border:1px solid #4a5568\">RTX 4090 \/ A100 (24GB+ VRAM)<\/td>\n<\/tr>\n<tr style=\"background-color:#1a202c;color:#e2e8f0\">\n<td style=\"padding:10px;border:1px solid #4a5568\">\u5185\u5b58<\/td>\n<td style=\"padding:10px;border:1px solid #4a5568\">16GB<\/td>\n<td style=\"padding:10px;border:1px solid #4a5568\">32GB \u6216\u66f4\u591a<\/td>\n<\/tr>\n<tr style=\"background-color:#2d3748;color:#e2e8f0\">\n<td style=\"padding:10px;border:1px solid #4a5568\">\u786c\u76d8<\/td>\n<td style=\"padding:10px;border:1px solid #4a5568\">50GB \u53ef\u7528\u7a7a\u95f4<\/td>\n<td style=\"padding:10px;border:1px solid #4a5568\">200GB+ SSD\uff08\u6a21\u578b\u6587\u4ef6\u8f83\u5927\uff09<\/td>\n<\/tr>\n<tr style=\"background-color:#1a202c;color:#e2e8f0\">\n<td style=\"padding:10px;border:1px solid #4a5568\">Python<\/td>\n<td style=\"padding:10px;border:1px solid #4a5568\">3.10+<\/td>\n<td style=\"padding:10px;border:1px solid #4a5568\">3.12 \u6216 3.13<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<h2>\ud83d\udd27 \u7b2c\u4e00\u6b65\uff1a\u7cfb\u7edf\u57fa\u7840\u73af\u5883\u51c6\u5907<\/h2>\n<h3>1.1 \u66f4\u65b0\u7cfb\u7edf<\/h3>\n<p>\u9996\u5148\u786e\u4fdd\u7cfb\u7edf\u662f\u6700\u65b0\u7684\uff1a<\/p>\n<pre><code>sudo apt update &amp;&amp; sudo apt upgrade -y\nsudo apt install -y git python3 python3-pip python3-venv python3-dev build-essential libgl1 wget<\/code><\/pre>\n<h3>1.2 \u5b89\u88c5 NVIDIA \u9a71\u52a8<\/h3>\n<p>\u8fd9\u662f\u6700\u5173\u952e\u7684\u4e00\u6b65\uff01\u6ca1\u6709\u6b63\u786e\u7684 GPU \u9a71\u52a8\uff0cComfyUI \u548c vLLM \u90fd\u65e0\u6cd5\u53d1\u6325 GPU \u52a0\u901f\u7684\u80fd\u529b\u3002<\/p>\n<p><div class='fancybox-wrapper lazyload-container-unload' data-fancybox='post-images' href='https:\/\/blog.nickywin.com\/wp-content\/uploads\/2026\/04\/nvidia_gpu_setup_1776194201327.jpg'><img class=\"lazyload lazyload-style-1\" src=\"data:image\/svg+xml;base64,PCEtLUFyZ29uTG9hZGluZy0tPgo8c3ZnIHdpZHRoPSIxIiBoZWlnaHQ9IjEiIHhtbG5zPSJodHRwOi8vd3d3LnczLm9yZy8yMDAwL3N2ZyIgc3Ryb2tlPSIjZmZmZmZmMDAiPjxnPjwvZz4KPC9zdmc+\"  decoding=\"async\" data-original=\"https:\/\/blog.nickywin.com\/wp-content\/uploads\/2026\/04\/nvidia_gpu_setup_1776194201327.jpg\" src=\"data:image\/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAYAAAAfFcSJAAAAAXNSR0IArs4c6QAAAARnQU1BAACxjwv8YQUAAAAJcEhZcwAADsQAAA7EAZUrDhsAAAANSURBVBhXYzh8+PB\/AAffA0nNPuCLAAAAAElFTkSuQmCC\" alt=\"NVIDIA GPU \u9a71\u52a8\u914d\u7f6e\u793a\u610f\u56fe\" style=\"width:100%;border-radius:8px;margin:15px 0\" \/><\/div><\/p>\n<pre><code># \u67e5\u770b\u53ef\u7528\u7684 NVIDIA \u9a71\u52a8\u7248\u672c\nsudo ubuntu-drivers list --gpgpu\n\n# \u5b89\u88c5\u63a8\u8350\u7684\u9a71\u52a8\uff08\u4ee5 580 \u7cfb\u5217\u4e3a\u4f8b\uff09\nsudo ubuntu-drivers install --gpgpu nvidia:580-server\n\n# \u91cd\u542f\u7cfb\u7edf\nsudo reboot<\/code><\/pre>\n<p>\u91cd\u542f\u540e\u9a8c\u8bc1\u9a71\u52a8\u5b89\u88c5\uff1a<\/p>\n<pre><code>nvidia-smi<\/code><\/pre>\n<p>\u5982\u679c\u770b\u5230\u7c7b\u4f3c\u4e0b\u9762\u7684\u8f93\u51fa\uff0c\u8bf4\u660e\u9a71\u52a8\u5b89\u88c5\u6210\u529f\uff1a<\/p>\n<pre><code>+-----------------------------------------------------------------------------+\n| NVIDIA-SMI 580.xx.xx    Driver Version: 580.xx.xx    CUDA Version: 13.0     |\n|-------------------------------+----------------------+----------------------+\n| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |\n| Fan  Temp  Perf  Pwr:Usage\/Cap|         Memory-Usage | GPU-Util  Compute M. |\n|===============================+======================+======================|\n|   0  NVIDIA GeForce ...  Off  | 00000000:01:00.0 Off |                  N\/A |\n|  0%   35C    P8    10W \/ 350W |      0MiB \/ 24576MiB |      0%      Default |\n+-------------------------------+----------------------+----------------------+<\/code><\/pre>\n<blockquote>\n<p>\ud83d\udca1 <strong>\u63d0\u793a\uff1a<\/strong>\u5982\u679c\u4f60\u7684\u670d\u52a1\u5668\u5df2\u7ecf\u9884\u88c5\u4e86 NVIDIA \u9a71\u52a8\uff0c\u53ef\u4ee5\u8df3\u8fc7\u8fd9\u4e00\u6b65\uff0c\u76f4\u63a5\u8fd0\u884c <code>nvidia-smi<\/code> \u68c0\u67e5\u5373\u53ef\u3002<\/p>\n<\/blockquote>\n<h2>\ud83c\udfa8 \u7b2c\u4e8c\u6b65\uff1a\u5b89\u88c5 ComfyUI<\/h2>\n<p>ComfyUI \u662f\u4e00\u4e2a\u57fa\u4e8e\u8282\u70b9\u7684 Stable Diffusion \u56fe\u50cf\u751f\u6210\u754c\u9762\uff0c\u5b83\u7684\u5de5\u4f5c\u6d41\u8bbe\u8ba1\u8ba9\u4f60\u53ef\u4ee5\u7075\u6d3b\u5730\u7ec4\u5408\u5404\u79cd AI \u6a21\u578b\u548c\u5904\u7406\u8282\u70b9\uff0c\u5b9e\u73b0\u590d\u6742\u7684\u56fe\u50cf\u751f\u6210\u6d41\u7a0b\u3002<\/p>\n<p><div class='fancybox-wrapper lazyload-container-unload' data-fancybox='post-images' href='https:\/\/blog.nickywin.com\/wp-content\/uploads\/2026\/04\/comfyui_interface_1776194214642.jpg'><img class=\"lazyload lazyload-style-1\" src=\"data:image\/svg+xml;base64,PCEtLUFyZ29uTG9hZGluZy0tPgo8c3ZnIHdpZHRoPSIxIiBoZWlnaHQ9IjEiIHhtbG5zPSJodHRwOi8vd3d3LnczLm9yZy8yMDAwL3N2ZyIgc3Ryb2tlPSIjZmZmZmZmMDAiPjxnPjwvZz4KPC9zdmc+\"  decoding=\"async\" data-original=\"https:\/\/blog.nickywin.com\/wp-content\/uploads\/2026\/04\/comfyui_interface_1776194214642.jpg\" src=\"data:image\/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAYAAAAfFcSJAAAAAXNSR0IArs4c6QAAAARnQU1BAACxjwv8YQUAAAAJcEhZcwAADsQAAA7EAZUrDhsAAAANSURBVBhXYzh8+PB\/AAffA0nNPuCLAAAAAElFTkSuQmCC\" alt=\"ComfyUI \u8282\u70b9\u5f0f\u5de5\u4f5c\u6d41\u754c\u9762\" style=\"width:100%;border-radius:8px;margin:15px 0\" \/><\/div><\/p>\n<h3>\u65b9\u6cd5\u4e00\uff1a\u4f7f\u7528 Comfy CLI\uff08\u63a8\u8350\uff09<\/h3>\n<p>Comfy CLI \u662f\u5b98\u65b9\u63d0\u4f9b\u7684\u547d\u4ee4\u884c\u7ba1\u7406\u5de5\u5177\uff0c\u53ef\u4ee5\u4e00\u952e\u5b89\u88c5\u548c\u7ba1\u7406 ComfyUI\uff1a<\/p>\n<pre><code># \u521b\u5efa\u5de5\u4f5c\u76ee\u5f55\nmkdir ~\/ComfyUI_install &amp;&amp; cd ~\/ComfyUI_install\n\n# \u521b\u5efa Python \u865a\u62df\u73af\u5883\npython3 -m venv venv\nsource venv\/bin\/activate\n\n# \u5b89\u88c5 Comfy CLI\npip install comfy-cli\n\n# \u4e00\u952e\u5b89\u88c5 ComfyUI\ncomfy install\n\n# \u542f\u52a8 ComfyUI\ncomfy launch -- --listen 0.0.0.0<\/code><\/pre>\n<h3>\u65b9\u6cd5\u4e8c\uff1a\u624b\u52a8\u5b89\u88c5\uff08\u4ece\u6e90\u7801\uff09<\/h3>\n<p>\u5982\u679c\u4f60\u559c\u6b22\u624b\u52a8\u63a7\u5236\u6bcf\u4e00\u6b65\uff1a<\/p>\n<pre><code># \u514b\u9686\u4ed3\u5e93\ngit clone https:\/\/github.com\/comfyanonymous\/ComfyUI.git ~\/ComfyUI\ncd ~\/ComfyUI\n\n# \u521b\u5efa\u865a\u62df\u73af\u5883\npython3 -m venv .venv\nsource .venv\/bin\/activate\n\n# \u5347\u7ea7 pip\npip install --upgrade pip wheel\n\n# \u5b89\u88c5 PyTorch\uff08CUDA 13.0 \u7248\u672c\uff09\npip install torch torchvision torchaudio --extra-index-url https:\/\/download.pytorch.org\/whl\/cu130\n\n# \u5982\u679c\u662f\u8f83\u65e7\u7684 GPU\uff0810 \u7cfb\u5217\uff09\uff0c\u4f7f\u7528 CUDA 12.6\uff1a\n# pip install torch torchvision torchaudio --extra-index-url https:\/\/download.pytorch.org\/whl\/cu126\n\n# \u5b89\u88c5 ComfyUI \u4f9d\u8d56\npip install -r requirements.txt<\/code><\/pre>\n<h3>\u542f\u52a8 ComfyUI<\/h3>\n<pre><code># \u6fc0\u6d3b\u865a\u62df\u73af\u5883\uff08\u5982\u679c\u5c1a\u672a\u6fc0\u6d3b\uff09\nsource ~\/ComfyUI\/.venv\/bin\/activate\n\n# \u542f\u52a8\uff0c\u5141\u8bb8\u5c40\u57df\u7f51\u8bbf\u95ee\npython main.py --listen 0.0.0.0<\/code><\/pre>\n<p>\u542f\u52a8\u540e\u8bbf\u95ee <code>http:\/\/\u4f60\u7684\u670d\u52a1\u5668IP:8188<\/code> \u5373\u53ef\u770b\u5230 ComfyUI \u7684\u754c\u9762\u3002<\/p>\n<h3>\u4e0b\u8f7d\u6a21\u578b<\/h3>\n<p>ComfyUI \u672c\u8eab\u4e0d\u81ea\u5e26\u6a21\u578b\uff0c\u4f60\u9700\u8981\u624b\u52a8\u4e0b\u8f7d\u3002\u4ee5\u4e0b\u662f\u4e00\u4e9b\u5e38\u7528\u6a21\u578b\u7684\u5b58\u653e\u8def\u5f84\uff1a<\/p>\n<pre><code># Stable Diffusion \u68c0\u67e5\u70b9\u6a21\u578b\u653e\u5230\u8fd9\u91cc\n~\/ComfyUI\/models\/checkpoints\/\n\n# VAE \u6a21\u578b\n~\/ComfyUI\/models\/vae\/\n\n# LoRA \u6a21\u578b\n~\/ComfyUI\/models\/loras\/\n\n# ControlNet \u6a21\u578b\n~\/ComfyUI\/models\/controlnet\/<\/code><\/pre>\n<p>\u63a8\u8350\u4ece <a href=\"https:\/\/huggingface.co\" target=\"_blank\">HuggingFace<\/a> \u6216 <a href=\"https:\/\/civitai.com\" target=\"_blank\">CivitAI<\/a> \u4e0b\u8f7d\u6a21\u578b\u3002<\/p>\n<h3>\u8bbe\u7f6e\u5f00\u673a\u81ea\u542f\uff08\u53ef\u9009\uff09<\/h3>\n<p>\u4f7f\u7528 systemd \u670d\u52a1\u8ba9 ComfyUI \u5f00\u673a\u81ea\u52a8\u8fd0\u884c\uff1a<\/p>\n<pre><code>sudo tee \/etc\/systemd\/system\/comfyui.service &lt;&lt; 'EOF'\n[Unit]\nDescription=ComfyUI Service\nAfter=network.target\n\n[Service]\nType=simple\nUser=\u4f60\u7684\u7528\u6237\u540d\nWorkingDirectory=\/home\/\u4f60\u7684\u7528\u6237\u540d\/ComfyUI\nExecStart=\/home\/\u4f60\u7684\u7528\u6237\u540d\/ComfyUI\/.venv\/bin\/python main.py --listen 0.0.0.0\nRestart=on-failure\nRestartSec=10\n\n[Install]\nWantedBy=multi-user.target\nEOF\n\nsudo systemctl daemon-reload\nsudo systemctl enable comfyui\nsudo systemctl start comfyui<\/code><\/pre>\n<h2>\ud83d\ude80 \u7b2c\u4e09\u6b65\uff1a\u5b89\u88c5 vLLM<\/h2>\n<p>vLLM \u662f\u4e00\u4e2a\u9ad8\u6027\u80fd\u7684\u5927\u8bed\u8a00\u6a21\u578b\uff08LLM\uff09\u63a8\u7406\u548c\u670d\u52a1\u5f15\u64ce\uff0c\u652f\u6301 PagedAttention \u6280\u672f\uff0c\u80fd\u591f\u4ee5\u6781\u9ad8\u7684\u541e\u5410\u91cf\u8fd0\u884c\u5404\u79cd\u5f00\u6e90 LLM\uff0c\u5982 Qwen\u3001Llama\u3001Gemma \u7b49\u3002<\/p>\n<p><div class='fancybox-wrapper lazyload-container-unload' data-fancybox='post-images' href='https:\/\/blog.nickywin.com\/wp-content\/uploads\/2026\/04\/vllm_architecture_1776194228757.jpg'><img class=\"lazyload lazyload-style-1\" src=\"data:image\/svg+xml;base64,PCEtLUFyZ29uTG9hZGluZy0tPgo8c3ZnIHdpZHRoPSIxIiBoZWlnaHQ9IjEiIHhtbG5zPSJodHRwOi8vd3d3LnczLm9yZy8yMDAwL3N2ZyIgc3Ryb2tlPSIjZmZmZmZmMDAiPjxnPjwvZz4KPC9zdmc+\"  decoding=\"async\" data-original=\"https:\/\/blog.nickywin.com\/wp-content\/uploads\/2026\/04\/vllm_architecture_1776194228757.jpg\" src=\"data:image\/png;base64,iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAYAAAAfFcSJAAAAAXNSR0IArs4c6QAAAARnQU1BAACxjwv8YQUAAAAJcEhZcwAADsQAAA7EAZUrDhsAAAANSURBVBhXYzh8+PB\/AAffA0nNPuCLAAAAAElFTkSuQmCC\" alt=\"vLLM \u63a8\u7406\u670d\u52a1\u67b6\u6784\u56fe\" style=\"width:100%;border-radius:8px;margin:15px 0\" \/><\/div><\/p>\n<h3>3.1 \u5b89\u88c5 uv \u5305\u7ba1\u7406\u5668\uff08\u63a8\u8350\uff09<\/h3>\n<p>uv \u662f\u4e00\u4e2a\u6781\u901f\u7684 Python \u5305\u7ba1\u7406\u5de5\u5177\uff0c\u6bd4\u4f20\u7edf\u7684 pip \u5feb 10-100 \u500d\uff1a<\/p>\n<pre><code># \u5b89\u88c5 uv\nwget -qO- https:\/\/astral.sh\/uv\/install.sh | sh\nsource $HOME\/.cargo\/env<\/code><\/pre>\n<h3>3.2 \u521b\u5efa\u865a\u62df\u73af\u5883\u5e76\u5b89\u88c5 vLLM<\/h3>\n<pre><code># \u521b\u5efa\u4e13\u7528\u76ee\u5f55\nmkdir ~\/vllm-service &amp;&amp; cd ~\/vllm-service\n\n# \u4f7f\u7528 uv \u521b\u5efa\u865a\u62df\u73af\u5883\nuv venv --python 3.12\nsource .venv\/bin\/activate\n\n# \u5b89\u88c5 vLLM\uff08\u81ea\u52a8\u68c0\u6d4b CUDA \u7248\u672c\uff09\nuv pip install vllm --torch-backend=auto<\/code><\/pre>\n<p>\u5982\u679c\u6ca1\u6709\u5b89\u88c5 uv\uff0c\u4e5f\u53ef\u4ee5\u7528\u4f20\u7edf\u7684 pip \u65b9\u5f0f\uff1a<\/p>\n<pre><code>python3 -m venv ~\/vllm-service\/.venv\nsource ~\/vllm-service\/.venv\/bin\/activate\npip install vllm<\/code><\/pre>\n<h3>3.3 \u9a8c\u8bc1\u5b89\u88c5<\/h3>\n<pre><code>python3 -c \"import vllm; print(f'vLLM version: {vllm.__version__}')\"<\/code><\/pre>\n<h3>3.4 \u542f\u52a8 vLLM \u670d\u52a1<\/h3>\n<p>vLLM \u53ef\u4ee5\u542f\u52a8\u4e00\u4e2a\u517c\u5bb9 OpenAI API \u683c\u5f0f\u7684\u63a8\u7406\u670d\u52a1\u5668\uff1a<\/p>\n<pre><code># \u4ee5 Qwen2.5-7B \u6a21\u578b\u4e3a\u4f8b\nvllm serve Qwen\/Qwen2.5-7B-Instruct \\\n  --host 0.0.0.0 \\\n  --port 8000 \\\n  --gpu-memory-utilization 0.85 \\\n  --max-model-len 8192<\/code><\/pre>\n<p>\u542f\u52a8\u540e\uff0c\u4f60\u53ef\u4ee5\u901a\u8fc7\u4ee5\u4e0b\u65b9\u5f0f\u6d4b\u8bd5 API\uff1a<\/p>\n<pre><code># \u6d4b\u8bd5\u5bf9\u8bdd\u63a5\u53e3\ncurl http:\/\/localhost:8000\/v1\/chat\/completions \\\n  -H \"Content-Type: application\/json\" \\\n  -d '{\n    \"model\": \"Qwen\/Qwen2.5-7B-Instruct\",\n    \"messages\": [\n      {\"role\": \"user\", \"content\": \"\u4f60\u597d\uff0c\u8bf7\u4ecb\u7ecd\u4e00\u4e0b\u4f60\u81ea\u5df1\"}\n    ],\n    \"temperature\": 0.7,\n    \"max_tokens\": 512\n  }'<\/code><\/pre>\n<h3>3.5 vLLM \u5e38\u7528\u53c2\u6570\u8bf4\u660e<\/h3>\n<table style=\"border-collapse:collapse;width:100%;border:1px solid #ddd\">\n<thead>\n<tr style=\"background-color:#2d3748;color:#fff\">\n<th style=\"padding:10px;text-align:left;border:1px solid #4a5568\">\u53c2\u6570<\/th>\n<th style=\"padding:10px;text-align:left;border:1px solid #4a5568\">\u8bf4\u660e<\/th>\n<th style=\"padding:10px;text-align:left;border:1px solid #4a5568\">\u793a\u4f8b\u503c<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr style=\"background-color:#1a202c;color:#e2e8f0\">\n<td style=\"padding:8px;border:1px solid #4a5568\"><code>--host<\/code><\/td>\n<td style=\"padding:8px;border:1px solid #4a5568\">\u76d1\u542c\u5730\u5740<\/td>\n<td style=\"padding:8px;border:1px solid #4a5568\">0.0.0.0<\/td>\n<\/tr>\n<tr style=\"background-color:#2d3748;color:#e2e8f0\">\n<td style=\"padding:8px;border:1px solid #4a5568\"><code>--port<\/code><\/td>\n<td style=\"padding:8px;border:1px solid #4a5568\">\u76d1\u542c\u7aef\u53e3<\/td>\n<td style=\"padding:8px;border:1px solid #4a5568\">8000<\/td>\n<\/tr>\n<tr style=\"background-color:#1a202c;color:#e2e8f0\">\n<td style=\"padding:8px;border:1px solid #4a5568\"><code>--gpu-memory-utilization<\/code><\/td>\n<td style=\"padding:8px;border:1px solid #4a5568\">GPU \u663e\u5b58\u4f7f\u7528\u6bd4\u4f8b<\/td>\n<td style=\"padding:8px;border:1px solid #4a5568\">0.85<\/td>\n<\/tr>\n<tr style=\"background-color:#2d3748;color:#e2e8f0\">\n<td style=\"padding:8px;border:1px solid #4a5568\"><code>--max-model-len<\/code><\/td>\n<td style=\"padding:8px;border:1px solid #4a5568\">\u6700\u5927\u4e0a\u4e0b\u6587\u957f\u5ea6<\/td>\n<td style=\"padding:8px;border:1px solid #4a5568\">8192<\/td>\n<\/tr>\n<tr style=\"background-color:#1a202c;color:#e2e8f0\">\n<td style=\"padding:8px;border:1px solid #4a5568\"><code>--quantization<\/code><\/td>\n<td style=\"padding:8px;border:1px solid #4a5568\">\u91cf\u5316\u65b9\u5f0f\uff08\u8282\u7701\u663e\u5b58\uff09<\/td>\n<td style=\"padding:8px;border:1px solid #4a5568\">awq \/ gptq \/ fp8<\/td>\n<\/tr>\n<tr style=\"background-color:#2d3748;color:#e2e8f0\">\n<td style=\"padding:8px;border:1px solid #4a5568\"><code>--tensor-parallel-size<\/code><\/td>\n<td style=\"padding:8px;border:1px solid #4a5568\">\u591a GPU \u5e76\u884c\u6570<\/td>\n<td style=\"padding:8px;border:1px solid #4a5568\">2\uff08\u53cc\u5361\u65f6\uff09<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<h3>3.6 \u8bbe\u7f6e\u5f00\u673a\u81ea\u542f\uff08\u53ef\u9009\uff09<\/h3>\n<pre><code>sudo tee \/etc\/systemd\/system\/vllm.service &lt;&lt; 'EOF'\n[Unit]\nDescription=vLLM Inference Server\nAfter=network.target\n\n[Service]\nType=simple\nUser=\u4f60\u7684\u7528\u6237\u540d\nWorkingDirectory=\/home\/\u4f60\u7684\u7528\u6237\u540d\/vllm-service\nExecStart=\/home\/\u4f60\u7684\u7528\u6237\u540d\/vllm-service\/.venv\/bin\/vllm serve Qwen\/Qwen2.5-7B-Instruct --host 0.0.0.0 --port 8000 --gpu-memory-utilization 0.85\nRestart=on-failure\nRestartSec=10\nEnvironment=HUGGING_FACE_HUB_TOKEN=\u4f60\u7684token\n\n[Install]\nWantedBy=multi-user.target\nEOF\n\nsudo systemctl daemon-reload\nsudo systemctl enable vllm\nsudo systemctl start vllm<\/code><\/pre>\n<h2>\u2699\ufe0f \u7b2c\u56db\u6b65\uff1aComfyUI + vLLM \u8054\u52a8\uff08\u8fdb\u9636\u73a9\u6cd5\uff09<\/h2>\n<p>\u4f60\u53ef\u4ee5\u5c06 vLLM \u548c ComfyUI \u7ed3\u5408\u4f7f\u7528\uff0c\u5b9e\u73b0 <strong>&#8220;\u6587\u5b57\u63cf\u8ff0 \u2192 AI \u81ea\u52a8\u6269\u5199 Prompt \u2192 \u751f\u6210\u56fe\u7247&#8221;<\/strong> \u7684\u5168\u81ea\u52a8\u5de5\u4f5c\u6d41\u3002<\/p>\n<p>\u601d\u8def\u5982\u4e0b\uff1a<\/p>\n<ol>\n<li>\u7528 vLLM \u90e8\u7f72\u4e00\u4e2a LLM\uff08\u5982 Qwen2.5\uff09\uff0c\u901a\u8fc7 API \u63a5\u6536\u7b80\u77ed\u63cf\u8ff0\uff0c\u81ea\u52a8\u6269\u5c55\u4e3a\u8be6\u7ec6\u7684\u56fe\u7247\u751f\u6210 Prompt<\/li>\n<li>\u5c06\u6269\u5c55\u540e\u7684 Prompt \u4f20\u9012\u7ed9 ComfyUI \u7684 API \u63a5\u53e3<\/li>\n<li>ComfyUI \u6839\u636e Prompt \u81ea\u52a8\u751f\u6210\u9ad8\u8d28\u91cf\u56fe\u7247<\/li>\n<\/ol>\n<p>\u8fd9\u79cd\u7ec4\u5408\u5728\u6279\u91cf\u751f\u6210\u7d20\u6750\u3001\u8bbe\u8ba1 AI \u521b\u4f5c\u5de5\u4f5c\u6d41\u7b49\u573a\u666f\u4e2d\u975e\u5e38\u5b9e\u7528\uff01<\/p>\n<h2>\ud83d\udee1\ufe0f \u5e38\u89c1\u95ee\u9898\u6392\u67e5<\/h2>\n<h3>Q1: nvidia-smi \u62a5\u9519 &#8220;NVIDIA-SMI has failed&#8221;<\/h3>\n<p>\u89e3\u51b3\u65b9\u6848\uff1a\u91cd\u65b0\u5b89\u88c5\u9a71\u52a8\uff0c\u6216\u8005\u68c0\u67e5\u5185\u6838\u7248\u672c\u662f\u5426\u5339\u914d\uff1a<\/p>\n<pre><code>sudo apt install --reinstall nvidia-driver-580-server\nsudo reboot<\/code><\/pre>\n<h3>Q2: ComfyUI \u542f\u52a8\u540e CUDA out of memory<\/h3>\n<p>\u4f7f\u7528\u4f4e\u663e\u5b58\u6a21\u5f0f\u542f\u52a8\uff1a<\/p>\n<pre><code>python main.py --listen 0.0.0.0 --lowvram<\/code><\/pre>\n<h3>Q3: vLLM \u5b89\u88c5\u65f6\u62a5\u9519 torch \u7248\u672c\u4e0d\u517c\u5bb9<\/h3>\n<p>\u6307\u5b9a CUDA \u540e\u7aef\u5b89\u88c5\uff1a<\/p>\n<pre><code>uv pip install vllm --torch-backend=cu126<\/code><\/pre>\n<h3>Q4: \u5982\u4f55\u8ba9 ComfyUI \u548c vLLM \u540c\u65f6\u8fd0\u884c\u4e0d\u51b2\u7a81\uff1f<\/h3>\n<p>\u5982\u679c\u53ea\u6709\u4e00\u5f20 GPU\uff0c\u5efa\u8bae\uff1a<\/p>\n<ul>\n<li>\u4e3a vLLM \u9650\u5236\u663e\u5b58\u4f7f\u7528\uff1a<code>--gpu-memory-utilization 0.5<\/code><\/li>\n<li>\u4e3a ComfyUI \u4f7f\u7528 <code>--lowvram<\/code> \u6a21\u5f0f<\/li>\n<li>\u6216\u8005\u53ea\u5728\u9700\u8981\u65f6\u542f\u52a8\u5176\u4e2d\u4e00\u4e2a\u670d\u52a1<\/li>\n<\/ul>\n<h2>\ud83d\udcdd \u603b\u7ed3<\/h2>\n<p>\u901a\u8fc7\u672c\u6559\u7a0b\uff0c\u4f60\u5df2\u7ecf\u6210\u529f\u5728 Ubuntu 24.04 \u4e0a\u642d\u5efa\u4e86\uff1a<\/p>\n<ul>\n<li>\u2705 <strong>ComfyUI<\/strong>\uff1a\u5f3a\u5927\u7684 AI \u56fe\u50cf\u751f\u6210\u5de5\u4f5c\u6d41\u5e73\u53f0<\/li>\n<li>\u2705 <strong>vLLM<\/strong>\uff1a\u9ad8\u6027\u80fd\u5927\u8bed\u8a00\u6a21\u578b\u63a8\u7406\u5f15\u64ce<\/li>\n<\/ul>\n<p>\u8fd9\u4e24\u4e2a\u5de5\u5177\u7684\u7ec4\u5408\uff0c\u8ba9\u4f60\u5728\u672c\u5730\u62e5\u6709\u4e86\u4e00\u4e2a\u529f\u80fd\u5b8c\u5907\u7684 AI \u5de5\u4f5c\u7ad9\u3002\u65e0\u8bba\u662f\u751f\u6210\u7cbe\u7f8e\u7684 AI \u827a\u672f\u4f5c\u54c1\uff0c\u8fd8\u662f\u90e8\u7f72\u81ea\u5df1\u7684\u804a\u5929\u673a\u5668\u4eba\uff0c\u90fd\u53ef\u4ee5\u8f7b\u677e\u5b9e\u73b0\u3002\u795d\u4f60\u73a9\u5f97\u6109\u5feb\uff01\ud83c\udf89<\/p>\n","protected":false},"excerpt":{"rendered":"<p>\u524d\u8a00 \u968f\u7740 AI \u6280\u672f\u7684\u98de\u901f\u53d1\u5c55\uff0c\u672c\u5730\u90e8\u7f72 AI \u6a21\u578b\u5df2\u7ecf\u6210\u4e3a\u8d8a\u6765\u8d8a\u591a\u5f00\u53d1\u8005\u548c\u521b\u4f5c\u8005\u7684\u521a\u9700\u3002ComfyUI \u662f [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":284,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[7],"tags":[],"class_list":["post-281","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-linux-tinkering"],"_links":{"self":[{"href":"https:\/\/blog.nickywin.com\/wp-json\/wp\/v2\/posts\/281","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/blog.nickywin.com\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/blog.nickywin.com\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/blog.nickywin.com\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/blog.nickywin.com\/wp-json\/wp\/v2\/comments?post=281"}],"version-history":[{"count":4,"href":"https:\/\/blog.nickywin.com\/wp-json\/wp\/v2\/posts\/281\/revisions"}],"predecessor-version":[{"id":301,"href":"https:\/\/blog.nickywin.com\/wp-json\/wp\/v2\/posts\/281\/revisions\/301"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/blog.nickywin.com\/wp-json\/wp\/v2\/media\/284"}],"wp:attachment":[{"href":"https:\/\/blog.nickywin.com\/wp-json\/wp\/v2\/media?parent=281"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/blog.nickywin.com\/wp-json\/wp\/v2\/categories?post=281"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/blog.nickywin.com\/wp-json\/wp\/v2\/tags?post=281"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}