Loading weigts error when running MiniMax-2.1 with sglang using pipeline parallelism

#18

by tuo02 - opened about 12 hours ago

about 12 hours ago

When I`m running MiniMax-2.1 on H200 with sglang command:
python3 -m sglang.launch_server --model /volume/models/MiniMaxAI/MiniMax-M2.1/snapshots/17f852dfda7d63c61b1107d47552bb30488ffbee --trust-remote-code --tp-size 2 --pp-size 2 --mem-fraction-static 0.85 --chunked-prefill-size 20480 --page-size 64 --cuda-graph-max-bs 64 --enable-metrics

the error occurs:

[2026-01-12 02:39:23 PP3 TP1] Scheduler hit an exception: Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/sglang/srt/managers/scheduler.py", line 2653, in run_scheduler_process
    scheduler = Scheduler(
  File "/usr/local/lib/python3.10/dist-packages/sglang/srt/managers/scheduler.py", line 316, in __init__
    self.tp_worker = TpModelWorker(
  File "/usr/local/lib/python3.10/dist-packages/sglang/srt/managers/tp_worker.py", line 245, in __init__
    self._model_runner = ModelRunner(
  File "/usr/local/lib/python3.10/dist-packages/sglang/srt/model_executor/model_runner.py", line 361, in __init__
    self.initialize(min_per_gpu_memory)
  File "/usr/local/lib/python3.10/dist-packages/sglang/srt/model_executor/model_runner.py", line 436, in initialize
    self.load_model()
  File "/usr/local/lib/python3.10/dist-packages/sglang/srt/model_executor/model_runner.py", line 804, in load_model
    self.model = get_model(
  File "/usr/local/lib/python3.10/dist-packages/sglang/srt/model_loader/__init__.py", line 28, in get_model
    return loader.load_model(
  File "/usr/local/lib/python3.10/dist-packages/sglang/srt/model_loader/loader.py", line 600, in load_model
    self.load_weights_and_postprocess(
  File "/usr/local/lib/python3.10/dist-packages/sglang/srt/model_loader/loader.py", line 608, in load_weights_and_postprocess
    model.load_weights(weights)
  File "/usr/local/lib/python3.10/dist-packages/sglang/srt/models/minimax_m2.py", line 882, in load_weights
    param = params_dict[name]
KeyError: 'model.layers.37.block_sparse_moe.e_score_correction_bias'

[2026-01-12 02:39:23] Received sigquit from a child process. It usually means the child failed.

rogeryoungh

MiniMax org about 12 hours ago

PP is not supported for M2 in vLLM and SGLang currently. We recommend using TP or EP instead, as they have been fully implemented and verified.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment