In semi-supervised lesion segmentation, the performance of the teacher network is poor, making it difficult for it to guide the student network to perform effective segmentation. To address this issue, an efficient semi-supervised medical image lesion segmentation method was proposed, employing the medical segment anything model (MedSAM), which exhibited superior feature extraction capabilities, as the teacher network. A lightweight student network based on Mamba was constructed, and its segmentation performance was enhanced through knowledge distillation. To address the semantic mismatch caused by feature alignment across heterogeneous networks, a perturbation-consistent cross-architecture knowledge distillation method was introduced. This approach mapped teacher features to the student feature space and aligned perturbation responses, thereby improving the student network’s feature representation ability and improving segmentation performance. Additionally, to tackle the challenges of diverse lesion morphologies and low foreground-background contrast, leading to poor segmentation consistency, a distribution-based self-supervised loss was proposed for optimization. Experiments on multiple types of medical image lesion segmentation datasets demonstrate that the proposed method in this paper outperforms existing methods in segmentation performance. Meanwhile, the student network has only 1.34 M parameters, which significantly improves the model efficiency.