Tags: ShenglongZ/DeepSpeed
Tags
Fix broken link to DeepSpeed Megatron fork (deepspeedai#2440) Co-authored-by: Lev Kurilenko <lekurile@microsoft.com>
fix deepspeedai#2240: wrong time unit in flops_profiler (deepspeedai#… …2241) Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com> Co-authored-by: Jeff Rasley <jerasley@microsoft.com>
fix ds-inference without policy (deepspeedai#2247) Co-authored-by: Jeff Rasley <jerasley@microsoft.com>
Tensor parallelism for Mixture of Experts (deepspeedai#2074) * tensor parallelism for mixture of experts Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com> Co-authored-by: Ammar Ahmad Awan <ammar.awan@microsoft.com>
use HF NeoX (deepspeedai#2087) Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com> Co-authored-by: Jeff Rasley <jerasley@microsoft.com>
PreviousNext