Tags: msp8955/DeepSpeed
Tags
Improve z3 trace management (deepspeedai#1916) * Fix OOM and type mismatch * Toggle prefetching * Disable z3 prefetching for inference (temp workaround) * Fix zero3 tracing issues * Remove debug prints * Enable prefetch for inference * Code clarity * Invalidate trace cache * Trace cache invalidation when needed Separate nvme prefetch from all-gather prefetch * Track last used step id * Use debug name in error message * Construct param trace from module trace Co-authored-by: Jeff Rasley <jerasley@microsoft.com>
Fix OOM and type mismatch (deepspeedai#1884) Co-authored-by: Jeff Rasley <jerasley@microsoft.com>
qkv_out can be a single tensor or a list. Handling these cases separe… …tely. (deepspeedai#1850) Co-authored-by: Jeff Rasley <jerasley@microsoft.com>
[ZeRO] Default disable elastic ckpt in stage 1+2 and reduce CPU memor… …y overhead during ckpt load (deepspeedai#1525) Co-authored-by: Olatunji Ruwase <olruwase@microsoft.com>
Various small documentation text improvements (deepspeedai#1665) Co-authored-by: Jeff Rasley <jerasley@microsoft.com>
Remove unused import of ssl.OP_ENABLE_MIDDLEBOX_COMPAT (deepspeedai#1601 )
PreviousNext