You can not select more than 25 topics Topics must start with a chinese character,a letter or number, can include dashes ('-') and can be up to 35 characters long.

pytorch_16gpu_0.sh 541 B

4 years ago
123456789101112131415161718
  1. #!/bin/bash
  2. GPUS_PER_NODE=8
  3. # Change for multinode config
  4. MASTER_ADDR=162.105.146.117
  5. MASTER_PORT=6000
  6. NNODES=2
  7. NODE_RANK=0
  8. WORLD_SIZE=$(($GPUS_PER_NODE*$NNODES))
  9. workdir=$(cd $(dirname $0); pwd)
  10. mainpy=${workdir}/../torch_main.py
  11. DISTRIBUTED_ARGS="--nproc_per_node $GPUS_PER_NODE --nnodes $NNODES --node_rank $NODE_RANK --master_addr $MASTER_ADDR --master_port $MASTER_PORT"
  12. python -m torch.distributed.launch $DISTRIBUTED_ARGS \
  13. ${mainpy} \
  14. --model $1 --dataset $2 --learning-rate 0.01 --validate --timing --distributed

分布式深度学习系统

Contributors (1)