PyTorch
1. Fix random seed:
torch.manual_seed(config['seed'])
torch.cuda.manual_seed_all(config['seed'])
np.random.seed(config['seed'])
random.seed(config['seed'])
Dataloader seed workers:
def seed_worker(worker_id):
worker_seed = torch.initial_seed() % 2**32
numpy.random.seed(worker_seed)
random.seed(worker_seed)
g = torch.Generator()
g.manual_seed(0)
DataLoader(
train_dataset,
batch_size=batch_size,
num_workers=num_workers,
worker_init_fn=seed_worker
generator=g,
)
2. Distributed
Env:
import torch.distributed as dist
parser.add_argument("--local_rank", type=int, help='local rank for DistributedDataParallel')
opt = parse_option()
torch.cuda.set_device(opt.local_rank) # dist.get_rank()
torch.distributed.init_process_group(backend='nccl', init_method='env://')
Sampler and dataloader:
from torch.utils.data.distributed import DistributedSampler
from torch.utils.data import DataLoader
train_sampler = DistributedSampler(TRAIN_DATASET) # arg shuffle=True by default
train_loader = DataLoader(TRAIN_DATASET,
batch_size=args.batch_size,
shuffle=False,
num_workers=args.num_workers,
sampler=train_sampler,
drop_last=True)
Model:
from torch.nn.parallel import DistributedDataParallel
model = model.cuda()
# arg broadcast_buffers=True by default enables sync_batchnorm
model = DistributedDataParallel(model, device_ids=[dist.get_rank()])
Run in shell:
CUDA_VISIBLE_DEVICES=0,1,2,3 \
python -m torch.distributed.launch --nproc_per_node 4 \
<your python script>
3. cuDNN
torch.backends.cudnn.enabled = True
torch.backends.cudnn.benchmark = True
# fix the algorithm for convolution
# torch.backends.cudnn.deterministic = True
4. TORCH_CUDA_ARCH_LIST
To fix nvcc fatal : Unsupported gpu architecture 'compute_86'
:
export TORCH_CUDA_ARCH_LIST="7.5"
Last update:
July 25, 2021
Authors: