pytorch中嘗試用多進(jìn)程加載訓(xùn)練數(shù)據(jù)集,源碼如下:
trainloader = torch.utils.data.DataLoader(trainset, batch_size=4, shuffle=True, num_workers=3)
結(jié)果報錯:
RuntimeError:
An attempt has been made to start a new process before the
current process has finished its bootstrapping phase.
This probably means that you are not using fork to start your
child processes and you have forgotten to use the proper idiom
in the main module:
if __name__ == '__main__':
freeze_support()
...
The "freeze_support()" line can be omitted if the program
is not going to be frozen to produce an executable.
從報錯信息可以看到,當(dāng)前進(jìn)程在運(yùn)行可執(zhí)行代碼時,產(chǎn)生了一個新進(jìn)程。這可能意味著您沒有使用fork來啟動子進(jìn)程或者是未在主模塊中正確使用。
后來經(jīng)過查閱發(fā)現(xiàn)了原因,因?yàn)閣indows系統(tǒng)下默認(rèn)用spawn方法部署多線程,如果代碼沒有受到__main__模塊的保護(hù),新進(jìn)程都認(rèn)為是要再次運(yùn)行的代碼,將嘗試再次執(zhí)行與父進(jìn)程相同的代碼,生成另一個進(jìn)程,依此類推,直到程序崩潰。
解決方法很簡單
把調(diào)用多進(jìn)程的代碼放到__main__模塊下即可。
if __name__ == '__main__':
transform = transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])
trainset = torchvision.datasets.CIFAR10(root='./data', train=True, download=True, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=4, shuffle=True, num_workers=3)
補(bǔ)充:pytorch-Dataloader多進(jìn)程使用出錯
使用Dataloader進(jìn)行多進(jìn)程數(shù)據(jù)導(dǎo)入訓(xùn)練時,會因?yàn)槎噙M(jìn)程的問題而出錯
dataloader = DataLoader(transformed_dataset, batch_size=4,shuffle=True, num_workers=4)
其中參數(shù)num_works=表示載入數(shù)據(jù)時使用的進(jìn)程數(shù),此時如果參數(shù)的值不為0而使用多進(jìn)程時會出現(xiàn)報錯
RuntimeError: An attempt has been made to start a new process before the current process has finished its bootstrapping phase. This probably means that you are not using fork to start your child processes and you have forgotten to use the proper idiom in the main module: if __name__ == '__main__': freeze_support() ... The "freeze_support()" line can be omitted if the program is not going to be frozen to produce an executable.
此時在數(shù)據(jù)的調(diào)用之前加上if __name__ == '__main__':即可解決問題
if __name__ == '__main__':#這個地方可以解決多線程的問題
for i_batch, sample_batched in enumerate(dataloader):
以上為個人經(jīng)驗(yàn),希望能給大家一個參考,也希望大家多多支持腳本之家。
您可能感興趣的文章:- pytorch多進(jìn)程加速及代碼優(yōu)化方法
- PyTorch 解決Dataset和Dataloader遇到的問題
- 解決pytorch DataLoader num_workers出現(xiàn)的問題