主頁(yè) > 知識(shí)庫(kù) > pytorch Dataset,DataLoader產(chǎn)生自定義的訓(xùn)練數(shù)據(jù)案例

pytorch Dataset,DataLoader產(chǎn)生自定義的訓(xùn)練數(shù)據(jù)案例

熱門(mén)標(biāo)簽:正安縣地圖標(biāo)注app 阿里電話(huà)機(jī)器人對(duì)話(huà) 舉辦過(guò)冬奧會(huì)的城市地圖標(biāo)注 地圖地圖標(biāo)注有嘆號(hào) 電銷(xiāo)機(jī)器人系統(tǒng)廠(chǎng)家鄭州 qt百度地圖標(biāo)注 400電話(huà)申請(qǐng)資格 遼寧智能外呼系統(tǒng)需要多少錢(qián) 螳螂科技外呼系統(tǒng)怎么用

1. torch.utils.data.Dataset

datasets這是一個(gè)pytorch定義的dataset的源碼集合。下面是一個(gè)自定義Datasets的基本框架,初始化放在__init__()中,其中__getitem__()和__len__()兩個(gè)方法是必須重寫(xiě)的。

__getitem__()返回訓(xùn)練數(shù)據(jù),如圖片和label,而__len__()返回?cái)?shù)據(jù)長(zhǎng)度。

class CustomDataset(data.Dataset):#需要繼承data.Dataset
 def __init__(self):
  # TODO
  # 1. Initialize file path or list of file names.
  pass
 def __getitem__(self, index):
  # TODO
  # 1. Read one data from file (e.g. using numpy.fromfile, PIL.Image.open).
  # 2. Preprocess the data (e.g. torchvision.Transform).
  # 3. Return a data pair (e.g. image and label).
  #這里需要注意的是,第一步:read one data,是一個(gè)data
  pass
 def __len__(self):
  # You should change 0 to the total size of your dataset.
  return 0

2. torch.utils.data.DataLoader

DataLoader(object)可用參數(shù):

dataset(Dataset) 傳入的數(shù)據(jù)集

batch_size(int, optional)每個(gè)batch有多少個(gè)樣本

shuffle(bool, optional)在每個(gè)epoch開(kāi)始的時(shí)候,對(duì)數(shù)據(jù)進(jìn)行重新排序

sampler(Sampler, optional) 自定義從數(shù)據(jù)集中取樣本的策略,如果指定這個(gè)參數(shù),那么shuffle必須為False

batch_sampler(Sampler, optional) 與sampler類(lèi)似,但是一次只返回一個(gè)batch的indices(索引),需要注意的是,一旦指定了這個(gè)參數(shù),那么batch_size,shuffle,sampler,drop_last就不能再制定了(互斥——Mutually exclusive)

num_workers (int, optional) 這個(gè)參數(shù)決定了有幾個(gè)進(jìn)程來(lái)處理data loading。0意味著所有的數(shù)據(jù)都會(huì)被load進(jìn)主進(jìn)程。(默認(rèn)為0)

collate_fn (callable, optional) 將一個(gè)list的sample組成一個(gè)mini-batch的函數(shù)

pin_memory (bool, optional) 如果設(shè)置為T(mén)rue,那么data loader將會(huì)在返回它們之前,將tensors拷貝到CUDA中的固定內(nèi)存(CUDA pinned memory)中.

drop_last (bool, optional) 如果設(shè)置為T(mén)rue:這個(gè)是對(duì)最后的未完成的batch來(lái)說(shuō)的,比如你的batch_size設(shè)置為64,而一個(gè)epoch只有100個(gè)樣本,那么訓(xùn)練的時(shí)候后面的36個(gè)就被扔掉了。 如果為False(默認(rèn)),那么會(huì)繼續(xù)正常執(zhí)行,只是最后的batch_size會(huì)小一點(diǎn)。

timeout(numeric, optional) 如果是正數(shù),表明等待從worker進(jìn)程中收集一個(gè)batch等待的時(shí)間,若超出設(shè)定的時(shí)間還沒(méi)有收集到,那就不收集這個(gè)內(nèi)容了。這個(gè)numeric應(yīng)總是大于等于0。默認(rèn)為0

worker_init_fn (callable, optional) 每個(gè)worker初始化函數(shù) If not None, this will be called on eachworker subprocess with the worker id (an int in [0, num_workers - 1]) as input, after seeding and before data loading. (default: None)

3. 使用Dataset, DataLoader產(chǎn)生自定義訓(xùn)練數(shù)據(jù)

假設(shè)TXT文件保存了數(shù)據(jù)的圖片和label,格式如下:第一列是圖片的名字,第二列是label

0.jpg 0
1.jpg 1
2.jpg 2
3.jpg 3
4.jpg 4
5.jpg 5
6.jpg 6
7.jpg 7
8.jpg 8
9.jpg 9

也可以是多標(biāo)簽的數(shù)據(jù),如:

0.jpg 0 10
1.jpg 1 11
2.jpg 2 12
3.jpg 3 13
4.jpg 4 14
5.jpg 5 15
6.jpg 6 16
7.jpg 7 17
8.jpg 8 18
9.jpg 9 19

圖庫(kù)十張?jiān)紙D片放在./dataset/images目錄下,然后我們就可以自定義一個(gè)Dataset解析這些數(shù)據(jù)并讀取圖片,再使用DataLoader類(lèi)產(chǎn)生batch的訓(xùn)練數(shù)據(jù)

3.1 自定義Dataset

首先先自定義一個(gè)TorchDataset類(lèi),用于讀取圖片數(shù)據(jù),產(chǎn)生標(biāo)簽:

注意初始化函數(shù):

import torch
from torch.autograd import Variable
from torchvision import transforms
from torch.utils.data import Dataset, DataLoader
import numpy as np
from utils import image_processing
import os
 
class TorchDataset(Dataset):
 def __init__(self, filename, image_dir, resize_height=256, resize_width=256, repeat=1):
  '''
  :param filename: 數(shù)據(jù)文件TXT:格式:imge_name.jpg label1_id labe2_id
  :param image_dir: 圖片路徑:image_dir+imge_name.jpg構(gòu)成圖片的完整路徑
  :param resize_height 為None時(shí),不進(jìn)行縮放
  :param resize_width 為None時(shí),不進(jìn)行縮放,
        PS:當(dāng)參數(shù)resize_height或resize_width其中一個(gè)為None時(shí),可實(shí)現(xiàn)等比例縮放
  :param repeat: 所有樣本數(shù)據(jù)重復(fù)次數(shù),默認(rèn)循環(huán)一次,當(dāng)repeat為None時(shí),表示無(wú)限循環(huán)sys.maxsize
  '''
  self.image_label_list = self.read_file(filename)
  self.image_dir = image_dir
  self.len = len(self.image_label_list)
  self.repeat = repeat
  self.resize_height = resize_height
  self.resize_width = resize_width
 
  # 相關(guān)預(yù)處理的初始化
  '''class torchvision.transforms.ToTensor'''
  # 把shape=(H,W,C)的像素值范圍為[0, 255]的PIL.Image或者numpy.ndarray數(shù)據(jù)
  # 轉(zhuǎn)換成shape=(C,H,W)的像素?cái)?shù)據(jù),并且被歸一化到[0.0, 1.0]的torch.FloatTensor類(lèi)型。
  self.toTensor = transforms.ToTensor()
 
  '''class torchvision.transforms.Normalize(mean, std)
  此轉(zhuǎn)換類(lèi)作用于torch. * Tensor,給定均值(R, G, B) 和標(biāo)準(zhǔn)差(R, G, B),
  用公式channel = (channel - mean) / std進(jìn)行規(guī)范化。
  '''
  # self.normalize=transforms.Normalize()
 
 def __getitem__(self, i):
  index = i % self.len
  # print("i={},index={}".format(i, index))
  image_name, label = self.image_label_list[index]
  image_path = os.path.join(self.image_dir, image_name)
  img = self.load_data(image_path, self.resize_height, self.resize_width, normalization=False)
  img = self.data_preproccess(img)
  label=np.array(label)
  return img, label
 
 def __len__(self):
  if self.repeat == None:
   data_len = 10000000
  else:
   data_len = len(self.image_label_list) * self.repeat
  return data_len
 
 def read_file(self, filename):
  image_label_list = []
  with open(filename, 'r') as f:
   lines = f.readlines()
   for line in lines:
    # rstrip:用來(lái)去除結(jié)尾字符、空白符(包括\n、\r、\t、' ',即:換行、回車(chē)、制表符、空格)
    content = line.rstrip().split(' ')
    name = content[0]
    labels = []
    for value in content[1:]:
     labels.append(int(value))
    image_label_list.append((name, labels))
  return image_label_list
 
 def load_data(self, path, resize_height, resize_width, normalization):
  '''
  加載數(shù)據(jù)
  :param path:
  :param resize_height:
  :param resize_width:
  :param normalization: 是否歸一化
  :return:
  '''
  image = image_processing.read_image(path, resize_height, resize_width, normalization)
  return image
 
 def data_preproccess(self, data):
  '''
  數(shù)據(jù)預(yù)處理
  :param data:
  :return:
  '''
  data = self.toTensor(data)
  return data

3.2 DataLoader產(chǎn)生批訓(xùn)練數(shù)據(jù)

if __name__=='__main__':
 train_filename="../dataset/train.txt"
 # test_filename="../dataset/test.txt"
 image_dir='../dataset/images'
 
 epoch_num=2 #總樣本循環(huán)次數(shù)
 batch_size=7 #訓(xùn)練時(shí)的一組數(shù)據(jù)的大小
 train_data_nums=10
 max_iterate=int((train_data_nums+batch_size-1)/batch_size*epoch_num) #總迭代次數(shù)
 
 train_data = TorchDataset(filename=train_filename, image_dir=image_dir,repeat=1)
 # test_data = TorchDataset(filename=test_filename, image_dir=image_dir,repeat=1)
 train_loader = DataLoader(dataset=train_data, batch_size=batch_size, shuffle=False)
 # test_loader = DataLoader(dataset=test_data, batch_size=batch_size,shuffle=False)
 
 # [1]使用epoch方法迭代,TorchDataset的參數(shù)repeat=1
 for epoch in range(epoch_num):
  for batch_image, batch_label in train_loader:
   image=batch_image[0,:]
   image=image.numpy()#image=np.array(image)
   image = image.transpose(1, 2, 0) # 通道由[c,h,w]->[h,w,c]
   image_processing.cv_show_image("image",image)
   print("batch_image.shape:{},batch_label:{}".format(batch_image.shape,batch_label))
   # batch_x, batch_y = Variable(batch_x), Variable(batch_y)

上面的迭代代碼是通過(guò)兩個(gè)for實(shí)現(xiàn),其中參數(shù)epoch_num表示總樣本循環(huán)次數(shù),比如epoch_num=2,那就是所有樣本循環(huán)迭代2次。

但這會(huì)出現(xiàn)一個(gè)問(wèn)題,當(dāng)樣本總數(shù)train_data_nums與batch_size不能整取時(shí),最后一個(gè)batch會(huì)少于規(guī)定batch_size的大小,比如這里樣本總數(shù)train_data_nums=10,batch_size=7,第一次迭代會(huì)產(chǎn)生7個(gè)樣本,第二次迭代會(huì)因?yàn)闃颖静蛔?,只能產(chǎn)生3個(gè)樣本。

我們希望,每次迭代都會(huì)產(chǎn)生相同大小的batch數(shù)據(jù),因此可以如下迭代:注意本人在構(gòu)造TorchDataset類(lèi)時(shí),就已經(jīng)考慮循環(huán)迭代的方法,因此,你現(xiàn)在只需修改repeat為None時(shí),就表示無(wú)限循環(huán)了,調(diào)用方法如下:

 '''
 下面兩種方式,TorchDataset設(shè)置repeat=None可以實(shí)現(xiàn)無(wú)限循環(huán),退出循環(huán)由max_iterate設(shè)定
 '''
 train_data = TorchDataset(filename=train_filename, image_dir=image_dir,repeat=None)
 train_loader = DataLoader(dataset=train_data, batch_size=batch_size, shuffle=False)
 # [2]第2種迭代方法
 for step, (batch_image, batch_label) in enumerate(train_loader):
  image=batch_image[0,:]
  image=image.numpy()#image=np.array(image)
  image = image.transpose(1, 2, 0) # 通道由[c,h,w]->[h,w,c]
  image_processing.cv_show_image("image",image)
  print("step:{},batch_image.shape:{},batch_label:{}".format(step,batch_image.shape,batch_label))
  # batch_x, batch_y = Variable(batch_x), Variable(batch_y)
  if step>=max_iterate:
   break
 # [3]第3種迭代方法
 # for step in range(max_iterate):
 #  batch_image, batch_label=train_loader.__iter__().__next__()
 #  image=batch_image[0,:]
 #  image=image.numpy()#image=np.array(image)
 #  image = image.transpose(1, 2, 0) # 通道由[c,h,w]->[h,w,c]
 #  image_processing.cv_show_image("image",image)
 #  print("batch_image.shape:{},batch_label:{}".format(batch_image.shape,batch_label))
 #  # batch_x, batch_y = Variable(batch_x), Variable(batch_y)

3.3 附件:image_processing.py

上面代碼,用到image_processing,這是本人封裝好的圖像處理包,包含讀取圖片,畫(huà)圖等基本方法:

# -*-coding: utf-8 -*-
"""
 @Project: IntelligentManufacture
 @File : image_processing.py
 @Author : panjq
 @E-mail : pan_jinquan@163.com
 @Date : 2019-02-14 15:34:50
"""
 
import os
import glob
import cv2
import numpy as np
import matplotlib.pyplot as plt
 
def show_image(title, image):
 '''
 調(diào)用matplotlib顯示RGB圖片
 :param title: 圖像標(biāo)題
 :param image: 圖像的數(shù)據(jù)
 :return:
 '''
 # plt.figure("show_image")
 # print(image.dtype)
 plt.imshow(image)
 plt.axis('on') # 關(guān)掉坐標(biāo)軸為 off
 plt.title(title) # 圖像題目
 plt.show()
 
def cv_show_image(title, image):
 '''
 調(diào)用OpenCV顯示RGB圖片
 :param title: 圖像標(biāo)題
 :param image: 輸入RGB圖像
 :return:
 '''
 channels=image.shape[-1]
 if channels==3:
  image = cv2.cvtColor(image, cv2.COLOR_RGB2BGR) # 將BGR轉(zhuǎn)為RGB
 cv2.imshow(title,image)
 cv2.waitKey(0)
 
def read_image(filename, resize_height=None, resize_width=None, normalization=False):
 '''
 讀取圖片數(shù)據(jù),默認(rèn)返回的是uint8,[0,255]
 :param filename:
 :param resize_height:
 :param resize_width:
 :param normalization:是否歸一化到[0.,1.0]
 :return: 返回的RGB圖片數(shù)據(jù)
 '''
 
 bgr_image = cv2.imread(filename)
 # bgr_image = cv2.imread(filename,cv2.IMREAD_IGNORE_ORIENTATION|cv2.IMREAD_COLOR)
 if bgr_image is None:
  print("Warning:不存在:{}", filename)
  return None
 if len(bgr_image.shape) == 2: # 若是灰度圖則轉(zhuǎn)為三通道
  print("Warning:gray image", filename)
  bgr_image = cv2.cvtColor(bgr_image, cv2.COLOR_GRAY2BGR)
 
 rgb_image = cv2.cvtColor(bgr_image, cv2.COLOR_BGR2RGB) # 將BGR轉(zhuǎn)為RGB
 # show_image(filename,rgb_image)
 # rgb_image=Image.open(filename)
 rgb_image = resize_image(rgb_image,resize_height,resize_width)
 rgb_image = np.asanyarray(rgb_image)
 if normalization:
  # 不能寫(xiě)成:rgb_image=rgb_image/255
  rgb_image = rgb_image / 255.0
 # show_image("src resize image",image)
 return rgb_image
 
def fast_read_image_roi(filename, orig_rect, ImreadModes=cv2.IMREAD_COLOR, normalization=False):
 '''
 快速讀取圖片的方法
 :param filename: 圖片路徑
 :param orig_rect:原始圖片的感興趣區(qū)域rect
 :param ImreadModes: IMREAD_UNCHANGED
      IMREAD_GRAYSCALE
      IMREAD_COLOR
      IMREAD_ANYDEPTH
      IMREAD_ANYCOLOR
      IMREAD_LOAD_GDAL
      IMREAD_REDUCED_GRAYSCALE_2
      IMREAD_REDUCED_COLOR_2
      IMREAD_REDUCED_GRAYSCALE_4
      IMREAD_REDUCED_COLOR_4
      IMREAD_REDUCED_GRAYSCALE_8
      IMREAD_REDUCED_COLOR_8
      IMREAD_IGNORE_ORIENTATION
 :param normalization: 是否歸一化
 :return: 返回感興趣區(qū)域ROI
 '''
 # 當(dāng)采用IMREAD_REDUCED模式時(shí),對(duì)應(yīng)rect也需要縮放
 scale=1
 if ImreadModes == cv2.IMREAD_REDUCED_COLOR_2 or ImreadModes == cv2.IMREAD_REDUCED_COLOR_2:
  scale=1/2
 elif ImreadModes == cv2.IMREAD_REDUCED_GRAYSCALE_4 or ImreadModes == cv2.IMREAD_REDUCED_COLOR_4:
  scale=1/4
 elif ImreadModes == cv2.IMREAD_REDUCED_GRAYSCALE_8 or ImreadModes == cv2.IMREAD_REDUCED_COLOR_8:
  scale=1/8
 rect = np.array(orig_rect)*scale
 rect = rect.astype(int).tolist()
 bgr_image = cv2.imread(filename,flags=ImreadModes)
 
 if bgr_image is None:
  print("Warning:不存在:{}", filename)
  return None
 if len(bgr_image.shape) == 3: #
  rgb_image = cv2.cvtColor(bgr_image, cv2.COLOR_BGR2RGB) # 將BGR轉(zhuǎn)為RGB
 else:
  rgb_image=bgr_image #若是灰度圖
 rgb_image = np.asanyarray(rgb_image)
 if normalization:
  # 不能寫(xiě)成:rgb_image=rgb_image/255
  rgb_image = rgb_image / 255.0
 roi_image=get_rect_image(rgb_image , rect)
 # show_image_rect("src resize image",rgb_image,rect)
 # cv_show_image("reROI",roi_image)
 return roi_image
 
def resize_image(image,resize_height, resize_width):
 '''
 :param image:
 :param resize_height:
 :param resize_width:
 :return:
 '''
 image_shape=np.shape(image)
 height=image_shape[0]
 width=image_shape[1]
 if (resize_height is None) and (resize_width is None):#錯(cuò)誤寫(xiě)法:resize_height and resize_width is None
  return image
 if resize_height is None:
  resize_height=int(height*resize_width/width)
 elif resize_width is None:
  resize_width=int(width*resize_height/height)
 image = cv2.resize(image, dsize=(resize_width, resize_height))
 return image
def scale_image(image,scale):
 '''
 :param image:
 :param scale: (scale_w,scale_h)
 :return:
 '''
 image = cv2.resize(image,dsize=None, fx=scale[0],fy=scale[1])
 return image
 
def get_rect_image(image,rect):
 '''
 :param image:
 :param rect: [x,y,w,h]
 :return:
 '''
 x, y, w, h=rect
 cut_img = image[y:(y+ h),x:(x+w)]
 return cut_img
def scale_rect(orig_rect,orig_shape,dest_shape):
 '''
 對(duì)圖像進(jìn)行縮放時(shí),對(duì)應(yīng)的rectangle也要進(jìn)行縮放
 :param orig_rect: 原始圖像的rect=[x,y,w,h]
 :param orig_shape: 原始圖像的維度shape=[h,w]
 :param dest_shape: 縮放后圖像的維度shape=[h,w]
 :return: 經(jīng)過(guò)縮放后的rectangle
 '''
 new_x=int(orig_rect[0]*dest_shape[1]/orig_shape[1])
 new_y=int(orig_rect[1]*dest_shape[0]/orig_shape[0])
 new_w=int(orig_rect[2]*dest_shape[1]/orig_shape[1])
 new_h=int(orig_rect[3]*dest_shape[0]/orig_shape[0])
 dest_rect=[new_x,new_y,new_w,new_h]
 return dest_rect
 
def show_image_rect(win_name,image,rect):
 '''
 :param win_name:
 :param image:
 :param rect:
 :return:
 '''
 x, y, w, h=rect
 point1=(x,y)
 point2=(x+w,y+h)
 cv2.rectangle(image, point1, point2, (0, 0, 255), thickness=2)
 cv_show_image(win_name, image)
 
def rgb_to_gray(image):
 image = cv2.cvtColor(image, cv2.COLOR_RGB2GRAY)
 return image
 
def save_image(image_path, rgb_image,toUINT8=True):
 if toUINT8:
  rgb_image = np.asanyarray(rgb_image * 255, dtype=np.uint8)
 if len(rgb_image.shape) == 2: # 若是灰度圖則轉(zhuǎn)為三通道
  bgr_image = cv2.cvtColor(rgb_image, cv2.COLOR_GRAY2BGR)
 else:
  bgr_image = cv2.cvtColor(rgb_image, cv2.COLOR_RGB2BGR)
 cv2.imwrite(image_path, bgr_image)
 
def combime_save_image(orig_image, dest_image, out_dir,name,prefix):
 '''
 命名標(biāo)準(zhǔn):out_dir/name_prefix.jpg
 :param orig_image:
 :param dest_image:
 :param image_path:
 :param out_dir:
 :param prefix:
 :return:
 '''
 dest_path = os.path.join(out_dir, name + "_"+prefix+".jpg")
 save_image(dest_path, dest_image)
 
 dest_image = np.hstack((orig_image, dest_image))
 save_image(os.path.join(out_dir, "{}_src_{}.jpg".format(name,prefix)), dest_image)

3.4 完整的代碼

# -*-coding: utf-8 -*-
"""
 @Project: pytorch-learning-tutorials
 @File : dataset.py
 @Author : panjq
 @E-mail : pan_jinquan@163.com
 @Date : 2019-03-07 18:45:06
"""
import torch
from torch.autograd import Variable
from torchvision import transforms
from torch.utils.data import Dataset, DataLoader
import numpy as np
from utils import image_processing
import os
 
class TorchDataset(Dataset):
 def __init__(self, filename, image_dir, resize_height=256, resize_width=256, repeat=1):
  '''
  :param filename: 數(shù)據(jù)文件TXT:格式:imge_name.jpg label1_id labe2_id
  :param image_dir: 圖片路徑:image_dir+imge_name.jpg構(gòu)成圖片的完整路徑
  :param resize_height 為None時(shí),不進(jìn)行縮放
  :param resize_width 為None時(shí),不進(jìn)行縮放,
        PS:當(dāng)參數(shù)resize_height或resize_width其中一個(gè)為None時(shí),可實(shí)現(xiàn)等比例縮放
  :param repeat: 所有樣本數(shù)據(jù)重復(fù)次數(shù),默認(rèn)循環(huán)一次,當(dāng)repeat為None時(shí),表示無(wú)限循環(huán)sys.maxsize
  '''
  self.image_label_list = self.read_file(filename)
  self.image_dir = image_dir
  self.len = len(self.image_label_list)
  self.repeat = repeat
  self.resize_height = resize_height
  self.resize_width = resize_width
 
  # 相關(guān)預(yù)處理的初始化
  '''class torchvision.transforms.ToTensor'''
  # 把shape=(H,W,C)的像素值范圍為[0, 255]的PIL.Image或者numpy.ndarray數(shù)據(jù)
  # 轉(zhuǎn)換成shape=(C,H,W)的像素?cái)?shù)據(jù),并且被歸一化到[0.0, 1.0]的torch.FloatTensor類(lèi)型。
  self.toTensor = transforms.ToTensor()
 
  '''class torchvision.transforms.Normalize(mean, std)
  此轉(zhuǎn)換類(lèi)作用于torch. * Tensor,給定均值(R, G, B) 和標(biāo)準(zhǔn)差(R, G, B),
  用公式channel = (channel - mean) / std進(jìn)行規(guī)范化。
  '''
  # self.normalize=transforms.Normalize()
 
 def __getitem__(self, i):
  index = i % self.len
  # print("i={},index={}".format(i, index))
  image_name, label = self.image_label_list[index]
  image_path = os.path.join(self.image_dir, image_name)
  img = self.load_data(image_path, self.resize_height, self.resize_width, normalization=False)
  img = self.data_preproccess(img)
  label=np.array(label)
  return img, label
 
 def __len__(self):
  if self.repeat == None:
   data_len = 10000000
  else:
   data_len = len(self.image_label_list) * self.repeat
  return data_len
 
 def read_file(self, filename):
  image_label_list = []
  with open(filename, 'r') as f:
   lines = f.readlines()
   for line in lines:
    # rstrip:用來(lái)去除結(jié)尾字符、空白符(包括\n、\r、\t、' ',即:換行、回車(chē)、制表符、空格)
    content = line.rstrip().split(' ')
    name = content[0]
    labels = []
    for value in content[1:]:
     labels.append(int(value))
    image_label_list.append((name, labels))
  return image_label_list
 
 def load_data(self, path, resize_height, resize_width, normalization):
  '''
  加載數(shù)據(jù)
  :param path:
  :param resize_height:
  :param resize_width:
  :param normalization: 是否歸一化
  :return:
  '''
  image = image_processing.read_image(path, resize_height, resize_width, normalization)
  return image
 
 def data_preproccess(self, data):
  '''
  數(shù)據(jù)預(yù)處理
  :param data:
  :return:
  '''
  data = self.toTensor(data)
  return data
 
if __name__=='__main__':
 train_filename="../dataset/train.txt"
 # test_filename="../dataset/test.txt"
 image_dir='../dataset/images'
 
 epoch_num=2 #總樣本循環(huán)次數(shù)
 batch_size=7 #訓(xùn)練時(shí)的一組數(shù)據(jù)的大小
 train_data_nums=10
 max_iterate=int((train_data_nums+batch_size-1)/batch_size*epoch_num) #總迭代次數(shù)
 
 train_data = TorchDataset(filename=train_filename, image_dir=image_dir,repeat=1)
 # test_data = TorchDataset(filename=test_filename, image_dir=image_dir,repeat=1)
 train_loader = DataLoader(dataset=train_data, batch_size=batch_size, shuffle=False)
 # test_loader = DataLoader(dataset=test_data, batch_size=batch_size,shuffle=False)
 
 # [1]使用epoch方法迭代,TorchDataset的參數(shù)repeat=1
 for epoch in range(epoch_num):
  for batch_image, batch_label in train_loader:
   image=batch_image[0,:]
   image=image.numpy()#image=np.array(image)
   image = image.transpose(1, 2, 0) # 通道由[c,h,w]->[h,w,c]
   image_processing.cv_show_image("image",image)
   print("batch_image.shape:{},batch_label:{}".format(batch_image.shape,batch_label))
   # batch_x, batch_y = Variable(batch_x), Variable(batch_y)
 
 '''
 下面兩種方式,TorchDataset設(shè)置repeat=None可以實(shí)現(xiàn)無(wú)限循環(huán),退出循環(huán)由max_iterate設(shè)定
 '''
 train_data = TorchDataset(filename=train_filename, image_dir=image_dir,repeat=None)
 train_loader = DataLoader(dataset=train_data, batch_size=batch_size, shuffle=False)
 # [2]第2種迭代方法
 for step, (batch_image, batch_label) in enumerate(train_loader):
  image=batch_image[0,:]
  image=image.numpy()#image=np.array(image)
  image = image.transpose(1, 2, 0) # 通道由[c,h,w]->[h,w,c]
  image_processing.cv_show_image("image",image)
  print("step:{},batch_image.shape:{},batch_label:{}".format(step,batch_image.shape,batch_label))
  # batch_x, batch_y = Variable(batch_x), Variable(batch_y)
  if step>=max_iterate:
   break
 # [3]第3種迭代方法
 # for step in range(max_iterate):
 #  batch_image, batch_label=train_loader.__iter__().__next__()
 #  image=batch_image[0,:]
 #  image=image.numpy()#image=np.array(image)
 #  image = image.transpose(1, 2, 0) # 通道由[c,h,w]->[h,w,c]
 #  image_processing.cv_show_image("image",image)
 #  print("batch_image.shape:{},batch_label:{}".format(batch_image.shape,batch_label))
 #  # batch_x, batch_y = Variable(batch_x), Variable(batch_y)

以上為個(gè)人經(jīng)驗(yàn),希望能給大家一個(gè)參考,也希望大家多多支持腳本之家。如有錯(cuò)誤或未考慮完全的地方,望不吝賜教。

您可能感興趣的文章:
  • pytorch鎖死在dataloader(訓(xùn)練時(shí)卡死)
  • 解決Pytorch dataloader時(shí)報(bào)錯(cuò)每個(gè)tensor維度不一樣的問(wèn)題
  • pytorch中DataLoader()過(guò)程中遇到的一些問(wèn)題
  • Pytorch dataloader在加載最后一個(gè)batch時(shí)卡死的解決
  • Pytorch 如何加速Dataloader提升數(shù)據(jù)讀取速度
  • pytorch DataLoader的num_workers參數(shù)與設(shè)置大小詳解
  • pytorch 實(shí)現(xiàn)多個(gè)Dataloader同時(shí)訓(xùn)練

標(biāo)簽:隨州 信陽(yáng) 淘寶好評(píng)回訪(fǎng) 阜新 濟(jì)源 興安盟 昭通 合肥

巨人網(wǎng)絡(luò)通訊聲明:本文標(biāo)題《pytorch Dataset,DataLoader產(chǎn)生自定義的訓(xùn)練數(shù)據(jù)案例》,本文關(guān)鍵詞  pytorch,Dataset,DataLoader,產(chǎn)生,;如發(fā)現(xiàn)本文內(nèi)容存在版權(quán)問(wèn)題,煩請(qǐng)?zhí)峁┫嚓P(guān)信息告之我們,我們將及時(shí)溝通與處理。本站內(nèi)容系統(tǒng)采集于網(wǎng)絡(luò),涉及言論、版權(quán)與本站無(wú)關(guān)。
  • 相關(guān)文章
  • 下面列出與本文章《pytorch Dataset,DataLoader產(chǎn)生自定義的訓(xùn)練數(shù)據(jù)案例》相關(guān)的同類(lèi)信息!
  • 本頁(yè)收集關(guān)于pytorch Dataset,DataLoader產(chǎn)生自定義的訓(xùn)練數(shù)據(jù)案例的相關(guān)信息資訊供網(wǎng)民參考!
  • 推薦文章