如果有一個(gè)多任務(wù)多l(xiāng)oss的網(wǎng)絡(luò),那么在訓(xùn)練時(shí),loss是如何工作的呢?
model = Model(inputs = input, outputs = [y1, y2]) l1 = 0.5 l2 = 0.3 model.compile(loss = [loss1, loss2], loss_weights=[l1, l2], ...)
final_loss = l1 * loss1 + l2 * loss2
我們最終的優(yōu)化效果是最小化final_loss。
問題來了,在訓(xùn)練過程中,是否loss2只更新得到y(tǒng)2的網(wǎng)絡(luò)通路,還是loss2會更新所有的網(wǎng)絡(luò)層呢?
此問題的關(guān)鍵在梯度回傳上,即反向傳播算法。
所以loss1只對x1和x2有影響,而loss2只對x1和x3有影響。
補(bǔ)充:keras 多個(gè)LOSS總和定義
用字典形式,名字是模型中輸出那一層的名字,這里的loss可以是自己定義的,也可是自帶的
補(bǔ)充:keras實(shí)戰(zhàn)-多類別分割loss實(shí)現(xiàn)
本文樣例均為3d數(shù)據(jù)的onehot標(biāo)簽形式,即y_true(batch_size,x,y,z,class_num)
def dice_coef_fun(smooth=1): def dice_coef(y_true, y_pred): #求得每個(gè)sample的每個(gè)類的dice intersection = K.sum(y_true * y_pred, axis=(1,2,3)) union = K.sum(y_true, axis=(1,2,3)) + K.sum(y_pred, axis=(1,2,3)) sample_dices=(2. * intersection + smooth) / (union + smooth) #一維數(shù)組 為各個(gè)類別的dice #求得每個(gè)類的dice dices=K.mean(sample_dices,axis=0) return K.mean(dices) #所有類別dice求平均的dice return dice_coef def dice_coef_loss_fun(smooth=0): def dice_coef_loss(y_true,y_pred): return 1-1-dice_coef_fun(smooth=smooth)(y_true=y_true,y_pred=y_pred) return dice_coef_loss
def generalized_dice_coef_fun(smooth=0): def generalized_dice(y_true, y_pred): # Compute weights: "the contribution of each label is corrected by the inverse of its volume" w = K.sum(y_true, axis=(0, 1, 2, 3)) w = 1 / (w ** 2 + 0.00001) # w為各個(gè)類別的權(quán)重,占比越大,權(quán)重越小 # Compute gen dice coef: numerator = y_true * y_pred numerator = w * K.sum(numerator, axis=(0, 1, 2, 3)) numerator = K.sum(numerator) denominator = y_true + y_pred denominator = w * K.sum(denominator, axis=(0, 1, 2, 3)) denominator = K.sum(denominator) gen_dice_coef = numerator / denominator return 2 * gen_dice_coef return generalized_dice def generalized_dice_loss_fun(smooth=0): def generalized_dice_loss(y_true,y_pred): return 1 - generalized_dice_coef_fun(smooth=smooth)(y_true=y_true,y_pred=y_pred) return generalized_dice_loss
# Ref: salehi17, "Twersky loss function for image segmentation using 3D FCDN" # -> the score is computed for each class separately and then summed # alpha=beta=0.5 : dice coefficient # alpha=beta=1 : tanimoto coefficient (also known as jaccard) # alpha+beta=1 : produces set of F*-scores # implemented by E. Moebel, 06/04/18 def tversky_coef_fun(alpha,beta): def tversky_coef(y_true, y_pred): p0 = y_pred # proba that voxels are class i p1 = 1 - y_pred # proba that voxels are not class i g0 = y_true g1 = 1 - y_true # 求得每個(gè)sample的每個(gè)類的dice num = K.sum(p0 * g0, axis=( 1, 2, 3)) den = num + alpha * K.sum(p0 * g1,axis= ( 1, 2, 3)) + beta * K.sum(p1 * g0, axis=( 1, 2, 3)) T = num / den #[batch_size,class_num] # 求得每個(gè)類的dice dices=K.mean(T,axis=0) #[class_num] return K.mean(dices) return tversky_coef def tversky_coef_loss_fun(alpha,beta): def tversky_coef_loss(y_true,y_pred): return 1-tversky_coef_fun(alpha=alpha,beta=beta)(y_true=y_true,y_pred=y_pred) return tversky_coef_loss
def IoU_fun(eps=1e-6): def IoU(y_true, y_pred): # if np.max(y_true) == 0.0: # return IoU(1-y_true, 1-y_pred) ## empty image; calc IoU of zeros intersection = K.sum(y_true * y_pred, axis=[1,2,3]) union = K.sum(y_true, axis=[1,2,3]) + K.sum(y_pred, axis=[1,2,3]) - intersection # ious=K.mean((intersection + eps) / (union + eps),axis=0) return K.mean(ious) return IoU def IoU_loss_fun(eps=1e-6): def IoU_loss(y_true,y_pred): return 1-IoU_fun(eps=eps)(y_true=y_true,y_pred=y_pred) return IoU_loss
以上為個(gè)人經(jīng)驗(yàn),希望能給大家一個(gè)參考,也希望大家多多支持腳本之家。
標(biāo)簽:蘭州 駐馬店 宿遷 常州 山東 成都 六盤水 江蘇
巨人網(wǎng)絡(luò)通訊聲明:本文標(biāo)題《關(guān)于keras多任務(wù)多l(xiāng)oss回傳的思考》,本文關(guān)鍵詞 關(guān)于,keras,多任務(wù),多,loss,;如發(fā)現(xiàn)本文內(nèi)容存在版權(quán)問題,煩請?zhí)峁┫嚓P(guān)信息告之我們,我們將及時(shí)溝通與處理。本站內(nèi)容系統(tǒng)采集于網(wǎng)絡(luò),涉及言論、版權(quán)與本站無關(guān)。