Hi, I was wondering why you have calculated sum of squared in style loss calculations:
m_loss = sse(d_mean, s_mean) / batch_size # normalized w.r.t. batch size
While in torch implementation MSE is being used, this line:
self.mean_criterion = nn.MSECriterion() self.mean_loss = self.mean_criterion:forward(self.input_mean, self.target_mean)
Hi, I was wondering why you have calculated
sum of squaredin style loss calculations:m_loss = sse(d_mean, s_mean) / batch_size # normalized w.r.t. batch sizeWhile in torch implementation
MSEis being used, this line:self.mean_criterion = nn.MSECriterion() self.mean_loss = self.mean_criterion:forward(self.input_mean, self.target_mean)