Large-Margin Softmax Loss, Angular Softmax Loss, Additive Margin Softmax, ArcFaceLoss And FocalLoss In Tensorflow

May 29, 2018 ยท View on GitHub

This repository contains core codes of the reimplementation of the following papers in TensorFlow:

If your goal is to reproduce the results in the original paper, please use the official codes:

For using these Ops on your own machine:

  • copy the header file "cuda_config.h" from "your_python_path/site-packages/external/local_config_cuda/cuda/cuda/cuda_config.h" to "your_python_path/site-packages/tensorflow/include/tensorflow/stream_executor/cuda/cuda_config.h".

  • run the following script:

mkdir build
cd build && cmake ..
make
  • run "test_op.py" and check the numeric errors to test your install

  • follow the below codes snippet to integrate this Op into your own code:

    • For Large Margin Softmax Loss:
    op_module = tf.load_op_library(so_lib_path)
    large_margin_softmax = op_module.large_margin_softmax
    
    @ops.RegisterGradient("LargeMarginSoftmax")
    def _large_margin_softmax_grad(op, grad, _):
      '''The gradients for `LargeMarginSoftmax`.
      '''
      inputs_features = op.inputs[0]
      inputs_weights = op.inputs[1]
      inputs_labels = op.inputs[2]
      cur_lambda = op.outputs[1]
      margin_order = op.get_attr('margin_order')
    
      grads = op_module.large_margin_softmax_grad(inputs_features, inputs_weights, inputs_labels, grad, cur_lambda[0], margin_order)
      return [grads[0], grads[1], None, None]
    
    var_weights = tf.Variable(initial_value, trainable=True, name='lsoftmax_weights')
    result = large_margin_softmax(features, var_weights, labels, global_step, 4, 1000., 0.000025, 35., 0.)
    loss = tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits(labels=labels, logits=result[0]))
    
    • For Angular Softmax Loss:
    op_module = tf.load_op_library(so_lib_path)
    angular_softmax = op_module.angular_softmax
    
    @ops.RegisterGradient("AngularSoftmax")
    def _angular_softmax_grad(op, grad, _):
      '''The gradients for `AngularSoftmax`.
      '''
      inputs_features = op.inputs[0]
      inputs_weights = op.inputs[1]
      inputs_labels = op.inputs[2]
      cur_lambda = op.outputs[1]
      margin_order = op.get_attr('margin_order')
    
      grads = op_module.angular_softmax_grad(inputs_features, inputs_weights, inputs_labels, grad, cur_lambda[0], margin_order)
      return [grads[0], grads[1], None, None]
    
    var_weights = tf.Variable(initial_value, trainable=True, name='asoftmax_weights')
    normed_var_weights = tf.nn.l2_normalize(var_weights, 1, 1e-10, name='weights_normed')
    result = angular_softmax(features, normed_var_weights, labels, global_step, 4, 1000., 0.000025, 35., 0.)
    loss = tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits(labels=labels, logits=result[0]))
    
    • For others just refer to this script.

All the codes was tested under TensorFlow 1.6, Python 3.5, Ubuntu 16.04 with CUDA 8.0. The outputs of these Ops in C++ had been compared with the original caffe codes' outputs, and the bias could be ignored. The gradients of this Op had been checked using tf.test.compute_gradient_error and tf.test.compute_gradient. While the others are implemented following the official implementation in Python Ops.

If you encountered some linkage problem when generating or loading *.so, you are highly recommended to read this section in the official tourial to make sure you were using the same C++ ABI version.

Any contributions to this repo is welcomed.

MIT License