Large-Margin Softmax Loss, Angular Softmax Loss, Additive Margin Softmax, ArcFaceLoss And FocalLoss In Tensorflow
May 29, 2018 ยท View on GitHub
This repository contains core codes of the reimplementation of the following papers in TensorFlow:
- Large-Margin Softmax Loss for Convolutional Neural Networks
- SphereFace: Deep Hypersphere Embedding for Face Recognition
- Additive Margin Softmax for Face Verification or CosFace: Large Margin Cosine Loss for Deep Face Recognition
- ArcFace: Additive Angular Margin Loss for Deep Face Recognition
- Focal Loss for Dense Object Detection
If your goal is to reproduce the results in the original paper, please use the official codes:
- Large Margin Softmax Loss in ICML 2016
- Angular Softmax Loss in CVPR 2017
- Additive Margin Softmax
- ArcFace: Additive Angular Margin Loss
- Focal Loss in ICCV 2017
For using these Ops on your own machine:
-
copy the header file "cuda_config.h" from "your_python_path/site-packages/external/local_config_cuda/cuda/cuda/cuda_config.h" to "your_python_path/site-packages/tensorflow/include/tensorflow/stream_executor/cuda/cuda_config.h".
-
run the following script:
mkdir build
cd build && cmake ..
make
-
run "test_op.py" and check the numeric errors to test your install
-
follow the below codes snippet to integrate this Op into your own code:
- For Large Margin Softmax Loss:
op_module = tf.load_op_library(so_lib_path) large_margin_softmax = op_module.large_margin_softmax @ops.RegisterGradient("LargeMarginSoftmax") def _large_margin_softmax_grad(op, grad, _): '''The gradients for `LargeMarginSoftmax`. ''' inputs_features = op.inputs[0] inputs_weights = op.inputs[1] inputs_labels = op.inputs[2] cur_lambda = op.outputs[1] margin_order = op.get_attr('margin_order') grads = op_module.large_margin_softmax_grad(inputs_features, inputs_weights, inputs_labels, grad, cur_lambda[0], margin_order) return [grads[0], grads[1], None, None] var_weights = tf.Variable(initial_value, trainable=True, name='lsoftmax_weights') result = large_margin_softmax(features, var_weights, labels, global_step, 4, 1000., 0.000025, 35., 0.) loss = tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits(labels=labels, logits=result[0]))- For Angular Softmax Loss:
op_module = tf.load_op_library(so_lib_path) angular_softmax = op_module.angular_softmax @ops.RegisterGradient("AngularSoftmax") def _angular_softmax_grad(op, grad, _): '''The gradients for `AngularSoftmax`. ''' inputs_features = op.inputs[0] inputs_weights = op.inputs[1] inputs_labels = op.inputs[2] cur_lambda = op.outputs[1] margin_order = op.get_attr('margin_order') grads = op_module.angular_softmax_grad(inputs_features, inputs_weights, inputs_labels, grad, cur_lambda[0], margin_order) return [grads[0], grads[1], None, None] var_weights = tf.Variable(initial_value, trainable=True, name='asoftmax_weights') normed_var_weights = tf.nn.l2_normalize(var_weights, 1, 1e-10, name='weights_normed') result = angular_softmax(features, normed_var_weights, labels, global_step, 4, 1000., 0.000025, 35., 0.) loss = tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits(labels=labels, logits=result[0]))- For others just refer to this script.
All the codes was tested under TensorFlow 1.6, Python 3.5, Ubuntu 16.04 with CUDA 8.0. The outputs of these Ops in C++ had been compared with the original caffe codes' outputs, and the bias could be ignored. The gradients of this Op had been checked using tf.test.compute_gradient_error and tf.test.compute_gradient. While the others are implemented following the official implementation in Python Ops.
If you encountered some linkage problem when generating or loading *.so, you are highly recommended to read this section in the official tourial to make sure you were using the same C++ ABI version.
Any contributions to this repo is welcomed.
MIT License