Surveillance and security scenarios usually require high efficient facial image compression scheme for face recognition and identification. While either traditional general image codecs or special facial image compression schemes only heuristically refine codec separately according to face verification accuracy metric. We propose a Learning based Facial Image Compression (LFIC) framework with a novel Regionally Adaptive Pooling (RAP) module whose parameters can be automatically optimized according to gradient feedback from an integrated hybrid semantic fidelity metric, including a successfully exploration to apply Generative Adversarial Network (GAN) as metric directly in image compression scheme. The experimental results verify the framework's efficiency by demonstrating performance improvement of 71.41%, 48.28% and 52.67% bitrate saving separately over JPEG2000, WebP and neural network-based codecs under the same face verification accuracy distortion metric. We also evaluate LFIC's superior performance gain compared with latest specific facial image codecs. Visual experiments also show some interesting insight on how LFIC can automatically capture the information in critical areas based on semantic distortion metrics for optimized compression, which is quite different from the heuristic way of optimization in traditional image compression algorithms.