### Image quality evaluation

An image-quality metric compares a source image to a reconstruction. Mean squared error (MSE) and signal-to-noise ratios (SNR) seem like natural image-quality metrics, but both are a poor match to human perceptual judgments. The multiscale structural similarity (MSSIM) measure was developed to better capture \emph{perceived} image quality (Wang, Simoncelli, & Bovik, 2003). MSSIM has been widely adopted as a measure for evaluating image compression schemes, and more recently we have shown that using MSSIM as a training loss in a deep network for image synthesis can produce superior results, as judged by human evaluators (Snell, Ridgeway, Liao, Roads, Mozer, & Zemel, 2017). However, we find that while MSSIM better correlates with human judgments than MSE or SNR, it is far from a perfect predictor of the ultimate arbiter of image quality---the human perceptual system.

Our goal is to leverage and improve on MSSIM, via deep learning and human judgement data, to obtain an image-quality metric that truly is a proxy for the human perceptual system. Such an improved metric should be useful for tasks such as image superresolution, de-noising, compression, synthesis via CycleGANs (Zhu, Park, Isola, & Efros, 2017), and embeddings via VAEs.

Our approach is as follows. First, we cache the MSSIM measure in a convolutional neural net which takes two images as input and outputs a quality score. The structure of the net (receptive fields, the use of multiscale representations, etc.) is designed to ensure that the net can replicate MSSIM scores. Second, we provide additional training to the net by way of human triplet judgements of the form original image O better approximated by reconstruction A than by reconstruction B.'' These judgments provide a soft constraint on a Siamese net, namely, Score(O,A) > Score(O,B).

#### Students

Michael Neuder (Computer Science, Boulder)