In this paper we propose a new method to automatically select the rank of linear transforms during supervised learning. Our approach relies on a sparsity-enforcing element-wise soft-thresholding operation applied after the linear transform. This novel approach to supervised rank learning has the important advantage that it is very simple to implement and incurs no extra complexity relative to linear transform learning. Furthermore, we propose a simple Stochastic Gradient Descent (SGD) implementation suitable for large scale learning, where SGD solvers have established themselves as the default workhorse. We compare our method to various other metric learning techniques in the application of image retrieval. This is one of the remaining few areas where supervised learning of low-rank linear transforms has not been fully exploited. The main reason for this is the lack of adequate datasets that are large enough, and hence we further introduce a new dataset consisting of groups of matching images derived from Cable News Network (CNN) videos using geometric verification and manual selection to find matching frames with adequate variability.