Adaptive transform learning schemes have been extensively studied in the literature with a goal to achieve better compression efficiency compared to extensively used Discrete Cosine Transforms (DCT) inside a video codec. These transforms are learned offline on a large training set and are tested either in competition with or in place of the core transforms i.e. DCT. In our previous work, we proposed an alternative approach where a set of non-separable content-adaptive transforms are learned on-the-fly on a sequence and then tested on the same sequence in competition with the core transforms. In this paper, we propose to further improve the previously proposed learning scheme by improving the coding of transformed coefficients in context to the online learning of transforms. The first proposed method improves the convergence of the learning scheme by re-ordering the transformed coefficients at each iteration. The second proposed method improves the compression efficiency by modifying the coding of last significant coefficient position in context to adaptive transforms. The results shows that by combining the above proposed methods, one can achieve 1.2% gain on top of the previously proposed scheme.