This paper addresses the estimation of accurate long-term dense motion fields from videos of complex scenes. With computer vision applications such as video editing in mind, we exploit optical flows estimated with various inter-frame distances and combine them through multi-step integration and statistical selection (MISS). In this context, managing numerous combinations of multi-step optical flows requires a complexity reduction scheme to overcome computational and memory issues. Our contribution is two-fold. First, we provide an exhaustive analysis of available single-reference complexity reduction strategies. Second, we propose a simple and efficient alternative related to multi-reference frames multi-step integration and statistical selection (MR-MISS). Our method automatically inserts intermediate reference frames once matching failures are detected to re-generate the motion estimation process and re-correlates the resulting dense trajectories. By this way, it reaches longer accurate displacement fields while efficiently reducing the complexity. Experiments on challenging sequences reveal improved results compared to state-of-the-art methods including existing MISS schemes both in terms of complexity reduction and accuracy improvement.