Generating complex discrete distributions remains as one of the challenging problems in machine learning. Existing techniques for generating complex distributions with high degrees of freedom depend on standard generative models like Generative Adversarial Networks (GAN), Wasserstein GAN, and associated variations. Such models are based on an optimization involving the distance between two continuous distributions. We introduce a Discrete Wasserstein GAN (DWGAN) model which is based on a dual formulation of the Wasserstein distance between two discrete distributions. We derive a novel training algorithm and corresponding network architecture based on the formulation. Experimental results are provided for both synthetic discrete data, and real discretized data from MNIST handwritten digits.