Memorability of media content such as images and videos has recently become an important research subject in computer vision. This paper presents our computation model for predicting image memorability, which is based on a deep learning architecture designed for a classification task. We exploit the use of both convolutional neural network (CNN) - based visual features and semantic features related to image captioning for the task. We train and test our model on the large-scale benchmarking memorability dataset: LaMem. Experiment result shows that the proposed computational model obtains better prediction performance than the state of the art, and even outperforms human consistency. We further investigate the genericity of our model on other memorability datasets. Finally, by validating the model on interestingness datasets, we reconfirm the uncorrelation between memorability and interestingness of images.