While computer vision models designed for natural object recognition are often successful in direct transfer learning to other domains, the high resolution and local basis of decision-making innate in mammogram classification presents a unique challenge. This was particularly relevant in The Digital Mammography DREAM Challenge, where only image-level labels were available for training and the dataset was highly imbalanced. To overcome these difficulties, we implemented a two-stage training scheme consisting of first training a patch classifier using a popular public dataset containing lesion segmentation masks, followed by image-level training on the public and DREAM datasets. The image-level model was instantiated by a global aggregation of features from the patch model, used in a scanning window (i.e. convolutional) fashion. With the patch-trained initial weights, the global model trained efficiently end-to-end, achieving accuracies up to 0.87 AUROC in the competition.