Detection and segmentation of people within a scene has been primarily applied to indoor imagery for surveillance systems and outdoor scenes for pedestrian detection. This paper proposes to leverage a similar semantic segmentation model for segmenting patients in the neonatal intensive care unit (NICU) during video-based monitoring. This will serve as part of a non-contact, non-invasive and unobtrusive system to monitor neonates by acquiring a relevant region-of-interest from overhead RGB-D video. This paper examines situations typical of the NICU environment to ensure generalization of the solution to all patient scenarios. Transfer learning is applied to a pre-trained convolutional neural network on three different patients. Promising results are observed when the model is tested on a new patient. Final testing accuracy of 93% demonstrates the potential of such algorithm to automatically determine a suitable region-of-interest for video-based patient monitoring.