Semi-supervised learning for photometric supernova classification

Joseph W. Richards*, Darren Homrighausen, Peter E. Freeman, Chad M. Schafer, Dovi Poznanski

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review


We present a semi-supervised method for photometric supernova typing. Our approach is to first use the non-linear dimension reduction technique diffusion map to detect structure in a data base of supernova light curves and subsequently employ random forest classification on a spectroscopically confirmed training set to learn a model that can predict the type of each newly observed supernova. We demonstrate that this is an effective method for supernova typing. As supernova numbers increase, our semi-supervised method efficiently utilizes this information to improve classification, a property not enjoyed by template-based methods. Applied to supernova data simulated by Kessler et al. to mimic those of the Dark Energy Survey, our methods achieve (cross-validated) 95per cent Type Ia purity and 87per cent Type Ia efficiency on the spectroscopic sample, but only 50per cent Type Ia purity and 50per cent efficiency on the photometric sample due to their spectroscopic follow-up strategy. To improve the performance on the photometric sample, we search for better spectroscopic follow-up procedures by studying the sensitivity of our machine-learned supernova classification on the specific strategy used to obtain training sets. With a fixed amount of spectroscopic follow-up time, we find that, despite collecting data on a smaller number of supernovae, deeper magnitude-limited spectroscopic surveys are better for producing training sets. For supernova Ia (II-P) typing, we obtain a 44per cent (1per cent) increase in purity to 72per cent (87per cent) and 30per cent (162per cent) increase in efficiency to 65per cent (84per cent) of the sample using a 25th (24.5th) magnitude-limited survey instead of the shallower spectroscopic sample used in the original simulations. When redshift information is available, we incorporate it into our analysis using a novel method of altering the diffusion map representation of the supernovae. Incorporating host redshifts leads to a 5per cent improvement in Type Ia purity and 13per cent improvement in Type Ia efficiency.

Original languageEnglish
Pages (from-to)1121-1135
Number of pages15
JournalMonthly Notices of the Royal Astronomical Society
Issue number2
StatePublished - Jan 2012
Externally publishedYes


  • Methods: data analysis
  • Methods: statistical
  • Supernovae: general
  • Surveys
  • Techniques: photometric


Dive into the research topics of 'Semi-supervised learning for photometric supernova classification'. Together they form a unique fingerprint.

Cite this