The use of machine learning (ML), and in particular, artificial neural networks (ANN), in engineering applications has increased dramatically over the last years. However, by and large, the development of such applications or their report lack proper evaluation. Deficient evaluation practice was observed in the general neural networks community and again in engineering applications through a survey we conducted of articles published in AI in Engineering and elsewhere. This status hinders understanding and prevents progress. This article goal is to remedy this situation. First, several evaluation methods are discussed with their relative qualities. Second, these qualities are illustrated by using the methods to evaluate ANN performance in two engineering problems. Third, a systematic evaluation procedure for ML is discussed. This procedure will lead to better evaluation of studies, and consequently to improved research and practice in the area of ML in engineering applications.