We present new results, both positive and negative, on the well-studied problem of learning disjunctive normal form (DNF) expressions. We first prove that an algorithm due to Kushilevitz and Mansour  can be used to weakly learn DNF using membership queries in polynomial time, with respect to the uniform distribution on the inputs. This is the first positive result for learning unrestricted DNF expressions in polynomial time in any nontrivial formal model of learning. It. pro-vides a sharp contrast with the results of Kharitonov , who proved that AC0 is not efficiently leamable in the same model (given certain plausible cryptographic assumptions). We also present efficient learning algorithms in various models for the read-fc and SAT-/: subclasses of DNF. For our negative results, we turn our attention to the recently introduced statistical query model of learning [11). This model is a restricted version of the popular Probably Approximately Correct (PAC) model , and practically every class known to be efficiently learnable in the PAC model is in fact learnable in the statistical query model [ll]. Here we give a general characterization of the complexity of statistical query learning in terms of the number of uncorrected functions in the concept class. This is a distribution- dependent quantity yielding upper and lower bounds on the number of statistical queries required for learning on any input distribution. As a corollary, we obtain that DNF expressions and decision trees are not even weakly learnable with respect to the uniform input distribution in polynomial time in the statistical query model. This result is information-Theoretic and therefore does not rely on any unproven assumptions. It demonstrates that no simple modification of the existing algorithms in the computational learning theory literature for learning various restricted forms of DNF and decision trees from passive random examples (arid also several algorithms proposed in the experimental machine learning communities, such as the 1D3 algorithm for decision trees  and its variants) will solve the general problem. The unifying tool for all of our results is the Fourier analysis of a finite class of boolean functions on the hypercube.