The problem of testing low-degree polynomials has received significant attention over the years due to its importance in theoretical computer science, and in particular in complexity theory. The problem is specified by three parameters: field size q, degree d and proximity parameter d, and the goal is to design a tester making as few as possible queries to a given function, which is able to distinguish between the case the given function has degree at most d, and the case the given function is d-far from any degree d function. With respect to these parameters, we say that a tester is optimal if it makes O(qt+1/d) queries, where t=t(d,q) is the testing dimension of d, q (defined as the minimum integer so that for all g: Fqn ? Fq of degree more than d, there is a subspace of dimension t on which their restriction has degree exceeding d). For the field of size q, such tester was first given by Bhattacharyya et al. for q = 2, and later by Haramaty et al.  for all prime powers q. In fact, they showed that the natural t-flat tester is an optimal tester for the Reed-Muller code, for an appropriate t. Here, the t-flat tester is the tester that picks a uniformly random affine subspace A of dimension t, and checks that deg(f|A)= d. Their analysis proves that the dependency of the t-flat tester on d and d is optimal, however the dependency on the field size, i.e. the hidden constant in the O, is a tower-type function in q. We improve the result of Haramaty et al., showing that the dependency on the field size is polynomial. Our technique also applies in the more general setting of lifted affine invariant codes, and gives the same polynomial dependency on the field size. This answers a problem raised in . Our approach significantly deviates from the strategy taken in earlier works , , , and is based on studying the structure of the collection of erroneous subspaces, i.e. subspaces A such that f|A has degree greater than d. Towards this end, we observe that these sets are poorly expanding in the affine version of the Grassmann graph and use that to establish structural results on them via global hypercontractivity. We then use this structure to perform local correction on f.