We investigated the representation of visual inputs by multiple simultaneously recorded single neurons in the human medial temporal lobe, using their firing rates to infer which images were shown to subjects. The selectivity of these neurons was quantified with a novel measure. About four spikes per neuron, triggered between 300 and 600 ms after image onset in a handful of units (7.8 on average), predicted the identity of images far above chance. Decoding performance increased linearly with the number of units considered, peaked between 400 and 500 ms, did not improve when considering correlations among simultaneously recorded units, and generalized to very different images. The feasibility of decoding sensory information from human extracellular recordings has implications for the development of brain-machine interfaces.