Historical documents have been undergoing large-scale digitization over the past years, placing massive image collections online. Optical character recognition (OCR) often performs poorly on such material, which makes searching within these resources problematic and textual analysis of such documents difficult. We present two approaches to overcome this obstacle, one textual and one visual. We show that, for tasks like finding newspaper articles related by topic, poor-quality OCR text suffices. An ordinary vector-space model is used to represent articles. Additional improvements obtain by adding words with similar distributional representations. As an alternative to OCR-based methods, one can perform image-based search, using word spotting. Synthetic images are generated for every word in a lexicon, and word-spotting is used to compile vectors of their occurrences. Retrieval is by means of a usual nearest-neighbor search. The results of this visual approach are comparable to those obtained using noisy OCR. We report on experiments applying both methods, separately and together, on historical Hebrew newspapers, with their added problem of rich morphology.