Learning Bayesian network structure from massive datasets: The "sparse candidate" algorithm

N Friedman, I Nachman, Dana Peer

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Learning Bayesian networks is often cast as an optimization problem, where the computational task is to find a structure that maximizes a statistically motivated score. By and large, existing learning tools address this optimization problem using standard heuristic search techniques. Since the search space is extremely large, such search procedures can spend most of the time examining candidates that are extremely unreasonable. This problem becomes critical when we deal with data sets that are large either in the number of instances, or the number of attributes. In this paper, we introduce an algorithm that achieves faster learning by restricting the search space. This iterative algorithm restricts the parents of each variable to belong to a small subset of candidates. We then search for a network that satisfies these constraints. The learned network is then used for selecting better candidates for the next iteration. We evaluate this algorithm both on synthetic and real-life data. Our results show that it is significantly faster than alternative search procedures without loss of quality in the learned structures.
Original languageEnglish
Title of host publicationUNCERTAINTY IN ARTIFICIAL INTELLIGENCE, PROCEEDINGS
EditorsKB Laskey, H Prade
Place of PublicationSAN FRANCISCO, CA, USA
PublisherMORGAN KAUFMANN PUB INC
Pages206-215
Number of pages10
ISBN (Print)1-55860-614-9
StatePublished - 1999
Event15th Conference on Uncertainty in Artificial Intelligence - Stockholm, Sweden
Duration: 30 Jul 19991 Aug 1999
Conference number: 15

Conference

Conference15th Conference on Uncertainty in Artificial Intelligence
Abbreviated titleROYAL INST TECHNOL
Country/TerritorySweden
CityStockholm
Period30/07/991/08/99

Fingerprint

Dive into the research topics of 'Learning Bayesian network structure from massive datasets: The "sparse candidate" algorithm'. Together they form a unique fingerprint.

Cite this