Passive localization and tracking of a mobile emitter, and joint learning of its reverberant three-dimensional (3D) acoustic environment, where critical structural features are unknown, is a key open problem. Unaccounted-for occluders are potentially present, so that the emitter can lose line-of-sight to the receivers, and can only be observed through its reflected raypaths. The locations of reflective boundaries must therefore be jointly estimated with the emitter's position. A multistage global optimization and tracking architecture is developed to solve this problem with a relatively unconstrained model. Each stage of this architecture establishes domain knowledge such as synchronization and initial environment estimation, which are inputs for the following stages of more refined algorithms. This approach is generalizable to different physical scales and modalities and improves on methods that do not exploit the motion of the emitter. In one stage of this architecture, particle swarm optimization is used to simultaneously estimate the environment and the emitter location. In another stage, a Hough transform-inspired boundary localization algorithm is extended to 3D settings, to establish an initial estimate of the environment. The performance of this holistic approach is analyzed and its reliability is demonstrated in a reverberant watertank testbed, which models the shallow-water underwater acoustic setting.