Smart-city intersections will play a crucial role in automated traffic management and improvement in pedestrian safety in cities of the future. They will (i) aggregate data from in-vehicle and infrastructure sensors; (ii) process the data by taking advantage of low-latency high-bandwidth communications, edge-cloud computing, and AI-based detection and tracking of objects; and (iii) provide intelligent feedback and input to control systems. The Cloud Enhanced Open Software Defined Mobile Wireless Testbed for City-Scale Deployment (COSMOS) enables research on technologies supporting smart cities. In this paper, we provide results of experiments using bird's eye cameras to detect and track vehicles and pedestrians from the COSMOS pilot site. We assess the capabilities for real-time computation, and detection and tracking accuracy-by evaluating and customizing a selection of video pre-processing and deep-learning algorithms. Distinct issues that are associated with the difference in scale for bird's eye view of pedestrians vs. cars are explored and addressed: the best multiple-object tracking accuracies (MOTA) for cars are around 73.2, and around 2.8 for pedestrians. The real-time goal of 30 frames-per-second-i.e., a total of 33.3 ms of latency for object detection for vehicles will be reachable once the processing time is improved roughly by a factor of three.