Abstract: |
My project aims to revolutionize our understanding of social dynamics in urban environments. Inspired by Jane Jacobs’ concept of the “Sidewalk Ballet,” we seek to capture and analyze the spontaneous interactions that make city streets vibrant and cohesive. By utilizing large multimodal models (LMMs), we will develop a new framework to detect and categorize social interactions from vast collections of geo-tagged images, including street-view imagery.
To achieve this, we will leverage ACCESS computational resources, specifically allocating 120,000 CPU hours and 35,000 GPU hours. These resources will support extensive data processing, model training, and validation tasks essential for creating a benchmark dataset of over 100,000 annotated urban images. We will employ and fine-tune advanced models such as Qwen2.5-vl, LLaVA 1.6, and Janus-Pro. Additionally, we will utilize software packages tailored for large-scale image analysis and geospatial data integration.
Our approach will enable large-scale mapping and analysis of urban social life, providing valuable insights for urban planners and designers to create more inclusive and vibrant public spaces. By automating the detection of social interactions, this project promises to offer scalable and actionable knowledge, paving the way for future research and practical applications in enhancing urban health and vitality. |