Understanding human activities surrounding a vehicle is the most crucial requirements for any fully automated or humanassistive driving system. The system should not only have a thorough understanding of the current situation, but also predict the future states. The understanding and prediction steps are essential for planning the right actions, avoiding dangers, and raising early enough warnings. It also should include a wide spectrum of information ranging from a high-level comprehension of the scene to fine-grained details about human dynamics and their intentions. We propose to achieve these goals by inferring a holistic comprehension of human activity adjacent to a vehicle from various sensing modalities such as videos (RGB), thermal (T), and Lidar/depth (D) data. Despite the recent progress in this area, most of the existing systems focused on human or object detection with very limited reasoning on the future states of the detected humans. We plan to go beyond detection and develop new robust prediction models that predict humans’ own goals, human-human interactions, and human-space interactions.
Understand and predict human activities surrounding a vehicle
- Thrust 1: 3D human detection and pose estimation
- Thrust 2: Scene layout and space understanding
- Thrust 3: Human trajectory prediction