Video Footage
346 HD video clips 5-10 seconds long are recorded with on-board camera at 30 FPS.
JAAD is a dataset for studying joint attention in the context of autonomous driving. The focus is on pedestrian and driver behaviors at the point of crossing and factors that influence them. To this end, JAAD dataset provides a richly annotated collection of 346 short video clips (5-10 sec long) extracted from over 240 hours of driving footage. These videos filmed in several locations in North America and Eastern Europe represent scenes typical for everyday urban driving in various weather conditions.
Bounding boxes with occlusion tags are provided for all pedestrians making this dataset suitable for pedestrian detection.
Behavior annotations specify behaviors for pedestrians that interact with or require attention of the driver. For each video there are several tags (weather, locations, etc.) and timestamped behavior labels from a fixed list (e.g. stopped, walking, looking, etc.). In addition, a list of demographic attributes is provided for each pedestrian (e.g. age, gender, direction of motion, etc.) as well as a list of visible traffic scene elements (e.g. stop sign, traffic signal, etc.) for each frame.
346 HD video clips 5-10 seconds long are recorded with on-board camera at 30 FPS.
Bounding boxes are provided for all 2793 pedestrians present in the videos.
Behavior tags are provided for each pedestrian per frame, including actions like walking, standing, crossing, looking (at the traffic), etc.
A list of attributes is provided for each pedestrian. Attributes include age, gender, clothing and accessories, direction of motion, crossing location, number of people in the group.
Each video is annotated with weather and time of day attribute. Text annotations for visible elements of infrastructure such as crosswalks, traffic lights, stop lights, etc., are provided per frame.
Videos were recorded in several locations in North America and Europe under different weather conditions.
Total number of frames | 82,032 |
---|---|
Total number of annotated frames | 82,032 |
Number of pedestrians with behavior annotations | 686 |
Total number of pedestrians | 2,786 |
Number of pedestrian bounding boxes | 378,643 |
Average length of pedestrian track | 121 frames |
Pedestrian counts | |
Number of pedestrians who cross the street | 495 |
Number of pedestrians who do not cross the street | 191 |
Get MP4 files (3.1GB)
Annotations are hosted on our github page
If you found our dataset useful in your research please consider citing our papers:
@inproceedings{rasouli2017ICCVW,
title={Are they going to cross? A benchmark dataset and baseline for pedestrian crosswalk behavior},
author={Rasouli, Amir and Kotseruba, Iuliia and Tsotsos, John K},
booktitle={Proceedings of the IEEE International Conference on Computer Vision Workshops},
pages={206--213},
year={2017}
}
@inproceedings{Rasouli2017IV,
title={Agreeing to cross: How drivers and pedestrians communicate},
author={Rasouli, Amir and Kotseruba, Iuliia and Tsotsos, John K},
booktitle={IEEE Intelligent Vehicles Symposium (IV)},
pages={264--269},
year={2017}
}
With questions regarding the dataset please contact Amir Rasouli (aras@eecs.yorku.ca) and Iuliia Kotseruba (yulia_k@eecs.yorku.ca).
The videos and annotations are released under the MIT License.