OK-VQA
Question&Answer
VQA
|...
许可协议: Custom

Overview

OK-VQA is a new dataset for visual question answering that requires methods which can draw upon outside knowledge to answer questions.

  • 14,055 open-ended questions
  • 5 ground truth answers per question
  • Manually filtered to ensure all questions require outside knowledge (e.g. from Wikipeida)
  • Reduced questions with most common answers to reduce dataset bias

Data Format

Input Questions Format

The questions are stored using the JSON file format.

The questions format has the following data structure:

{
"info" : info,
"task_type" : str,
"data_type": str,
"data_subtype": str,
"questions" : [question],
"license" : license
}

info {
"year" : int,
"version" : str,
"description" : str,
"contributor" : str,
"url" : str,
"date_created" : datetime
}

license{
"name" : str,
"url" : str
}

question{
"question_id" : int,
"image_id" : int,
"question" : str
}
  • task_type: type of annotations in the JSON file (OpenEnded).
  • data_type: source of the images (mscoco or abstract_v002).
  • data_subtype: type of data subtype (e.g. train2014/val2014/test2015 for mscoco, train2015/val2015 for abstract_v002).

Annotation Format

The annotations are stored using the JSON file format.

The annotations format has the following data structure:

{
"info" : info,
"data_type": str,
"data_subtype": str,
"annotations" : [annotation],
"license" : license
}

info {
"year" : int,
"version" : str,
"description" : str,
"contributor" : str,
"url" : str,
"date_created" : datetime
}

license{
"name" : str,
"url" : str
}

annotation{
"question_id" : int,
"image_id" : int,
"question_type" : str,
"answer_type" : str,
"answers" : [answer],
"multiple_choice_answer" : str
}

answer{
"answer_id" : int,
"answer" : str,
"answer_confidence": str
}
  • data_type: source of the images (mscoco or abstract_v002).

  • data_subtype: type of data subtype (e.g. train2014/val2014/test2015 for mscoco, train2015/val2015 for abstract_v002).

  • question_type: type of the question determined by the first few words of the question. For details, please see README.

  • answer_type: type of the answer. Currently, "yes/no", "number", and "other".

  • multiple_choice_answer: most frequent ground-truth answer.

  • answer_confidence:

    subject's confidence in answering the question. For details, please see Antol et al., ICCV 2015.

License

Custom

数据概要
数据格式
Text, Image,
数据量
--
文件大小
18.77GB
发布方
Allen Institute for artificial intelligence
AI2 is a non-profit research institute founded in 2014 with the mission of conducting high-impact AI research and engineering in service of the common good.
数据集反馈
| 45 | 数据量 -- | 大小 18.77GB
OK-VQA
Question&Answer
VQA
许可协议: Custom

Overview

OK-VQA is a new dataset for visual question answering that requires methods which can draw upon outside knowledge to answer questions.

  • 14,055 open-ended questions
  • 5 ground truth answers per question
  • Manually filtered to ensure all questions require outside knowledge (e.g. from Wikipeida)
  • Reduced questions with most common answers to reduce dataset bias

Data Format

Input Questions Format

The questions are stored using the JSON file format.

The questions format has the following data structure:

{
"info" : info,
"task_type" : str,
"data_type": str,
"data_subtype": str,
"questions" : [question],
"license" : license
}

info {
"year" : int,
"version" : str,
"description" : str,
"contributor" : str,
"url" : str,
"date_created" : datetime
}

license{
"name" : str,
"url" : str
}

question{
"question_id" : int,
"image_id" : int,
"question" : str
}
  • task_type: type of annotations in the JSON file (OpenEnded).
  • data_type: source of the images (mscoco or abstract_v002).
  • data_subtype: type of data subtype (e.g. train2014/val2014/test2015 for mscoco, train2015/val2015 for abstract_v002).

Annotation Format

The annotations are stored using the JSON file format.

The annotations format has the following data structure:

{
"info" : info,
"data_type": str,
"data_subtype": str,
"annotations" : [annotation],
"license" : license
}

info {
"year" : int,
"version" : str,
"description" : str,
"contributor" : str,
"url" : str,
"date_created" : datetime
}

license{
"name" : str,
"url" : str
}

annotation{
"question_id" : int,
"image_id" : int,
"question_type" : str,
"answer_type" : str,
"answers" : [answer],
"multiple_choice_answer" : str
}

answer{
"answer_id" : int,
"answer" : str,
"answer_confidence": str
}
  • data_type: source of the images (mscoco or abstract_v002).

  • data_subtype: type of data subtype (e.g. train2014/val2014/test2015 for mscoco, train2015/val2015 for abstract_v002).

  • question_type: type of the question determined by the first few words of the question. For details, please see README.

  • answer_type: type of the answer. Currently, "yes/no", "number", and "other".

  • multiple_choice_answer: most frequent ground-truth answer.

  • answer_confidence:

    subject's confidence in answering the question. For details, please see Antol et al., ICCV 2015.

License

Custom

数据集反馈
0
立即开始构建AI
graviti
wechat-QR
长按保存识别二维码,关注Graviti公众号