graviti
产品服务
解决方案
知识库
公开数据集
关于我们
RAVEN
Image Captioning
|...
许可协议: Research Only

Overview

We propose a new visual reasoning dataset, called RAVEN (Relational and Analogical Visual rEasoNing), in the context of Raven's Progressive Matrices (RPM). Unlike previous works, RAVEN is aimed at lifting machine intelligence by associating vision with structural, relational, and analogical reasoning in a hierarchical representation. This allows us to establish a semantic link between vision and reasoning by providing structure representation. We measure human performance in this dataset, benchmark several other baseline models, and propose a simple neural module (Dynamic Residual Tree, or DRT) that combines visual understanding and structural reasoning. Comprehensive experiments show that incorporating structural information consistently improves model performance.

数据概要
数据格式
image,
数据量
10K
文件大小
--
| 数据量 10K | 大小 --
RAVEN
Image Captioning
许可协议: Research Only

Overview

We propose a new visual reasoning dataset, called RAVEN (Relational and Analogical Visual rEasoNing), in the context of Raven's Progressive Matrices (RPM). Unlike previous works, RAVEN is aimed at lifting machine intelligence by associating vision with structural, relational, and analogical reasoning in a hierarchical representation. This allows us to establish a semantic link between vision and reasoning by providing structure representation. We measure human performance in this dataset, benchmark several other baseline models, and propose a simple neural module (Dynamic Residual Tree, or DRT) that combines visual understanding and structural reasoning. Comprehensive experiments show that incorporating structural information consistently improves model performance.

0
立即开始构建AI
graviti
wechat-QR
长按保存识别二维码,关注Graviti公众号

Copyright@Graviti
沪ICP备19019574号
沪公网安备 31011002004865号