We're releasing three versions of this dataset containing 1M datapoints each: 1.1 Binarized: each image has 2-3 white sprites on ... ground-truth segmentation masks for CLEVR [6] scenes. These were ...