Daily Paper: EditAR - Unified Conditional Generation with Autoregressive Models

Proposes EditAR, a unified autoregressive framework based on LlamaGen, handling tokenized image and text inputs with DINOv2 feature distillation for diverse conditional generation tasks like editing, depth-to-image, edge-to-image, and segmentation-to-image.

Daily Paper: Are We Done with Object-Centric Learning? (OCCAM)

Argues unsupervised object discovery is largely solved by pre-trained segmentation models (e.g., HQES, SAM). Proposes OCCAM probe framework to show OCL's focus should shift to downstream challenges like OOD generalization and compositionality using available object representations, highlighting robust foreground object selection as the new bottleneck.