Yikai Liao's Tech Blog

Daily Paper: Density Measures for Language Generation

This paper introduces density measures to quantify the breadth-validity trade-off in language generation. Based on the generation-in-the-limit framework, it proposes an algorithm optimized with dynamic adjustment, fallback mechanisms, a token system, and tree structures to ensure high-density output.

AI LLM Language Generation Density Measures Theoretical Models

Daily Paper: Thought Manipulation

Proposes ThoughtMani, a training-free method to reduce redundant reasoning in large reasoning models by leveraging external chain-of-thought from smaller models, improving efficiency and safety.

AI LLM Reasoning CoT Efficiency

Daily Paper: Meta-LoRA - Meta-Learning LoRA Components for Domain-Aware ID Personalization

Proposes Meta-LoRA, a meta-learning LoRA framework encoding domain priors via shared LoRA base components for efficient, high-fidelity few-shot ID personalization in diffusion models like FLUX.1. Introduces Meta-PHD benchmark and R-FaceSim metric.

AI LoRA Meta-Learning Personalization Diffusion Models Few-Shot Learning FLUX.1

Daily Paper: Antidistillation Sampling

Proposes Antidistillation Sampling, a method to poison LLM reasoning traces during generation, hindering model distillation while preserving the original model's performance.

AI LLM Model Distillation Sampling Security

Daily Paper: Predictable Scale: Part I — Optimal Hyperparameter Scaling Law in Large Language Model Pretraining

Proposes empirical scaling laws (Step Law) that accurately estimate optimal Batch Size and Learning Rate based on model and data size, robust across different model structures, sparsity, and data distributions.

AI LLM Scaling Laws Hyperparameters Pretraining MoE

Daily Paper: EditAR - Unified Conditional Generation with Autoregressive Models

Proposes EditAR, a unified autoregressive framework based on LlamaGen, handling tokenized image and text inputs with DINOv2 feature distillation for diverse conditional generation tasks like editing, depth-to-image, edge-to-image, and segmentation-to-image.

AI Autoregressive Models Conditional Generation Image Editing Image Translation Computer Vision Feature Distillation LlamaGen DINOv2

Daily Paper: Trelawney - Looking beyond the next token

Introduces Trelawney, a training method that improves language model planning, reasoning, and story generation by explicitly inserting future information (lookahead tokens delimited by <T>, </T>) into training sequences, enabling models to learn and utilize future goals.

AI LLM Planning Reasoning Language Model Training Controllable Generation

Daily Paper: Are We Done with Object-Centric Learning? (OCCAM)

Argues unsupervised object discovery is largely solved by pre-trained segmentation models (e.g., HQES, SAM). Proposes OCCAM probe framework to show OCL's focus should shift to downstream challenges like OOD generalization and compositionality using available object representations, highlighting robust foreground object selection as the new bottleneck.

AI Object-Centric Learning Representation Learning OOD Generalization Compositionality Segmentation Models OCCAM Robustness Computer Vision

Hello World

Welcome to Hugo Theme Stack