[평범한 학부생이 하는 논문 리뷰] Text-to-Image Rectified Flow as Plug-and-Play Priors (ICLR 2025)

Notice

Recent Posts

Recent Comments

Link

« 2026/07 »
일	월	화	수	목	금	토
			1	2	3	4
5	6	7	8	9	10	11
12	13	14	15	16	17	18
19	20	21	22	23	24	25
26	27	28	29	30	31

Tags more

Archives

Today

Total

관리 메뉴

평범한 필기장

[평범한 학부생이 하는 논문 리뷰] Text-to-Image Rectified Flow as Plug-and-Play Priors (ICLR 2025) 본문

AI/Generative Models

[평범한 학부생이 하는 논문 리뷰] Text-to-Image Rectified Flow as Plug-and-Play Priors (ICLR 2025)

junseok-rh 2025. 7. 17. 10:41

Paper : https://arxiv.org/abs/2406.03293

Text-to-Image Rectified Flow as Plug-and-Play Priors

Large-scale diffusion models have achieved remarkable performance in generative tasks. Beyond their initial training applications, these models have proven their ability to function as versatile plug-and-play priors. For instance, 2D diffusion models can s

arxiv.org

Abstract

본 논문은 기존 diffusion model의 prior로써의 능력을 rectified flow model도 가지고 있고 오히려 성능이 더 좋다는 것을 보인다.

Preliminary

1. Methods

본 논문에서는 세가지 distillation method인 RFDS, iRFDS, RFDS-Rev를 제안한다.

1.1 RFDS

Pretrained network를 loss function으로 사용하는 것은 model 파라미터 대신에 input을 optimizing하는 것으로 볼 수 있다. 수식 (4)는 optimization variable로 $\mathbf{x}$를 두면 다음과 같이 나타낼 수 있다.

$\boldsymbol{\theta}$를 찾기 위해서 위 수식의 $\boldsymbol{\theta}$에 대한 gradient는 다음과 같다.

본 논문에서는 DreamFusion에서 SDS와 같이 network jacobian을 identity matrix로 둔다. 최종 RFDS loss는 다음과 같다.

1.2 iRFDS

Rectified flow에서 velocity prediction objective는 time-symmetry property를 가지는데, 이를 이용해서 noise를 optimization할 수 있다. Optimization objective를 $\mathbf{x} = g(\boldsymbol{\theta})$에서 $\boldsymbol{\epsilon}$로 바꾸면 수식 (7)은 다음과 같아진다.

여기서 $w^\prime(t)$는 $w(t)$의 반대 부호이다.

이를 이용해 inversion을 하고 image editing이 가능하다.

1.3 RFDS-Rev

RFDS는 기존 SDS처럼 object detail이 부족하다. 본 논문은 ReFlow방식에서 영감을 받아서 RFDS-Rev를 제안한다. RFDS의 mode-seeking은 averaged velocity를 야기하고 결국 blurred image를 야기한다. 본 논문은 다음 가정을 도입한다.

이 가정을 바탕으로 본 논문은 RFDS를 향상시키기 위한 two-stage method를 제안한다.

본 논문은 이 방식을 ReFlow로 finetuning된 모델을 가정하고 설계됐지만, ReFlow training을 하는 것과 상관없이 성능을 향상시켰다.

1.4 Applying iRFDS and RFDS-Rev to Diffusion models

Score function을 velocity field로 변환함으로써 diffusion model에서도 위 세가지 방식이 적용될 수 있다. Diffusion model에 적용될 때, RFDS baseline은 SDS loss와 동일하다.

2. Experiments

2.1 RFDS vs RFDS-Rev vs Diffusion Prior vs Diffusion RFDS-Rev

Toy Experiments on Optimization of 2D case

Text-to-3D Generation by Lifting 2D Models

2.2 iRFDS vs Diffusion Methods

2.3 Ablation Experiments

iRFDS optimization steps in RFDS-Rev

w/ Network Jacobian vs w/o Network Jacobian

'AI > Generative Models' 카테고리의 다른 글

[평범한 대학원생이 하는 논문 간단 요약] One Image is Worth a Thousand Words:A Usability Preservable Text-Image Collaborative Erasing Framework (ICML 2025) (1)	2025.09.09
[평범한 학부생이 하는 논문 리뷰] DiT (ICCV 2023 oral) & MM-DiT (ICML 2024) (0)	2025.07.23
[평범한 학부생이 하는 논문 리뷰] EraseAnything: Enabling Concept Erasure in Rectified Flow Transformers (ICML 2025) (1)	2025.07.16
[평범한 학부생이 하는 논문 리뷰] ReFlex: Text-Guided Editing of Real Images in Rectified Flow via Mid-Step Feature Extraction and Attention Adaptation (ICCV 2025) (2)	2025.07.06
[평범한 학부생이 하는 논문 리뷰] Stable Flow: Vital Layers for Training-Free Image Editing (CVPR 2025) (0)	2025.07.01

'AI/Generative Models' Related Articles

평범한 필기장

[평범한 학부생이 하는 논문 리뷰] Text-to-Image Rectified Flow as Plug-and-Play Priors (ICLR 2025) 본문

[평범한 학부생이 하는 논문 리뷰] Text-to-Image Rectified Flow as Plug-and-Play Priors (ICLR 2025)

Abstract

Preliminary

1. Methods

1.1 RFDS

1.2 iRFDS

1.3 RFDS-Rev

1.4 Applying iRFDS and RFDS-Rev to Diffusion models

2. Experiments

2.1 RFDS vs RFDS-Rev vs Diffusion Prior vs Diffusion RFDS-Rev

Toy Experiments on Optimization of 2D case

Text-to-3D Generation by Lifting 2D Models

2.2 iRFDS vs Diffusion Methods

2.3 Ablation Experiments

iRFDS optimization steps in RFDS-Rev

w/ Network Jacobian vs w/o Network Jacobian

'AI > Generative Models' 카테고리의 다른 글

티스토리툴바