avatar
Articles
296
Tags
89
Categories
6
Home
Archives
Tags
Categories
Link
About
Others
  • Music
  • Gallery
Vines' LogPaper Review - Preference Learning (WIP)
Search
Home
Archives
Tags
Categories
Link
About
Others
  • Music
  • Gallery

Paper Review - Preference Learning (WIP)

Created2025-07-19|ML/CV/NLP
|Post Views:
Author: Vines
Link: http://vinesmsuic.github.io/paper-preferencelearning/
Copyright Notice: All articles on this blog are licensed under CC BY-NC-SA 4.0 unless otherwise stated.
Literature Review
cover of previous post
Previous
Paper Review - FramePack and Anti-drifting Sampling
Make video diffusion practical! Quick notes on Framepack paper.
Related Articles
cover
2022-02-11
Paper Review - AnimeGAN
Studying image-to-image translation. Overview of 2019 ISICA paper "AnimeGAN - A Novel Lightweight GAN for Photo Animation".
cover
2022-01-21
Paper Review - CartoonGAN
Studying image-to-image translation. Overview of 2018 CVPR paper "CartoonGAN- Generative Adversarial Networks for Photo Cartoonizations".
cover
2022-08-18
Paper Review - MUNIT
Studying image-to-image translation. Overview of 2018 ECCV paper "Multimodal Unsupervised Image-to-Image Translation".
cover
2022-08-18
Paper Review - Pix2Pix, CycleGAN
Studying image-to-image translation. Overview of 2017 CVPR paper "Image-to-Image Translation with Conditional Adversarial Networks" and 2017 ICCV paper "Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks".
cover
2022-01-21
Paper Review - White-box Cartoonization
Studying image-to-image translation. Overview of 2020 CVPR paper "Learning to Cartoonize Using White-box Cartoon Representations".
cover
2024-06-19
Paper Review - AlphaFold2 and AlphaFold3
Let's try to figure out whats inside AlphaFold! AlphaFold can accurately predict structures of biomolecular interactions.
avatar
Vines
Vines' Learning Journey
Articles
296
Tags
89
Categories
6
Announcement
Breaking Change - :year/:month/:day/:title/ => :title/
Contents
  1. 1. DPO (2023)
    1. 1.0.1. Why PPO is hard to get right for LLMs
    2. 1.0.2. Why DPO is easier and often better
  2. 1.1. Intuition
    1. 1.1.1. Core Idea of Bradley-Terry (BT) Preference Model
    2. 1.1.2. Core Idea of Plackett–Luce (PL) Preference Model
  3. 1.2. Code Sketch
  • 2. Extended Reads / Videos
  • ©2019 - 2025 By Vines
    Framework Hexo 5.4.0|Theme Butterfly 5.3.3
    The journey is many times better than the end.
    Search
    Loading Database