avatar
Articles
293
Tags
89
Categories
6
Home
Archives
Tags
Categories
Link
About
Others
  • Music
  • Gallery
Vines' LogPaper Review - Audio-Visual Related Research (WIP)
Search
Home
Archives
Tags
Categories
Link
About
Others
  • Music
  • Gallery

Paper Review - Audio-Visual Related Research (WIP)

Created2025-02-25|ML/CV/NLP
|Post Views:
Author: Vines
Link: http://vinesmsuic.github.io/paper-survey-av/
Copyright Notice: All articles on this blog are licensed under CC BY-NC-SA 4.0 unless otherwise stated.
Literature Review
cover of previous post
Previous
Implementing RAG for Code Library Documentation
I tried to implement RAG for Code Library Documentation. This note help me to remind the important steps in setting up a RAG.
Related Articles
cover
2022-02-11
Paper Review - AnimeGAN
Studying image-to-image translation. Overview of 2019 ISICA paper "AnimeGAN - A Novel Lightweight GAN for Photo Animation".
cover
2022-01-21
Paper Review - CartoonGAN
Studying image-to-image translation. Overview of 2018 CVPR paper "CartoonGAN- Generative Adversarial Networks for Photo Cartoonizations".
cover
2022-08-18
Paper Review - Pix2Pix, CycleGAN
Studying image-to-image translation. Overview of 2017 CVPR paper "Image-to-Image Translation with Conditional Adversarial Networks" and 2017 ICCV paper "Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks".
cover
2024-07-17
Paper Review - ColorPeel
An interesting paper from ECCV2024. It talks about the color and shape disentanglement on Text-to-Image models. The solution is simple yet effective.
cover
2024-10-16
Paper Review - Modality Gap and Alignment in Multi-modal Contrastive Learning
Contrastive learning is a popular self-supervised learning technique that has shown remarkable success in training deep neural networks. The core idea behind contrastive learning is to learn representations that are not only discriminative but also invariant to various transformations. This is achieved by contrasting positive and negative samples in the embedding space.
cover
2023-01-31
Paper Review - Diffusion Models Applications
Some keypoints and details jot from CVPR 2022 tutorial - Tutorial on Denoising Diffusion-based Generative Modeling - Foundations and Applications
avatar
Vines
Vines' Learning Journey
Articles
293
Tags
89
Categories
6
Announcement
Breaking Change - :year/:month/:day/:title/ => :title/
Contents
  1. 1. Video-To-Audio
    1. 1.1. MMAudio (2024)
    2. 1.2. MultiFoley (2024)
  2. 2. Binaural Audio Generation based on Video
    1. 2.1. CCStereo (2025)
    2. 2.2. PseudoBinaural (CVPR 2021)
  3. 3. Audio Editing Based on Visuals
    1. 3.1. AVEdit (ACCV 2024)
      1. 3.1.1. Method
      2. 3.1.2. Limitations
      3. 3.1.3. Other Contributions
©2019 - 2025 By Vines
Framework Hexo 5.4.0|Theme Butterfly 5.3.3
The journey is many times better than the end.
Search
Loading Database