Anonymous
10/17/2025, 8:15:42 AM
No.106915790
>mfw Research news
10/17/2025
>Deep Compositional Phase Diffusion for Long Motion Sequence Generation
https://github.com/asdryau/TransPhase
>RealDPO: Real or Not Real, that is the Preference
https://vchitect.github.io/RealDPO-Project
>DialectGen: Benchmarking and Improving Dialect Robustness in Multimodal Generation
https://arxiv.org/abs/2510.14949
>Coupled Diffusion Sampling for Training-Free Multi-View Image Editing
https://coupled-diffusion.github.io
>ScaleWeaver: Weaving Efficient Controllable T2I Generation with Multi-Scale Reference Attention
https://arxiv.org/abs/2510.14882
>ImagerySearch: Adaptive Test-Time Search for Video Generation Beyond Semantic Dependency Constraints
https://arxiv.org/abs/2510.14847
>FraQAT: Quantization Aware Training with Fractional bits
https://arxiv.org/abs/2510.14823
>In-Context Learning with Unpaired Clips for Instruction-based Video Editing
https://arxiv.org/abs/2510.14648
>DEXTER: Diffusion-Guided EXplanations with TExtual Reasoning for Vision Models
https://arxiv.org/abs/2510.14741
>DOS: Directional Object Separation in Text Embeddings for Multi-Object Image Generation
https://arxiv.org/abs/2510.14376
>Adapting Self-Supervised Representations as a Latent Space for Efficient Generation
https://arxiv.org/abs/2510.14630
>Efficient Video Sampling: Pruning Temporally Redundant Tokens for Faster VLM Inference
https://arxiv.org/abs/2510.14624
>Learning an Image Editing Model without Image Editing Pairs
https://nupurkmr9.github.io/npedit
>Consistent text-to-image generation via scene de-contextualization
https://arxiv.org/abs/2510.14553
>Noise Projection: Closing the Prompt-Agnostic Gap Behind Text-to-Image Misalignment in Diffusion Models
https://arxiv.org/abs/2510.14526
>VIST3A: Text-to-3D by Stitching a Multi-view Reconstruction Network to a Video Generator
https://gohyojun15.github.io/VIST3A
>Self-Augmented Visual Contrastive Decoding
https://arxiv.org/abs/2510.13315
10/17/2025
>Deep Compositional Phase Diffusion for Long Motion Sequence Generation
https://github.com/asdryau/TransPhase
>RealDPO: Real or Not Real, that is the Preference
https://vchitect.github.io/RealDPO-Project
>DialectGen: Benchmarking and Improving Dialect Robustness in Multimodal Generation
https://arxiv.org/abs/2510.14949
>Coupled Diffusion Sampling for Training-Free Multi-View Image Editing
https://coupled-diffusion.github.io
>ScaleWeaver: Weaving Efficient Controllable T2I Generation with Multi-Scale Reference Attention
https://arxiv.org/abs/2510.14882
>ImagerySearch: Adaptive Test-Time Search for Video Generation Beyond Semantic Dependency Constraints
https://arxiv.org/abs/2510.14847
>FraQAT: Quantization Aware Training with Fractional bits
https://arxiv.org/abs/2510.14823
>In-Context Learning with Unpaired Clips for Instruction-based Video Editing
https://arxiv.org/abs/2510.14648
>DEXTER: Diffusion-Guided EXplanations with TExtual Reasoning for Vision Models
https://arxiv.org/abs/2510.14741
>DOS: Directional Object Separation in Text Embeddings for Multi-Object Image Generation
https://arxiv.org/abs/2510.14376
>Adapting Self-Supervised Representations as a Latent Space for Efficient Generation
https://arxiv.org/abs/2510.14630
>Efficient Video Sampling: Pruning Temporally Redundant Tokens for Faster VLM Inference
https://arxiv.org/abs/2510.14624
>Learning an Image Editing Model without Image Editing Pairs
https://nupurkmr9.github.io/npedit
>Consistent text-to-image generation via scene de-contextualization
https://arxiv.org/abs/2510.14553
>Noise Projection: Closing the Prompt-Agnostic Gap Behind Text-to-Image Misalignment in Diffusion Models
https://arxiv.org/abs/2510.14526
>VIST3A: Text-to-3D by Stitching a Multi-view Reconstruction Network to a Video Generator
https://gohyojun15.github.io/VIST3A
>Self-Augmented Visual Contrastive Decoding
https://arxiv.org/abs/2510.13315