Anonymous
10/30/2025, 8:17:27 AM
No.107050567
10/29/2025
>VC4VG: Optimizing Video Captions for Text-to-Video Generation
https://arxiv.org/abs/2510.24134
>ETC: training-free diffusion models acceleration with Error-aware Trend Consistency
https://arxiv.org/abs/2510.24129
>OmniText: A Training-Free Generalist for Controllable Text-Image Manipulation
https://arxiv.org/abs/2510.24093
>Routing Matters in MoE: Scaling Diffusion Transformers with Explicit Routing Guidance
https://arxiv.org/abs/2510.24711
>What do vision-language models see in the context? Investigating multimodal in-context learning
https://arxiv.org/abs/2510.24331
>Training-free Source Attribution of AI-generated Images via Resynthesis
https://arxiv.org/abs/2510.24278
>Beyond Objects: Contextual Synthetic Data Generation for Fine-Grained Classification
https://arxiv.org/abs/2510.24078
>SAGE: Structure-Aware Generative Video Transitions between Diverse Clips
https://kan32501.github.io/sage.github.io
>Group Relative Attention Guidance for Image Editing
https://arxiv.org/abs/2510.24657
>A Dual-Branch CNN for Robust Detection of AI-Generated Facial Forgeries
https://arxiv.org/abs/2510.24640
>Decoupled MeanFlow: Turning Flow Models into Flow Maps for Accelerated Sampling
https://arxiv.org/abs/2510.24474
>MC-SJD : Maximal Coupling Speculative Jacobi Decoding for Autoregressive Visual Generation Acceleration
https://arxiv.org/abs/2510.24211
>Enhancing CLIP Robustness via Cross-Modality Alignment
https://arxiv.org/abs/2510.24038
>AutoPrompt: Automated Red-Teaming of Text-to-Image Models via LLM-Driven Adversarial Prompts
https://arxiv.org/abs/2510.24034
>Efficient Cost-and-Quality Controllable Arbitrary-scale Super-resolution with Fourier Constraints
https://arxiv.org/abs/2510.23978
>Neural USD: An object-centric framework for iterative editing and control
https://escontrela.me/neural_usd
>VC4VG: Optimizing Video Captions for Text-to-Video Generation
https://arxiv.org/abs/2510.24134
>ETC: training-free diffusion models acceleration with Error-aware Trend Consistency
https://arxiv.org/abs/2510.24129
>OmniText: A Training-Free Generalist for Controllable Text-Image Manipulation
https://arxiv.org/abs/2510.24093
>Routing Matters in MoE: Scaling Diffusion Transformers with Explicit Routing Guidance
https://arxiv.org/abs/2510.24711
>What do vision-language models see in the context? Investigating multimodal in-context learning
https://arxiv.org/abs/2510.24331
>Training-free Source Attribution of AI-generated Images via Resynthesis
https://arxiv.org/abs/2510.24278
>Beyond Objects: Contextual Synthetic Data Generation for Fine-Grained Classification
https://arxiv.org/abs/2510.24078
>SAGE: Structure-Aware Generative Video Transitions between Diverse Clips
https://kan32501.github.io/sage.github.io
>Group Relative Attention Guidance for Image Editing
https://arxiv.org/abs/2510.24657
>A Dual-Branch CNN for Robust Detection of AI-Generated Facial Forgeries
https://arxiv.org/abs/2510.24640
>Decoupled MeanFlow: Turning Flow Models into Flow Maps for Accelerated Sampling
https://arxiv.org/abs/2510.24474
>MC-SJD : Maximal Coupling Speculative Jacobi Decoding for Autoregressive Visual Generation Acceleration
https://arxiv.org/abs/2510.24211
>Enhancing CLIP Robustness via Cross-Modality Alignment
https://arxiv.org/abs/2510.24038
>AutoPrompt: Automated Red-Teaming of Text-to-Image Models via LLM-Driven Adversarial Prompts
https://arxiv.org/abs/2510.24034
>Efficient Cost-and-Quality Controllable Arbitrary-scale Super-resolution with Fourier Constraints
https://arxiv.org/abs/2510.23978
>Neural USD: An object-centric framework for iterative editing and control
https://escontrela.me/neural_usd