Anonymous
11/1/2025, 12:33:59 AM
No.107067768
>mfw Research news
10/31/2025
>A-TPT: Angular Diversity Calibration Properties for Test-Time Prompt Tuning of Vision-Language Models
https://arxiv.org/abs/2510.26441
>The Quest for Generalizable Motion Generation: Data, Model, and Evaluation
https://arxiv.org/abs/2510.26794
>SteerVLM: Robust Model Control through Lightweight Activation Steering for Vision Language Models
https://arxiv.org/abs/2510.26769
>LoCoT2V-Bench: A Benchmark for Long-Form and Complex T2V Generation
https://arxiv.org/abs/2510.26412
>GLYPH-SR: Can We Achieve Both High-Quality Image Super-Resolution and High-Fidelity Text Recovery via VLM-guided Latent Diffusion Model?
https://arxiv.org/abs/2510.26339
>Sketch2PoseNet: Efficient and Generalized Sketch to 3D Human Pose Prediction
https://arxiv.org/abs/2510.26196
>BasicAVSR: Arbitrary-Scale Video Super-Resolution via Image Priors and Enhanced Motion Compensation
https://arxiv.org/abs/2510.26149
>FullPart: Generating each 3D Part at Full Resolution
https://fullpart3d.github.io
>ConceptScope: Characterizing Dataset Bias via Disentangled Visual Concepts
https://arxiv.org/abs/2510.26186
>StructLayoutFormer:Conditional Structured Layout Generation via Structure Serialization and Disentanglement
https://arxiv.org/abs/2510.26141
>Masked Diffusion Captioning for Visual Feature Learning
https://cfeng16.github.io/mdlm4vfl
>SEE4D: Pose-Free 4D Generation via Auto-Regressive Video Inpainting
https://see-4d.github.io
>Security Risk of Misalignment between Text and Image in Multi-modal Model
https://arxiv.org/abs/2510.26105
>Dynamic VLM-Guided Negative Prompting for Diffusion Models
https://arxiv.org/abs/2510.26052
>MIRO: MultI-Reward cOnditioned pretraining improves T2I quality and efficiency
https://nicolas-dufour.github.io/miro
>SplitFlow: Flow Decomposition for Inversion-Free T2I Editing
https://arxiv.org/abs/2510.25970
>StyleGuard: Preventing T2I-Model-based Style Mimicry Attacks by Style Perturbations
https://arxiv.org/abs/2505.18766
10/31/2025
>A-TPT: Angular Diversity Calibration Properties for Test-Time Prompt Tuning of Vision-Language Models
https://arxiv.org/abs/2510.26441
>The Quest for Generalizable Motion Generation: Data, Model, and Evaluation
https://arxiv.org/abs/2510.26794
>SteerVLM: Robust Model Control through Lightweight Activation Steering for Vision Language Models
https://arxiv.org/abs/2510.26769
>LoCoT2V-Bench: A Benchmark for Long-Form and Complex T2V Generation
https://arxiv.org/abs/2510.26412
>GLYPH-SR: Can We Achieve Both High-Quality Image Super-Resolution and High-Fidelity Text Recovery via VLM-guided Latent Diffusion Model?
https://arxiv.org/abs/2510.26339
>Sketch2PoseNet: Efficient and Generalized Sketch to 3D Human Pose Prediction
https://arxiv.org/abs/2510.26196
>BasicAVSR: Arbitrary-Scale Video Super-Resolution via Image Priors and Enhanced Motion Compensation
https://arxiv.org/abs/2510.26149
>FullPart: Generating each 3D Part at Full Resolution
https://fullpart3d.github.io
>ConceptScope: Characterizing Dataset Bias via Disentangled Visual Concepts
https://arxiv.org/abs/2510.26186
>StructLayoutFormer:Conditional Structured Layout Generation via Structure Serialization and Disentanglement
https://arxiv.org/abs/2510.26141
>Masked Diffusion Captioning for Visual Feature Learning
https://cfeng16.github.io/mdlm4vfl
>SEE4D: Pose-Free 4D Generation via Auto-Regressive Video Inpainting
https://see-4d.github.io
>Security Risk of Misalignment between Text and Image in Multi-modal Model
https://arxiv.org/abs/2510.26105
>Dynamic VLM-Guided Negative Prompting for Diffusion Models
https://arxiv.org/abs/2510.26052
>MIRO: MultI-Reward cOnditioned pretraining improves T2I quality and efficiency
https://nicolas-dufour.github.io/miro
>SplitFlow: Flow Decomposition for Inversion-Free T2I Editing
https://arxiv.org/abs/2510.25970
>StyleGuard: Preventing T2I-Model-based Style Mimicry Attacks by Style Perturbations
https://arxiv.org/abs/2505.18766