Anonymous
7/5/2025, 5:19:16 AM
No.105804954
>mfw Resource news
07/04/2025
>YuzuUI: A New Frontend for SD WebUI
https://github.com/crstp/sd-yuzu-ui
>VLM Image Captioning: Multiturn VLM Bulk captioning using your api service
https://github.com/victorchall/vlm-caption
>GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning
https://huggingface.co/THUDM/GLM-4.1V-9B-Thinking
07/03/2025
>FreeMorph: Tuning-Free Generalized Image Morphing with Diffusion Model
https://yukangcao.github.io/FreeMorph
>LongAnimation: Long Animation Generation with Dynamic Global-Local Memory
https://cn-makers.github.io/long_animation_web
>CI-VID: A Coherent Interleaved Text-Video Dataset
https://github.com/ymju-BAAI/CI-VID
>Depth Anything at Any Condition
https://ghost233lism.github.io/depthanything-AC-page
>SketchColour: Channel Concat Guided DiT-based Sketch-to-Colour Pipeline for 2D Animation
https://bconstantine.github.io/SketchColour
>ReFlex: Text-Guided Editing of Real Images in Rectified Flow via Mid-Step Feature Extraction and Attention Adaptation
https://wlaud1001.github.io/ReFlex
>Semantic Alignment and Reinforcement for Data-Free Quantization of Vision Transformers
https://github.com/zysxmu/SARDFQ
07/02/2025
>SimpleTuner v2.0.1 with 2x Flux training speedup on Hopper + Blackwell support
https://github.com/bghira/SimpleTuner/releases/tag/v2.0.1
>DoppleDanger: Live face swapping and voice cloning
https://github.com/luispark6/DoppleDanger
>DAM-VSR: Disentanglement of Appearance and Motion for Video Super-Resolution
https://kongzhecn.github.io/projects/dam-vsr
>LLaVA-SP: Enhancing Visual Representation with Visual Spatial Tokens for MLLMs
https://github.com/CnFaker/LLaVA-SP
>ONLY: One-Layer Intervention Sufficiently Mitigates Hallucinations in Large Vision-Language Models
https://zifuwan.github.io/ONLY
>Beyond Low-Rank Tuning: Model Prior-Guided Rank Allocation for Effective Transfer in Low-Data and Large-Gap Regimes
https://github.com/EndoluminalSurgicalVision-IMR/SR-LoRA
07/04/2025
>YuzuUI: A New Frontend for SD WebUI
https://github.com/crstp/sd-yuzu-ui
>VLM Image Captioning: Multiturn VLM Bulk captioning using your api service
https://github.com/victorchall/vlm-caption
>GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning
https://huggingface.co/THUDM/GLM-4.1V-9B-Thinking
07/03/2025
>FreeMorph: Tuning-Free Generalized Image Morphing with Diffusion Model
https://yukangcao.github.io/FreeMorph
>LongAnimation: Long Animation Generation with Dynamic Global-Local Memory
https://cn-makers.github.io/long_animation_web
>CI-VID: A Coherent Interleaved Text-Video Dataset
https://github.com/ymju-BAAI/CI-VID
>Depth Anything at Any Condition
https://ghost233lism.github.io/depthanything-AC-page
>SketchColour: Channel Concat Guided DiT-based Sketch-to-Colour Pipeline for 2D Animation
https://bconstantine.github.io/SketchColour
>ReFlex: Text-Guided Editing of Real Images in Rectified Flow via Mid-Step Feature Extraction and Attention Adaptation
https://wlaud1001.github.io/ReFlex
>Semantic Alignment and Reinforcement for Data-Free Quantization of Vision Transformers
https://github.com/zysxmu/SARDFQ
07/02/2025
>SimpleTuner v2.0.1 with 2x Flux training speedup on Hopper + Blackwell support
https://github.com/bghira/SimpleTuner/releases/tag/v2.0.1
>DoppleDanger: Live face swapping and voice cloning
https://github.com/luispark6/DoppleDanger
>DAM-VSR: Disentanglement of Appearance and Motion for Video Super-Resolution
https://kongzhecn.github.io/projects/dam-vsr
>LLaVA-SP: Enhancing Visual Representation with Visual Spatial Tokens for MLLMs
https://github.com/CnFaker/LLaVA-SP
>ONLY: One-Layer Intervention Sufficiently Mitigates Hallucinations in Large Vision-Language Models
https://zifuwan.github.io/ONLY
>Beyond Low-Rank Tuning: Model Prior-Guided Rank Allocation for Effective Transfer in Low-Data and Large-Gap Regimes
https://github.com/EndoluminalSurgicalVision-IMR/SR-LoRA