Qwen-Image-Edit

Vision Model

qwen/qwen-image-edit

by Alibaba Qwen Team•release date: 8/19/2025

Qwen-Image-Edit is a general-purpose image editing model by Alibaba Qwen, enabling high-quality semantic and precise bilingual text edits on images.

Coming Soon

Technical Specs

Release Date8/19/2025

Input Formats

image

Output Formats

image

Capabilities & Features

Capabilities

semantic and appearance based image editingprecise multilingual text editing within images (Chinese and English)add, remove, or modify image elements while preserving other areasobject rotation and style transferIP (intellectual property) creative editingstate of-the-art performance on public image editing benchmarks

Supported File Types

.png.jpeg.jpg

Qwen-Image-Edit - Background

Overview

Qwen-Image-Edit is a general-purpose image editing model developed by the Alibaba Qwen Team. Built on the 20B-parameter Qwen-Image architecture, it is designed to deliver high-quality and efficient image editing capabilities. The model excels at both low-level visual modifications and advanced semantic transformations, making it suitable for a wide range of creative and professional applications.

Development History

Qwen-Image-Edit was released on August 19, 2025, as an extension of the Qwen-Image model. Its development focused on leveraging Qwen-Image's unique text rendering abilities and expanding them to precise image editing tasks. The model was engineered to address both appearance-level and semantic-level edits, and has rapidly established itself as a robust foundation model for image editing based on its strong performance in public benchmarks.

Key Innovations

Dual-level editing supporting both low-level appearance changes and high-level semantic transformations
Accurate bilingual text editing within images, preserving original font, size, and style
Integration of advanced text rendering capabilities into image editing workflows

Qwen-Image-Edit - Technical Specifications

Architecture

Qwen-Image-Edit is based on the Qwen-Image architecture, utilizing a transformer-based design optimized for image understanding and manipulation. The model is engineered to handle complex editing tasks, including both pixel-level and semantic-level modifications, and incorporates specialized modules for text rendering within images.

Parameters

20 billion parameters, positioning it among large-scale vision-language models for comprehensive image editing tasks.

Capabilities

Low-level visual appearance editing such as adding, deleting, or modifying elements while preserving unaffected regions
High-level semantic editing including IP creation, object rotation, and style transfer with semantic consistency
Precise bilingual (Chinese and English) text editing within images, maintaining original visual characteristics

Limitations

Publicly available information does not specify context length or detailed technical constraints
Model performance and capabilities may evolve as further updates are released

Qwen-Image-Edit - Performance

Strengths

Outstanding results on multiple public image editing benchmarks
Robust foundational model for diverse and complex image editing tasks

Real-world Effectiveness

Qwen-Image-Edit demonstrates strong real-world performance, delivering high-quality edits with both visual fidelity and semantic accuracy. Its ability to handle intricate text modifications and maintain stylistic consistency makes it valuable for professional design, content creation, and automated editing workflows. The model's efficiency and reliability have been validated through extensive benchmarking.

Qwen-Image-Edit - When to Use

Scenarios

You have a creative design workflow that requires precise modifications to images, such as adding or removing visual elements without affecting the rest of the image. Qwen-Image-Edit is ideal for these tasks due to its ability to perform localized edits with high fidelity, ensuring the integrity of the original content is preserved.
You need to generate marketing materials or branded content that involves editing or inserting bilingual text directly into images. The model’s advanced text editing capabilities allow for seamless integration of Chinese and English text, maintaining the original font, size, and style, which streamlines localization and branding efforts.
You are developing applications that require advanced semantic image editing, such as object rotation, style transfer, or IP creation. Qwen-Image-Edit excels in these scenarios by enabling high-level transformations while preserving semantic consistency, reducing manual intervention and accelerating creative workflows.

Best Practices

Use high-quality input images in supported formats (PNG, JPEG) to maximize output fidelity
Clearly specify editing instructions, especially for complex semantic or text-based modifications, to achieve optimal results

← Back to Search