LibreChat supports multiple AI image generation models, allowing you to create images from text descriptions through various providers and endpoints.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/danny-avila/librechat/llms.txt
Use this file to discover all available pages before exploring further.
Supported Image Models
DALL-E
OpenAI’s DALL-E 2 and DALL-E 3 models
Flux
Black Forest Labs’ Flux models via API
Stable Diffusion
Stability AI’s image generation
Gemini Image Gen
Google’s Imagen through Gemini
DALL-E Configuration
OpenAI DALL-E
DALL-E is available through the OpenAI endpoint:.env
dall-e-3: Latest model with highest qualitydall-e-2: Legacy model, faster and cheaper
Azure DALL-E
For Azure-hosted DALL-E:.env
DALL-E Reverse Proxy
Use a custom endpoint:.env
Image Generation Parameters
DALL-E 3 Parameters
Text description of the desired image (up to 4000 characters)
Image dimensions:
1024x1024- Square (default)1792x1024- Wide/landscape1024x1792- Tall/portrait
Image quality:
standard- Faster, lower costhd- Higher detail, slower, more expensive
Image style:
vivid- Hyper-real and dramaticnatural- More natural, less dramatic
Example DALL-E Request
Flux API Configuration
Flux models through various providers:generate: Standard image generationgenerate_finetuned: Use fine-tuned modelslist_finetunes: Get available custom models
1:1- Square16:9- Landscape9:16- Portrait4:3,3:4,21:9, etc.
Stable Diffusion Configuration
Configure Stable Diffusion endpoints:.env
Google Gemini Image Generation
Gemini models with image generation:Using Image Generation
In Chat
Simply ask the AI to generate an image:With Agents
Configure agents with image generation tools:Tool Configuration
Image generation tools are automatically available:Generated Image Handling
File Context
Generated images are stored with specific context:File Storage
Configure storage for generated images:librechat.yaml
Download and Usage
Generated images can be:- Downloaded directly from the UI
- Referenced in future messages
- Used with vision models for analysis
- Saved to conversation history
Advanced Techniques
Prompt Engineering
Create effective image prompts:- Be specific about subject, setting, lighting
- Include style preferences (photorealistic, artistic, etc.)
- Mention colors, mood, atmosphere
- Specify composition and perspective
- Add details about time of day, season
Iterative Refinement
Batch Generation
Model Comparison
- DALL-E 3
- DALL-E 2
- Flux
- Stable Diffusion
Strengths:
- Highest quality results
- Best prompt following
- Good text rendering
- Consistent style
- Professional content
- Marketing materials
- Detailed scenes
- Text in images
File Size Configuration
Set limits for generated images:librechat.yaml
Best Practices
Cost Awareness: HD images and larger sizes cost more - use appropriately
Example Use Cases
Marketing Materials
Marketing Materials
Social Media Content
Social Media Content
Concept Art
Concept Art
Educational Diagrams
Educational Diagrams
Troubleshooting
Content Policy Violation
Content Policy Violation
Issue: Image rejected for policy violationSolutions:
- Revise prompt to remove potentially sensitive content
- Be more specific and appropriate
- Avoid violent, adult, or copyrighted content
- Use different phrasing
Poor Quality Results
Poor Quality Results
Issue: Generated images don’t match expectationsSolutions:
- Add more detail to prompt
- Specify style and mood
- Use quality: “hd” setting
- Try different style parameter
- Iterate with refinements
Generation Fails
Generation Fails
Issue: Tool fails to generate imageSolutions:
- Check API key is valid
- Verify endpoint configuration
- Check rate limits
- Review error messages
- Try simpler prompt
Environment Variables
Cost Optimization
Pricing (DALL-E 3):
- Standard 1024x1024: $0.040
- Standard 1024x1792/1792x1024: $0.080
- HD 1024x1024: $0.080
- HD 1024x1792/1792x1024: $0.120
Related Features
Multimodal
Analyze generated images with vision models
Agents
Use image generation in agent workflows
Artifacts
Display generated images in artifacts
Code Interpreter
Process and manipulate generated images