16 Commits

Author SHA1 Message Date
MiniMax
1391b63464 Merge pull request #52 from fjqz177/main
Thanks for PR
2026-04-01 14:42:16 +08:00
MiniMax
b80cce48bd Merge pull request #54 from smallmj/main
fix: i2i mode jq command to handle base64 with embedded newlines
2026-04-01 14:36:46 +08:00
smallmj
9643620f1a fix: i2i mode jq command to handle base64 with embedded newlines
- Use 'jq -Rs split' instead of 'jq -R' to properly handle base64 data URIs
  that may contain embedded newlines in the data
- Fix double-nesting of subject_reference array causing API error 2013
- Also fix: base64 -w 0 to avoid line wrapping in base64 output
- Write payload to temp file for curl to avoid shell arg length limits
- Add full response output on API error for easier debugging

Fixes #53
2026-04-01 10:27:53 +08:00
fjqz177
f02366cd0f fix(install): create properly prefixed symlinks for independent skill management
Allow users to identify and uninstall minimax skills without affecting other skills in the same directory.
- Create individual minimax-{skill_name} symlinks instead of single broken link
- Uninstall now removes only minimax-* symlinks, preserving other skills
- Fix Windows PowerShell naming to match
- Add Chinese installation guide (low-priority)
2026-03-31 21:00:33 +08:00
zest0198
cf44f7b122 Merge pull request #42 from Ethereal49/patch-1
update: remove redundant characters in README_zh.md
2026-03-30 11:59:25 +08:00
Ethereal49
0e006d124b Update README: remove redundant characters
removed redundant characters
2026-03-29 15:54:12 +08:00
liyuan97
f87b423670 Merge pull request #21 from divitkashyap/feat/vision-analysis
feat(vision-analysis): add image analysis skill with OCR, UI review, and chart extraction
2026-03-27 20:47:30 +08:00
liyuan97
37046a3edb Merge pull request #25 from JithendraNara/codex/docs-install-guidance
docs: add Cursor install guide and VS Code note
2026-03-27 20:46:42 +08:00
liyuan97
ee1e834f6c Merge pull request #34 from dewu0224/fix/windows-i2i-payload
fix(image): use temp files for jq to avoid Windows command-line length limit
2026-03-27 20:44:41 +08:00
liyuan97
551541a974 Merge pull request #39 from MiniMax-AI/feat/minimax-plan-limits-768p-only
feat(minimax): add plan limits & fix video constraints to 768P/6s only
2026-03-27 20:43:57 +08:00
Claude
b4e6c16f4b fix(image): use temp files for jq to avoid Windows command-line length limit
When running in i2i (image-to-image) mode, the script passes a
base64-encoded image as a jq argument on the command line. On Windows,
cmd.exe limits arguments to ~32 KB, causing 'Argument list too long' errors
when the reference image is large.

Fix: write intermediate JSON fragments to temp files and pass them to jq
via --from-file instead of on the command line. This works reliably on
Windows and is equally valid on Linux/macOS.

Fixes the i2i (--mode i2i --ref-image) workflow on Windows.
2026-03-27 01:11:07 +08:00
JithendraNara
56a37aec73 docs: clarify Cursor Windows path example 2026-03-25 13:57:34 -04:00
JithendraNara
6c404106e3 docs: add Cursor install guide and VS Code note 2026-03-25 13:53:04 -04:00
Divit Kashyap
0a61f2be6a Merge branch 'main' into feat/vision-analysis 2026-03-25 11:55:49 +00:00
Divit Kashyap
6d8beb6ade feat(vision-analysis): add proactive MCP setup helper with multi-environment instructions 2026-03-25 11:54:29 +00:00
Divit Kashyap
0b2927a366 feat(vision-analysis): add image analysis skill with OCR, UI review, chart extraction 2026-03-25 11:23:06 +00:00
7 changed files with 473 additions and 32 deletions

93
.cursor-plugin/INSTALL.md Normal file
View File

@@ -0,0 +1,93 @@
# Installing MiniMax Skills for Cursor
Enable MiniMax skills in Cursor by cloning the repository locally and pointing Cursor's skills path at the `skills/` directory.
## Prerequisites
- Cursor installed
- Git
## Installation
### macOS / Linux
```bash
git clone https://github.com/MiniMax-AI/skills.git ~/.cursor/minimax-skills
```
Set Cursor's skills path to:
```text
~/.cursor/minimax-skills/skills/
```
### Windows (PowerShell)
```powershell
git clone https://github.com/MiniMax-AI/skills.git "$env:USERPROFILE\.cursor\minimax-skills"
```
Set Cursor's skills path to:
```text
C:\Users\YOUR_USERNAME\.cursor\minimax-skills\skills\
```
Replace `YOUR_USERNAME` with your Windows account name.
After saving the path, restart Cursor or reload the window so it rescans local skills.
## Verify
Confirm the clone exists and contains `SKILL.md` files:
### macOS / Linux
```bash
find ~/.cursor/minimax-skills/skills -maxdepth 2 -name SKILL.md | head
```
### Windows (PowerShell)
```powershell
Get-ChildItem "$env:USERPROFILE\.cursor\minimax-skills\skills" -Directory | ForEach-Object {
Get-ChildItem $_.FullName -Filter SKILL.md
}
```
## Updating
### macOS / Linux
```bash
cd ~/.cursor/minimax-skills && git pull
```
### Windows (PowerShell)
```powershell
Set-Location "$env:USERPROFILE\.cursor\minimax-skills"
git pull
```
## Uninstalling
### macOS / Linux
```bash
rm -rf ~/.cursor/minimax-skills
```
### Windows (PowerShell)
```powershell
Remove-Item -Recurse -Force "$env:USERPROFILE\.cursor\minimax-skills"
```
## VS Code Note
This repository does not currently ship a standalone VS Code extension.
If you use VS Code, the recommended options are:
- run a supported CLI tool such as Codex, Claude Code, or OpenCode inside the VS Code integrated terminal
- use Cursor if you want native local-skills configuration from this repository

View File

@@ -12,7 +12,10 @@
git clone https://github.com/MiniMax-AI/skills.git ~/.minimax-skills git clone https://github.com/MiniMax-AI/skills.git ~/.minimax-skills
mkdir -p ~/.config/opencode/skills mkdir -p ~/.config/opencode/skills
ln -s ~/.minimax-skills/skills/* ~/.config/opencode/skills/ for skill in ~/.minimax-skills/skills/*/; do
skill_name=$(basename "$skill")
ln -s "$skill" ~/.config/opencode/skills/minimax-"$skill_name"
done
``` ```
### Windows (PowerShell) ### Windows (PowerShell)
@@ -22,7 +25,7 @@ git clone https://github.com/MiniMax-AI/skills.git "$env:USERPROFILE\.minimax-sk
New-Item -ItemType Directory -Force -Path "$env:USERPROFILE\.config\opencode\skills" New-Item -ItemType Directory -Force -Path "$env:USERPROFILE\.config\opencode\skills"
Get-ChildItem "$env:USERPROFILE\.minimax-skills\skills" -Directory | ForEach-Object { Get-ChildItem "$env:USERPROFILE\.minimax-skills\skills" -Directory | ForEach-Object {
New-Item -ItemType SymbolicLink -Path "$env:USERPROFILE\.config\opencode\skills\$($_.Name)" -Target $_.FullName New-Item -ItemType SymbolicLink -Path "$env:USERPROFILE\.config\opencode\skills\minimax-$($_.Name)" -Target $_.FullName
} }
``` ```
@@ -58,14 +61,14 @@ Symlinks will automatically point to the updated content — no need to re-link.
### macOS / Linux ### macOS / Linux
```bash ```bash
rm -rf ~/.config/opencode/skills rm -f ~/.config/opencode/skills/minimax-*
rm -rf ~/.minimax-skills rm -rf ~/.minimax-skills
``` ```
### Windows (PowerShell) ### Windows (PowerShell)
```powershell ```powershell
Remove-Item -Recurse -Force "$env:USERPROFILE\.config\opencode\skills" Get-ChildItem "$env:USERPROFILE\.config\opencode\skills\minimax-*" | Remove-Item -Force
Remove-Item -Recurse -Force "$env:USERPROFILE\.minimax-skills" Remove-Item -Recurse -Force "$env:USERPROFILE\.minimax-skills"
``` ```

85
.opencode/INSTALL_zh.md Normal file
View File

@@ -0,0 +1,85 @@
# 安装 MiniMax Skills for OpenCode
## 前置要求
- 已安装 [OpenCode.ai](https://opencode.ai)
## 安装
### macOS / Linux
```bash
git clone https://github.com/MiniMax-AI/skills.git ~/.minimax-skills
mkdir -p ~/.config/opencode/skills
for skill in ~/.minimax-skills/skills/*/; do
skill_name=$(basename "$skill")
ln -s "$skill" ~/.config/opencode/skills/minimax-"$skill_name"
done
```
### Windows (PowerShell)
```powershell
git clone https://github.com/MiniMax-AI/skills.git "$env:USERPROFILE\.minimax-skills"
New-Item -ItemType Directory -Force -Path "$env:USERPROFILE\.config\opencode\skills"
Get-ChildItem "$env:USERPROFILE\.minimax-skills\skills" -Directory | ForEach-Object {
New-Item -ItemType SymbolicLink -Path "$env:USERPROFILE\.config\opencode\skills\minimax-$($_.Name)" -Target $_.FullName
}
```
> **注意:** 在 Windows 上创建符号链接可能需要管理员权限或启用开发者模式。
重启 OpenCode 以发现技能。
验证方法:询问"列出可用技能"
## 可用技能
- **frontend-dev** — 前端开发,包含 UI 设计、动画、AI 生成媒体资源
- **fullstack-dev** — 全栈后端架构和前后端集成
- **android-native-dev** — Android 原生应用开发,采用 Material Design 3
- **ios-application-dev** — iOS 应用开发,包含 UIKit、SnapKit 和 SwiftUI
- **shader-dev** — GLSL 着色器技术,用于创建惊艳的视觉效果(兼容 ShaderToy
- **gif-sticker-maker** — 将照片转换为动画 GIF 贴纸Funko Pop / Pop Mart 风格)
- **minimax-pdf** — 使用基于令牌的设计系统生成、填写和重新格式化 PDF 文档
- **pptx-generator** — 生成、编辑和读取 PowerPoint 演示文稿
- **minimax-xlsx** — 打开、创建、读取、分析、编辑或验证 Excel/电子表格文件
- **minimax-docx** — 使用 OpenXML SDK 专业创建、编辑和格式化 Word 文档
## 更新
```bash
cd ~/.minimax-skills && git pull
```
符号链接将自动指向更新后的内容,无需重新链接。
## 卸载
### macOS / Linux
```bash
rm -f ~/.config/opencode/skills/minimax-*
rm -rf ~/.minimax-skills
```
### Windows (PowerShell)
```powershell
Get-ChildItem "$env:USERPROFILE\.config\opencode\skills\minimax-*" | Remove-Item -Force
Remove-Item -Recurse -Force "$env:USERPROFILE\.minimax-skills"
```
## 故障排除
### 找不到技能
1. 验证符号链接是否存在:`ls -la ~/.config/opencode/skills/`
2. 每个技能文件夹应包含 `SKILL.md` 文件
3. 安装后重启 OpenCode
## 获取帮助
- 问题反馈https://github.com/MiniMax-AI/skills/issues

View File

@@ -22,6 +22,7 @@ Development skills for AI coding agents. Plug into your favorite AI coding tool
| `pptx-generator` | Generate, edit, and read PowerPoint presentations. Create from scratch with PptxGenJS (cover, TOC, content, section divider, summary slides), edit existing PPTX via XML workflows, or extract text with markitdown. | Official | | `pptx-generator` | Generate, edit, and read PowerPoint presentations. Create from scratch with PptxGenJS (cover, TOC, content, section divider, summary slides), edit existing PPTX via XML workflows, or extract text with markitdown. | Official |
| `minimax-xlsx` | Open, create, read, analyze, edit, or validate Excel/spreadsheet files (.xlsx, .xlsm, .csv, .tsv). Covers creating new xlsx from scratch via XML templates, reading and analyzing with pandas, editing existing files with zero format loss, formula recalculation, validation, and professional financial formatting. | Official | | `minimax-xlsx` | Open, create, read, analyze, edit, or validate Excel/spreadsheet files (.xlsx, .xlsm, .csv, .tsv). Covers creating new xlsx from scratch via XML templates, reading and analyzing with pandas, editing existing files with zero format loss, formula recalculation, validation, and professional financial formatting. | Official |
| `minimax-docx` | Professional DOCX document creation, editing, and formatting using OpenXML SDK (.NET). Three pipelines: create new documents from scratch, fill/edit content in existing documents, or apply template formatting with XSD validation gate-check. | Official | | `minimax-docx` | Professional DOCX document creation, editing, and formatting using OpenXML SDK (.NET). Three pipelines: create new documents from scratch, fill/edit content in existing documents, or apply template formatting with XSD validation gate-check. | Official |
| `vision-analysis` | Analyze, describe, and extract information from images using vision AI models. Supports describe, OCR, UI mockup review, chart data extraction, and object detection. Powered by MiniMax VL API with OpenAI GPT-4V fallback. | Community |
| `minimax-multimodal-toolkit` | Generate voice, music, video, and image content via MiniMax APIs — the unified entry for MiniMax multimodal use cases. Covers TTS (text-to-speech, voice cloning, voice design, multi-segment), music (songs, instrumentals), video (text-to-video, image-to-video, start-end frame, subject reference, templates, long-form multi-scene), image (text-to-image, image-to-image with character reference), and media processing (convert, concat, trim, extract) via FFmpeg. | Official | | `minimax-multimodal-toolkit` | Generate voice, music, video, and image content via MiniMax APIs — the unified entry for MiniMax multimodal use cases. Covers TTS (text-to-speech, voice cloning, voice design, multi-segment), music (songs, instrumentals), video (text-to-video, image-to-video, start-end frame, subject reference, templates, long-form multi-scene), image (text-to-image, image-to-image with character reference), and media processing (convert, concat, trim, extract) via FFmpeg. | Official |
## Installation ## Installation
@@ -40,6 +41,7 @@ git clone https://github.com/MiniMax-AI/skills.git ~/.cursor/minimax-skills
``` ```
Add to your Cursor settings — point the skills path to `~/.cursor/minimax-skills/skills/`. Add to your Cursor settings — point the skills path to `~/.cursor/minimax-skills/skills/`.
For Windows setup and verification, see [`.cursor-plugin/INSTALL.md`](.cursor-plugin/INSTALL.md).
### Codex ### Codex
@@ -63,6 +65,17 @@ ln -s ~/.minimax-skills/skills/* ~/.config/opencode/skills/
Restart OpenCode to discover the skills. See [`.opencode/INSTALL.md`](.opencode/INSTALL.md) for details. Restart OpenCode to discover the skills. See [`.opencode/INSTALL.md`](.opencode/INSTALL.md) for details.
### VS Code
This repository does not currently ship a standalone VS Code extension.
If you use VS Code, the supported approach is to run one of the supported CLI tools inside the integrated terminal:
- Codex
- Claude Code
- OpenCode
If you want native local-skills configuration from this repo, use Cursor and follow [`.cursor-plugin/INSTALL.md`](.cursor-plugin/INSTALL.md).
## Contributing ## Contributing
We welcome contributions! Before submitting a PR, please read: We welcome contributions! Before submitting a PR, please read:

View File

@@ -22,6 +22,7 @@
| `pptx-generator` | 生成、编辑和读取 PowerPoint 演示文稿。支持用 PptxGenJS 从零创建(封面、目录、内容、分节页、总结页),通过 XML 工作流编辑现有 PPTX或用 markitdown 提取文本。 | Official | | `pptx-generator` | 生成、编辑和读取 PowerPoint 演示文稿。支持用 PptxGenJS 从零创建(封面、目录、内容、分节页、总结页),通过 XML 工作流编辑现有 PPTX或用 markitdown 提取文本。 | Official |
| `minimax-xlsx` | 打开、创建、读取、分析、编辑或验证 Excel/电子表格文件(.xlsx、.xlsm、.csv、.tsv。支持通过 XML 模板从零创建 xlsx、使用 pandas 读取分析、零格式损失编辑现有文件、公式重算与验证、专业财务格式化。 | Official | | `minimax-xlsx` | 打开、创建、读取、分析、编辑或验证 Excel/电子表格文件(.xlsx、.xlsm、.csv、.tsv。支持通过 XML 模板从零创建 xlsx、使用 pandas 读取分析、零格式损失编辑现有文件、公式重算与验证、专业财务格式化。 | Official |
| `minimax-docx` | 基于 OpenXML SDK.NET的专业 DOCX 文档创建、编辑与排版。三条流水线:从零创建新文档、填写/编辑现有文档内容、应用模板格式并通过 XSD 验证门控检查。 | Official | | `minimax-docx` | 基于 OpenXML SDK.NET的专业 DOCX 文档创建、编辑与排版。三条流水线:从零创建新文档、填写/编辑现有文档内容、应用模板格式并通过 XSD 验证门控检查。 | Official |
| `vision-analysis` | 使用视觉 AI 模型分析、描述和提取图像信息。支持描述、OCR 文字识别、UI 界面审查、图表数据提取和物体检测。基于 MiniMax VL APIOpenAI GPT-4V 作为备选。 | Community |
| `minimax-multimodal-toolkit` | 通过 MiniMax API 生成语音、音乐、视频和图片内容 — MiniMax 多模态使用场景的统一入口。涵盖 TTS文字转语音、声音克隆、声音设计、多段合成、音乐带词歌曲、纯音乐、视频文生视频、图生视频、首尾帧、主体参考、模板、长视频多场景、图片文生图、图生图含角色参考以及基于 FFmpeg 的媒体处理(格式转换、拼接、裁剪、提取)。 | Official | | `minimax-multimodal-toolkit` | 通过 MiniMax API 生成语音、音乐、视频和图片内容 — MiniMax 多模态使用场景的统一入口。涵盖 TTS文字转语音、声音克隆、声音设计、多段合成、音乐带词歌曲、纯音乐、视频文生视频、图生视频、首尾帧、主体参考、模板、长视频多场景、图片文生图、图生图含角色参考以及基于 FFmpeg 的媒体处理(格式转换、拼接、裁剪、提取)。 | Official |
## 安装 ## 安装
@@ -40,6 +41,7 @@ git clone https://github.com/MiniMax-AI/skills.git ~/.cursor/minimax-skills
``` ```
在 Cursor 设置中将 skills 路径指向 `~/.cursor/minimax-skills/skills/` 在 Cursor 设置中将 skills 路径指向 `~/.cursor/minimax-skills/skills/`
Windows 安装与校验方式见 [`.cursor-plugin/INSTALL.md`](.cursor-plugin/INSTALL.md)。
### Codex ### Codex
@@ -61,7 +63,18 @@ mkdir -p ~/.config/opencode/skills
ln -s ~/.minimax-skills/skills/* ~/.config/opencode/skills/ ln -s ~/.minimax-skills/skills/* ~/.config/opencode/skills/
``` ```
重启 OpenCode 以发现技能。详见 [`.opencode/INSTALL.md`](.opencode/INSTALL.md)。 重启 OpenCode 以发现技能。详见 [`.opencode/INSTALL_zh.md`](.opencode/INSTALL_zh.md)。
### VS Code
当前仓库还没有提供独立的 VS Code 扩展。
如果你使用 VS Code推荐方式是在集成终端里运行已支持的 CLI 工具:
- Codex
- Claude Code
- OpenCode
如果你希望直接使用本仓库的本地 skills 配置,建议使用 Cursor并参考 [`.cursor-plugin/INSTALL.md`](.cursor-plugin/INSTALL.md)。
## 贡献 ## 贡献

View File

@@ -44,7 +44,7 @@ image_to_data_url() {
local mime local mime
mime="$(file -b --mime-type "$path" 2>/dev/null)" || mime="image/jpeg" mime="$(file -b --mime-type "$path" 2>/dev/null)" || mime="image/jpeg"
local b64 local b64
b64="$(base64 < "$path")" b64="$(base64 -w 0 < "$path")"
echo "data:${mime};base64,${b64}" echo "data:${mime};base64,${b64}"
} }
@@ -57,6 +57,78 @@ resolve_image() {
esac esac
} }
# ============================================================================
# Payload builder — avoids command-line length limits on Windows
# Uses temp files for jq when the payload may contain large base64 data.
# ============================================================================
# Build JSON payload, writing large fields (base64 image data) to temp files
# to avoid Windows cmd.exe argument-length limits (~32KB).
build_payload() {
local model="$1" prompt="$2" response_format="$3" n="$4"
local prompt_optimizer="$5" aigc_watermark="$6"
local aspect_ratio="$7" width="$8" height="$9" seed="${10:-}"
local ref_image="${11:-}"
# Start with base payload using temp file to avoid long command lines
local base_tmp
base_tmp="$(mktemp)"
trap "rm -f '$base_tmp'" EXIT INT TERM HUP
jq -n \
--arg model "$model" \
--arg prompt "$prompt" \
--arg rf "$response_format" \
--argjson n "$n" \
--argjson po "$prompt_optimizer" \
--argjson aw "$aigc_watermark" \
'{model: $model, prompt: $prompt, response_format: $rf, n: $n, prompt_optimizer: $po, aigc_watermark: $aw}' \
> "$base_tmp"
# Add optional fields, each via temp file to stay within Windows arg limits
if [[ -n "$aspect_ratio" ]]; then
local tmp2; tmp2="$(mktemp)"; trap "rm -f '$base_tmp' '$tmp2'" EXIT INT TERM HUP
jq --arg ar "$aspect_ratio" '. + {aspect_ratio: $ar}' "$base_tmp" > "$tmp2"
mv "$tmp2" "$base_tmp"
fi
if [[ -n "$width" ]]; then
local tmp2; tmp2="$(mktemp)"; trap "rm -f '$base_tmp' '$tmp2'" EXIT INT TERM HUP
jq --argjson w "$width" '. + {width: $w}' "$base_tmp" > "$tmp2"
mv "$tmp2" "$base_tmp"
fi
if [[ -n "$height" ]]; then
local tmp2; tmp2="$(mktemp)"; trap "rm -f '$base_tmp' '$tmp2'" EXIT INT TERM HUP
jq --argjson h "$height" '. + {height: $h}' "$base_tmp" > "$tmp2"
mv "$tmp2" "$base_tmp"
fi
if [[ -n "$seed" ]]; then
local tmp2; tmp2="$(mktemp)"; trap "rm -f '$base_tmp' '$tmp2'" EXIT INT TERM HUP
jq --argjson s "$seed" '. + {seed: $s}' "$base_tmp" > "$tmp2"
mv "$tmp2" "$base_tmp"
fi
# Subject reference (i2i mode) — build via temp file to avoid huge command-line args
if [[ -n "$ref_image" ]]; then
local img_url
img_url="$(resolve_image "$ref_image")"
# Create temp files and set traps separately to avoid set -u issues
local ref_tmp; ref_tmp="$(mktemp)"
trap "rm -f '$base_tmp' '$ref_tmp'" EXIT INT TERM HUP
local url_tmp; url_tmp="$(mktemp)"; trap "rm -f '$base_tmp' '$ref_tmp' '$url_tmp'" EXIT INT TERM HUP
# Write URL to temp file to avoid long-argument issues, then build JSON
echo -n "$img_url" > "$url_tmp"
# Use jq -s to collect all lines (handles base64 with embedded newlines), take first element
jq -Rs 'split("\n")[0] | {type: "character", image_file: .}' "$url_tmp" > "$ref_tmp"
local tmp2; tmp2="$(mktemp)"; trap "rm -f '$base_tmp' '$ref_tmp' '$url_tmp' '$tmp2'" EXIT INT TERM HUP
jq --slurpfile ref "$ref_tmp" '. + {subject_reference: $ref}' "$base_tmp" > "$tmp2"
mv "$tmp2" "$base_tmp"
fi
cat "$base_tmp"
rm -f "$base_tmp"
trap - EXIT INT TERM HUP
}
# ============================================================================ # ============================================================================
# Main # Main
# ============================================================================ # ============================================================================
@@ -144,31 +216,13 @@ USAGE
echo "Error: -n must be between 1 and 9" >&2; exit 1 echo "Error: -n must be between 1 and 9" >&2; exit 1
fi fi
# Build payload # Build payload using temp-file method (avoids Windows cmd.exe arg-length limit)
local payload local payload
payload=$(jq -n \ payload=$(build_payload \
--arg model "$model" \ "$model" "$prompt" "$response_format" "$n" \
--arg prompt "$prompt" \ "$prompt_optimizer" "$aigc_watermark" \
--arg rf "$response_format" \ "$aspect_ratio" "$width" "$height" "$seed" \
--argjson n "$n" \ "$ref_image")
--argjson po "$prompt_optimizer" \
--argjson aw "$aigc_watermark" \
'{model: $model, prompt: $prompt, response_format: $rf, n: $n, prompt_optimizer: $po, aigc_watermark: $aw}')
[[ -n "$aspect_ratio" ]] && payload=$(echo "$payload" | jq --arg ar "$aspect_ratio" '. + {aspect_ratio: $ar}')
[[ -n "$width" ]] && payload=$(echo "$payload" | jq --argjson w "$width" '. + {width: $w}')
[[ -n "$height" ]] && payload=$(echo "$payload" | jq --argjson h "$height" '. + {height: $h}')
[[ -n "$seed" ]] && payload=$(echo "$payload" | jq --argjson s "$seed" '. + {seed: $s}')
# Subject reference (i2i mode)
if [[ "$mode" == "i2i" ]]; then
if [[ -z "$ref_image" ]]; then
echo "Error: --ref-image is required for i2i mode" >&2; exit 1
fi
local img_url
img_url="$(resolve_image "$ref_image")"
payload=$(echo "$payload" | jq --arg img "$img_url" '. + {subject_reference: [{type: "character", image_file: $img}]}')
fi
local api_host="${MINIMAX_API_HOST:-https://api.minimaxi.com}" local api_host="${MINIMAX_API_HOST:-https://api.minimaxi.com}"
local api_url="${api_host}/v1/image_generation" local api_url="${api_host}/v1/image_generation"
@@ -177,13 +231,18 @@ USAGE
echo "Model: $model" echo "Model: $model"
echo "Generating $n image(s)..." echo "Generating $n image(s)..."
# Write payload to temp file to avoid command-line length limits
local payload_tmp; payload_tmp="$(mktemp)"
trap "rm -f '$payload_tmp'" EXIT INT TERM HUP
echo -n "$payload" > "$payload_tmp"
local raw_output http_code response local raw_output http_code response
raw_output="$(curl -s -w "\n%{http_code}" \ raw_output="$(curl -s -w "\n%{http_code}" \
-X POST "$api_url" \ -X POST "$api_url" \
-H "Authorization: Bearer ${MINIMAX_API_KEY}" \ -H "Authorization: Bearer ${MINIMAX_API_KEY}" \
-H "Content-Type: application/json" \ -H "Content-Type: application/json" \
--max-time 120 \ --max-time 120 \
-d "$payload" 2>/dev/null)" || { -d "@$payload_tmp" 2>/dev/null)" || {
echo "Error: curl request failed" >&2 echo "Error: curl request failed" >&2
exit 1 exit 1
} }
@@ -203,6 +262,7 @@ USAGE
local status_msg local status_msg
status_msg="$(echo "$response" | jq -r '.base_resp.status_msg // "Unknown error"')" status_msg="$(echo "$response" | jq -r '.base_resp.status_msg // "Unknown error"')"
echo "Error: API error (code $status_code): $status_msg" >&2 echo "Error: API error (code $status_code): $status_msg" >&2
echo "Full response: $response" >&2
exit 1 exit 1
fi fi

View File

@@ -0,0 +1,174 @@
---
name: vision-analysis
description: >
Analyze, describe, and extract information from images using the MiniMax vision MCP tool.
Use when: user shares an image file path or URL (any message containing .jpg, .jpeg, .png,
.gif, .webp, .bmp, or .svg file extension) or uses any of these words/phrases near an image:
"analyze", "analyse", "describe", "explain", "understand", "look at", "review",
"extract text", "OCR", "what is in", "what's in", "read this image", "see this image",
"tell me about", "explain this", "interpret this", in connection with an image, screenshot,
diagram, chart, mockup, wireframe, or photo.
Also triggers for: UI mockup review, wireframe analysis, design critique, data extraction
from charts, object detection, person/animal/activity identification.
Triggers: any message with an image file extension (jpg, jpeg, png, gif, webp, bmp, svg),
or any request to analyze/describ/understand/review/extract text from an image, screenshot,
diagram, chart, photo, mockup, or wireframe.
license: MIT
metadata:
version: "1.0"
category: ai-vision
sources:
- MiniMax Token Plan MCP (understand_image tool)
---
# Vision Analysis
Analyze images using the MiniMax `MiniMax_understand_image` MCP tool available in the MiniMax Token Plan.
## Prerequisites
- MiniMax Token Plan subscription with valid `MINIMAX_API_KEY`
- MiniMax MCP configured (`MiniMax_understand_image` tool available)
### If MCP tool is not configured
**Step 1:** The agent should fetch setup instructions from:
**https://platform.minimaxi.com/docs/token-plan/mcp-guide**
**Step 2:** Detect the user's environment (OpenCode, Cursor, Claude Code, etc.) and output the exact commands needed. Common examples:
**OpenCode** — add to `~/.config/opencode/opencode.json` or `package.json`:
```json
{
"mcp": {
"MiniMax": {
"type": "local",
"command": ["uvx", "minimax-coding-plan-mcp", "-y"],
"environment": {
"MINIMAX_API_KEY": "YOUR_TOKEN_PLAN_KEY",
"MINIMAX_API_HOST": "https://api.minimaxi.com"
},
"enabled": true
}
}
}
```
**Claude Code**:
```bash
claude mcp add -s user MiniMax --env MINIMAX_API_KEY=your-key --env MINIMAX_API_HOST=https://api.minimaxi.com -- uvx minimax-coding-plan-mcp -y
```
**Cursor** — add to MCP settings:
```json
{
"mcpServers": {
"MiniMax": {
"command": "uvx",
"args": ["minimax-coding-plan-mcp"],
"env": {
"MINIMAX_API_KEY": "your-key",
"MINIMAX_API_HOST": "https://api.minimaxi.com"
}
}
}
}
```
**Step 3:** After configuration, tell the user to restart their app and verify with `/mcp`.
**Important:** If the user does not have a MiniMax Token Plan subscription, inform them that the `understand_image` tool requires one — it cannot be used with free or other tier API keys.
## Analysis Modes
| Mode | When to use | Prompt strategy |
|---|---|---|
| `describe` | General image understanding | Ask for detailed description |
| `ocr` | Text extraction from screenshots, documents | Ask to extract all text verbatim |
| `ui-review` | UI mockups, wireframes, design files | Ask for design critique with suggestions |
| `chart-data` | Charts, graphs, data visualizations | Ask to extract data points and trends |
| `object-detect` | Identify objects, people, activities | Ask to list and locate all elements |
## Workflow
### Step 1: Auto-detect image
The skill triggers automatically when a message contains an image file path or URL with extensions:
`.jpg`, `.jpeg`, `.png`, `.gif`, `.webp`, `.bmp`, `.svg`
Extract the image path from the message.
### Step 2: Select analysis mode and call MCP tool
Use the `MiniMax_understand_image` tool with a mode-specific prompt:
**describe:**
```
Provide a detailed description of this image. Include: main subject, setting/background,
colors/style, any text visible, notable objects, and overall composition.
```
**ocr:**
```
Extract all text visible in this image verbatim. Preserve structure and formatting
(headers, lists, columns). If no text is found, say so.
```
**ui-review:**
```
You are a UI/UX design reviewer. Analyze this interface mockup or design. Provide:
(1) Strengths — what works well, (2) Issues — usability or design problems,
(3) Specific, actionable suggestions for improvement. Be constructive and detailed.
```
**chart-data:**
```
Extract all data from this chart or graph. List: chart title, axis labels, all
data points/series with values if readable, and a brief summary of the trend.
```
**object-detect:**
```
List all distinct objects, people, and activities you can identify. For each,
describe what it is and its approximate location in the image.
```
### Step 3: Present results
Return the analysis clearly. For `describe`, use readable prose. For `ocr`, preserve structure. For `ui-review`, use a structured critique format.
## Output Format Example
For describe mode:
```
## Image Description
[Detailed description of the image contents...]
```
For ocr mode:
```
## Extracted Text
[Preserved text structure from the image]
```
For ui-review mode:
```
## UI Design Review
### Strengths
- ...
### Issues
- ...
### Suggestions
- ...
```
## Notes
- Images up to 20MB supported (JPEG, PNG, GIF, WebP)
- Local file paths work if MiniMax MCP is configured with file access
- The `MiniMax_understand_image` tool is provided by the `minimax-coding-plan-mcp` package