Merge pull request #52 from fjqz177/main

Thanks for PR
Merge pull request #54 from smallmj/main
2026-04-01 14:42:16 +08:00 · 2026-04-01 14:36:46 +08:00 · 2026-04-01 10:27:53 +08:00 · 2026-03-31 21:00:33 +08:00 · 2026-03-30 11:59:25 +08:00 · 2026-03-29 15:54:12 +08:00
7 changed files with 473 additions and 32 deletions
--- a/.cursor-plugin/INSTALL.md
+++ b/.cursor-plugin/INSTALL.md
@@ -0,0 +1,93 @@
 # Installing MiniMax Skills for Cursor
 Enable MiniMax skills in Cursor by cloning the repository locally and pointing Cursor's skills path at the `skills/` directory.
 ## Prerequisites
 - Cursor installed
 - Git
 ## Installation
 ### macOS / Linux
 ```bash
 git clone https://github.com/MiniMax-AI/skills.git ~/.cursor/minimax-skills
 ```
 Set Cursor's skills path to:
 ```text
 ~/.cursor/minimax-skills/skills/
 ```
 ### Windows (PowerShell)
 ```powershell
 git clone https://github.com/MiniMax-AI/skills.git "$env:USERPROFILE\.cursor\minimax-skills"
 ```
 Set Cursor's skills path to:
 ```text
 C:\Users\YOUR_USERNAME\.cursor\minimax-skills\skills\
 ```
 Replace `YOUR_USERNAME` with your Windows account name.
 After saving the path, restart Cursor or reload the window so it rescans local skills.
 ## Verify
 Confirm the clone exists and contains `SKILL.md` files:
 ### macOS / Linux
 ```bash
 find ~/.cursor/minimax-skills/skills -maxdepth 2 -name SKILL.md | head
 ```
 ### Windows (PowerShell)
 ```powershell
 Get-ChildItem "$env:USERPROFILE\.cursor\minimax-skills\skills" -Directory | ForEach-Object {
    Get-ChildItem $_.FullName -Filter SKILL.md
 }
 ```
 ## Updating
 ### macOS / Linux
 ```bash
 cd ~/.cursor/minimax-skills && git pull
 ```
 ### Windows (PowerShell)
 ```powershell
 Set-Location "$env:USERPROFILE\.cursor\minimax-skills"
 git pull
 ```
 ## Uninstalling
 ### macOS / Linux
 ```bash
 rm -rf ~/.cursor/minimax-skills
 ```
 ### Windows (PowerShell)
 ```powershell
 Remove-Item -Recurse -Force "$env:USERPROFILE\.cursor\minimax-skills"
 ```
 ## VS Code Note
 This repository does not currently ship a standalone VS Code extension.
 If you use VS Code, the recommended options are:
 - run a supported CLI tool such as Codex, Claude Code, or OpenCode inside the VS Code integrated terminal
 - use Cursor if you want native local-skills configuration from this repository
--- a/.opencode/INSTALL.md
+++ b/.opencode/INSTALL.md
@@ -12,7 +12,10 @@
 git clone https://github.com/MiniMax-AI/skills.git ~/.minimax-skills
 mkdir -p ~/.config/opencode/skills
-ln -s ~/.minimax-skills/skills/* ~/.config/opencode/skills/
+for skill in ~/.minimax-skills/skills/*/; do
    skill_name=$(basename "$skill")
    ln -s "$skill" ~/.config/opencode/skills/minimax-"$skill_name"
 done
 ```
 ### Windows (PowerShell)
@@ -22,7 +25,7 @@ git clone https://github.com/MiniMax-AI/skills.git "$env:USERPROFILE\.minimax-sk
 New-Item -ItemType Directory -Force -Path "$env:USERPROFILE\.config\opencode\skills"
 Get-ChildItem "$env:USERPROFILE\.minimax-skills\skills" -Directory | ForEach-Object {
-    New-Item -ItemType SymbolicLink -Path "$env:USERPROFILE\.config\opencode\skills\$($_.Name)" -Target $_.FullName
+    New-Item -ItemType SymbolicLink -Path "$env:USERPROFILE\.config\opencode\skills\minimax-$($_.Name)" -Target $_.FullName
 }
 ```
@@ -58,14 +61,14 @@ Symlinks will automatically point to the updated content — no need to re-link.
 ### macOS / Linux
 ```bash
-rm -rf ~/.config/opencode/skills
+rm -f ~/.config/opencode/skills/minimax-*
 rm -rf ~/.minimax-skills
 ```
 ### Windows (PowerShell)
 ```powershell
-Remove-Item -Recurse -Force "$env:USERPROFILE\.config\opencode\skills"
+Get-ChildItem "$env:USERPROFILE\.config\opencode\skills\minimax-*" | Remove-Item -Force
 Remove-Item -Recurse -Force "$env:USERPROFILE\.minimax-skills"
 ```
--- a/.opencode/INSTALL_zh.md
+++ b/.opencode/INSTALL_zh.md
@@ -0,0 +1,85 @@
 # 安装 MiniMax Skills for OpenCode
 ## 前置要求
 - 已安装 [OpenCode.ai](https://opencode.ai)
 ## 安装
 ### macOS / Linux
 ```bash
 git clone https://github.com/MiniMax-AI/skills.git ~/.minimax-skills
 mkdir -p ~/.config/opencode/skills
 for skill in ~/.minimax-skills/skills/*/; do
    skill_name=$(basename "$skill")
    ln -s "$skill" ~/.config/opencode/skills/minimax-"$skill_name"
 done
 ```
 ### Windows (PowerShell)
 ```powershell
 git clone https://github.com/MiniMax-AI/skills.git "$env:USERPROFILE\.minimax-skills"
 New-Item -ItemType Directory -Force -Path "$env:USERPROFILE\.config\opencode\skills"
 Get-ChildItem "$env:USERPROFILE\.minimax-skills\skills" -Directory | ForEach-Object {
    New-Item -ItemType SymbolicLink -Path "$env:USERPROFILE\.config\opencode\skills\minimax-$($_.Name)" -Target $_.FullName
 }
 ```
 > **注意：** 在 Windows 上创建符号链接可能需要管理员权限或启用开发者模式。
 重启 OpenCode 以发现技能。
 验证方法：询问"列出可用技能"
 ## 可用技能
 - **frontend-dev** — 前端开发，包含 UI 设计、动画、AI 生成媒体资源
 - **fullstack-dev** — 全栈后端架构和前后端集成
 - **android-native-dev** — Android 原生应用开发，采用 Material Design 3
 - **ios-application-dev** — iOS 应用开发，包含 UIKit、SnapKit 和 SwiftUI
 - **shader-dev** — GLSL 着色器技术，用于创建惊艳的视觉效果（兼容 ShaderToy）
 - **gif-sticker-maker** — 将照片转换为动画 GIF 贴纸（Funko Pop / Pop Mart 风格）
 - **minimax-pdf** — 使用基于令牌的设计系统生成、填写和重新格式化 PDF 文档
 - **pptx-generator** — 生成、编辑和读取 PowerPoint 演示文稿
 - **minimax-xlsx** — 打开、创建、读取、分析、编辑或验证 Excel/电子表格文件
 - **minimax-docx** — 使用 OpenXML SDK 专业创建、编辑和格式化 Word 文档
 ## 更新
 ```bash
 cd ~/.minimax-skills && git pull
 ```
 符号链接将自动指向更新后的内容，无需重新链接。
 ## 卸载
 ### macOS / Linux
 ```bash
 rm -f ~/.config/opencode/skills/minimax-*
 rm -rf ~/.minimax-skills
 ```
 ### Windows (PowerShell)
 ```powershell
 Get-ChildItem "$env:USERPROFILE\.config\opencode\skills\minimax-*" | Remove-Item -Force
 Remove-Item -Recurse -Force "$env:USERPROFILE\.minimax-skills"
 ```
 ## 故障排除
 ### 找不到技能
 1. 验证符号链接是否存在：`ls -la ~/.config/opencode/skills/`
 2. 每个技能文件夹应包含 `SKILL.md` 文件
 3. 安装后重启 OpenCode
 ## 获取帮助
 - 问题反馈：https://github.com/MiniMax-AI/skills/issues
--- a/README.md
+++ b/README.md
@@ -22,6 +22,7 @@ Development skills for AI coding agents. Plug into your favorite AI coding tool
 | `pptx-generator` | Generate, edit, and read PowerPoint presentations. Create from scratch with PptxGenJS (cover, TOC, content, section divider, summary slides), edit existing PPTX via XML workflows, or extract text with markitdown. | Official |
 | `minimax-xlsx` | Open, create, read, analyze, edit, or validate Excel/spreadsheet files (.xlsx, .xlsm, .csv, .tsv). Covers creating new xlsx from scratch via XML templates, reading and analyzing with pandas, editing existing files with zero format loss, formula recalculation, validation, and professional financial formatting. | Official |
 | `minimax-docx` | Professional DOCX document creation, editing, and formatting using OpenXML SDK (.NET). Three pipelines: create new documents from scratch, fill/edit content in existing documents, or apply template formatting with XSD validation gate-check. | Official |
 | `vision-analysis` | Analyze, describe, and extract information from images using vision AI models. Supports describe, OCR, UI mockup review, chart data extraction, and object detection. Powered by MiniMax VL API with OpenAI GPT-4V fallback. | Community |
 | `minimax-multimodal-toolkit` | Generate voice, music, video, and image content via MiniMax APIs — the unified entry for MiniMax multimodal use cases. Covers TTS (text-to-speech, voice cloning, voice design, multi-segment), music (songs, instrumentals), video (text-to-video, image-to-video, start-end frame, subject reference, templates, long-form multi-scene), image (text-to-image, image-to-image with character reference), and media processing (convert, concat, trim, extract) via FFmpeg. | Official |
 ## Installation
@@ -40,6 +41,7 @@ git clone https://github.com/MiniMax-AI/skills.git ~/.cursor/minimax-skills
 ```
 Add to your Cursor settings — point the skills path to `~/.cursor/minimax-skills/skills/`.
 For Windows setup and verification, see [`.cursor-plugin/INSTALL.md`](.cursor-plugin/INSTALL.md).
 ### Codex
@@ -63,6 +65,17 @@ ln -s ~/.minimax-skills/skills/* ~/.config/opencode/skills/
 Restart OpenCode to discover the skills. See [`.opencode/INSTALL.md`](.opencode/INSTALL.md) for details.
 ### VS Code
 This repository does not currently ship a standalone VS Code extension.
 If you use VS Code, the supported approach is to run one of the supported CLI tools inside the integrated terminal:
 - Codex
 - Claude Code
 - OpenCode
 If you want native local-skills configuration from this repo, use Cursor and follow [`.cursor-plugin/INSTALL.md`](.cursor-plugin/INSTALL.md).
 ## Contributing
 We welcome contributions! Before submitting a PR, please read:
--- a/README_zh.md
+++ b/README_zh.md
@@ -22,6 +22,7 @@
 | `pptx-generator` | 生成、编辑和读取 PowerPoint 演示文稿。支持用 PptxGenJS 从零创建（封面、目录、内容、分节页、总结页），通过 XML 工作流编辑现有 PPTX，或用 markitdown 提取文本。 | Official |
 | `minimax-xlsx` | 打开、创建、读取、分析、编辑或验证 Excel/电子表格文件（.xlsx、.xlsm、.csv、.tsv）。支持通过 XML 模板从零创建 xlsx、使用 pandas 读取分析、零格式损失编辑现有文件、公式重算与验证、专业财务格式化。 | Official |
 | `minimax-docx` | 基于 OpenXML SDK（.NET）的专业 DOCX 文档创建、编辑与排版。三条流水线：从零创建新文档、填写/编辑现有文档内容、应用模板格式并通过 XSD 验证门控检查。 | Official |
 | `vision-analysis` | 使用视觉 AI 模型分析、描述和提取图像信息。支持描述、OCR 文字识别、UI 界面审查、图表数据提取和物体检测。基于 MiniMax VL API，OpenAI GPT-4V 作为备选。 | Community |
 | `minimax-multimodal-toolkit` | 通过 MiniMax API 生成语音、音乐、视频和图片内容 — MiniMax 多模态使用场景的统一入口。涵盖 TTS（文字转语音、声音克隆、声音设计、多段合成）、音乐（带词歌曲、纯音乐）、视频（文生视频、图生视频、首尾帧、主体参考、模板、长视频多场景）、图片（文生图、图生图含角色参考），以及基于 FFmpeg 的媒体处理（格式转换、拼接、裁剪、提取）。 | Official |
 ## 安装
@@ -40,6 +41,7 @@ git clone https://github.com/MiniMax-AI/skills.git ~/.cursor/minimax-skills
 ```
 在 Cursor 设置中将 skills 路径指向 `~/.cursor/minimax-skills/skills/`。
 Windows 安装与校验方式见 [`.cursor-plugin/INSTALL.md`](.cursor-plugin/INSTALL.md)。
 ### Codex
@@ -61,7 +63,18 @@ mkdir -p ~/.config/opencode/skills
 ln -s ~/.minimax-skills/skills/* ~/.config/opencode/skills/
 ```
-重启 OpenCode 以发现技能。详见 [`.opencode/INSTALL.md`](.opencode/INSTALL.md)。
+重启 OpenCode 以发现技能。详见 [`.opencode/INSTALL_zh.md`](.opencode/INSTALL_zh.md)。
 ### VS Code
 当前仓库还没有提供独立的 VS Code 扩展。
 如果你使用 VS Code，推荐方式是在集成终端里运行已支持的 CLI 工具：
 - Codex
 - Claude Code
 - OpenCode
 如果你希望直接使用本仓库的本地 skills 配置，建议使用 Cursor，并参考 [`.cursor-plugin/INSTALL.md`](.cursor-plugin/INSTALL.md)。
 ## 贡献
--- a/skills/minimax-multimodal-toolkit/scripts/image/generate_image.sh
+++ b/skills/minimax-multimodal-toolkit/scripts/image/generate_image.sh
@@ -44,7 +44,7 @@ image_to_data_url() {
  local mime
  mime="$(file -b --mime-type "$path" 2>/dev/null)" || mime="image/jpeg"
  local b64
-  b64="$(base64 < "$path")"
+  b64="$(base64 -w 0 < "$path")"
  echo "data:${mime};base64,${b64}"
 }
@@ -57,6 +57,78 @@ resolve_image() {
  esac
 }
 # ============================================================================
 # Payload builder — avoids command-line length limits on Windows
 # Uses temp files for jq when the payload may contain large base64 data.
 # ============================================================================
 # Build JSON payload, writing large fields (base64 image data) to temp files
 # to avoid Windows cmd.exe argument-length limits (~32KB).
 build_payload() {
  local model="$1" prompt="$2" response_format="$3" n="$4"
  local prompt_optimizer="$5" aigc_watermark="$6"
  local aspect_ratio="$7" width="$8" height="$9" seed="${10:-}"
  local ref_image="${11:-}"
  # Start with base payload using temp file to avoid long command lines
  local base_tmp
  base_tmp="$(mktemp)"
  trap "rm -f '$base_tmp'" EXIT INT TERM HUP
  jq -n \
    --arg model "$model" \
    --arg prompt "$prompt" \
    --arg rf "$response_format" \
    --argjson n "$n" \
    --argjson po "$prompt_optimizer" \
    --argjson aw "$aigc_watermark" \
    '{model: $model, prompt: $prompt, response_format: $rf, n: $n, prompt_optimizer: $po, aigc_watermark: $aw}' \
    > "$base_tmp"
  # Add optional fields, each via temp file to stay within Windows arg limits
  if [[ -n "$aspect_ratio" ]]; then
    local tmp2; tmp2="$(mktemp)"; trap "rm -f '$base_tmp' '$tmp2'" EXIT INT TERM HUP
    jq --arg ar "$aspect_ratio" '. + {aspect_ratio: $ar}' "$base_tmp" > "$tmp2"
    mv "$tmp2" "$base_tmp"
  fi
  if [[ -n "$width" ]]; then
    local tmp2; tmp2="$(mktemp)"; trap "rm -f '$base_tmp' '$tmp2'" EXIT INT TERM HUP
    jq --argjson w "$width" '. + {width: $w}' "$base_tmp" > "$tmp2"
    mv "$tmp2" "$base_tmp"
  fi
  if [[ -n "$height" ]]; then
    local tmp2; tmp2="$(mktemp)"; trap "rm -f '$base_tmp' '$tmp2'" EXIT INT TERM HUP
    jq --argjson h "$height" '. + {height: $h}' "$base_tmp" > "$tmp2"
    mv "$tmp2" "$base_tmp"
  fi
  if [[ -n "$seed" ]]; then
    local tmp2; tmp2="$(mktemp)"; trap "rm -f '$base_tmp' '$tmp2'" EXIT INT TERM HUP
    jq --argjson s "$seed" '. + {seed: $s}' "$base_tmp" > "$tmp2"
    mv "$tmp2" "$base_tmp"
  fi
  # Subject reference (i2i mode) — build via temp file to avoid huge command-line args
  if [[ -n "$ref_image" ]]; then
    local img_url
    img_url="$(resolve_image "$ref_image")"
    # Create temp files and set traps separately to avoid set -u issues
    local ref_tmp; ref_tmp="$(mktemp)"
    trap "rm -f '$base_tmp' '$ref_tmp'" EXIT INT TERM HUP
    local url_tmp; url_tmp="$(mktemp)"; trap "rm -f '$base_tmp' '$ref_tmp' '$url_tmp'" EXIT INT TERM HUP
    # Write URL to temp file to avoid long-argument issues, then build JSON
    echo -n "$img_url" > "$url_tmp"
    # Use jq -s to collect all lines (handles base64 with embedded newlines), take first element
    jq -Rs 'split("\n")[0] | {type: "character", image_file: .}' "$url_tmp" > "$ref_tmp"
    local tmp2; tmp2="$(mktemp)"; trap "rm -f '$base_tmp' '$ref_tmp' '$url_tmp' '$tmp2'" EXIT INT TERM HUP
    jq --slurpfile ref "$ref_tmp" '. + {subject_reference: $ref}' "$base_tmp" > "$tmp2"
    mv "$tmp2" "$base_tmp"
  fi
  cat "$base_tmp"
  rm -f "$base_tmp"
  trap - EXIT INT TERM HUP
 }
 # ============================================================================
 # Main
 # ============================================================================
@@ -144,31 +216,13 @@ USAGE
    echo "Error: -n must be between 1 and 9" >&2; exit 1
  fi
-  # Build payload
+  # Build payload using temp-file method (avoids Windows cmd.exe arg-length limit)
  local payload
-  payload=$(jq -n \
+  payload=$(build_payload \
-    --arg model "$model" \
+    "$model" "$prompt" "$response_format" "$n" \
-    --arg prompt "$prompt" \
+    "$prompt_optimizer" "$aigc_watermark" \
-    --arg rf "$response_format" \
+    "$aspect_ratio" "$width" "$height" "$seed" \
-    --argjson n "$n" \
+    "$ref_image")
    --argjson po "$prompt_optimizer" \
    --argjson aw "$aigc_watermark" \
    '{model: $model, prompt: $prompt, response_format: $rf, n: $n, prompt_optimizer: $po, aigc_watermark: $aw}')
  [[ -n "$aspect_ratio" ]] && payload=$(echo "$payload" | jq --arg ar "$aspect_ratio" '. + {aspect_ratio: $ar}')
  [[ -n "$width" ]] && payload=$(echo "$payload" | jq --argjson w "$width" '. + {width: $w}')
  [[ -n "$height" ]] && payload=$(echo "$payload" | jq --argjson h "$height" '. + {height: $h}')
  [[ -n "$seed" ]] && payload=$(echo "$payload" | jq --argjson s "$seed" '. + {seed: $s}')
  # Subject reference (i2i mode)
  if [[ "$mode" == "i2i" ]]; then
    if [[ -z "$ref_image" ]]; then
      echo "Error: --ref-image is required for i2i mode" >&2; exit 1
    fi
    local img_url
    img_url="$(resolve_image "$ref_image")"
    payload=$(echo "$payload" | jq --arg img "$img_url" '. + {subject_reference: [{type: "character", image_file: $img}]}')
  fi
  local api_host="${MINIMAX_API_HOST:-https://api.minimaxi.com}"
  local api_url="${api_host}/v1/image_generation"
@@ -177,13 +231,18 @@ USAGE
  echo "Model: $model"
  echo "Generating $n image(s)..."
  # Write payload to temp file to avoid command-line length limits
  local payload_tmp; payload_tmp="$(mktemp)"
  trap "rm -f '$payload_tmp'" EXIT INT TERM HUP
  echo -n "$payload" > "$payload_tmp"
  local raw_output http_code response
  raw_output="$(curl -s -w "\n%{http_code}" \
    -X POST "$api_url" \
    -H "Authorization: Bearer ${MINIMAX_API_KEY}" \
    -H "Content-Type: application/json" \
    --max-time 120 \
-    -d "$payload" 2>/dev/null)" || {
+    -d "@$payload_tmp" 2>/dev/null)" || {
    echo "Error: curl request failed" >&2
    exit 1
  }
@@ -203,6 +262,7 @@ USAGE
    local status_msg
    status_msg="$(echo "$response" | jq -r '.base_resp.status_msg // "Unknown error"')"
    echo "Error: API error (code $status_code): $status_msg" >&2
    echo "Full response: $response" >&2
    exit 1
  fi
--- a/skills/vision-analysis/SKILL.md
+++ b/skills/vision-analysis/SKILL.md
@@ -0,0 +1,174 @@
 ---
 name: vision-analysis
 description: >
  Analyze, describe, and extract information from images using the MiniMax vision MCP tool.
  Use when: user shares an image file path or URL (any message containing .jpg, .jpeg, .png,
  .gif, .webp, .bmp, or .svg file extension) or uses any of these words/phrases near an image:
  "analyze", "analyse", "describe", "explain", "understand", "look at", "review",
  "extract text", "OCR", "what is in", "what's in", "read this image", "see this image",
  "tell me about", "explain this", "interpret this", in connection with an image, screenshot,
  diagram, chart, mockup, wireframe, or photo.
  Also triggers for: UI mockup review, wireframe analysis, design critique, data extraction
  from charts, object detection, person/animal/activity identification.
  Triggers: any message with an image file extension (jpg, jpeg, png, gif, webp, bmp, svg),
  or any request to analyze/describ/understand/review/extract text from an image, screenshot,
  diagram, chart, photo, mockup, or wireframe.
 license: MIT
 metadata:
  version: "1.0"
  category: ai-vision
  sources:
    - MiniMax Token Plan MCP (understand_image tool)
 ---
 # Vision Analysis
 Analyze images using the MiniMax `MiniMax_understand_image` MCP tool available in the MiniMax Token Plan.
 ## Prerequisites
 - MiniMax Token Plan subscription with valid `MINIMAX_API_KEY`
 - MiniMax MCP configured (`MiniMax_understand_image` tool available)
 ### If MCP tool is not configured
 **Step 1:** The agent should fetch setup instructions from:
 **https://platform.minimaxi.com/docs/token-plan/mcp-guide**
 **Step 2:** Detect the user's environment (OpenCode, Cursor, Claude Code, etc.) and output the exact commands needed. Common examples:
 **OpenCode** — add to `~/.config/opencode/opencode.json` or `package.json`:
 ```json
 {
  "mcp": {
    "MiniMax": {
      "type": "local",
      "command": ["uvx", "minimax-coding-plan-mcp", "-y"],
      "environment": {
        "MINIMAX_API_KEY": "YOUR_TOKEN_PLAN_KEY",
        "MINIMAX_API_HOST": "https://api.minimaxi.com"
      },
      "enabled": true
    }
  }
 }
 ```
 **Claude Code**:
 ```bash
 claude mcp add -s user MiniMax --env MINIMAX_API_KEY=your-key --env MINIMAX_API_HOST=https://api.minimaxi.com -- uvx minimax-coding-plan-mcp -y
 ```
 **Cursor** — add to MCP settings:
 ```json
 {
  "mcpServers": {
    "MiniMax": {
      "command": "uvx",
      "args": ["minimax-coding-plan-mcp"],
      "env": {
        "MINIMAX_API_KEY": "your-key",
        "MINIMAX_API_HOST": "https://api.minimaxi.com"
      }
    }
  }
 }
 ```
 **Step 3:** After configuration, tell the user to restart their app and verify with `/mcp`.
 **Important:** If the user does not have a MiniMax Token Plan subscription, inform them that the `understand_image` tool requires one — it cannot be used with free or other tier API keys.
 ## Analysis Modes
 | Mode | When to use | Prompt strategy |
 |---|---|---|
 | `describe` | General image understanding | Ask for detailed description |
 | `ocr` | Text extraction from screenshots, documents | Ask to extract all text verbatim |
 | `ui-review` | UI mockups, wireframes, design files | Ask for design critique with suggestions |
 | `chart-data` | Charts, graphs, data visualizations | Ask to extract data points and trends |
 | `object-detect` | Identify objects, people, activities | Ask to list and locate all elements |
 ## Workflow
 ### Step 1: Auto-detect image
 The skill triggers automatically when a message contains an image file path or URL with extensions:
 `.jpg`, `.jpeg`, `.png`, `.gif`, `.webp`, `.bmp`, `.svg`
 Extract the image path from the message.
 ### Step 2: Select analysis mode and call MCP tool
 Use the `MiniMax_understand_image` tool with a mode-specific prompt:
 **describe:**
 ```
 Provide a detailed description of this image. Include: main subject, setting/background,
 colors/style, any text visible, notable objects, and overall composition.
 ```
 **ocr:**
 ```
 Extract all text visible in this image verbatim. Preserve structure and formatting
 (headers, lists, columns). If no text is found, say so.
 ```
 **ui-review:**
 ```
 You are a UI/UX design reviewer. Analyze this interface mockup or design. Provide:
 (1) Strengths — what works well, (2) Issues — usability or design problems,
 (3) Specific, actionable suggestions for improvement. Be constructive and detailed.
 ```
 **chart-data:**
 ```
 Extract all data from this chart or graph. List: chart title, axis labels, all
 data points/series with values if readable, and a brief summary of the trend.
 ```
 **object-detect:**
 ```
 List all distinct objects, people, and activities you can identify. For each,
 describe what it is and its approximate location in the image.
 ```
 ### Step 3: Present results
 Return the analysis clearly. For `describe`, use readable prose. For `ocr`, preserve structure. For `ui-review`, use a structured critique format.
 ## Output Format Example
 For describe mode:
 ```
 ## Image Description
 [Detailed description of the image contents...]
 ```
 For ocr mode:
 ```
 ## Extracted Text
 [Preserved text structure from the image]
 ```
 For ui-review mode:
 ```
 ## UI Design Review
 ### Strengths
 - ...
 ### Issues
 - ...
 ### Suggestions
 - ...
 ```
 ## Notes
 - Images up to 20MB supported (JPEG, PNG, GIF, WebP)
 - Local file paths work if MiniMax MCP is configured with file access
 - The `MiniMax_understand_image` tool is provided by the `minimax-coding-plan-mcp` package
Author	SHA1	Message	Date
MiniMax	1391b63464	Merge pull request #52 from fjqz177/main Thanks for PR	2026-04-01 14:42:16 +08:00
MiniMax	b80cce48bd	Merge pull request #54 from smallmj/main fix: i2i mode jq command to handle base64 with embedded newlines	2026-04-01 14:36:46 +08:00
smallmj	9643620f1a	fix: i2i mode jq command to handle base64 with embedded newlines - Use 'jq -Rs split' instead of 'jq -R' to properly handle base64 data URIs that may contain embedded newlines in the data - Fix double-nesting of subject_reference array causing API error 2013 - Also fix: base64 -w 0 to avoid line wrapping in base64 output - Write payload to temp file for curl to avoid shell arg length limits - Add full response output on API error for easier debugging Fixes #53	2026-04-01 10:27:53 +08:00
fjqz177	f02366cd0f	fix(install): create properly prefixed symlinks for independent skill management Allow users to identify and uninstall minimax skills without affecting other skills in the same directory. - Create individual minimax-{skill_name} symlinks instead of single broken link - Uninstall now removes only minimax-* symlinks, preserving other skills - Fix Windows PowerShell naming to match - Add Chinese installation guide (low-priority)	2026-03-31 21:00:33 +08:00
zest0198	cf44f7b122	Merge pull request #42 from Ethereal49/patch-1 update: remove redundant characters in README_zh.md	2026-03-30 11:59:25 +08:00
Ethereal49	0e006d124b	Update README: remove redundant characters removed redundant characters	2026-03-29 15:54:12 +08:00
liyuan97	f87b423670	Merge pull request #21 from divitkashyap/feat/vision-analysis feat(vision-analysis): add image analysis skill with OCR, UI review, and chart extraction	2026-03-27 20:47:30 +08:00
liyuan97	37046a3edb	Merge pull request #25 from JithendraNara/codex/docs-install-guidance docs: add Cursor install guide and VS Code note	2026-03-27 20:46:42 +08:00
liyuan97	ee1e834f6c	Merge pull request #34 from dewu0224/fix/windows-i2i-payload fix(image): use temp files for jq to avoid Windows command-line length limit	2026-03-27 20:44:41 +08:00
liyuan97	551541a974	Merge pull request #39 from MiniMax-AI/feat/minimax-plan-limits-768p-only feat(minimax): add plan limits & fix video constraints to 768P/6s only	2026-03-27 20:43:57 +08:00
Claude	b4e6c16f4b	fix(image): use temp files for jq to avoid Windows command-line length limit When running in i2i (image-to-image) mode, the script passes a base64-encoded image as a jq argument on the command line. On Windows, cmd.exe limits arguments to ~32 KB, causing 'Argument list too long' errors when the reference image is large. Fix: write intermediate JSON fragments to temp files and pass them to jq via --from-file instead of on the command line. This works reliably on Windows and is equally valid on Linux/macOS. Fixes the i2i (--mode i2i --ref-image) workflow on Windows.	2026-03-27 01:11:07 +08:00
JithendraNara	56a37aec73	docs: clarify Cursor Windows path example	2026-03-25 13:57:34 -04:00
JithendraNara	6c404106e3	docs: add Cursor install guide and VS Code note	2026-03-25 13:53:04 -04:00
Divit Kashyap	0a61f2be6a	Merge branch 'main' into feat/vision-analysis	2026-03-25 11:55:49 +00:00
Divit Kashyap	6d8beb6ade	feat(vision-analysis): add proactive MCP setup helper with multi-environment instructions	2026-03-25 11:54:29 +00:00
Divit Kashyap	0b2927a366	feat(vision-analysis): add image analysis skill with OCR, UI review, chart extraction	2026-03-25 11:23:06 +00:00