3 Steps to Configure Gemini Thinking Mode: Cherry Studio and Chatbox Complete Setup Tutorial

Want your AI to "think before it speaks" just like a human? Gemini Thinking mode is Google's latest deep reasoning feature, allowing models to showcase their complete thought process before delivering an answer. In this post, we'll dive into how to correctly configure Gemini Thinking mode in two popular AI clients: Cherry Studio and Chatbox.

Core Value: By the end of this article, you'll know how to enable Gemini's thinking mode in Cherry Studio and Chatbox, view the model's reasoning process, and improve your results for complex tasks.

gemini-thinking-mode-cherry-studio-chatbox-config-guide-en 图示


Key Highlights of Gemini Thinking Mode

Gemini Thinking mode is a deep reasoning feature introduced by Google in the Gemini 2.5 and 3 series models. Unlike standard conversations, Thinking mode allows the model to perform internal reasoning before providing a final answer, significantly improving accuracy for complex tasks.

Key Point Description Value
Thought Visualization Displays the model's reasoning process Understand how the AI reaches its conclusions
Enhanced Reasoning Multi-step logical reasoning Solve complex math and programming problems
Controllable Thinking Depth Adjust the thinking token budget Balance speed and accuracy
Model Compatibility Full Gemini 2.5/3 series Flexibly choose the right model for your scenario

Models Supporting Gemini Thinking Mode

Currently, the following Gemini models support Thinking mode:

Model Name Model ID Thinking Parameter Default Behavior
Gemini 3 Pro gemini-3-pro-preview thinking_level Dynamic Thinking (HIGH)
Gemini 3 Flash gemini-3-flash-preview thinking_level Dynamic Thinking (HIGH)
Gemini 2.5 Pro gemini-2.5-pro thinking_budget Dynamic (8192 tokens)
Gemini 2.5 Flash gemini-2.5-flash thinking_budget Dynamic (-1)
Gemini 2.5 Flash-Lite gemini-2.5-flash-lite thinking_budget Off by default (0)

🎯 Technical Tip: For actual development or production, we recommend using the APIYI (apiyi.com) platform to call Gemini Thinking models. It provides OpenAI-compatible interfaces, so you don't have to deal with complex Google API authentication flows.

gemini-thinking-mode-cherry-studio-chatbox-config-guide-en 图示

Detailed API Parameters for Gemini Thinking Mode

Different versions of Gemini models use different parameters to control the thinking process:

Gemini 3 Series – thinking_level parameter

Level Description Best For
minimal Minimal thinking Simple Q&A
low Low thinking Casual conversation
medium Medium thinking General reasoning
high Deep thinking (Default) Complex tasks

Gemini 2.5 Series – thinking_budget parameter

Value Description Best For
0 Disable thinking Fast response
-1 Dynamic thinking (Recommended) Auto-adjustment
128-32768 Specify token count Fine-grained control

Configuring Gemini Thinking Mode in Cherry Studio

Cherry Studio is a powerful AI client that supports 300+ models and various AI providers. Here's a detailed guide on how to configure Gemini Thinking mode in Cherry Studio.

Step 1: Add the Gemini API Provider

  1. Open Cherry Studio and go to Settings → Providers.
  2. Find Gemini or Custom Provider.
  3. Enter your API configuration:
API Address: https://api.apiyi.com/v1
API Key: Your APIYI Key

💡 Configuration Tip: Using APIYI (apiyi.com) as your API address provides more stable access and a unified interface format.

Step 2: Add Gemini Thinking Models

Click the "Manage" or "Add" button at the bottom to manually add the following models:

Model Name to Add Description
gemini-3-pro-preview Gemini 3 Pro (Thinking version)
gemini-3-flash-preview Gemini 3 Flash (Thinking version)
gemini-2.5-pro Gemini 2.5 Pro (Thinking version)
gemini-2.5-flash Gemini 2.5 Flash (Thinking version)

Step 3: Enable the Thinking Mode Toggle

In the chat interface:

  1. Click the Settings icon in the top right corner.
  2. Find the Thinking Mode option.
  3. Switch the toggle to ON.

gemini-thinking-mode-cherry-studio-chatbox-config-guide-en 图示

Custom Parameter Configuration in Cherry Studio

If the UI toggle doesn't seem to work, you can manually configure the custom parameters:

For Gemini 3 models:

{
  "generationConfig": {
    "thinkingConfig": {
      "thinkingLevel": "high",
      "includeThoughts": true
    }
  }
}

For Gemini 2.5 models:

{
  "generationConfig": {
    "thinkingConfig": {
      "thinkingBudget": -1,
      "includeThoughts": true
    }
  }
}

Paste the relevant JSON into the Custom Parameters section in Cherry Studio.

View full Cherry Studio configuration steps

Detailed configuration steps:

  1. Open Model Settings: Click the model name at the top of the chat box.
  2. Go to Advanced Settings: Scroll down to the "Custom Parameters" area.
  3. Paste JSON: Copy and paste the JSON configuration for your specific model.
  4. Save and Test: Send a message to verify that the thinking process is displayed.

Troubleshooting:

  • Ensure the JSON format is correct (no trailing commas).
  • Confirm the model name matches your configuration.
  • Check if your API Key is valid.

🚀 Quick Start: We recommend using the APIYI (apiyi.com) platform to get your API Key. It supports the full range of Gemini models and is much easier to set up.


Configuring Gemini Thinking Mode in Chatbox

Chatbox is another popular AI desktop client known for its clean interface and multi-platform support. Here's how to set up Gemini Thinking mode in Chatbox.

Step 1: Configure the API Provider

  1. Open Chatbox and click Settings in the bottom left.
  2. Select Model Provider → Custom.
  3. Configure the API details:
Name: Gemini Thinking
API Type: OpenAI Compatible
API Host: https://api.apiyi.com
API Key: sk-your-apiyi-key

Step 2: Select a Thinking Model

Enter or select one of the following in the model selector:

  • gemini-3-pro-preview – Best reasoning capabilities
  • gemini-2.5-pro – Balanced performance and cost
  • gemini-2.5-flash – Fast responses

Step 3: Configure Thinking Parameters

Chatbox allows you to configure thinking mode via Extra Parameters:

{
  "thinking_config": {
    "thinking_level": "high"
  }
}

Or using thinking_budget:

{
  "thinking_config": {
    "thinking_budget": 8192
  }
}

Chatbox Thinking Process Display Settings

By default, Chatbox collapses the thinking process. You can adjust how it's displayed:

Setting Function Recommended Value
Show Thinking Process Expand/collapse thinking content On
Thinking Process Style Separate block/Inline display Separate block
Auto-collapse Automatically collapse long thoughts On
View Chatbox configuration code example
# Configuring Gemini Thinking using the OpenAI SDK
import openai

client = openai.OpenAI(
    api_key="sk-your-apiyi-key",
    base_url="https://api.apiyi.com/v1"  # APIYI unified interface
)

response = client.chat.completions.create(
    model="gemini-2.5-pro",
    messages=[
        {"role": "user", "content": "Explain why 1+1=2"}
    ],
    extra_body={
        "thinking_config": {
            "thinking_budget": 8192,
            "include_thoughts": True
        }
    }
)

# Output thinking process and the answer
print(response.choices[0].message.content)

Gemini Thinking Mode Best Practices

Thinking Depth Configuration for Different Scenarios

Scenario Recommended Model Thinking Configuration Description
Mathematical Proofs gemini-3-pro-preview thinking_level: high Requires rigorous reasoning
Code Debugging gemini-2.5-pro thinking_budget: 16384 Complex logic analysis
General Q&A gemini-2.5-flash thinking_budget: -1 Dynamic adaptation
Fast Response gemini-2.5-flash-lite thinking_budget: 0 Disable thinking
Article Writing gemini-3-flash-preview thinking_level: medium Balance creativity and efficiency

Thinking Token Budget Recommendations

Simple questions: 0-1024 tokens
General reasoning: 1024-4096 tokens
Complex tasks: 4096-16384 tokens
Extreme reasoning: 16384-32768 tokens

💡 Pro Tip: Choosing the right thinking depth depends largely on the complexity of your task. We recommend running actual tests on the APIYI (apiyi.com) platform to find the configuration that works best for your specific use case. The platform supports all Gemini Thinking models, making it easy to compare results quickly.

Thinking Token Budget vs. Task Complexity Choose the right thinking depth based on task type

Task Complexity

Thinking Token Budget

0 1K 8K 16K 32K

Simple Questions 0 – 1,024 tokens

<!-- Sample Tasks -->
<rect x="105" y="100" width="150" height="70" rx="6" fill="#1e293b" stroke="#22c55e" stroke-width="1" />
<text x="180" y="120" text-anchor="middle" fill="#22c55e" font-size="11" font-weight="bold" font-family="system-ui, sans-serif">Use Cases</text>
<text x="115" y="140" fill="#e2e8f0" font-size="9" font-family="system-ui, sans-serif">• Simple translation</text>
<text x="115" y="155" fill="#e2e8f0" font-size="9" font-family="system-ui, sans-serif">• Info lookup</text>

<!-- Connector -->
<line x1="180" y1="170" x2="180" y2="280" stroke="#22c55e" stroke-width="1" stroke-dasharray="4,4" opacity="0.5" />

General Reasoning 1K – 4K tokens

<!-- Sample Tasks -->
<rect x="260" y="100" width="155" height="70" rx="6" fill="#1e293b" stroke="#3b82f6" stroke-width="1" />
<text x="337" y="120" text-anchor="middle" fill="#3b82f6" font-size="11" font-weight="bold" font-family="system-ui, sans-serif">Use Cases</text>
<text x="270" y="140" fill="#e2e8f0" font-size="9" font-family="system-ui, sans-serif">• Code explanation</text>
<text x="270" y="155" fill="#e2e8f0" font-size="9" font-family="system-ui, sans-serif">• Article summary</text>

<!-- Connector -->
<line x1="337" y1="170" x2="337" y2="230" stroke="#3b82f6" stroke-width="1" stroke-dasharray="4,4" opacity="0.5" />

Complex Tasks 4K – 16K tokens

<!-- Sample Tasks -->
<rect x="420" y="100" width="155" height="60" rx="6" fill="#1e293b" stroke="#f59e0b" stroke-width="1" />
<text x="497" y="118" text-anchor="middle" fill="#f59e0b" font-size="11" font-weight="bold" font-family="system-ui, sans-serif">Use Cases</text>
<text x="430" y="136" fill="#e2e8f0" font-size="9" font-family="system-ui, sans-serif">• Algorithm design • Debugging</text>
<text x="430" y="151" fill="#e2e8f0" font-size="9" font-family="system-ui, sans-serif">• Option comparison • Architecture</text>

<!-- Connector -->
<line x1="497" y1="160" x2="497" y2="170" stroke="#f59e0b" stroke-width="1" stroke-dasharray="4,4" opacity="0.5" />

Extreme Reasoning 16K – 32K tokens

<!-- Sample Tasks -->
<rect x="580" y="175" width="155" height="60" rx="6" fill="#1e293b" stroke="#ef4444" stroke-width="1" />
<text x="657" y="193" text-anchor="middle" fill="#ef4444" font-size="11" font-weight="bold" font-family="system-ui, sans-serif">Use Cases</text>
<text x="590" y="211" fill="#e2e8f0" font-size="9" font-family="system-ui, sans-serif">• Math proofs • Complex coding</text>
<text x="590" y="226" fill="#e2e8f0" font-size="9" font-family="system-ui, sans-serif">• Deep reasoning • Strategic planning</text>

💡 Recommended: Use thinking_budget: -1 (Dynamic Mode) to let the model auto-adjust its thinking depth

Thinking Mode vs. Standard Mode Comparison

Dimension Standard Mode Thinking Mode
Response Speed Fast (1-3s) Slower (3-10s)
Reasoning Depth Shallow Deep multi-step
Token Consumption Low Medium to High
Accuracy (Complex Tasks) 60-70% 85-95%
Explainability Low High (Thinking process visible)
Best For Simple Q&A Complex reasoning tasks

FAQ

Q1: Cherry Studio isn’t showing the reasoning process even after turning on Thinking Mode?

This is a known issue. The UI switch for some providers might not work correctly, so you'll need to manually add the JSON configuration in the "Custom Parameters":

{
  "generationConfig": {
    "thinkingConfig": {
      "thinkingLevel": "high",
      "includeThoughts": true
    }
  }
}

Make sure includeThoughts is set to true—this is the key parameter for displaying the reasoning process. When calling through the APIYI apiyi.com platform, returning the reasoning summary is enabled by default.

Q2: What’s the difference between parameters for Gemini 2.5 and Gemini 3?

The two series use different parameters to control Thinking Mode:

  • Gemini 3 Series: Uses the thinkingLevel parameter, with values: minimal/low/medium/high.
  • Gemini 2.5 Series: Uses the thinkingBudget parameter, with values ranging from 0 to 32768.

Don't mix these parameters, or the API will return an error. We recommend using the unified interface via APIYI apiyi.com, as the platform automatically handles parameter compatibility for you.

Q3: How much extra token consumption does Thinking Mode add?

Thinking tokens are billed separately. Taking Gemini 2.5 Pro as an example:

  • Default thinking budget: 8192 tokens
  • Maximum thinking budget: 32768 tokens

Actual consumption depends on the task's complexity. For simple questions, the model might only use a few hundred thinking tokens; for complex problems, it might use up the entire budget. Setting thinkingBudget: -1 allows the model to adjust automatically, which is often the most cost-effective choice.

Q4: How can I get just the reasoning summary instead of the full process?

By setting includeThoughts: true in your API call, you'll get a reasoning summary rather than the full internal thinking tokens. Summaries are more concise and perfect for displaying in a UI. The full internal reasoning process isn't currently open to the public.

Q5: Which tasks are best suited for Thinking Mode?

Thinking Mode is particularly great for tasks requiring multi-step reasoning:

  • Math proofs and calculations
  • Code debugging and algorithm design
  • Logical reasoning and problem analysis
  • Strategic planning and decision analysis

Simple tasks like information retrieval, translation, or basic summarization don't really need Thinking Mode—it'll just increase latency and costs without much benefit.


Summary

Gemini Thinking Mode is a powerful feature for boosting Large Language Model reasoning capabilities. Through this configuration guide, you've learned how to:

  1. Understand Thinking Mode principles: Master the differences between the thinking_level and thinking_budget parameters.
  2. Configure Cherry Studio: Enable Thinking Mode via the UI switch or custom JSON parameters.
  3. Configure Chatbox: Use "Extra Parameters" to set up reasoning options.
  4. Follow best practices: Choose the right thinking depth based on task complexity.

We recommend using APIYI apiyi.com to quickly verify the effects of Gemini Thinking Mode. The platform provides a unified OpenAI-compatible interface, supports all Gemini 2.5 and 3 series models, and offers simpler configuration with more stable access.


References

  1. Google Gemini Thinking Official Documentation: Full API parameter descriptions

    • Link: ai.google.dev/gemini-api/docs/thinking
  2. Cherry Studio Official Documentation: Client configuration guide

    • Link: docs.cherry-ai.com
  3. Gemini Model List: List of models supporting Thinking

    • Link: ai.google.dev/gemini-api/docs/models

Author: APIYI Team
Technical Support: For Gemini API access or technical consulting, please visit APIYI at apiyi.com

Leave a Comment