Decoding Cross-Lingual Responses: Why Your AI Assistant Switches from Chinese to Korean and How to Fix It
Overview
Have you ever typed a prompt in Chinese to your coding assistant, only to get a reply in Korean? This puzzling behavior is more than a glitch—it’s a window into how large language models (LLMs) handle multilingual input, especially when code vocabulary reshapes the embedding space. In this tutorial, we’ll explore the mechanics behind such cross-lingual responses, then build a practical solution to detect and prevent unwanted language switches. By the end, you’ll understand embedding spaces, token overlap, and how to fine-tune your assistant for consistent language output.

Prerequisites
To follow along, you’ll need:
- Python 3.8 or later
- Basic knowledge of transformers and tokenizers (e.g., from Hugging Face)
- Familiarity with NumPy and cosine similarity
- An environment with
transformers,torch, andsentence-transformersinstalled
Install dependencies:
pip install transformers torch sentence-transformers
Step-by-Step Instructions
Step 1: Understand Embeddings and Language Overlap
LLMs like GPT or CodeLlama represent every token as a vector in a high-dimensional embedding space. When you mix languages—especially in coding contexts—tokens from different languages can occupy similar regions due to overlapping semantics (e.g., common programming keywords like print()). This similarity can cause the model to produce tokens from a different language than expected.
For example, consider these embeddings:
- Chinese token: “打印” (print)
- Korean token: “인쇄” (print)
- Java keyword: “System.out.println”
If your prompt contains code, the model might anchor to a region where Chinese and Korean embeddings intersect, leading to a Korean response.
Step 2: Inspect the Embedding Space
We’ll use sentence-transformers to visualize token similarities. Run this Python script:
from sentence_transformers import SentenceTransformer
import numpy as np
model = SentenceTransformer('all-MiniLM-L6-v2')
sentences = [
"打印变量", # Chinese
"변수 출력", # Korean
"print variable", # English
"int main()" # Code snippet
]
embeddings = model.encode(sentences)
similarities = np.dot(embeddings, embeddings.T)
print(similarities)
You’ll notice high similarity between Chinese and Korean programming phrases. This is the root cause of language switching.
Step 3: Detect Language Switch in Real Time
Build a detection function that monitors the assistant’s output language. We’ll use langdetect (or a simple character-range check). First, install it:
pip install langdetect
Then implement a wrapper for your assistant:
from langdetect import detect
def check_language(text):
try:
lang = detect(text)
except:
lang = 'unknown'
return lang
# Example: when you send a Chinese prompt, check if response language changes
prompt = "如何在Python中打印变量?" # Chinese
response = assistant.generate(prompt) # your model call
lang_resp = check_language(response[:50]) # check first 50 chars
if lang_resp == 'ko':
print("ALERT: Language switch to Korean detected!")
Step 4: Fix with Context Reinforcement
Prevent language switching by adding explicit language instructions in your system prompt. For example:
system_prompt = "You are a helpful coding assistant. Always respond in the same language as the user's last message. If the user writes in Chinese, reply in Chinese."
response = assistant.generate(system_prompt + "\n" + user_input)
Alternatively, use logit bias to suppress tokens from undesired languages. Here’s a snippet using Hugging Face transformers:

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
tokenizer = AutoTokenizer.from_pretrained("codellama/CodeLlama-7b-hf")
model = AutoModelForCausalLM.from_pretrained("codellama/CodeLlama-7b-hf")
# Tokenize input
inputs = tokenizer(prompt, return_tensors="pt")
# Identify Korean token IDs (example: range 50000-52000 for a Korean tokenizer)
# This step requires knowing your tokenizer's vocabulary mapping.
korean_ids = list(range(50000, 52000)) # placeholder
bias = torch.zeros(tokenizer.vocab_size)
bias[korean_ids] = -100.0 # reduce logits
outputs = model.generate(**inputs, logits_processor=[bias])
response = tokenizer.decode(outputs[0])
Step 5: Train Explicit Language Embedding
For a permanent solution, fine-tune the model with language-annotated data. Collect paired examples where the language tag is prepended. Example training data:
"[LANG_ZH] 打印变量" → "打印变量"
"[LANG_KO] 변수 출력" → "변수 출력"
Fine-tune using standard causal LM loss. This aligns the model’s outputs with the expected language tag. Use Trainer from Hugging Face:
from transformers import Trainer, TrainingArguments
training_args = TrainingArguments(
output_dir="./results",
per_device_train_batch_size=4,
num_train_epochs=3,
)
trainer = Trainer(
model=model,
args=training_args,
train_dataset=multi_lang_dataset, # your custom dataset
)
trainer.train()
Common Mistakes
Here are pitfalls to avoid:
- Ignoring tokenization differences: A token in Chinese might be split into multiple sub-tokens, while the same concept in Korean is a single token. Check your tokenizer’s behavior first.
- Over-relying on langdetect: Langdetect works on full sentences, not short code snippets. Use character-range detection for mixed-language code.
- Applying logit bias to wrong token IDs: Tokenizer vocabularies vary. Always inspect token IDs:
tokenizer.encode("한국어")to verify. - Forgetting to normalize embeddings: Cosine similarity requires normalized vectors. Use
torch.nn.functional.normalizebefore computing.
Summary
Language switching in coding assistants happens because code vocabulary merges embedding spaces across languages. By detecting the switch, reinforcing language context, and optionally fine-tuning, you can ensure consistent responses. This tutorial gave you a hands-on path from theory to implementation—now you can debug your AI assistant when it starts replying in Korean to your Chinese prompts.
Related Articles
- How to Identify and Avoid Websites with Undefined Trust Levels
- 10 Critical Shifts Redefining the UX Designer Role in 2026
- 10 Critical Red Flags for Websites with Undefined Trust: A Guide to Online Safety
- US Senate Banking Committee Releases Landmark Crypto Market Bill: Key Questions Answered
- Breaking: Wholesale Power Prices Plunge – No Signal for New Wind and Solar Investment
- Mastering the New UX Imperative: From Concept to Production-Ready Prototypes with AI
- Navigating the Post-Quantum Shift: Meta’s Blueprint for Cryptographic Resilience
- The Shifting Landscape of UX Design: When Code Becomes a Deliverable