Written by: Andrew Scheiner & Lauren Johnston

EDST370: “Education and Emerging Technologies”, Dickinson College

        Everyone remembers those frustrating moments of trying to talk to customer service chatbots when they are trying to cancel their phone bill or find out a package’s shipping status. Communicating with a “robot” on your computer can be a real struggle, especially for those without great technological skills who desire some help. But what if I told you that a new wave of technology has come out with a chatbot that can do as much as translate foreign languages for you? See, November 30th, 2022 was a day that changed the trajectory of human connection with artificial intelligence (AI). On this day, a large AI research corporation named OpenAI released its first demo version of a brand-new intelligent chatbot whose name will go down in history - ChatGPT. Hopefully this product name rings a bell, but I won’t be able to know if hearing ChatGPT gives you the chills or makes you really, really excited to get to work with an incredible new piece of technology. Whether you are afraid that this is the first step in robots taking over the world or not, we are here to talk about how generative AI chatbots like ChatGPT can help improve the future of education, and specifically in the field of learning foreign languages. In our case, we investigated the success that ChatGPT, Google’s Gemini, and Microsoft’s Copilot each had in translating English to Chinese. This blog post will cover a lot, including:

  • What generative AI is and how users can interact with it to translate language
  • Looking into exactly how AI translates
  • Describing what our experiment was - to test the performance of each platform
  • Discuss results - talk about each model specifically and then about AI as a whole
  • Concluding thoughts, translation potential, and talk about where we could go next

Generative AI and Language Translations

        For those who are new to the idea of generative artificial intelligence, I would recommend to regard this technological phenomenon as machine learning systems being capable of generating text, images, code, or other content.[1] While our post will more specifically examine that of (1) ChatGPT, (2) Google’s Gemini, and (3) Microsoft’s Copilot, there are several other platforms such as Claude (another chatbot) and DALLE-2 (an image-generating source) among others.

        Next, I think it’s important to understand what models and learning each of our chatbots are based on. To start, ChatGPT is a chat-like user interface which generates text and human-like responses using either GPT-3.5 or GPT-4, both of which are large-language models (LLMs). In essence, LLMs are algorithms which take in huge amounts of information from the Internet as training data to train the LLM to then analyze, generate, and output human-like content, in particular for interaction.[2]

        Then, we have Google’s Gemini which is its own LLM, an interface that users can communicate with and even ask it to generate images! And lastly, we will look at the performance of Copilot, Bing’s chatbot that was built using the Codex-based LLM from OpenAI (the same company that developed GPT) which also has a speciality in assisting with computer programming.

        So how can we use this knowledge to help us learn new languages? Well the good news is we have some great technological models and interfaces which we will be experimenting with later on. But for now, learning a new language can be as easy as opening up your web browser, clicking a button, and typing in

“Can you please translate ‘good morning’ into Spanish for me?”

Or

“I am a first-year student at a university who is taking a Spanish class. You will play the role of my Spanish professor to help me learn. Let’s first have a conversation in Spanish and you will give me feedback on my responses.”

        Both of these prompts that we give the chatbot will suffice our desire in different ways. Directly asking it to translate will get the job done for sure, but giving AI a role will make for a more fluid conversation - encouraging independent and continued learning. In our experiment, we will be showing sample prompts and responses and comparing the models to one another.

Translation Specifics

        It’s great that we have this new tech to help us speak another language - but how does it translate and how can we de-stigmatize artificial intelligence? I want to take a quick step backwards and point out a platform that many are well aware of: Google Translate. Has anyone else wondered how it works? Well, it works very similar to how something like GPT learns, through a neural network where a machine learns by a massive amount of data (think of it as examples). Thus, think of the generative AI platforms we have discussed as nothing more complex than the way Google Translate works. Translate searches its vast amount of knowledge it’s learned from existing translation to know how to translate what you want it to! We won’t be diving into the idea of neural networks here (it’s really complex) but we recommend learning more if you are more interested in the software/technology/business side of educational technology.

        The only real caveat to the difference between something like Google’s Gemini and its Translate is that Gemini (and all LLMs) are actually predicting what word will come next in its response to you. So when we are going to be asking it to translate something to Chinese from English for us, Gemini will actually be predicting what the most logical thing to say. The goal in machine learning is to teach a machine what is correct or known through data, then use it to predict unknown data. So, if we were to ask it to translate “Hello” for us into Chinese, Gemini would use its vast network of knowledge to then predict that the most logical response would be 你好 or, nǐhǎo. Luckily in this case, Gemini made the right prediction and that is the correct translation. Hopefully this explains why if we were to ask AI a question, it might not always be 100% accurate. It’s basically continually predicting the next word in the response and gives it to us to evaluate.

Our Experiment

        Finally what everyone has been waiting for - what our experiment was in using generative AI to translate. Our goal was to find how successful AI would be in translating English to Chinese. On a “scientific” level, here were the details of our experiment:

  • Independent variables (what changed): Chatbots used and whether we gave them a role or asked for translations directly.
  • Dependent variable (what we measured): Each chatbots’ success in translation
  • Controls: Prompts and translations that we asked them

        First, let’s look at how ChatGPT did. As for success, when we gave it role as a teacher it correctly translated 6 out of 7 prompts, but only 4 out of 7 when just being directly asked to translate for us. In addition, we found that its responses were very brief. Here is an example of what it looked like to interact with ChatGPT in asking it to translate:

ChatGPT translating "so-so" and "it's a deal" from English to Chinese.

Luckily, we had a much more improved experience working with Google’s Gemini chatbot. When we asked Gemini to directly translate for us, it was 100% successful, which was a great start. And to our surprise, it was even more helpful on our end when we gave Gemini a role as a teacher. Not only did Gemini translate all of our statements correctly, but it also:

  • Gave a more friendly response to being our Chinese teacher
  • Gave more advice for helping us understand how it can be more help in translate
  • Gave several responses for a translation often (formal, informal, additional options) and explained differences
  • Answers are more sophisticated

Gemini translating "it's a deal" from English to Chinese.

Lastly, to conclude our experiment, we observed Bing’s Copilot. We were pleased to see that Copilot was 100% successful in translating our seven statements when asked to both directly translate and act as our teacher. The best feature we found was that we were given hyperlinks for which sources it drew its responses from. However, at first with the roleplay - it began in only Chinese. Once we prompt it for translations, it talks in English while giving the Chinese translations. So our experiment started with telling Copilot we are a first-year student and it automatically started in Chinese - this is a problem as it might be daunting to try and read a response in only Chinese when you are still learning.

Copilot translating "it's a deal" from English to Chinese.

Results and Platform Performance

        As we analyze the impact of our experiment, the best place to start is that we can clearly get more effective and accurate translation output out of Gemini and Copilot rather than using ChatGPT. The platforms which have the most potential to be used as “teaching assistants” can be seen in Gemini and Copilot while ChatGPT seems to be much more similar to the application of Google Translate - a direct translator. But it’s important to note using chatbots can be an upgrade over Google Translate because we can have a flowing, human-like conversation and we can save our conversation history (and even copy the output!). Now that we know Gemini and Copilot can be the best conversational translation chatbots, which should we use? Since they have extremely similar accuracy in responses, we recommend using Gemini if you are more interested in several ways to say a statement and get more explanation. Moreover, if you are interested in getting beneficial translation responses while also being connected to external resources, Copilot can be a great choice too.

        If this experiment should convey anything, we hope it alleviates the fear that these chatbots will not be taking over the world anytime soon. But on a more serious note, generative AI has the potential to be a massive upgrade over getting help with a homework assignment from Google Translate. Chatbots like Gemini and Copilot can be great secondary learning tools to expand language learning because of their communication ability, enormous knowledge bank, and power of explication.

Conclusion and What’s Next?

We think that generative AI could potentially serve as a “language partner” for new learners. An essential part of improving one’s foreign language skills is to practice communicating. However, not everyone is in contact with native speakers. A simple solution is to practice with generative AI. This method cannot help a language learner communicate orally, but it can help one practice new vocabulary and sentence structures. Generative AI as a tool for studying is a much more engaging method compared to reviewing textbook dialogue and can even introduce new words.

However, while generative AI has the ability to translate anything from English into Chinese, we do not see it replacing foreign language teachers. Based on this experiment, we can conclude that generative AI does not have the ability to understand one’s personal language level. For example, when giving Microsoft’s Copilot the role of a first-year Chinese teacher, it instantly switched to typing in Chinese words that a level-one student would not know, especially when first learning the language.

The phrases that we chose for this experiment were based on patterns that do not exist in English. The generative AI that we chose are English-based by default. One phrase in particular, “it’s a deal,” uses a four-character idiom to represent the concept. These idioms (or 成语 chéngyǔ) are an essential part of Chinese language as some ideas are only expressed using them.

We can see future foreign language teachers encouraging students to study with generative AI. While I highly doubt that AI can replace teachers, it can be a very helpful tool outside of the classroom. All forms of AI continue to evolve, but updated versions of these platforms can be used to enrich a classroom if they serve as tools. Or, in addition, in a large foreign language classroom, educators can use AI to act as a teacher to students on an individual level while still having cautious oversight into how AI is helping each student. This would help students get the one-on-help they might require to maximize their language learning potential.

Thanks for reading!


[1] Generative AI Defined: How It Works, Benefits and Dangers (techrepublic.com).
[2] Wikipedia and https://indatalabs.com/blog/chatgpt-large-language-model.