On the same day that Google announced it was laying off 12,000 employees, I ran a small experiment. I logged onto ChatGPT, OpenAI’s hugely successful chatbot, and asked him to write a letter from Google’s CEO to the company’s employees, informing them of layoffs. I placed the result next to the letter from the real CEO, Sundar Pichai, and asked colleagues and random people to guess which of the texts was written by a human and which by a machine. Perhaps a testament to the generic nature of texts coming from corporate spokespeople, they all chose the AI-generated ChatGPT text over the human-generated one without hesitation. Well, all but one: an online tool developed by Copyleaks, an Israeli startup with offices in New York and a research and development center in Kiryat Shmona. “With high probability written by AI (artificial intelligence),” it said.
Of course, the company’s founder and CEO, Alon Yamin, wasn’t surprised. By incorporating the ability to recognize text created by artificial intelligence into their (also AI-based, of course) copy protection system, it has already proven to be a powerful tool for dealing with this new technological development and has helped show just how far some people already are rely on texts from artificial intelligence. “We have access to a lot of content from students and researchers,” Yamin Calcalist said in an interview from Copyleaks offices in New York. “Over the past few weeks, we have begun enabling AI detection for content to determine the percentage of students using artificial intelligence to write content. Over the last three weeks over 10% of the content that has been submitted to the system is hundreds of thousands of documents including text created by AI and at that point ChatGPT was just coming out. The data will continue to rise. We were very surprised by this number.”
Show 1 gallery
Alon Yamin – Copyleaks CEO
(Photo: Red Golan, Studio Golan)
ChatGPT’s ability to generate intelligent and informative-looking texts at a level high enough to successfully pass certification exams in subjects such as medicine or accounting, or to pass entrance exams for an MBA program with a high score, has transformed the days of the written scientific work – today a central tool in the learning process – are over. Yamin believes that the solution offered by Copyleaks successfully manages this crisis: “Students need to know how to write content, it is an important skill that will not disappear from the world, but there is a process of deciphering how to use it tool works . Everything is very, very new.”
The company was founded about eight years ago by Yamin and his partner Yehonatan Bitton, VP of R&D. “I met Yehonatan at IDF Intelligence Unit 8200,” Yamin said. “We were programmers. After military service, he studied computer science and I studied economics and management. Soon after that we started working on copyleaks. We focus on AI technologies for text analysis. What does the text mean, where does it come from? from, is it original or not, what tone was it written in, who wrote the lyrics?
“Our starting point was the Yehonatan family business. They sell ornamental fish. He developed a website for her when he was 11 and uploaded a lot of content to rank high on Google. One day, he saw that they were sinking in the search results rankings, affecting site traffic and revenue. He saw that their competitors were copying content, and Google penalized them because the search engine ranked pages with duplicated content lower and can not know what it is the source and what is the copy. That was the starting point. We wanted to develop a tool that would be able to identify the distribution of content on the web and determine whether it was original or not. We’ve found that a lot of content isn’t a one-to-one copy, so we wanted something smarter that could recognize even someone playing with the text, but be similar enough in structure, meaning, and tone to overcome this.
“From there we shifted the focus to the world of education. There it is very important to know if content is original and there are also many uses in the advertising, media and business world – whether someone is copying or stealing your content. Is there any leakage of sensitive content on the network? Everything is at a more sophisticated level than just copy and paste. We can also identify instances where content has been copied and translated, providing protection from literally every direction.”
The emergence of ChatGPT, says Yamin, didn’t surprise them: “We saw this development months ago and have been busy developing technologies that could reduce the risk. There are many advantages to working with ChatGPT, but as users we don’t know if the text was written by a human or an AI. Our technology knows what it is. It might be difficult for humans to tell the difference, but at the end of the day, an AI system that writes differently looks different from a statistical perspective. There are AI crumbs that can recognize technology like ours and use it to determine that it is content that was not written by humans. The transition wasn’t easy, we’ve worked on that for a very long time. In the end, it’s all about text, even if it’s created by AI, and we’re constantly working to analyze text content. There were a lot of changes and developments that we had to make, but also a lot of common parts that allowed us to base it on our existing infrastructures.”
“Imagine you hear a knock at the door. It sounds like a normal knock to us, but if you understand Morse code, it has meaning. Our AI knows how to speak AI language, recognizes it in text versus non-AI generated text. Our system understands how an AI text is created, it is a text based on statistical models, on data files, it is not human. There are unique things in the text written by AI and that’s why it looks different. We know how to identify these things and recreate how the text was created.”
What is the user feedback like?
“Currently, we say whether a text to be checked was written by AI with a probability of more than 99% for all content. That means only if the text contains AI-authored content, without specifying which parts of the text were AI-authored In the next two to three weeks, we will launch an update that will allow identification based on paragraphs and sentences. It will be possible to know at sentence and paragraph level what is and is not written by AI and we will attach trust percentages to each sentence. At the moment we only present things that we are 99% sure about.”
Since the release of ChatGPT, a number of tools have appeared claiming to recognize text created with it. The chatbot developer OpenAI is also planning to launch its own identification tool. What is your advantage in this game?
“We are not limited to any particular platform or model. Our technology can recognize any text generated by AI, not just ChatGPT. Additionally, the ability to spot at the paragraph or sentence level is something unique to us, and this impacts the quality and how reliable the results can be. Therefore our development is part of a complete platform. We can also determine whether the text is original or not. We are the only platform that covers everything from duplication to copyright infringement. We are available in five languages (English, French, Spanish, Portuguese and German) and are working on more languages.”
Despite the slowdown in the global high-tech industry, Alon Yamin said this has been a prosperous period for the company: “We are in the midst of growth and hiring processes, not layoffs. This is an interesting time.”
It is very rare to find a startup in Kiryat Shmona.
“Yehonatan is from a kibbutz in the area, so we settled there. We wanted to stay in the area to see how we could do something with startups there. Now the VC firm JVP has opened offices and there is progress.”
Is it difficult to recruit employees there?
“At the stage we’re at now, that’s less of an issue. At first, it took us a while to figure out how best to do this. We had to figure out how to work with colleges and universities in the area. Our first employees were Druze from the area who were studying engineering, and now there are five or six Druze workers, many workers who come from the universities and many workers who worked in Tel Aviv and are originally from the North and that made it possible for them to return north.”