Technology
- Raima Muhammad
- Jan 1, 2025
- 2 min read
Updated: Jan 8
When was the last time you called a company and directly spoke to a human? When was the last time you completed a sentence when using Google? This only shows the evolution of AI tons of gigabytes in size trained on enormous amounts of data predicting what word comes next while carrying out our searches on different platforms.
The usage of Large Language Models has grown drastically over the past decade as different organizations compete in developing newer and bigger forms. In July 2021, OpenAI unveiled GPT-3, a language model that is trained to predict the next word in a sentence.GPT-3 has 175 billion parameters and was trained on 570 gigabytes of text, in contrast to its predecessor, GPT-2, which had only 1.5 billion parameters—100 times smaller. GPT-3 is capable of performing tasks it was not explicitly trained for, such as translating sentences into various languages with few to no training examples. It also features capabilities like text summarization, chatbots, search, and code generation, which are absent in earlier models.
Large Language Models will continue to grow in size, power, and versatility; however, this does not mean they will be free from shortcomings. For instance, GPT-3 has been known to generate racist, sexist, and bigoted text, as well as plausible content that, upon closer inspection, is factually inaccurate, undesirable, or unpredictable. Additionally, these models can be used to produce misleading essays, tweets, and news stories. A question then arises about whom to hold accountable for possible harms resulting from poor performance, bias or misuse.
Let’s analyze one of the greatest thought experiments called the Turing Test. In 1950, the British mathematician and cryptanalyst Alan Turing published a paper outlining a provocative thought experiment. The main aim of this test was to see whether machines were able to think for themselves. If the machines were able to consistently fool the interviewer into believing it was human for about 30% of the time, then it would be intelligent. Garry Marcus, a cognitive scientist and co-author of the book “Rebooting AI” argues that this test does not test intelligence but rather the ability of a given software program to pass as human. This is true because over the past couple of years different AIs have tried to beat it i.e. Eliza, Parry, Eugene etc.
In 2014, Eugene Goostman passed the legendary Turing Test tricking 33% of a panel of judges into believing he was a true 13 year old Ukrainian boy during the course of a five-minute chat conversation. Not only is this test outdated as it was defeated but is also a red flag as it is fundamentally about deception and any system capable of passing is carries a danger to deceiving people.
The only way we then have to be cautious about this LLM’S is to amend the Turing Test by making it to test for the speed of the LLM’s in order to continuously evolve the LLM’s and to check if the mechanism in which the LLM works would include bringing up of controversial issues and to take the final extent that if the LLM fails the test it must not be issued out to the public. Hence forth taking the moral high ground of public protection and evolution.
-Raima Muhammad

Comments