Garbage In, Garbage Out

By Paul

December 20, 2022

OpenAI is an organization dedicated to the benevolent use and development of artificial intelligence (AI).

I’ve had some interesting discussions with our children about AI. Can it think for itself? Has it become sentient on its own? Are we just a few years away from Skynet taking over the world?

OpenAI has released a chatbot, a piece of programming, that allows you to have a conversation with the artificial intelligence that OpenAI has developed called ChatGPT.

I thought, great. It’s almost Christmas. I’ll ask the chatbot a couple of questions and let ChatGPT write my blog. A Christmas present to me. I immediately got political, and found that the answers, for the most part, were reasonable and fairly well “thought out”.

Then, I posed this question: Is it reasonable to expect identical twins to score the same test results?

Here is its answer:

Identical twins are often thought to be very similar to each other due to their shared genetic makeup, but it is not reasonable to expect them to score the same on all tests or to have identical abilities and traits. While identical twins do share the same genetic material, they are still two separate individuals who have been exposed to different environments and experiences throughout their lives. These differences can lead to variations in their cognitive abilities, personality traits, and other characteristics that can affect their test scores and other performance measures.
Additionally, the nature of testing itself can introduce variability in test scores. Tests can be designed to measure different things, and the way that a test is administered and scored can affect the results. Additionally, individual differences in factors such as motivation, anxiety, and test-taking strategies can also impact test scores.
Overall, it is important to recognize that identical twins, like all individuals, are unique and can have differences in their abilities, traits, and performance on various tasks. While it is possible that identical twins may score similarly on some tests, it is not reasonable to expect them to score identically on all tests.

In this case, AI is incorrect, and reveals why AI has a very long way to go.

I posed this question because of Kayla and Kellie Bingham. They’re sisters and identical twins. Back in 2016, they wrote their Medical University of South Carolina medical school exams at the same time. As they wrote their eight-hour exams remotely, the professor monitoring their exams noticed that not only did Kayla and Kellie progress through the exam at the same rate, they were getting the same questions right and the same questions wrong. In fact, the sisters had identical answers to 296 of 307 questions, including 54 wrong answers.

Alarm bells sounded. They were accused of cheating, of passing notes to each other and subsequently found guilty of “academic dishonesty” and lost their places in medical school.

But then science took over. Kayla and Kellie claimed they didn’t cheat and sued the school for defamation to prove it. They found a Cal State professor, Nancy Segal who has spent her career studying identical twins. She’s the founding director of the Twin Studies Center at Cal State Fullerton.

Dr. Segal testified at the twin’s trial that not only is it normal for identical twins to score exactly the same in an exam, but that it would be an aberration if they didn’t. Even when identical twins are reared apart, they behave in a similar manner. The twins won their lawsuit.

Science to the rescue. OpenAI seemingly didn’t include Dr. Segal’s 2012 publication, “Born Together, Reared Apart: The Landmark Minnesota Twin Study”, where she delves into how twins behave.

Kayla and Kellie had the same number of wrong answers to the same questions. They almost scored identically to the questions they got right. The chatbot comes down squarely like the professor monitoring their exam; that it couldn’t happen.

And this is AI’s problem. What is being fed into the vast database of information? What is the source? Is it credible? Does it show bias on the basis of age, gender, race, creed, colour or ethnicity? How much weight is given to any individual piece of information for an outcome?

This is similar to the trouble Tesla is having with its self-driving software. It can’t predict every situation that a driver might encounter, only the set of encounters that have already occurred. So it has constant errors and drivers testing the software usually end up turning it off except for use on controlled access highways where the number of new and unique variables are reduced.

One of Google’s self-driving car experiments is a case in point. A woman in an electric wheelchair was chasing a duck down a street in Mountain View, California a few years back. Wielding a broom, she was in hot pursuit of the duck. The Google self-driving car encountering this incident had no idea what to do so it simply came to a stop.

It’s the exceptions, the one-offs that fool the automated systems. As humans, we can take these into account and take the appropriate action. In the case of the duck, it was to come to a stop. But maybe an actual driver would have steered around it and continued on their way, or diverted to another street to avoid the interaction altogether. 

AI has a very long way to go. ChatGPT is kind of like the Wikipedia of AI. It can be used as a resource, but take it with a pinch of salt. It might be mostly correct, but not necessarily so.

Merry Christmas everyone. See you in the new year.

Lobby Christmas Tree Royal York Hotel


December 20, 2022
Be sure to check out Dana's blog, Time to Write. I like to think I'm a pretty good writer. Dana is an AMAZING writer.
{"email":"Email address invalid","url":"Website address invalid","required":"Required field missing"}