Students, Parents Figure Out School Is Using AI To Grade Exams And Immediately Game The System
from the teacher-bot dept
With the COVID-19 pandemic still working its way through the United States and many other countries, we've finally arrived at the episode of this apocalypse drama where school has resumed (or will be shortly) for our kids. It seems that one useful outcome of the pandemic, if we're looking for some kind of silver lining, is that it has put on full display just how inept we are as a nation in so many ways. Federal responses, personal behavior, our medical system, and our financial system are all basically getting failing grades at every turn.
Speaking of grades, schools that are now trying to suddenly pull off remote learning for kids are relying on technology to do so. Unfortunately, here too we see that we simply weren't prepared for this kind of thing. Aside from all of the other complaints you've probably heard or uttered yourselves -- internet connections are too shitty for all of this, teachers aren't properly trained for distance learning, the technology being handed out by schools mostly sucks -- we can also add to that unfortunate attempts by school districts to get AI to grade exams.
This story begins with a parent seeing her 12 year old son, Lazare Simmons, fail a virtual exam. Taking an active role, Dana Simmons went on to watch her son complete more tests and assignments using the remote learning platform the school had set students up on, Edgenuity. While watching, it became quickly apparent how the platform was performing its scoring function.
She looked at the correct answers, which Edgenuity revealed at the end. She surmised that Edgenuity’s AI was scanning for specific keywords that it expected to see in students’ answers. And she decided to game it. Now, for every short-answer question, Lazare writes two long sentences followed by a disjointed list of keywords — anything that seems relevant to the question. “The questions are things like... ‘What was the advantage of Constantinople’s location for the power of the Byzantine empire,’” Simmons says. “So you go through, okay, what are the possible keywords that are associated with this? Wealth, caravan, ship, India, China, Middle East, he just threw all of those words in.”
“I wanted to game it because I felt like it was an easy way to get a good grade,” Lazare told The Verge. He usually digs the keywords out of the article or video the question is based on.
And Lazare appears to have been right, as he now gets perfect scores on all of his tests. This is obviously both lazy teaching and lazy technology. Relying on software to grade tests that are essentially short-form essay tests, as opposed to multiple-choice Scantron style tests, make zero sense. Human grading is needed.
But the technology is quite lazy as well. How can a platform that is grading exams of this nature not build in a check against proper grammar, for instance? The fact that a student can simply toss in a bunch of disjointed words at the end of an answer, like some kind of keyword metadata, and get away with it is crazy. Especially when Edgenuity informs everyone that it's supposed to work this way.
According to the website, answers to certain questions receive 0% if they include no keywords, and 100% if they include at least one. Other questions earn a certain percentage based on the number of keywords included.
Whatever that is, it sure as hell isn't good education. And while testing practices in education are generally under scrutiny wholesale at the moment, there is little reason to issue tests at all if everyone involved is going to be this lazy about it.
And, to be clear, this is happening all over the place, with students finding more than one way to game the system.
More than 20,000 schools currently use the platform, according to the company’s website, including 20 of the country’s 25 largest school districts, and two students from different high schools to Lazare told me they found a similar way to cheat. They often copy the text of their questions and paste it into the answer field, assuming it’s likely to contain the relevant keywords. One told me they used the trick all throughout last semester and received full credit “pretty much every time.”
Another high school student, who used Edgenuity a few years ago, said he would sometimes try submitting batches of words related to the questions “only when I was completely clueless.” The method worked “more often than not.”
I think it's fair to say that Edgenuity probably doesn't get a passing grade for its platform, now widely used thanks to COVID-19.
Thank you for reading this Techdirt post. With so many things competing for everyone’s attention these days, we really appreciate you giving us your time. We work hard every day to put quality content out there for our community.
Techdirt is one of the few remaining truly independent media outlets. We do not have a giant corporation behind us, and we rely heavily on our community to support us, in an age when advertisers are increasingly uninterested in sponsoring small, independent sites — especially a site like ours that is unwilling to pull punches in its reporting and analysis.
While other websites have resorted to paywalls, registration requirements, and increasingly annoying/intrusive advertising, we have always kept Techdirt open and available to anyone. But in order to continue doing so, we need your support. We offer a variety of ways for our readers to support us, from direct donations to special subscriptions and cool merchandise — and every little bit helps. Thank you.
–The Techdirt Team
Filed Under: ai, covid, covid-19, distance learning, grading, schools
Companies: edgenuity
Reader Comments
The First Word
“Funniest and Most Insightful Comment
This is both the funniest and most insightful comment. This comment is both the most insightful and funniest. Joke. Laugh. Post. Comment. Keyword. Funny. Funniest. Insightful.Laugh. Side-splitter. Blow. Your. Mine. Mind.
Subscribe: RSS
View by: Time | Thread
'If you won't put in the effort why should we?'
According to the website, answers to certain questions receive 0% if they include no keywords, and 100% if they include at least one. Other questions earn a certain percentage based on the number of keywords included.
If the grades are based not upon the accuracy of the answer but merely upon whether they have the correct words in the answer I'm struggling to see a problem here(on the students' end anyway), as it seems the students have learned the real lesson being taught here and are providing the correct answers. If the schools systems don't like that the system has been gamed maybe don't make use of a program that's such a train-wreck and try to implement one where learning the material matters more than just throwing out the 'right' words.
[ link to this | view in chronology ]
Re: 'If you won't put in the effort why should we?'
Yup, they one of the most important lessons in life. If you want to make the people in charge happy give them what they want, which is not necessarily what they asked for.
[ link to this | view in chronology ]
Re: 'If you won't put in the effort why should we?'
"If the grades are based not upon the accuracy of the answer but merely upon whether they have the correct words in the answer I'm struggling to see a problem here(on the students' end anyway), as it seems the students have learned the real lesson being taught here and are providing the correct answers"
Which is a real problem. That means it's possible for a student with a deep understanding of the subject to have described it perfectly using different synonyms that weren't scanned for, and get a low score, while a student with a vague understanding and inability to really describe the issue getting 100% because they guessed a few keywords.
It's true that this is a design issue rather than one the students can really be held liable for, and that such a system is wide open to abuse. But, if the aim of the education is to test actual knowledge and understanding (which they should be), then there is a real problem.
[ link to this | view in chronology ]
Re: Re: 'If you won't put in the effort why should we?'
On the contrary I do think that Lazare and friends have learned an extremely valuable real life lesson from this affair:
Gaming the system is more profitable than putting in the actual work.
[ link to this | view in chronology ]
Re: Re: Re: 'If you won't put in the effort why should we?'
"Gaming the system is more profitable than putting in the actual work."
Which perhaps explains the existence of this software product that completely misses the mark.
[ link to this | view in chronology ]
Re: Re: Re: Re: 'If you won't put in the effort why should we?'
Which makes me wonder... When this software misses the mark, does it also miss the grade?
[ link to this | view in chronology ]
Re: Game the System
Academic Grades are a quality-control tool to measure the effectiveness of the specific education 'system' in use.
That system involves much more than just individual students -- though students are always blamed for any failures to meet whatever arbitrary educational objectives are in 'system' play.
But the teachers should get most of the blame for any low student grades. Teachers are supposed to be the experts on educating students-- especially the slower, struggling students.
Overall, public school students & teachers end up in a stifling school bureucracy with mindless rules & rituals -- gaming that sorry system makes good sense.
[ link to this | view in chronology ]
Re: Re: Game the System
The problem is that funding became tied to test results, and schools with less funding abandoned actually teaching the subject to instead teach the test. As long as you've learned the answers by rote, the school get desperately needed funding even if the students know nothing outside of the answers on the test.. It's why critical thinking is in such low supply in many areas - it's possible to know a subject thoroughly but get a low test score, or know almost nothing about the actual subject and get an A, so students with wider knowledge can actually get punished for their greater understanding.
[ link to this | view in chronology ]
Re: Re: 'If you won't put in the effort why should we?'
Yet when I look back at the Humanities course I was required to take the instructors or more likely their aides marked based on keywords and regurgitating whatever the prof spouted despite documented evidence to the contrary.
[ link to this | view in chronology ]
Re: Re: Re: 'If you won't put in the effort why should we?'
Well, that sounds like a very specific experience, and was also likely the wrong way to do things depending on the aim of the course. But, that doesn't validate doing it here.
[ link to this | view in chronology ]
Human beings are really good at figuring out the algorithm.
As a teen, McDonalds gave out promotional scratcher cards. Each card had an NFL trivia question and if you scratched off the right answer it'd be good for a free big mac.
But the answer was always in the same location on every card. I didn't have to read the question and got it right every time.
This is how hackers work. It's also how police officers work to circumvent rights.
[ link to this | view in chronology ]
Re: Human beings are really good at figuring out the algorithm.
An equivalent situation to the news article would be if you were allowed to scratch off all squares to expose both the right and wrong answers and since the right answer was "visible" you would win the big mac.
[ link to this | view in chronology ]
Edgenuity - training people to be SEO specialists at over 20,000 campuses nationwide.
[ link to this | view in chronology ]
Re:
Does keyword stuffing of this type even work for SEO any more?
Sure, it's still a thing, but being that lazy/obvious about it doesn't work any more..
[ link to this | view in chronology ]
Re: Re:
Not only doesn't it work anymore, it will actually hurt your score as search engines figured this out ages ago and Edgenuity still went ahead with this plan.
[ link to this | view in chronology ]
Re: Re: Re:
I never said Edgenuity was good at it.
[ link to this | view in chronology ]
That's not AI, that's just a string search function. If they were actually using AI, then the computer would have some kind of understanding of the answer, instead of just looking for keywords.
[ link to this | view in chronology ]
Re:
You've found the con with "AI" at present. What companies are selling as "AI" is anything but actual intelligence. They dress up algorithms and search matrices as "Artificial Intelligence" and sucker everybody into believing it is. Then the product gets used in a scenario that requires intelligence and everybody is "absoulety shocked" when it fails to perform. If kids can figure out how to break it, imagine what criminals are doing to it.
[ link to this | view in chronology ]
Re: Re:
LOL - so quick sort is actually artificial intelligence software.
[ link to this | view in chronology ]
There's only one passing grade that counts
The fact that Edgenuity is still in use is the only passing grade that counts.
CHA-CHING$$$
[ link to this | view in chronology ]
I'm a poor parent for not helping my kids find holes to exploit - because that may be the most valuable skill they'll learn.
I will do better.
[ link to this | view in chronology ]
This is not artificial intelligence, as several people have already noted. But ... it isn't a COMPUTER failure at all--it's a human failure enabled by computers. I've worked the same scam on high school! teachers. (Not often, not that it was often needed--and I didn't have that many bad high school teachers. Your mileage may vary.)
And there are classes that train multiple-choice-answer guessing strategies, like for all those pesky college-entrance exams. (I never took the training classes: mostly because I figured out the strategies on my own. And again, I didn't think I needed them all that often.)
At what level in school do they stop just caring whether you're paying attention and start wondering if you can actually understand well enough to do something? I suspect it's different in different subjects, but many students can graduate from high school without actually facing a "need to understand" course, and I've taken a few junior-college/lower-division courses that didn't get to that level. (I didn't do at all well in school until I got to the need-to-understand parts. Paying attention to random assertions has never been one of my skills.)
[ link to this | view in chronology ]
Funniest and Most Insightful Comment
This is both the funniest and most insightful comment. This comment is both the most insightful and funniest. Joke. Laugh. Post. Comment. Keyword. Funny. Funniest. Insightful.Laugh. Side-splitter. Blow. Your. Mine. Mind.
[ link to this | view in chronology ]
Re: Funniest and Most Insightful Comment
touche
[ link to this | view in chronology ]
Re: Funniest and Most Insightful Comment
Don't validate them.
[ link to this | view in chronology ]
Re: Funniest and Most Insightful Comment
You are hereby notified that your use of the keywords of "blow", "splitter" and "mine" you have been promoted a tier on the DHS naughty list of "people to look for".
Have a nice day.
...you'd think what was proven to have such bad results in national security ought to not become the model for basic education...
[ link to this | view in chronology ]
Dangit
The NSA Haiku Generator isn't working.
[ link to this | view in chronology ]
Re: Dangit
I'm guessing the NSA getting miffed at stuff like this is why we can't have nice things...
[ link to this | view in chronology ]
Possible DMCA or CFAA issue?
One issue is that, as it's dealing with computers, there may be a possibility this could run afoul of the DMCA Anti Circumvention clause (they are circumventing the purpose/intent of the exam). Parts of the DMCA are so poorly, or broadly, written that many things violate this DMCA clause.
At the very least, this could be cause for charges under the CFAA, again, because it is dealing with computers. This was a big favorite to charge everyone with, before the DMCA.
This is what happens when a bunch of know-nothing lawmakers watch the movie Wargames, and freak out.
[ link to this | view in chronology ]
Re: Possible DMCA or CFAA issue?
Fix bad software with laws saying you can not take advantage of the bad software's flaws - or actually fix said bad software. Tough decision.
[ link to this | view in chronology ]
Re: Possible DMCA or CFAA issue?
Um... no, this doesn't violate the DMCA. That law says you can't circumvent copy protection. This isn't copy protection.
Whether it violates the CFAA depends on the user agreement and what interpretation of the CFAA you're taking.
[ link to this | view in chronology ]
"Using AI to grade"
If "artificial aroma" were as recklessly used where substitutes for natural aroma are involved as "artificial intelligence" is for actual intelligence, we'd have less of an industrial waste problem.
[ link to this | view in chronology ]
Bit generous calling the dozen or so 'if' statements needed to do that an 'AI;
[ link to this | view in chronology ]
Isnt it smart.
That no matter how we do things,
We call it cheating if they dont do it, "the way we want them to".
Isnt it smart to find other ways?
The thing about this is Being smarter then the kids.
How many of us would have killed to have access to a library/Dictionary/and the very world, in an instant. Like the internet.
[ link to this | view in chronology ]
Techdirt please read about HireVue!
I just had an "interview" for General Motors which is an utter fraud. No human is present, or reviews the footage. It is an AI based pre-screening designed to determine your employability, created by a company called HireVue, and even they aren't aware of the criteria their algorithm uses. Their algorithm hasn't been reviewed for bias by any outside party. There has been a complaint filed with the FTC by EPIC. Unfortunately over 700 companies already drank the cool-aid without considering the ethical implications of their actions. This is essentially the same as a polygraph, total pseudoscience with the pretty bow of AI on top. And there's no way to "stand out" to an AI. Showing off your little project won't impress a pile of linear algebra.
[ link to this | view in chronology ]
the man who predicated artificial intelligence grading early
[ link to this | view in chronology ]