unning a website populated by bright math students creates a number of unique challenges. When we make mistakes, they get called out pretty quickly. When we do something cool, everyone is thrilled… and then also offers up advice on how we could have done it better. And when we build something interesting, someone is always going to want to know how it works.
One of the tools that sparks the most questions is Alcumus, a free tool from AoPS that helps students learn by adaptively delivering problems for them to study. At its core, Alcumus is a giant database of well-written problems and solutions. Sitting on top of that database is the Alcumus engine which “rates” students and decides how well they understand topics, determining when they should move on and when they should keep studying.
(If you’re thinking, ‘Wait, Alcumus is a math thing? I thought it was a Transformer,’ then you might want to check out this AoPS News article to get better acquainted with Alcumus before you read on.)
How do you rate students?
Alcumus problems are sorted into topics. Students get to choose which topic they want to study, and Alcumus tracks their performance in each topic with topic ratings. A hypothetical student, Melissa, would like to learn about geometric probability, so she might set Alcumus to the Using Geometry in Probability topic. That’s a hard one. Melissa may have her work cut out for her.
Alcumus gives Melissa a rating between 0 and 100 for every topic. This number describes how well the system thinks she understands the topic. When Melissa gets a problem right, her rating goes up, since Alcumus thinks she understands it better. Getting a question wrong will make that rating go down. But what does all that mean?
First, Melissa’s topic rating is the probability that she will correctly answer the average problem in the topic. So if Melissa’s Using Geometry in Probability rating is 75, then Alcumus believes Melissa will get three out of four average problems in the topic correct.
Under the hood, every student and every problem has a hidden score. We use these scores to guess at the probability of the student getting the problem right: if the student’s score is way higher than the problem’s, then the probability is close to 100%. If it’s way lower, then the probability is close to 0%. If their scores are equal, that’s 50%. The problem in Using Geometry in Probability with the highest rating is:
A boss plans a business meeting at Starbucks with the two engineers below him. However, he fails to set a time, and all three arrive at Starbucks at a random time between 2:00 and 4:00 p.m. When the boss shows up, if both engineers are not already there, he storms out and cancels the meeting. Each engineer is willing to stay at Starbucks alone for an hour, but if the other engineer has not arrived by that time, he will leave. What is the probability that the meeting takes place?
And the problem with the lowest rating is
The fair spinner shown is spun once. What is the probability of spinning an odd number? Express your answer as a common fraction.
As it turns out, at Melissa’s current score, Alcumus gives her a 97% chance of answering the second problem right, but only a 4% chance on the first one. The function we use to take scores and return probabilities is called the logistic curve. It looks like this:
This is actually the same curve that underlies how chess ratings work. The specific function Alcumus uses to predict the probability that a student will answer a question correctly is:
Probability=11+𝑒problem score–student score.
Melissa’s rating in a topic is the probability we get when we use this function to compare Melissa’s score to the average of the problem scores in the topic.
Once we have that set up as our model, we let the students play and let the Alcumus engine do its work adjusting the student (and problem) scores as it watches what happens.
How do the ratings change?
So, how exactly do we measure how well a student understands a topic? What happens when Melissa starts to get better at geometric probability? First, we start with some guesses about how good Melissa is and we watch her solve problems. Next, as we start to see what she can and can’t do, we refine our opinion of her. If you’re a robot and you’re doing all this with math, it’s called Bayesian Updating.
Bayesian Updating is based on a fancy toy called Bayes’ Theorem. Statistics students first learn Bayes’ Theorem by solving a sequence of contrived problems about cancer or people dying in hospitals. (If you Google “Bayes’ Theorem examples,” you get about 1.4 million hits. If you Google “Bayes’ Theorem hospital,” you get around 700 thousand.) Statisticians are dark people.
When you write it out, Bayes’ Theorem is a scary bunch of symbols
𝑃(𝐴|𝐵)=𝑃(𝐵|𝐴)𝑃(𝐴)𝑃(𝐵).
I spent a lot of time staring at that equation once upon a time, so I know exactly how impenetrable it can be. It’s also not even “right” for us. All four of those copies of 𝑃 in there mean something different and they’re all hiding their own little secrets that confuse us mathematicians over and over and over again.
So let’s start over.
We’re given a brand new student. We don’t know much about her. We think there’s about a 50% chance she’s “average,” a 25% chance she’s “above average” and a 25% chance she’s “below average.” That’s called a prior, as in, “This is the information that I have prior to watching the student.” Mathematicians like to use the phrase “prioring on” to sound smart and say, “This is why I think some silly thing is going to happen.” As in, “Prioring on his outfit, I think he’s the most likely student to fall out of his chair,” or “Did you know he was going to fall out of his chair? No, but I had strong priors.”
Next we have the observation. If our student solves the first 10 problems easily, we might stop feeling good about our prior, and we update it. Now maybe we’d say, she’s 5% likely to be “below average,” 20% likely to be “average,” and 75% likely to be “above average.” This new belief is called our posterior, as in, um, “This is the information that I have posterior to watching the student.” Sadly, the meaning of the word posterior has evolved a bit in the past couple thousand years, as in “The student fell out of the chair and onto his posterior.”
How is this Bayes’ Theorem? Well, there are two ingredients to the posterior. First, there’s our prior belief. If you’ve been teaching a student for a full year, you have a very well-defined prior belief about her. She rarely falls out of her chair, so maybe we think of her as “probably above average.” Seeing how she answers one problem isn’t going to have much of an effect on that belief. Second, there’s how well the story fits the outcome. Getting all 10 problems right on a really hard test is more likely for an above average student than for a below-average student. So that tells us she’s more likely to be an “above average” student than “below average.” What Bayes’ Theorem tells us is that we just smash these two effects together to get the posterior.
Posterior belief that she’s above average = (how well being above average predicts the result) * (prior belief that she’s above average).
Easy.
Homework for those inclined: the exact correspondence is
- prior: P(student is average)
- posterior: P(student is average | getting all 10 right)
- observation: P( getting all 10 right | student is average) / P(getting all 10 right).
What does this mean? Hint: Write 𝑃(𝐵)=∑𝐴𝑃(𝐵|𝐴).
How does Alcumus pick problems?
This is one of students’ most common questions. First off, it’s random: Alcumus flips a coin and picks a problem for the student. However, not every problem is equally likely.
Alcumus begins by deciding whether to give a student a problem in the current topic or giving the student a review problem (from a topic he or she has already passed). After the topic is chosen, Alcumus gives each problem in the topic a probability of being picked. Problems you’ve seen before are less likely to be delivered. If your rating is on the low end of the topic, Alcumus will prefer easier problems. If your rating is on the high end of the topic, Alcumus will prefer harder problems. If you change the difficulty, that shifts these probabilities, too.
Picking problems is highly constrained: if a topic has only hard problems, it’s going to give you a hard problem. Advanced Quadratics only has hard problems, so setting Alcumus to Easy and trying Advanced Quadratics will still give you hard problems. Is Alcumus adaptive? Does Alcumus teach?
We close with these lovely questions, which speak to a lot of the current issues in modern education, circling around what teaching actually is. Alcumus adapts to students in some ways. It tracks where you are and what you understand and adjusts what you see based on that information. It points you to resources—book references, videos, community conversations—that you can use if you get stuck. It has solutions that you can read at exactly the point you need them.
The one tiny missing piece is that students still need to use these resources or some other available resource to get past that block. Alcumus as a learning tool works best when wrapped in some amount of teaching. Many students are excellent at teaching themselves, whether by reading a book or by working carefully through problems they’ve already seen. Others may choose to ask a parent, sibling, tutor, or someone else when they are stuck. We at AoPS embed Alcumus as a motivation and training tool into some of our classes to complement the teaching that we do there.
If you haven’t already, go give Alcumus a try, and see if you can sense how the engine is working in the background as your ratings change. And we’d love to hear your opinions about how teaching fits into learning on our Facebooks or Twitters.