If you've taken an assessment and shared your technical scores, they are among the first few things a potential employer sees when viewing your Triplebyte profile. In this article, we’ll walk through some of the details of how we transform your answers into scores, and hopefully clarify some common questions about our assessments in the process.
Triplebyte’s assessments use a testing model called item response theory (or IRT for short). IRT is widely used in the testing industry in general, most notably by the GMAT graduate-school entrance exam taken by several hundred thousand business and finance students each year.
The basic idea of IRT is that we begin with a set of beliefs about the traits we’re trying to assess, and as a test-taker answers questions, we adjust those beliefs. We adjust your beliefs upward for a correct answer and downward for an incorrect one, and we adjust by an amount that depends on our initial expectations — a very surprising correct or incorrect answer adjusts our estimate by a lot; an unsurprising one adjusts our estimate by only a little.
Triplebyte’s assessments are also adaptive, meaning that which questions you’re asked depends on our current best estimate of your skill level. If you’re doing very well so far, we’ll ask more difficult questions; if you’re struggling in an area, we’ll ask easier questions.
Your First Quiz
When you take a quiz on Triplebyte, we begin with an estimate based on the average engineer. Before we know anything about you, our best guess is that you’re probably not highly unusual. We assign a high initial probability to scores near average and a low initial probability to scores far from average.
Based on our past experience with thousands of other candidates, we estimate the difficulty of the questions in our question bank, and we try to ask you a few questions that we know strongly differentiate engineers with above-average knowledge in an area from those with below-average knowledge. We use those answers to start fine-tuning our estimates of your skill. For example, if you answer these questions correctly, we can be pretty confident that your knowledge is above average and can begin asking questions to help differentiate where in that range it lies.
For example, suppose we begin with an engineer whose real underlying skill is a bit above average. We don’t know this yet - we have to begin by estimating their skill to be about average. They answer about half of our initial questions correctly, so we initially evaluate them as about average. As we ask more questions, though, we notice that they’re answering more questions correctly than we’d expect from someone of average skill, and we begin to adjust our estimate upward. Eventually, we’re fairly confident in an above-average score estimate.
In the image below, each vertical slice represents a single question and the different colors represent skill levels, with purples and blues representing weaker engineers and pinks and reds representing stronger ones. Our initial guess that this engineer’s knowledge is about average is visible as the large yellow bars in the left half of the image, and our shift towards a slightly-above-average estimate is visible as growing orange bars on the right side of the image.
There are a few extra layers of complexity in the ML model that handles these estimates and adjustments, but this is the basic principle.
Finally, we average out the remaining uncertainty in our model to generate a single numerical score, which is what we use to generate our feedback to you and our score displays to companies hiring through Triplebyte.
We dive into even deeper mathematical depths in this article if you're curious to still learn more.
If you have additional questions, please contact Candidate Support.