AM Research is a division of Aspiring Minds. Aspiring Minds aspires to build an assessment-driven job marketplace (a SAT/GRE for jobs) to drive accountability in higher education and meritocracy in labor markets. The products developed based on our research has impacted more than two million lives and the resulting data is a source of continuous new research.
A cocktail of assessment, HR, machine learning, data science, education, social impact with two teaspoons of common sense stirred in it.
What makes a programming problem hard?
Why are some programming problems solved by more number of students while others are not. The varying numbers we saw got us thinking on how the human brain responds to programming problems. This was also an important question for us to have an answer for when we designed an assessment or wanted guidance on pedagogy. Understanding what makes a programming problem hard would enable us to put questions into a programming assessment of a given difficulty where neither everyone would get a perfect score nor a zero and would also help us in creating equivalent testforms for the test.
We tried taking a jab at it by answering it empirically. We marked 23 programming problems on four different parameters on a 2 or 3 point scale — how hard is the data structure used in the implementation, what data structure is being returned from the target function, how hard it is to conceive the algorithm, the implementation of the algorithm and how hard is it to handle edge cases for the given problem. [See the attached PDF for more details on the metrics and the rubric followed]. There was some nuance involved in choosing these metrics – for instance, the algorithm to a problem could be hard to conceive if, say, it requires thinking through a dynamic programming approach, but its implementation can be fairly easy, involving a couple of loops. On the other hand, the algorithm to sort and then to merge a bunch of arrays can be simple in themselves but implementing such a requirement could be a hassle.
For these problems, we had responses from some 8000 CS undergraduates each. Each problem was delivered to a test-taker in a randomized testform. From this we pulled out how many people were able to write compilable code (this was as low as 3.6% to as high as 74% for different problems) and how many got all test cases right. We wanted to see how well we could predict this using our expert-driven difficulty metrics (our difficulties are relative and can change based on sample; for an absolute analysis we could have predicted the IRT parameters of the question — wanna try?)
So, what came out? Yes! we can predict. Here is the base correlations matrix. They are negative because a harder problem has a lower correct rate.
|Percent-pass all test cases||-0.25||-0.42||-0.43||-0.05|
We tried a first cut analysis on our data by building a regression tree with some simple cross-validation. We got a really cool, intuitive tree and a prediction accuracy of 0.81! This is our ‘Tree of Program Difficulty’ . So what do we learn?
The primary metric in predicting whether a good percentage of people are able to solve a problem right is the algorithmic difficulty. Problems for which the algorithm is easy to deduce (<1.5) immediately witness a high pass rate whereas those for which it is hard (>2.5) witness a very poor pass rate. For those that’re moderately hard algorithmically (between 1 and 2.5), the next criterion deciding the pass percentage is the difficulty in implementing the algorithm. If it’s easy to implement (<2), we see a high pass rate being predicted. For those that're moderately hard in implementation and algorithm, the difficulty of the data structures used in the problem then predicts the pass rate. If an advanced data structure is used, the rate falls to less than 6% and is around a moderate 11% otherwise.
So, what nodes do your problems fall on? Does it match our result? Tell us!
Thanks Ramakant for the nifty work with data!
-Shashank and Varun
We finally have a place to feature the work which we began five years ago. Great effort, Tarun, to get this up and running.
We thought this was important since education technology and assessments are going through a revolution. We wish to add our two teaspoons of wisdom (did I actually say that!) to the ongoing battle against the conventional non-scalable and unscientific ways of training, assessing and skill matching. We look forward to making this as a means to collaborate with academics, the industry and anyone who feels positively about education technology.
|Sales and Business Development||15.88|
|ANALYTICS AND COMMUNICATION|
|Corporate Communication/Content Development||2.20|
|IT AND ITeS INDUSTRY|
|ITes and BPO||21.37|
Table 1: By using standardized assessments of job suitability, in a study of 60,000 Indian undergraduates, we find that a strikingly low proportion of them have skills required for the industry. All these students got detailed feedback from us to improve. The table shows the percentage of students that have the required skills for different jobs. (Refer: National Employability Report for Graduates, under Reports in Publications)
We think assessments will be the key to democratize learning and employment opportunity: it provides a benchmark for measuring success of training interventions, provides feedback to learners creating a ‘dialogue’ in the learning process and most importantly, helps link learning to tangible outcomes in terms of jobs and otherwise.
Let me state it simply: To scale learning and make employment markets meritocratic, we need to scale automated assessments. This is the space we dabble in!
If you are thirsty for data, refer to the table and figure in this post. It tells the story of the problem we are up against and trying to solve.
Figure 1: 2500 undergraduates were surveyed to find their employment outcomes one year after they got their undergraduate education. We categorized their colleges in three categories (tier 1-3) based on their overall performance in AMCAT, our employability test. We find that a candidate in a tier 3 college has 24% lower odds of getting a job and 26% lower salary when he/she has the same merit (AMCAT scores) as a tier 1 students. Similarly, a 1 point drop in college GPA (on a 10 pt scale) decreases job odds by 16% and salary by 9%. Neither of these two parameters are useful predictors of job success beyond AMCAT scores. This shows a clear bias in the employment ecosystem. (Refer ‘Who gets a job’ under Reports in Publications)
How do we solve it? Stay tuned to our subsequent job posts…
- assessment research
- Big Data
- Computer Program Assessments
- Data science
- decision trees
- hiring assessment
- hiring test
- item difficulty
- Kids learning
- Machine Learning
- motor skill test
- online hiring assessment
- online hiring test
- programming assessments
- programming test
- Test Cases
- testing research