AM Research is a division of Aspiring Minds. Aspiring Minds aspires to build an assessment-driven job marketplace (a SAT/GRE for jobs) to drive accountability in higher education and meritocracy in labor markets. The products developed based on our research has impacted more than two million lives and the resulting data is a source of continuous new research.
A cocktail of assessment, HR, machine learning, data science, education, social impact with two teaspoons of common sense stirred in it.
Hurray! Our paper “Grading uncompilable programs” has been accepted at IAAI 2019! This is the first attempt to provide semantic feedback on uncompilable codes. This is our fourth paper on the topic as we have improved technology to grade and provide feedback on programming skills.
Automata, the world’s only AI-based evaluator of programming skills, is being used across the world to hire software engineers. In our earlier post, we discussed how a US-based company used the platform to improve hiring efficiency.
In our IAAI paper, we mention a study with a Chinese company. We find that the system helped in selecting around 26% more candidates for the interviews. These candidates had written a logically meaningful code, but the code did not compile. Out of these candidates, around 19% of candidates were hired. This is a big win as many worthy candidates who would have got missed out by traditional program grading systems were hired. (See details below)
|1||Code unrelated to given task||3361||1979|
|2||Appropriate keywords and tokens are present||3264||2125|
|3||Right control structure exists with missing data dependency||2547||1440|
|4||Correct with inadvertent errors||2955||1017|
|≥ 3||Selected for interview||9330||2457|
Recently, the system has been used by a large IT company in India for large scale hiring of entry-level software engineers. The company hires 1000’s of software engineers every year and struggles with filling all open positions. By way of grading non-compilable codes, they have been able to improve hiring throughput by 30-40%.
Stay tuned, more to come… Programs + AI opens up a million opportunities!!!
- Rohit and Varun
Aspiring Minds has been doing machine learning, aka artificial intelligence, for 8 years now, much before it became a vogue. We solved original questions using AI, not copycatting the West – doing a lot of firsts in the world. Here is a quick recap of Aspiring Minds’ tryst with AI, together with how AI evolved in India.
Phase I- “ML in a niche”- We had hired two engineers to work on Machine learning projects in 2010. After one year, they came to my room and inquired about their future, since all their friends were doing software development. Hardly anyone knew about ML.
- 2012: Launched SVAR: An AI based spoken English evaluation product
Today, SVAR is used across the world including India, Philippines, China and Latin America. It automatically generates scores on pronunciation and fluency based on speech samples of a person.
Was this the first AI-based product from India that reached scale?
- 2012: Made one of our data set public and organized a Machine Learning Competition
The competition had entries from India, Brazil, Belgium and Pakistan. See the leader-board and winners here. This was probably the first by an Indian company and among the first few in the world.
- 2013: Launched AUTOMATA: World’s first machine learning based programming assessment
Automata is used by companies across the world – some examples include Wipro, Cognizant, Baidu, ZTE and one of the largest ecommerce giants in the USA. It is backed by three publications and several patents.
- 2014: Published our first ML paper on grading programming skills automatically at KDD
The paper has quickly garnered 28 citations. This was followed by several other papers on automatic grading of spoken English, motor skills, soft-skills, published at KDD, ACL, Ubicomp, IJSE and others. We also did a first workshop on AI+assessments at KDD with international collaborators.
Aspiring Minds remains one of the very few Indian companies that publish in ML conferences
Phase II- Big Data Science Fascination- Everyone by now had started talking about Big Data and Data Science – a new name for machine learning! Most work in India was around data engineering and not deriving intelligence from data. MOOCs on AI exploded –everyone who took the course didn’t really learn.
- 2015: Organized the world’s first Data Science Camp for Kids
We organized a very successful hands-on data science camp for kids of standard 5th-8th. Kids performed the full flow of supervised learning. Since then this open source project has been replicated at Illinois, Seattle, Pune and Bangalore. It also led to a paper on the pedagogy of teaching machine learning to kids.
- 2015: Launched ml-india.org, the first effort ever to audit India’s ML activity and a resource repository for all MLers
ML India brings all ML efforts in India under a single roof. Read more about how does India fare in ML – the main motivation behind setting up this forum. The group has 1800+ members, hosted 27 machine learning meetups, lists 146 ML professionals, 55 companies, 28 data sets and 11 groups.
We also launched a new data set AMEO, at iKDD. Attracted users from Harvard Kennedy School, Dublin Institute of Technology, New York University, TCS, Sapient and Flytxt.
- 2016: Launched the World’s first automated motor skills test
Aptitude tests have been automated for ages. But motor skills test, a way to measure skills of blue collar workers have not. We used the power of tablets and machine learning to do it and show that it is predictive of the job performance of blue collar workers. Read here.
- 2016: US Skill Map and India Skill Map- Big Data Analysis
Automatically crawled the web to aggregate jobs of USA and India, to create the world’s first interactive Skill Demand Map. Check it out here.
Phase III- National Interest in AI, but with nascent understanding –Data science had by now died a silent death, only to be replaced by Artificial Intelligence. From the PMO, Finance Minister to Niti Aayog, today everyone is interested in AI. Yet, we have little novel methods or application of AI from India. We have little local expertise in AI – our research contribution is 1/15th of US and 1/8th of China.
- 2017: Machines started understanding codes that do not compile!
Automata, our program skill grading platform started scoring uncompilable codes, a first in the world! Our algorithm could read meaning of programs, which a compiler couldn’t and generate feedback for so many more students.
And the journey continues!!!
This has been possible by efforts of many in Aspiring Minds’ research team, most notably, Shashank Srikant, Rohit Takhar, Vishal Venugopal, Gursimran Singh, Bhanu Pratap Singh, Vinay Shashidhar, and Milan Sachdeva.
Phase IV: How can India lead in Artificial Intelligence? From doing research, we started thinking about research policy. My recent book, ‘Leading Science and Technology: India Next’ focuses primarily on the research ecosystem in India and highlights several areas where we should improve. It is supported by a white paper on how India should invigorate its Artificial Intelligence ecosystem. This is where we need to go next…
If you ever took a coding job test on a machine, you will probably frown if you couldn’t make your code to compile. Your program might be almost right, but due to some silly bug, unidentified in a small time frame, you will get a ZERO.
Not any more! Aspiring Minds’ research team has created a technology which can detect how good the program’s algorithm is, even if it doesn’t compile.
How do we do it? First, we can fix some of the codes using artificial intelligence. By looking at patterns in good compilable codes, our algorithms minimally modify existing programs to make them compilable. By using this approach, we can compile 40% of uncompilable codes. Once compilable, our patented machine learning based algorithm can generate a grade which mimics human raters.
Fancy as it may seem, we had a harder problem to solve. What about the codes that do not compile? Using smart static analysis of codes, we are able to derive features, signatures of the logic of the program, from these codes automatically. With these features and a customized form of our machine learning algorithm, we can provide grades as accurately as you could think!
On a set of programs attempted for a job in a large e-commerce player in USA, we find that 46% codes were not compiling, but weren’t blank.
Our AI based algorithm found that 6% of these codes, for 596 students, had nearly correct logic. Another 29% candidates, with a little bit of guidance, would have reached the right logic. All these candidates deserved a shot with the company!
In another data set of a technology giant in China, we find that 27% candidates whose codes do not compile, have sound programming logic.
What is more? Our AI algorithm can provide feedback to all candidates whose code do not compile. To some, we can tell how to fix their programs and make them to compile. To all, we can give them feedback on their algorithmic approach, tips to reach the correct logic and provide feedback on the stylistic and maintainability issues in their code.
Disappointed with coding platforms which gives everyone a poor score and no feedback… We have corrected this for all times to come!
A new year is on the horizon. For many people it is time to make resolutions about what to do in the coming year. This year, instead of focusing entirely on what you want to do, consider thinking more carefully about those things you want to avoid. Our recent research, also covered by WSJ print, found that the secret to success is knowing what NOT to do and then not doing it! For instance, there were many things during this past year that experts advised should not be done – such as NOT to do a Brexit, NOT to elect ultra-nationalist voices and NOT to demonetize one’s currency without a plan. Only time will tell whether these were actually bad decisions. We find that recognizing a bad decision and avoiding it is far more important for success than focusing on the best things to do.
Our evidence came from tracking job success. We found that the most successful salespersons, customer service agents and managers weren’t those who chose the “best” course of action in a given situation, but rather were those who knew what NOT to do in a situation and avoided those actions. For instance, in a situation when you are very late for a sales meeting, what one absolutely should not do is fail to apologize. On the other hand, there might be different ways one could apologize or show regret, some being better than others, as deemed by experts. However, our work showed choosing among these different ways of expressing regret was not predictive of one’s success in a sales job. What mattered was the ability to identify what should not be done (i.e., expressing no regret). The wrong response may seem obvious in this situation, but it isn’t obvious to everyone and is also not obvious for many other situations.
Our study was based on a methodology called situational judgment testing. We provided candidates with a series of specific situations and asked them to choose from among a number of possible ways to respond to each situation – a technique known as situational judgment testing or SJT (see Figure 2 for an example). We asked them to choose which of the options presented for each situation would be the best way to respond and which would be the worst. We then analyzed the data to see if their choices predicted actual job performance (such as sales targets achieved) for a few different roles.
We expected that the people who were most successful in the workplace would be those who were able to identify what experts in the field said were the best ways to respond to each scenario. It turns out that was not the case. Instead, what we found was that the people who were most successful on the job were those who were correctly able to identify the worst answer to a larger number of situations. They knew what course of action was important to avoid for more scenarios. Specifically, the correlation between the ability to correctly identify the worst responses and job performance ranged r = 0.28 to 0.33 and was statistically significant. By contrast, the correlation between the ability to correctly identify the best responses to the scenarios and performance ranged r = 0.14 to 0.16 and was not statistically significant.
This work has important ramifications, the first and most immediate of which is being able to filter and hire better performers simply by concentrating on whether they know how to avoid doing the wrong things, which are typically widely-agreed upon, rather than trying to find people who pick the best answer. This should influence interview methodologies, case based discussions and other ways of candidate evaluation.
Another significant contribution is to the field of situational judgment testing. Unlike IQ tests, situational judgment tests are traditionally hard to standardize. Different organizations, functions and cultures have different notions of the ‘best’ way to handle a situation. Thus, with the best answer philosophy, one needs to build different tests and scoring mechanisms for each. On the other hand, the contribution of our work is that the ‘worst’ answer is more universal and consistent across diverse environment. It suggests that the development of SJTs can be relatively standardized across fields of study in a way that has not previously been possible.
Above and beyond all of these, the results have implications for our daily lives. Specifically, the results suggest that maybe this year you ought to concentrate on what ‘not to do’ and train your mind to avoid those things! Our conjecture is that it will lead to better happiness to your lives.
Make a start and list out what things you will avoid doing in 2017. We have our top on the list…Not write boring blogs!
Results presented at Ubicomp 2016, Heidelberg, Germany
Knowledge and cognitive ability tests have been automated and are taken on computers for more than three decades now. Pretty much all of you would have taken a SAT, GMAT or a GRE. What about motor skills? They are needed for almost all vocational jobs, say a plumber’s manual dexterity in fixing a screw. The best tests for them still are these bulky boards, pegs and instruments.
No one till date really thought about exploiting the power of the touch interfaces to develop such tests. Touch screen based devices are now ubiquitous in form of mobile phones and tablets. We wanted to find whether we can test people’s skills, say in tailoring and machining, by making them do things on the tablet. We wrote creative apps to make them do various actions on a tablet — rotating their fingers, pinching them, moving their elbows and shoulders to trace… and so on.
We reported in our Ubicomp paper, presented last week, that the scores from these tests actually do predict the speed and accuracy of industrial tasks done by machinists, tailors and machine operators. In fact, they are better predictors than the bulky manual tests! Our test scores can predict all parameters of task performance measured by us. The correlation ranges 0.19 to 0.37, similar to what a logical ability test would predict for a knowledge worker. In comparison, manual test scores correlate significantly only for 4 out of 7 task performance ratings and ranges 0.19-0.33.
This has great implications for the training and job matching of vocational workers. Using these apps, vocational job aspirants can test their motor skills at the comfort of their homes. They can get feedback and work on improving their skills. Also, if they perform well, they can generate credentials such as “Motor skills certified for a tailor” and highlight them to employers. The same assessments can be used by the industry to filter and recruit high performing employees.
We are happy to present the world’s first validated motor skill test. There is so much more opportunity for further research – figuring out which scores correlate to performance in which task, creating a job to score map, creating more innovative apps and so on… Let us do it with the power of the touch interface.
- assessment research
- Big Data
- Computer Program Assessments
- Data science
- decision trees
- hiring assessment
- hiring test
- item difficulty
- Kids learning
- Machine Learning
- motor skill test
- online hiring assessment
- online hiring test
- programming assessments
- programming test
- Test Cases
- testing research