Aspiring Minds releases AMCAT employment data at CODS 2016!

Aspiring Minds Research is pleased to announce that it will be co-organizing this year’s data challenge at CODS 2016, the annual top-tier conference on machine learning and data science organized by the Indian chapter of KDD.

ameo_2015

Undergraduates – performance and salaries
This year, we wanted data science enthusiasts to get a flavor of the kind of data we have and work on. We have released AMEO 2015 – a dataset on Aspiring Minds’ Employability Outcomes. which captures the academic and demographic information of engineering undergraduates giving AMCAT, Aspiring Minds’ battery of standardized assessments. What makes this dataset unique and rich is that it also has employment outcomes (annual salaries of students’ first jobs) along with standardized test scores.

Interesting questions
The answers to a lot of interesting questions possible lie in this dataset –

  • Can we predict the salaries a particular undergraduate would get on graduating?
  • Is the recruitment industry meritocratic – Do people with higher skills get paid higher? Or are there biases which don’t allow for these?
  • How important are English skills in getting a job?

and many more!

Participate and spread the word – 1000 USD cash prizes!
Interested in finding out the answers to these questions?
Take a stab at the data right away by downloading it from the contest website (mentioned below).

Get started right away and help spread the word and!
1000 USD cash prizes to those with the best submissions!

Contest website
http://ikdd.acm.org/Site/CoDS2016/datachallenge.html

Who is prepared to learn computer programming?

Everyone wants to learn programming – or at least some of us want everyone to learn programming (See code.org). We believe that knowing programming doesn’t only enable to write software, but also teaches us how to think about problems objectively, pushing solution to a problem into a structure; a step by step process. This makes you a better problem solver in life in general, greatly improving your ability to manage things around you. Strangely enough, we found this hypothesis getting challenged in our own research group’s recruitment drives though – we found people ranking very highly on competitive programming websites doing very poorly when asked to think about an open problem, like how would one build a tool to automatically grade the quality of an email. This has led us to believe that knowing programming doesn’t cover everything, there are many other skills to take care of, but nevertheless…

In this blog we want to specifically look at the question of who has the pre-requisites to learn programming in a reasonable amount of time, say 3 months, in an adult education scenario. This is a very important question – for one, software engineering remains a lucrative career across the world given what it pays and the volume of jobs it has to offer. On the other hand, there’s a dearth of skilled people available in the market for these jobs! (See our report based on outcome data here). As a consequence, many companies want to hence hire ‘trainable’ candidates, whom they can train in 3 months with the right skills and deploy on live projects. Besides this, the answer to this question is equally important to students who take courses or MOOCs to make themselves more employable – they would need to know if they would be able to pick the required skills at the end of the course and make good of their investment.

I will share the results we got from one study we did involving 1371 candidates, but this result has now been confirmed multiple times over through various similar studies we’ve done. These candidates, just out of college, were to join a large IT company (150,000 plus people) as software engineers. They were to go through a 3 months training in programming. At the end of the training, the company would put them in three bucket – high, medium and low, the low being asked to leave. We tested all these candidates at the beginning of their training in four IRT based adaptive tests – English, Logical Ability, Quantitative Ability and Computer Programming (more about these here). Could their scores in these skills predict who would be eventually asked to leave?

The answer is yes: we could predict with a fairly good accuracy who was successful after the training. But then, the question that follows is – what skills finally mattered in predicting this information?

First, English and Logical Ability matter. English to understand instructions which are all in English and Logical ability, the basic deductive and inductive ability. But quantitative ability doesn’t matter. See the graph below. The model with Quantitative Ability scores included doesn’t do any better than just using the model with English and Logical scores. Thus we should not be testing and filtering candidates on quantitative ability for programming roles – unfortunately many have been doing this :( ! With a filter on a combination of English/Logical scores, we get a 20-20 type1-type 2 point.

graph_blog3

Figure 1:: Type 1: High/mid performers qualified as low performers. Type 2: Low performers qualified as Mid/High performers. EL: Model using just English and Logical scores. Type 2: Model using English and Logical and Quantitative Ability scores. The ELQ doesn’t add significant incremental value.

When we introduce the computer programming score, we can do a much better prediction. But, what scares institutions is that if they put a filter on programming, very few candidates will qualify the intake metric. This is actually untrue! We verified this empirically: if you use programming score in your model, more candidates from the population qualify the metric, but importantly more of the qualified succeed in training.

But for many this is counter intuitive. Given we have interpretable models, we can actually see why this is happening. Here is the rough qualification criteria in its barebones structure:

  LogicalScore + (1/2)* English score > Sc1

 Logical >Sc2

OR

 (1/2)*English score + Programming Score +  Logical score > Sc3

 Programming score >Sc4

So what does this mean? English is half as important as the others!

But more so, if the candidate doesn’t know programming, he/she needs a high logical ability (constrained by Sc2). On the other hand, if the person has some basic exposure to programming (Sc4 remove the bottom 30% candidates by their score in programming), their logical score can be offset by their programming ability. This means that candidates with higher programming scores can succeed even if they have a lower logical score. If we do not test for programming at all, all these candidates get cut out even if they know some level of programming which will make them succeed.

So the jury is out, a neat result: Ability in the language of instruction and logical ability predicts success in a short duration programming course. Language is half as important as Logical Ability and Quantitative ability is not important at all. If the person knows some programming, his/her level of programming can offset the requirement of Logical Ability and also language skills.

So, want to try to know whether you are trainable for programming? Talk to us! We will make you take AMCAT.

Want to know the details behind this simple neat result, ask us for a tech report! Vinay will be happy to send. :)

Till next time, learn how to code!

- Varun