Plan what NOT to do in 2017!

A new year is on the horizon. For many people it is time to make resolutions about what to do in the coming year. This year, instead of focusing entirely on what you want to do, consider thinking more carefully about those things you want to avoid. Our recent research, also covered by WSJ print, found that the secret to success is knowing what NOT to do and then not doing it! For instance, there were many things during this past year that experts advised should not be done – such as NOT to do a Brexit, NOT to elect ultra-nationalist voices and NOT to demonetize one’s currency without a plan. Only time will tell whether these were actually bad decisions. We find that recognizing a bad decision and avoiding it is far more important for success than focusing on the best things to do.

NT01

Figure 1

Our evidence came from tracking job success. We found that the most successful salespersons, customer service agents and managers weren’t those who chose the “best” course of action in a given situation, but rather were those who knew what NOT to do in a situation and avoided those actions. For instance, in a situation when you are very late for a sales meeting, what one absolutely should not do is fail to apologize. On the other hand, there might be different ways one could apologize or show regret, some being better than others, as deemed by experts. However, our work showed choosing among these different ways of expressing regret was not predictive of one’s success in a sales job. What mattered was the ability to identify what should not be done (i.e., expressing no regret). The wrong response may seem obvious in this situation, but it isn’t obvious to everyone and is also not obvious for many other situations.

Our study was based on a methodology called situational judgment testing. We provided candidates with a series of specific situations and asked them to choose from among a number of possible ways to respond to each situation – a technique known as situational judgment testing or SJT (see Figure 2 for an example). We asked them to choose which of the options presented for each situation would be the best way to respond and which would be the worst. We then analyzed the data to see if their choices predicted actual job performance (such as sales targets achieved) for a few different roles.

Figure 2: Sample question from a SJT

We expected that the people who were most successful in the workplace would be those who were able to identify what experts in the field said were the best ways to respond to each scenario. It turns out that was not the case. Instead, what we found was that the people who were most successful on the job were those who were correctly able to identify the worst answer to a larger number of situations. They knew what course of action was important to avoid for more scenarios. Specifically, the correlation between the ability to correctly identify the worst responses and job performance ranged r = 0.28 to 0.33 and was statistically significant. By contrast, the correlation between the ability to correctly identify the best responses to the scenarios and performance ranged r = 0.14 to 0.16 and was not statistically significant.

This work has important ramifications, the first and most immediate of which is being able to filter and hire better performers simply by concentrating on whether they know how to avoid doing the wrong things, which are typically widely-agreed upon, rather than trying to find people who pick the best answer. This should influence interview methodologies, case based discussions and other ways of candidate evaluation.

Another significant contribution is to the field of situational judgment testing. Unlike IQ tests, situational judgment tests are traditionally hard to standardize. Different organizations, functions and cultures have different notions of the ‘best’ way to handle a situation. Thus, with the best answer philosophy, one needs to build different tests and scoring mechanisms for each. On the other hand, the contribution of our work is that the ‘worst’ answer is more universal and consistent across diverse environment. It suggests that the development of SJTs can be relatively standardized across fields of study in a way that has not previously been possible.

Above and beyond all of these, the results have implications for our daily lives. Specifically, the results suggest that maybe this year you ought to concentrate on what ‘not to do’ and train your mind to avoid those things! Our conjecture is that it will lead to better happiness to your lives.

Make a start and list out what things you will avoid doing in 2017. We have our top on the list…Not write boring blogs!

-Varun and Steve

World’s first automated motor skill test – exploiting the power of touch tablets

Results presented at Ubicomp 2016, Heidelberg, Germany

Knowledge and cognitive ability tests have been automated and are taken on computers for more than three decades now. Pretty much all of you would have taken a SAT, GMAT or a GRE. What about motor skills? They are needed for almost all vocational jobs, say a plumber’s manual dexterity in fixing a screw. The best tests for them still are these bulky boards, pegs and instruments.

pegpin
Fig 1. Manual motor skill testing equipment – Pegboard and Pinboard

No one till date really thought about exploiting the power of the touch interfaces to develop such tests. Touch screen based devices are now ubiquitous in form of mobile phones and tablets. We wanted to find whether we can test people’s skills, say in tailoring and machining, by making them do things on the tablet. We wrote creative apps to make them do various actions on a tablet — rotating their fingers, pinching them, moving their elbows and shoulders to trace… and so on.

touch1motor3

 

 

 

 

 

 

 

 

 

         Fig 2. Touch interface device (Tab)                                            Fig 3. Snapshots of our motor skills assessment apps

We reported in our Ubicomp paper, presented last week, that the scores from these tests actually do predict the speed and accuracy of industrial tasks done by machinists, tailors and machine operators. In fact, they are better predictors than the bulky manual tests! Our test scores can predict all parameters of task performance measured by us. The correlation ranges 0.19 to 0.37, similar to what a logical ability test would predict for a knowledge worker. In comparison, manual test scores correlate significantly only for 4 out of 7 task performance ratings and ranges 0.19-0.33.

This has great implications for the training and job matching of vocational workers. Using these apps, vocational job aspirants can test their motor skills at the comfort of their homes. They can get feedback and work on improving their skills. Also, if they perform well, they can generate credentials such as “Motor skills certified for a tailor” and highlight them to employers. The same assessments can be used by the industry to filter and recruit high performing employees.

We are happy to present the world’s first validated motor skill test. There is so much more opportunity for further research – figuring out which scores correlate to performance in which task, creating a job to score map, creating more innovative apps and so on… Let us do it with the power of the touch interface.

-Varun

An Automated Test of Motor Skills for Job Prediction and Feedback

We’re pleased to announce that our recent work on designing automated assessments to test motor skills (skills like finger dexterity and wrist dexterity) has been accepted for publication at the 9th International Conference on Educational Data Mining (EDM 2016).
Here are some highlights of our work –

  • The need: Motor skills are required in a large number of blue collar jobs today. However, no automated means exist to test and provide feedback on these skills. We explore the use of touch-screen surfaces and tablet-apps to measure these skills.
  • Gamified apps: We design novel app-based gamified tests to measure one’s motor skills. We’ve designed apps to specifically check finger dexterity, manual dexterity and multilimb co-ordination.
    amultifingermanual

 

 

 

 

 

 

 

 

  • Validation on three jobs: We validated the scores from the apps on three different job roles – tailoring, plumbing and carpentry. The results we present make a strong case for using such automated, touch-screen based tests in job selection and to provide automatic feedback for test-takers to improve their skills!

If you’re interested in the work and would like to learn more, please feel free to write to research@aspiringminds.com

Data Science For Kids Goes International

We successfully organised our first international data science workshop for kids at the University of Illinois as a part of SAIL, a one-day event to learn more about life on campus by attending classes taught by current students.
The workshop aimed towards introducing the idea of machine learning and data-driven techniques to middle-to-high-school kids. Participants went through a fun exercise to understand the complete data science pipeline starting from problem formulation to prediction and analysis.cssail
Special mention and thanks to the mentors, Narender Gupta, Colin Graber and Raghav Batta, students at the university who helped us execute the academic and peripheral logistics of the workshop efficiently and making the experience engaging and interesting for the attendees.

naren

colinraghav

 

 

 

 

 

 

Narender Gupta                     Colin Graber                          Raghav Batta

To read the mentor experiences click here.
Visit
sail.cs.illinois.edu for more information on the event or workshop.

What AM Research told you in 2015 – the data science way?

As the year came to an end, we looked back on what we shared with the world in 2015. As data nerds, we pushed all our blog articles in to an NLP engine to cluster them to identify key themes. Given the small sample size and challenges to find semantic similarity in our specialized area, we waded through millions of unsupervised samples through deep learning with a Bayesian framework, ran it on a cluster of GPUs for a month…yada yada. Well, for some problems it is just that humans can do things easily and efficiently; so that is what we actually did.

The key themes were:

Grading of programs – 4 posts

We need to grade programs better to be able to give automated feedback to learners and help companies hire more efficiently and expand the pool considered for hiring. We at AM dream to have an automated teaching assistant – we think it is possible and will be disruptive. Thus we dedicated 4 of our posts on telling you about automatically grading programs and its impact.

The tree of program difficulty – We found that we could determine the empirical difficulty of a programming problem based on the data structures it uses, the control structures used and its return type, among other parameters. We used these features in a nice decision tree to predict how many test takers would answer the question correctly, and we predicted with a correlation of 0.81! This tells us about human cognition, helps improve pedagogy and also helps generate the right questions to have a balanced test. And this is just the tip of the iceberg. Second, we approached the same by looking at the difficulty of test-cases and their inter correlation. We understood what conceptual mistakes people make and also got a recipe to make better test cases for programs and had insights on how to score them. For instance, we found that a trailing comma in a test case can make it unnecessarily difficult!

Finding super good programmers – Given these thoughts on how to construct a programming test and score it, we showed you how all this intelligence put together with our super semantic machine learning algorithm, we can spot 16% good programmers missed by test case based measures. Additionally, we also found automatically the super good ones writing efficient and maintainable code. So please say a BIG NO to test case based programming assessment tools!

venn

Reproduced from “AI can help you spot the right programmers”. It shows a test case metric misses 16% good programmers. Furthermore AI can help spot 20% super good coders

Pre-reqs to learn programming - Stepping back, we tried determining who could learn programming through a short duration course. We found that it was a function of a person’s logical ability and English but not did not depend on her/his quantitative skills. Interestingly, we found that a basic exposure to programming language could compensate for lower logical ability in predicting a successful student who could learn programming. A data way to find course prerequisites!

Building a machine learning ecosystem – 3 posts

Catching them young! We designed a cognitively manageable hands-on supervised learning exercise for 5th-9th graders. We helped kids, in three workshops spread across different cities, make fairly accurate friend predictors with great success! We think data science is going to become a horizontal skill across job roles and want to find ways to get it into schools, universities and informal education.

“Exams. I would take my exam results, from the report card of every year. And then I will make it on excel and then I will remember the grades and the one I get more grades I will take a gift” [sic.]

flashcard

Reproduced from datasciencekids.org. Whom will you befriend? Can machine learning models devised by high school kids predict this?

The ML India ecosystem – Our next victims were those in universities. We launched ml-india.org to catalyse the Indian machine learning ecosystem. Given India’s very low research output in machine learning, we have put together a resource center and a mail list to promote machine learning. We also declared ourselves as self-styled evaluators of machine learning research in India and promise to share monthly updates.

Employment outcome data release – We recently launched AMEO, our employability outcome data set at CODS. This unique data set has assessment details, education and demographic details of close to 6000 students together with their employment outcomes – first job designation and salary. This can tell us so much about the labor market to guide students and also identify gaps – to guide policy makers. We are keenly looking forward to what wonderful insights we get from the crowd! Come, contribute!

Pat our back! – 3 posts 

blog4-image

Reproduced from “Work on spoken English grading gets accepted at ACL, AM-R&D going to Beijing!”. We describe our system that mixes machine learning with crowdsourcing to do spontaneous speech evaluation

We told you about our KDD and ACL papers on automatic spoken English evaluation – the first semi-automated automated grading of free speech. We loved mixing crowdsourcing with machine learning – a cross between peer and machine grading – to do super reliable automated evaluation.

And then our ICML workshop paper talked about how to build models of ‘employability’ – interpretable, theoretically plausible yet non-linear models which could predict outcome based on grades. More than 200 organizations have benefited by using this model in recruiting talent and they do way better than linear models!

Other posts

On the posts off these three clusters, we told you about –
Why we exist – why we need data science to promote labor market meritoracy

– The state of the art and goals for assessment research for the next decade (See ASSESS 2015)

Our work on classifying with 80-80 accuracy for 1500+ classes

It has been an interesting year at AM, learning from all our peers and contributing our bit to research, while using it to build super products. We promise to treat you with a lot more interesting stuff in open-response grading, labor market standardizing and understanding next year. Stay tuned to this space!

Varun