Data science camp for kids!

It is an open secret that data science is becoming pervasive. What was once the preserve of statisticians and computer scientists – deft at trudging through mountains of data – has found its tools and techniques percolating into every industry and every level. Peer into the crystal ball and you don’t need to suspend reality too much to imagine a future in which a factory manager looks at production data to predict what machine might break-down soon. A cab-operator analyzes his Uber receipts to figure out where he should drive to make the most money. A sales manager looks at what kinds of customers his sales agents are most successful with to ascertain who to deploy where. Decidedly, the future belongs to the data scientist. Where will these data scientists come from? Who is going to train them?

The very nature of the subject eschews traditional learning modes. The data scientist must have the ability to learn quickly the context of the dataData science camp!, build hypotheses, have the ability to use techniques to confirm his suspicions and then construct predictors or automated systems. It marries technology with knowledge; intuition with scientific rigor. Our education systems will be slow to adapt – they will have to devise new methodologies, develop syllabi and learn to simultaneously involve multiple teachers. In the meanwhile, a whole generation of students might graduate who do not have the skills that industry expects from them in a data rich environment.

At Aspiring Minds, we’re passionate about helping students reach their full potential. We plan to pursue a series of initiatives to help advance data science education in India and around the world. As a first step, we held a data science camp for elementary school students! The participants continuously surprised us – with their knowledge, their understanding and even their wit. Two things became clear quickly – a. kids seldom confront open-ended problems and it took some getting used-to the idea of there being no one correct, pre-decided answer and b. with some guidance, they learn astonishingly quickly.

Read more about our exciting and rewarding weekend here!

At the end of the camp, the participating kids blogged about their experiences and the plots/analysis that they came up with. Read about them here.

Our team got enthusiastically involved in mentoring the students through the exercise and ended up learning more about their own teaching styles in the process.

We’ve also put out the exercises and resources we used for the camp for you to replicate it in your school/university/workplace. If the thought of indulging high schoolers in data-science seems absurd to you, snap out of it! It is possible; we tried it and the kids had a fun time picking up these concepts.

Let us know what you thought of our data camp. Please do write to us if you go ahead and try this out with students around you. We’ll eagerly look forward to that!

Samarth Singal
Research Intern, Aspiring Minds
Class of 2017, Computer science, Harvard.

Paper accepts at ICML and KDD!

Some more good news!

Soon after our recent acceptance of our spoken English grading work at ACL, our work on learning models for job selection and personalized feedback gets accepted at the workshop Machine Learning for Education at ICML! Some results from this paper were discussed in one of our previous posts. The tool was built five years ago and has since helped a couple of million students get personalized feedback and aided 200+ companies hire better. I shall also be giving an invited talk at this workshop.

Earlier this month, we also got a paper at KDD accepted, which builds on our previous work in spontaneous speech evaluation. We find how well we can grade spontaneous speech of natives of different countries and also analyze the benefits the industry gets with such an evaluation system.

Busy year ahead it seems – paper presentations at France, Beijing, Australia and finally New Jersey, where we’re organizing the second edition of ASSESS, our annual workshop on data mining for educational assessment and feedback. It’s being organized at ICDM 2015 this winter. July 20th is the submission deadline for the workshop. Here is a list of submissions we saw in our workshop last year, at KDD. Spread the word!

– Varun

Work on spoken English grading gets accepted at ACL, AM-R&D going to Beijing!

Good news! Our work on using crowdsourcing and machine learning to grade spontaneous English has been accepted at ACL 2015.

  • Ours is the first semi-automated approach to grade spontaneous speech.
  • We propose a new general technique which sits between completely automated grading techniques and peer grading: We use crowd for doing the tough human intelligence task, derive features from it and use ML to build high quality models.
  • We think, this is the first time anyone used crowdsourcing to get accurate features that are then fed into ML to build great models. Correct us if we are wrong!

Design of our Automated Spontaneous Speech grading system.

Figure 1: Design of our Automated Spontaneous Speech grading system.

The technique helps scale spoken English testing, which means super scale spoken English training!

Great job Vinay and Nishant.

PS: Also check out our KDD paper on programming assessment if you already haven’t.

- Varun