This competition concerns educational diagnostic questions, which are pedagogically effective, multiple-choice questions (MCQs) whose distractors embody misconceptions. With a large and ever-increasing number of such questions, it becomes overwhelming for teachers to know which questions are the best ones to use for their students. We thus seek to answer the following question: how can we use data on hundreds of millions of answers to MCQs to drive automatic personalized learning in large-scale learning scenarios where manual personalization is infeasible? Success in using MCQ data at scale helps build more intelligent, personalized learning platforms that ultimately improve the quality of education en masse. To this end, we introduce a new, large-scale, real-world dataset and formulate 4 data mining tasks on MCQs that mimic real learning scenarios and target various aspects of the above question in a competition setting at NeurIPS 2020. We report on our NeurIPS competition in which nearly 400 teams submitted approximately 4000 submissions, with encouragingly diverse and effective approaches to each of our tasks.