Competition encourages data scientists to use AI to improve cancer screening

Two out of every five people in the US will be diagnosed with cancer during their lifetimes, according to the National Cancer Institute . Now the same technology behind improved voice assistants and credit card fraud detection—artificial intelligence—is now being implemented into lung cancer screenings for this year’s Data Science Bowl.

Booz Allen Hamilton and Kaggle are hoping to inspire data scientists and medical communities around the world to use artificial intelligence to improve lung cancer screening technology at this year’s Data Science Bowl. The 90-day Data Science Bowl competition will award winners with $1 million in prizes. Funds for the prize purse will be provided by the Laura and John Arnold Foundation.

“Cancer is an intensely personal disease for so many of us: it hits loved ones at home, colleagues at work and friends in our communities. Improving cancer screening and treatment is among the most important responsibilities we have in the next decade,” said Dr. Josh Sullivan, senior vice president, Booz Allen Hamilton. “Artificial Intelligence and human ingenuity can be powerful in the fight against cancer. Through last year’s Data Science Bowl, hedge fund analysts who had no medical experience created an algorithm that can review heart MRI images on par with trained technicians, helping to better heart disease screening. This year, data scientists—professional and hobbyists alike—can make a difference in the lives of millions of people facing a cancer diagnosis.”

The 2017 Data Science Bowl: In Depth

  • Competition Aims to Improve Early Detection of Lung Cancer: Low-dose computed tomography scans can reduce lung cancer deaths by 20 percent, as demonstrated in National Cancer Institute  sponsored screening trials. This reduction would save more lives each year than any cancer-screening test in history. However, there are significant challenges as low-dose CT scans have a high false-positive rate, creating patient anxiety and potentially leading to costly and unnecessary diagnostic work like invasive biopsies that put patients at risk for collapsed lungs and other complications. Reducing the false-positive rate is a critical step in making these scans available to more patients.
  • Participants Will Use Machine Learning and Artificial Intelligence to Scan Lung Images: Using a data set of anonymized high-resolution lung scans provided by the National Cancer Institute, Data Science Bowl participants will develop artificial intelligence algorithms that accurately determine when lesions in the lungs are cancerous, and thereby dramatically decrease the false positive rate of current low-dose CT technology.
  • Leading Health and Technology Organizations Join Booz Allen and Kaggle: The competition receives additional sponsorship and support from a number of leading health and technology organizations, including the American College of Radiology, Amazon Web Services, NVIDIA and many others. For a full list of sponsors, visit

“The Data Science Bowl is an exciting opportunity for data scientist to work with unique data sets that they wouldn’t have access to unless conducting medical research,” said Anthony Goldbloom, CEO, Kaggle. “This year’s competition has an especially important goal. By reducing the false positive rate of low-dose CT scans, we can not only prevent thousands of inaccurate lung cancer diagnoses, but also save lives through critical early detection of cancer.”

To join the Data Science Bowl and the Kaggle community, visit Competitors can download the data set and participate in the competition by registering on