Despite billions invested in cancer research, our understanding of the disease, treatment, and prevention remains limited. Natural language processing (NLP) can offer new insights by mining the rich but underutilized information encoded in physicians’ observations and clinical findings, which are still primarily recorded as free-form text. NLP-based models can make a difference in clinical practice by improving models of disease progression, preventing over-treatment, and narrowing down on a cure.