AI PIONEER Transforms Drug Discovery With Breakthrough Protein Mapping

New Pill Medicine Breakthrough Concept
PIONEER, a collaborative creation by Cleveland Clinic and Cornell University, revolutionizes how researchers identify key protein interactions for drug development, combining extensive genomic and protein data to target diseases efficiently.

Researchers have developed PIONEER, new software that simplifies the identification of crucial protein-protein interactions for drug targeting.

This tool, which integrates vast genomic data and physical protein structures, aids researchers in pinpointing interaction points that could lead to effective treatments for diseases like cancer.

Scientists from Cleveland Clinic and Cornell University have developed a publicly-available software and web database designed to simplify the identification of key protein-protein interactions that can be targeted with medication.

The tool, named PIONEER (Protein-protein InteractiOn iNtErfacE pRediction), was showcased by researchers who used it to identify potential drug targets for various cancers and other complex diseases. Their findings were published today (October 24) in Nature Biotechnology.

Advancements in Genomic Research and Drug Discovery

While genomic research plays a crucial role in drug discovery, it is often not sufficient by itself, explains study co-lead author Feixiong Cheng, PhD, director of Cleveland Clinic’s Genome Center. Developing medications based on genomic data can take an average of 10-15 years from identifying a disease-causing gene to starting clinical trials.

“In theory, making new medicines based on genetic data is straightforward: mutated genes make mutated proteins,” Dr. Cheng says. “We try to create molecules that stop these proteins from disrupting critical biological processes by blocking them from interacting with healthy proteins, but in reality, that is much easier said than done.”

Challenges in Protein Interaction Networks

One protein in our body can interact with hundreds of other proteins in many different ways. Those proteins can then interact with hundreds more, forming a complex network of protein-protein interactions called the interactome, Dr. Cheng explains. This becomes even more complicated when disease-causing DNA mutations are introduced into the mix. Some genes can be mutated in many ways to cause the same disease, meaning one condition can be associated with many interactomes arising from just one differently mutated protein.

Drug developers are left with tens of thousands of potential disease-causing interactions to pick from – and that’s only after they generate the list based on the affected protein’s physical structures.

Utilizing AI to Navigate Complex Interactomes

Dr. Cheng sought to make an artificial intelligence (AI) tool to help genetic/genomic researchers and drug developers identify the most promising protein-protein interactions more easily, teaming up with Haiyuan Yu, PhD, director of the Cornell University Center for Innovative Proteomics. The group integrated massive amounts of data from multiple sources including:

  • Genomic sequences from almost 100,000 individuals who were either born with disease-causing mutations or acquired them later in life (usually cancer).
  • Physical three-dimensional structures of over 16,000 human proteins, and data on how DNA mutations impact those structures.
  • Known interactions between almost 300,000 different protein-protein pairs.

Their resulting database allows researchers to navigate the interactome for more than 10,500 diseases, from alopecia to von Willebrand Disease.

Researchers who identified a disease-associated mutation can input it into PIONEER to receive a ranked list of protein-protein interactions that contribute to the disease and can potentially be treated with a drug. Scientists can search for a disease by name to receive a list of potential disease-causing protein interactions that they can then go on to research. PIONEER is designed to help biomedical researchers who specialize in almost any disease across categories including autoimmune, cancer, cardiovascular, metabolic, neurological, and pulmonary.

Impact and Validation of PIONEER

The team validated their database’s predictions in the lab, where they made almost 3,000 mutations on over 1,000 proteins and tested their impact on almost 7,000 protein-protein interaction pairs. Preliminary research based on these findings is already underway to develop and test treatments for lung and endometrial cancers. The team also demonstrated that their model’s protein-protein interaction mutations can predict:

  • Survival rates and prognoses for various cancer types, including sarcoma, a rare but potentially deadly cancer.
  • Anti-cancer drug responses in large pharmacogenomics databases.

The researchers also experimentally validated that protein-protein interaction mutations between the proteins NRF2 and KEAP1 can predict tumor growth in lung cancer, offering a novel target for targeted cancer therapeutic development.

“The resources needed to conduct interactome studies poses a significant barrier to entry for most genetic researchers,” says Dr. Cheng. “We hope PIONEER can overcome these barriers computationally to lessen the burden and grant more scientists with the ability to advance new therapies.”

Reference: “A structurally informed human protein–protein interactome reveals proteome-wide perturbations caused by disease mutations” 24 October 2024, Nature Biotechnology.
DOI: 10.1038/s41587-024-02428-4

This study has five co-first authors who contributed equally: Dapeng Xiong, PhD (Cornell University); Yunguang Qiu, PhD (Cleveland Clinic); Junfei Zhao, PhD (Columbia University); Yadi Zhou, PhD (Cleveland Clinic); and Dongjin Lee, PhD (Cornell University).

It was funded in part by The National Institute on Aging (R01AG084250, R56AG074001, U01AG073323, R01AG066707, R01AG076448, R01AG082118, RF1AG082211, and R21AG083003) and The National Institute of Neurological Disorders and Stroke (RF1NS133812).

The work was also supported in part by the late Charis Eng, MD, PhD, the Sondra J. and Stephen R. Hardis Chair of Cancer Genomic Medicine at Cleveland Clinic. Dr. Cheng wishes to dedicate this paper to the memory of Dr. Eng, founding Chair of Genomic Medicine Institute. She will be remembered for her lifelong dedication to human genetics, personalized genomic healthcare research, and mentorship.