Scientist, Machine Learning Computer-Aided Drug Design

Location: Philadelphia Area
Job Type:

Janssen Research & Development, L.L.C., a division of Johnson & Johnson’s Family of Companies is recruiting for a Scientist, Machine learning for Computer-Aided Drug Design located in Spring House, PA.

At the Janssen Pharmaceutical Companies of Johnson & Johnson, we are working to create a world without disease. Transforming lives by finding new and better ways to prevent, intercept, treat and cure disease inspires us. We bring together the best minds and pursue the most promising science.

Developing innovative therapeutics to treat diseases like Alzheimer’s disease, various types of cancers, and infectious diseases like Hepatitis B, influenza is our mission. In this endeavor, we are seeking to recruit new talent for the comprehensive analyses of high-dimensional datasets using state-of-the-art data science methods applied to drug discovery programs. The position is opened at Spring House (PA), a headquarters of Janssen Research & Development. We significantly increased our investment into the workforce for data analysis pipelines with the emphasis in current cutting-edge technologies to support future Artificial Intelligence-driven drug design and discovery.

Janssen Research & Development L.L.C. has an open position for a full-time scientist to support drug design and discovery projects using machine learning approaches. We are looking for candidates with a track record in machine learning, strong coding experience and preferably experience working with chemical or biological data, but we are happy to teach chemistry and biology to everyone from different background and passion for complex data.

A candidate will develop prediction pipelines for small molecule compounds by integrating diverse data sources (chemical structure, microscopy images, gene expression) to infer biological activities of millions of chemical compounds, and to apply these predictions to drive drug design projects. Predictive pipelines aim to increase the safety and efficacy of the drug candidates and decrease the time needed to progress hit compound to lead compound to compound in clinical trials. Testing of them in real projects would require interaction with chemists, biologists, and data scientists and iterative model optimization if needed. The candidate will also be responsible for the integration of new machine learning pipelines, optimization of it, and deploying on current projects.

Additionally, the candidate will drive research in transfer learning with an emphasis on deep transfer learning. Deep learning and other multi-task machine learning techniques have already shown promise for small molecule projects in Janssen, yet most of those models require a significant amount of data, while many bioassays generate significantly smaller datasets that require transfer learning to integrate them successfully into the predictive pipelines.


• Support drug design and development using machine learning techniques (emphasis on deep learning);

• Building machine learning models that effectively learn from heterogenous multi-modal data, e.g. modeling biological effects of compounds using chemical structure and gene expression, high content image descriptors or other omics data

• Application of the developed pipelines in drug design and development project with actionable conclusions;

• Design and development of the transfer learning pipelines for small and medium size datasets (SMSDS) from scratch (Keras, TensorFlow, etc.) or adaptation of the open source code if available;

• Finding the beneficial interplay of the transfer learning on large datasets (millions of compounds by thousands of biological end-point types) to small and medium size datasets reflecting ADMET assays (thousands of end-points per assay);

• Integration of the pipelines into internal expert system together with end-users (chemists, biologists, data scientists) and IT support: to promote transparency, traceability and visual component of the developed technique.

• Contributing to the scientific weight of the department by authoring peer-reviewed papers and presenting at relevant conferences.


• A minimum of a Bachelor’s or Master’s degree is required, PhD in Computer Science, Data Science, Computational chemistry, Bioinformatics or a related quantitative field is preferred.

• Preferably experience of interaction with research chemists or biologists in an academic or industrial setting or equivalent experience, preferred.

• A minimum of 1 or more years of experience in applied machine learning is required

• Experience in Life Sciences or related field is preferred

• Advanced programming skills to enable the development of functional prototypes: Python is required.

• Experience with Deep Learning machine learning frameworks, like PyTorch, Keras, Tensorflow or alike, is preferred.

• Excellent communication, reporting and team interaction skills, self-motivation, proactivity and the ability to work independently, required.


• A passion for hands-on science and delivering high quality results is required.

• Good communication, organizing, and planning skills and the ability to take leadership for and drive decision making in a research projects is required.

• Ability to develop and deliver a presentation on technical data with strong business impact is preferred.

• Creative mind that can see things from different perspectives and come up with innovative solutions to complex problems. Ability to introduce best practices from previous work experiences into the new team.

• Desire for continuous learning and the ability to identify, evaluate and implement emerging areas of science.