Senior Machine Learning Scientist - Bioinformatics, Python, ML - Europe
Posted 1 day ago by MRP Technology Ltd
Negotiable
Undetermined
Remote
100% Remote, UK
Summary: A large global organization is seeking a Senior Machine Learning Scientist to develop and evaluate predictive models linking viral genotype and phenotype data. The role emphasizes applying machine learning techniques to biological datasets to support research and decision-making. This position is remote and is initially a 6-8 month contract with potential extensions.
Key Responsibilities:
- Develop classification models to analyze curated genotype-phenotype datasets
- Apply appropriate modelling strategies to predict viral sensitivity or resistance based on sequence derived features
- Implement training, validation, and hyperparameter tuning workflows using predefined datasets
- Evaluate alternative feature representations provided by the bioinformatics team and assess their suitability
- Assess model performance using metrics appropriate for imbalanced biological datasets
- Evaluate robustness across data splits, phenotype definitions, and successive data releases
- Identify failure modes, instability, and limitations, and document their implications
- Document modelling assumptions, trade offs, uncertainty, and limitations in a reproducible and transparent manner
- Provide interpretable summaries of model behavior, including feature importance and consistency of signals
- Identify amino acid positions or features that recur across models or resampling strategies, while highlighting where signals are not reproducible
- Clearly document and communicate findings, assumptions, and caveats within the bioinformatics team
Key Skills:
- Publication backed experience in predictive ML on biological data (classification modelling)
- Experience in genotype/feature phenotype modelling
- Method development or algorithmic contributions
- Model interpretability and generation of biological insight
- Strong background in machine learning or statistical learning with substantial hands-on experience developing classification models
- Experience working with high dimensional, sparse biological or omics datasets
- Strong proficiency in Python for end-to-end machine learning workflows
- Demonstrated experience designing validation strategies and assessing performance under significant class imbalance and limited sample sizes
- Clear understanding of model limitations, uncertainty, and overfitting risks in real-world biological datasets
- Experience delivering machine learning analyses intended to inform research and internal decision making
- Experience making principled methodological recommendations in the face of incomplete or noisy data
- Experience working with biological sequence data or genotype-phenotype analyses
- Experience with interpretability or explainability approaches applied to biological machine learning models
- Background in pharmaceutical, biotech, or regulated research environments
Salary (Rate): undetermined
City: undetermined
Country: UK
Working Arrangements: remote
IR35 Status: undetermined
Seniority Level: undetermined
Industry: IT
Senior Machine Learning Scientist - Bioinformatics, Python, ML - Europe, Remote
A large global organization are seeking a Senior Machine Learning Scientist to develop and evaluate internal predictive models linking viral genotype and phenotype data based sensitivity analyses. The role focuses on applying established machine learning approaches to curated biological datasets to support research, validation, and internal decision making.
This role will be an initial 6-8 months+ contract with the possibility of extensions.
The role can be worked on a remote basis, from Europe.
Key Skills/Responsibilities:
The primary priority is for the candidate to have publication backed experience, specifically in:
- Predictive ML on biological data (classification modelling)
- Genotype/feature phenotype modelling
- Method development or algorithmic contributions
- Model interpretability and generation of biological insight
- Strong background in machine learning or statistical learning with substantial hands on experience developing classification models
- Experience working with high dimensional, sparse biological or omics datasets
- Strong proficiency in Python for end to end machine learning workflows
- Demonstrated experience designing validation strategies and assessing performance under significant class imbalance and limited sample sizes
- Clear understanding of model limitations, uncertainty, and overfitting risks in real world biological datasets
- Experience delivering machine learning analyses intended to inform research and internal decision making
- Experience making principled methodological recommendations in the face of incomplete or noisy data
- Experience working with biological sequence data or genotype-phenotype analyses
- Experience with interpretability or explainability approaches applied to biological machine learning models
- Background in pharmaceutical, biotech, or regulated research environments
- Develop classification models to analyze curated genotype-phenotype datasets
- Apply appropriate modelling strategies to predict viral sensitivity or resistance based on sequence derived features
- Implement training, validation, and hyperparameter tuning workflows using predefined datasets
- Evaluate alternative feature representations provided by the bioinformatics team and assess their suitability
- Assess model performance using metrics appropriate for imbalanced biological datasets
- Evaluate robustness across data splits, phenotype definitions, and successive data releases
- Identify failure modes, instability, and limitations, and document their implications
- Document modelling assumptions, trade offs, uncertainty, and limitations in a reproducible and transparent manner
- Provide interpretable summaries of model behavior, including feature importance and consistency of signals
- Identify amino acid positions or features that recur across models or resampling strategies, while highlighting where signals are not reproducible
- Clearly document and communicate findings, assumptions, and caveats within the bioinformatics team