The Data Scientist will be primarily responsible for designing and executing the analysis of our uniquely complex combinatorial datasets of microbial interactions in order to identify beneficial microbial “ensembles”. You will fully own the data analysis process of these datasets and will be able to innovate new methods of analysis and visualization, with the results of your work directly informing the direction of laboratory experiments to drive our discovery efforts. As an early member, of a growing team, your contributions will be transformative: working closely with the Director of Engineering and laboratory scientists, you will build out the core data analysis methods at Concerto, influencing upstream image processing and downstream predictive modeling. Finally, as our team grows, you’ll play a critical role in developing Concerto’s culture and putting our core values into practice.
Responsibilities
Standardize and operationalize data cleaning methods
Design and implement analysis techniques for combinatorial microbe interaction data with the strong statistical rigor required of large biological datasets, to identify beneficial groups of microbes working together
Design and implement new and insightful data visualizations to communicate scientific results to a technical audience
Standardize your analysis work into reusable code modules
Work with Director of Engineering to inform upstream image processing and feature extraction methods used to generate data
Work with the Chief Scientific Officer and other scientist to inform the design of experimental screens, as well as follow-up experiments aiming to address the working hypothesis
Identify opportunities to extract new value from kChip datasets
Embody Concerto’s core values in your work
Required Qualifications
PhD/Masters in Data Science, Statistics, Computational Biology or related field with 1-3 years of working experience
3-5 years of experience designing and implementing data analysis methods to interrogate complex datasets
A strong understanding of data cleaning and filtering processes
Experience implementing statistical inference, machine learning, and/or other analytical methods
Experience using Python and Jupyter Notebooks for data analysis, including the use of packages such as Numpy and Pandas
Familiarity with data visualization tools or software packages
Strong written and verbal communication; and ability to communicate technical material to expert and non-expert audiences simply and clearly
Demonstrated track record of working well in team environments
Proactive, with a strong drive for problem solving and creative data exploration
Human-friendly, reproducible coding habits (well-commented, logically structured, etc.)