Course English 2.5 ECTS

Introduction to Python for Data Analysis and Automation in Biology

BioEng Python

Want to gain an understanding of the theory and skills needed to make decisions related to upstream process development?

This course aims to get programming novices (little to no experience) off the ground with adopting Python (instead of Excel and Word) in their daily work. In contrast to many existing Python courses targeting computer scientists and software engineers, this course is specifically tailored towards Biotechnology. It focuses primarily on Python as a tool for data analysis and automation, deemphasizing parts that are relevant to software development only. Furthermore, participants are provided with knowledge about data analytics and relevant machine learning methods, including best practice approaches, troubleshooting and avoiding common pitfalls.

For more details on scope, form and exam, please see course base.

Course schedule

Day 1

  • 09:00                Automating Tasks with the Unix Shell
  • 10:20                Morning break
  • 10:35                Automating Tasks with the Unix Shell (Continued)
  • 12:30                Lunch break
  • 13:30                Version Control with Git
  • 14:30                Afternoon break
  • 14:45                Version Control with Git (Continued)
  • 15:45                Wrap-up
  • 16:00                END

Day 2

  • 09:00                Introduction to Python
  • 10:30                Morning break
  • 10:45                Introduction to Python (Continued)
  • 12:00                Lunch break
  • 13:00                Introduction to Python (Continued)
  • 14:30                Afternoon break
  • 14:45                Introduction to Python (Continued)
  • 15:45                Wrap-up
  • 16:00                END

Day 3

  • 09:00                Introduction to Pandas
  • 10:30                Morning break
  • 10:45                Visualizations with Altair
  • 12:00                Lunch break
  • 13:00                Introduction to Machine Learning with Scikit Learn
  • 14:30                Afternoon break
  • 14:45                Introduction to Machine Learning with Scikit Learn (Continued)
  • 15:30                Wrap-up and Outlook
  • 16:00                END

* The course schedule is subject to possible adjustments.

Content

With data generation and genetic engineering becoming evermore easy in biology, life scientists and bioengineers are increasingly facing challenges in processing and analyzing data and automating experimental workflows in their line of work. For example, simple tasks (such as designing primers) can become a huge drain on scientists’ time as they repetitively copy and paste information into web interfaces instead of running batch operations.

Furthermore, qualifications demanded of biotechnologists in the industry are shifting away from pipetting towards the analysis of data and automation of workflows. Therefore, it is essential that life science and biotechnology PhD students are trained in the computational tools needed for data analysis and task/lab automation.

This course will provide you with theoretical and practical knowledge about:

* Obtain a working knowledge of Python basics and fundamentals relevant to data analysis and automation.

* Adopt a modern development and reporting environment for Python in the form of Jupyter notebooks.

* Obtain a good overview of key Python libraries covering Bioinformatics/Sequence analysis (Biopython, pydna), data analysis and statistics (Pandas), machine learning (scikit-learn), and image processing (scikit-image).

Software and Data Carpentry curricula

This course is based on the Software and Data Carpentry curricula (https://carpentries.org) and style of teaching (live coding, hands-on exercise etc.). Since 1998, Software Carpentry has been teaching basic lab skills for research computing to scientists and engineers and course materials have continuously been adapted and tailored to their problems and needs. The course materials for this course have been tailored extensively by us towards life science and biotech related problems that can be solved with Python and specifically target life science and biotech PhD students.

Scope and form

The course is 100% interactive and relies on the proven approach of teachers conveying the knowledge through live coding while the participants follow along (supported by teaching assistants). Furthermore, live coding is frequently interrupted by hands-on exercises in which the participants develop programming solutions to appropriate tasks on their own (with the help of the teachers and teaching assistants).


Learning outcomes

At the end of this course, participants will be able to:

  • To use the Unix shell for working with files and directories, pipes and filters, loops, shell scripts, and searching
  • Use Python for data analysis and task automation, including the import of libraries, reading and plotting of data, selection and filtering of data, writing of conditional statements and functions, and debugging
  • Utilize basic version control of data and programming code with Git
  • Adopt a modern development and reporting environment for Python in the form of Jupyter notebooks
  • Clean, filter, transform and summarize tabular data with Pandas
  • Visualize data using the Python plotting libraries matplotlib and altair
  • Apply scikit-learn for basic Machine Learning such as classification, regression, clustering, PCA etc.
  • Apply biopython for basic DNA sequence handling
  • Simulate and plan of experiments involving the creation of recombinant DNA using pydna
  • Perform basic image processing using scikit-image.

Important information 

Here you will find important information for short course participants regarding cancellation policies, location and waitlist policies.

Fee, registration and location 

  1. This course is free for PhD students registered at a Danish university
  2. The course fee for industry participants is 15,000 DKK.

To register, please reach out to Program Coordinator, Seungmi Nam (seunam@dtu.dk). Send your name, university or company and department.

The application deadline is May 2024.

The FBM initiative – funded by the Novo Nordisk Foundation  

This course was developed in the framework of the Fermentation Based Biomanufacturing Initiative, in collaboration between DTU Bioengineering, DTU Biosustain, and DTU Chemical Engineering. The project is funded by the Novo Nordisk Foundation.

Course instructor

Registration

Duration

6-8 May 2024

Place

DTU Lyngby Campus

ECTS

2.5

Price

15.000,00 DKK

Registration