Summer School for Big Data in Biology (CBRS)

Big Data in Biology Summer School


Intensive four-day workshops on diverse topics for analysis of large-scale DNA, RNA, and protein datasets

Register

May & June
Online & In-Person
All Levels

The Center for Biomedical Research Support hosts the Annual Summer School for Big Data in Biology each May and June. Participants gain hands-on experience with real datasets and tools, guided by experts in computational biology. Courses are tailored for beginners through advanced users.

Course Format

  • Runs for 4–5 consecutive days (mornings or afternoons).
  • Offered online or in-person.
  • Includes lectures, datasets, and practical exercises.
  • No exams — certificates of completion available on request.
  • Academic credit is not issued.

Before You Enroll

  • Check course prerequisites carefully.
  • Many courses expect familiarity with Unix, Bash scripting, and TACC.
  • Make sure you have a UT EID and a TACC account

Fees and Registration


We accept personal credit cards (AmEx, MasterCard, Visa, Discover), UT ProCards (see details), and IDT (interdepartmental transfer).

  • Groups of 5 or more from the same agency or institution receive a 20% discount.
  • Register for 3 or more courses and receive a 50% discount
     
AffiliationStudents / Post-docsFaculty / Staff
UT System$195*$295*
Non-UT Schools$275**$500**
Other
  • Non-Profit: $500
  • Government: $500
  • Industry: $1,000

* Our staff will confirm affiliations with UT.
** Non-UT students must provide a copy of their current student or faculty/staff ID.

 

Refund and Cancellation Policy

Refunds (minus a $25 fee) are available if requested in writing at least one week before the course start. No refunds or substitutions will be granted after that date. Failure to cancel on time and non-attendance still require full payment. UT Austin may cancel courses with low enrollment and will issue full refunds in those cases.

For issues or questions about registration, email bcg@utexas.edu.

 

Important Registration Notice

Incognito (private) mode, clearing web browser cache, or switching browsers might be necessary to complete course registration if the cart remains empty.

Do NOT use someone else’s PIN number during the registration process, or your registration will not be complete. Use your own unique PIN number assigned to you during registration if you are new, or the same PIN number you have used for earlier registrations.

Also, if you are registering on behalf of someone else, PLEASE DO NOT use your name, contact information, or EID at any point in the process. You MUST use the information as it pertains to the student, or they will not be included on the course roster properly and could miss out on crucial course communication. Ask that the student you are registering email you the receipt when they receive it via their email.

No refunds will be issued within 2 business days of the course start date.

Introduction to Statistical Modeling

Date:
Tue, May 26 - Fri, May 29
Time:
9:00 am - 12:00 pm
Location:
Instructor:

Layla Guyot

Close Date:
Modality:
In-Person
Course Description:

This course is a hands-on introduction to building and interpreting statistical models in R, with a focus on real-world applications. We will cover key concepts in hypothesis testing, multiple linear regression, and logistic regression. You will learn how to choose appropriate modeling approaches, fit models using R, check assumptions, interpret results, and clearly communicate your findings. Each topic will include a brief introduction to foundational concepts, a demonstration of analysis in R, and guided practice through interactive coding exercises. Emphasis will be placed on using statistical modeling to answer research questions within reproducible workflows. By the end of the course, the goal is for you to be able to apply statistical modeling to your own data.

Preferred or Prerequisite Skills:
This course is recommended for students with some prior knowledge of R or programming in general.

Computer Requirement:
Participants are expected to provide their own laptops.

If using a UT Procard, read this disclaimer.

Back to top

Introduction to RNA-Seq

Date:
Tue, May 26 - Fri, May 29
Time:
1:00 pm - 4:00 pm
Location:
Instructor:

Dhivya Arasappan (Co-Director, Bioinformatics Consulting Group, CBRS)

Close Date:
Modality:
Hybrid, but in-person encouraged
Course Description:

This four-day course provides an introduction to methods for analysis of RNA-seq data. A typical RNA-seq workflow will be featured, starting from quality assessment of raw data, mapping (bwa, kallisto), differential expression analysis (DESeq2), and downstream analyses and visualization. The course also describes analysis methods for dealing with single-cell RNA-Seq data. Participants will gain hands-on experience using these tools in a Linux command line environment.

Preferred or Prerequisite Skills:
None

Computer Requirement:
Students should have their own laptop computer. UT EID is required for wireless access on campus. Please be sure you know both your UT EID when you come to class. To obtain a UT EID, go here.

If using a UT Procard, read this disclaimer.

Back to top

Introduction to Biocomputing: Working in Unix and R

Date:
Mon, Jun 1 - Fri, Jun 5
Time:
9:00 am - 12:00 pm
Location:
Instructor:

Matt Bramble (Bioinformatician, Bioinformatics Consulting Group, CBRS)

Close Date:
Wed, May 27
Modality:
Hybrid, but in-person encouraged
Course Description:

This course will cover the Unix command line and data analysis in R within the context of biocomputing. We will start at the Unix command line and cover command line tools for manipulating data files, before transitioning to RStudio to cover introductory topics and engage with data analysis methods in R. The course will finish up with tidyverse tools and methods for visualizing data using ggplot2.

Preferred or Prerequisite Skills:
None

Computer Requirement:
Students should have their own laptop computer. A UT EID is required for wireless access on campus. Please be sure you know both your UT EID when you come to class. To obtain a UT EID, go here.

If using a UT Procard, read this disclaimer.

Back to top

Introduction to Python

Date:
Mon, Jun 1 - Fri, Jun 5
Time:
1:00 pm - 4:00 pm
Location:
Instructor:

James Derry (Senior Systems Administrator)

Close Date:
Modality:
In-person
Course Description:

This five-day course will introduce students to basic concepts in programming using the Python language, establishing a foundation for scientific computing. Trainees will learn introductory topics such as data structures, control flow, functions, file input/output, and data parsing. The class will work with SciPy libraries like Pandas. Trainees will have full access to the teacher’s course book and course content (datasets, scripts, and jupyter notebooks).

Preferred or Prerequisite Skills:
None

Computer Requirement:
This class is offered in-person. Students must provide laptops able to connect to the internet, and a Firefox or Chrome browser. UT EID is required for wireless access. Please be sure you know your UT EID when you come to class. To obtain a UT EID, go here.

If using a UT Procard, read this disclaimer.

Back to top

Introduction to Core NGS Concepts and Tools

Date:
Mon, Jun 8 - Fri, Jun 12
Time:
9:00 am - 12:30 pm
Location:
Instructor:

Anna Battenhouse (Associate Research Scientist and Bioinformatics Consultant, CBRS)

Close Date:
Modality:
Hybrid, but in-person recommended
Course Description:

This five-day course provides an introduction to the concepts and vocabulary of Next Generation Sequencing (NGS) with an emphasis on common protocols, tools and file formats used in NGS data analysis. Subjects covered include quality assessment and manipulation of raw NGS sequences (FastQC, cutadapt), read mapping (bwa, bowtie2), the Sequence Alignment Map (SAM) format, and tools for manipulating BAM files (samtools, bedtools). Participants will gain hands-on experience using these and other NGS tools in the Linux command line environment at TACC, as well as exposure to the many bioinformatics resources TACC makes available.

Preferred or Prerequisite Skills:
None. UNIX/Linux command line experience is not required, and becoming familiar with how to use the command line for NGS analysis will be a major focus of this course. However, to get a head start on developing this important skill you can look through our Intro to Unix/Linux workshop wiki, and our Intermediate Unix/Linux workshop wiki.

Computer Requirement:
In order to participate fully in the hands-on exercises students should have their own laptop computer with an SSH client program. Macs have SSH available in the Terminal application. Recent Windows versions have an SSH client built into its PowerShell and Command Prompt programs, or PuTTy can be used if SSH is not available. A TACC Account and UT EID are also required. To obtain a UT EID, go here. To sign up for a TACC account, go here.

If using a UT Procard, read this disclaimer.

Back to top

Principles of Machine Learning for Bioinformatics

Date:
Mon, Jun 15 - Thu, Jun 18
Time:
9:00 am - 12:00 pm
Location:
Instructor:

Dennis Wylie (Co-Director, Bioinformatics Consulting Group, CBRS)

Close Date:
Modality:
Hybrid
Course Description:

This four-day course will introduce a selection of machine learning methods used in bioinformatic analyses with a focus on RNA-seq gene expression data. We will cover unsupervised learning, dimensionality reduction and clustering; feature selection and extraction; and supervised learning methods for classification (e.g., random forests, SVM, LDA, kNN, etc.) and regression (with an emphasis on regularization methods appropriate for high-dimensional problems). Participants will have the opportunity to apply these methods as implemented in R and python to publicly available data.

Preferred or Prerequisite Skills:
This course is recommended for students with some prior knowledge of either R or python. Participants are expected to provide their own laptops with recent versions of R and/or python installed. Students will be instructed to download several free software packages (including R packages and python libraries including pandas and sklearn).

Computer Requirement:
Students should have their own laptop computer. UT EID is required for wireless access. Please be sure you know your UT EID when you come to class. To obtain a UT EID, go here.

Cost:

$50

If using a UT Procard, read this disclaimer.

Back to top
checkout container