Tag Archives: coding

Matt Suntay’s jump into the PINC computing program

27 May

Matt Suntay is one of the students in the PINC program and also a research student in my lab in the E. coli / drug resistance / machine learning team. A few days ago he gave a speech at our PINC/GOLD/gSTAR graduation event. I thought it was a great speech and Matt was kind enough to let me share it here both as a video and the text for those of you who prefer reading.

“To those of you who may know me, you all know I’m pretty adventurous. For those of you who may not know me, first off, my name is Matthew Suntay, and I have jumped off planes, cliffs, and bridges – and each time was just as exhilarating as the last. But, let me tell you about my most favorite jump: the leap of faith I took for the PINC program.

I call it a leap of faith because when I first heard about the PINC program, and specifically CSC 306, I thought, “Ain’t no way this could be for me. I may be stupid because I can barely understand the English in o-chem and now I gotta understand the English in Python? Maaaan, English isn’t even my first language… But they said I don’t need any prior computer science knowledge, so why not? It’s Spring ‘21, new year, new me, right?”

And let me tell you, it definitely made me a new me. I went from printing “Hello World!” to finding genes in Salmonella to constructing machine-learning models to study Alzheimer’s Disease and antibiotic resistance in E. coli. These are some pretty big jumps–my favorite, right?–and they weren’t easy to make. However, I was never scared to make any one of those jumps because of the PINC program.

When I think PINC, I don’t only see lines of code across my screen or cameras turned off on Zoom. I see friends, colleagues, mentors, and teachers. I see a community.

I see a community willing to support me in my efforts to develop myself as a scientist. I see a community providing me the platform and opportunities to grow as a researcher. And most importantly, I see a community that shared hardships, tears, laughter, and success with me.

I can confidently say that the PINC program was, and still is, monumental to my journey through science. Thanks to the PINC program, many doors have been opened to me and one of those doors I’m always happy to walk through each time is the one in Hensill Hall, Room 406 – or the CoDE lab. It was here in this lab that I met some of the most amazing people who want to do nothing but help me reach new heights. I’m so grateful and lucky to have them. So thank you, Dr. Pennings, for believing in me and continuing to believe in me. Thank you to everyone in the CoDE lab for supporting me and laughing at my terrible jokes – and real talk, please keep doing so, I don’t know how to handle the embarrassment that comes after a bad joke.

If I haven’t said it enough already, thank you so much to the PINC program. If you were to ask the me from a year ago what his plans were for the future, he would tell you, “Slow down, dude, I don’t even know I’m trying to eat for breakfast tomorrow.” But now if you were to ask me what my plans for the future are, I’d still tell you I don’t know what I’m trying to eat for breakfast tomorrow because I’m too busy writing code to solve my most current research question, whatever it may be.

For many students, including myself, one of the biggest causes of an existential crisis is, “What am I gonna do after I graduate?” To be honest, I’m still thinking that same thought, but without the dread of an existential crisis. One of the coolest parts of the PINC program is the exposure to research and the biotechnology industry, and learning that research == me and not just != the stereotype of a scientist.

Dr. Yoon, thank you for taking the time and effort to push me and my teammates forward, because even though our projects were difficult, we learned a lot about machine-learning and ourselves, like who knew we had it in us this whole time? You definitely did and you helped us see that. Professor Kulkarni, you also helped us realize that we should give ourselves more credit. 601 and 602 showed us we can be competitive and that we’re worth so much more than we make ourselves out to be. Also, I would like to give a quick shoutout to Chris Davies and Chun-Wan Yan for the wonderful seminars because those talks gave me hope and inspiration for the future. Knowing that there’s something out there for me makes going into the future a lot less scary and a lot more exciting because who knows what awesome opportunity is waiting for me?

And one last honorable mention I would like to make is to Professor Milo Johnson. He was my CSC 306 professor, and I don’t know if he is here today, but he was an amazing teacher in more ways than one. He helped me turn my ideas into possibilities and I have him to thank for helping kick start my journey through PINC. When I thought “I couldn’t do it, this isn’t for me,” he said “Don’t worry, you got this.”

So, once again, to wrap things up, thank you to everyone who’s helped me out this far and continues to help me out. Thank you to all my friends, mentors, and teachers that I’ve met along the way. And thank you to the PINC program, the best jump I’ve ever made.

Matthew Suntay – PINC graduate 2022

Scientist Spotlight: Alennie Roldan

7 Jun
Alennie (they/them) graduated from SFSU in 2021 and will be working as a Bioinformatics Programmer in the lab of Dr. Marina Sirota.

Pleuni: Hi Alennie, congratulations on graduating this semester! 

Alennie: Thank you! I really enjoyed my time at SFSU and I’m excited to move onto the next chapter. 

Pleuni: You told me that you are starting a job at UCSF soon. Would you mind telling me what you’ll be doing there and how you found that job? 

Alennie: I’ll be working as a Bioinformatics Programmer in the lab of Dr. Marina Sirota. The work is very in line with the interdisciplinary concepts I learned through the PINC program–– coding meets life science and health data. Prior to getting the position, I heard about an event, “NIH Diversity Supplement Virtual Matchmaking,” from the PINC and SEO mailing list. At the event, I met with many different UCSF PIs and learned about their research. I kept in contact with some of the PIs I met whose research I thought was very interesting. From there I scheduled different meetings and interviews with each PI to see if we’d be a good match. I ended up moving forward with the Sirota lab because I wanted to be involved in their research and felt that I could learn a lot from the experience. 

Pleuni: When did you start to learn coding? 

Alennie: Honestly, I feel like my first stint with coding began with Tumblr. In middle and high school I picked up some HTML to personalize my Tumblr page. It was exciting to input strange strings of numbers and letters and churn out wacky graphics. When I stopped using Tumblr I didn’t seriously pick up coding until summer 2019 for the BDSP, where I learned that there were so many different ways programming could be used. 

Pleuni: Did you always want to learn coding? 

Alennie: When I was younger, I’d watch the crime show “Criminal Minds’” with my mother. One of my favorite characters was Penelope Garcia, the show’s FBI Technical Analyst. She fills the tech-savvy role of the group and I always enjoyed seeing how she’d help solve the case by unlocking “digital secrets” or finding classified information. Based on portrayals like that, I always considered coding as an exclusive skill limited to cyber security and creating complex software. So I was always interested in coding, but the idea of learning how seemed too daunting. 

Pleuni: You did the entire PINC program – which part did you like most? Which part was frustrating? 

Alennie: I enjoyed the creative freedom of the PINC program. Many of the classes I took had final projects that encouraged us to come up with our own ideas. It was satisfying and challenging to take all that I’ve learned so far and use that knowledge to come up with my own projects. One of my favorite projects was for CSC 307: Machine Learning for Life Science Data Scientists. The goal of my group’s project was to address the lack of diversity in dermatology datasets by applying a machine-learning model that could identify various skin disorders; our dataset consisted of skin image samples from People of Color. The assignment was especially rewarding because it allowed me to combine my passion for health equity, social justice, and programming into a single project. 

The most frustrating part of the program was primarily due to the pandemic. It was difficult to communicate with my professors and classmates through a remote format. The experience sometimes felt isolating because I had been so used to seeing my mentors in-person or meeting up with classmates to work on an assignment/project. Thankfully, I had met many of the same classmates in person before switching to virtual learning so I felt like I had some familiar faces to interact with. 

Pleuni: Sometimes it looks like coding is something for only some kinds of people. There are a lot of stereotypes associated with coding. How do you feel about that? 

Alennie: This is a very good question, as there are many layers to the coder/programmer stereotype. If you were to ask people to draw a picture of a coder, the most common image you’d likely see is a lonely man furiously typing in a darkened room, hunched over in his chair and focused on screens covered with indecipherable numbers and symbols. Simply put, we often imagine a typical coder as a cisgender white man who typically exhibits loner or awkward behaviors. It’s a very narrow and negative stereotype which ultimately promotes negative connotations regarding neurodivergent individuals and excludes Women and People of Color from the narrative. 

The stereotype does little to encourage or welcome most people. But in reality, the coding community at large desperately needs a diverse range of people who can contribute their unique perspectives. Stereotypes can be discouraging and unwelcoming, so it’s important for institutions to emphasize inclusivity to show how students can be fantastic coders and still be true to their unique identities. 

…it’s important for institutions to emphasize inclusivity to show how students can be fantastic coders and still be true to their unique identities.

Pleuni: I know you are applying to medical school. Do you think it is useful for a doctor to know about computer science? 

For example, by having some knowledge in computer science a doctor could aid in the design of an app that patients can use to let them know if they’re experiencing side effects to their medication, create a website that shows local doctors who are LGBTQ+ friendly, or even better navigate electronic health records. The possibilities are endless! 

Alennie: I believe that computer science can be very useful to a physician because it can improve how they can take care of people. Since they are face-to-face with patients everyday, healthcare professionals are in a position where they can recognize and understand what unique problems need to be addressed in their communities. 

Pleuni: Do you have any tips for students who are just starting out? 

Alennie: Embrace your creativity! We often think of coding as a sterile and strict subject, but as you create new programs, websites, apps, etc you realize how much creative freedom you actually have. Learning how to code can be very daunting so when you personalize programs to fit your style or reflect things that you like, it makes the journey seem less scary and more fun. When I started coding, I had the most bare-bones of tools at my disposal, but I could still find ways to inject things to make my code feel like it belonged to me. The very first game I programmed, a basic recreation of Pong, I signed with my favorite color, pastel pink.

Alennie recreated the classic game of Pong with a little extra flair for one of their coding projects.

Pleuni: Thank you, Alennie! Please stay in touch!

Scientist Spotlight: Berenice Chavez Rojas

28 May

Berenice Chavez Rojas graduated from SFSU in 2021 with a major in biology and a minor in computing applications. She is moving to Boston to work in a lab at Harvard’s Medical School.

Pleuni: Hi Berenice, congratulations on graduating this semester! 
I know that you are starting a job at Harvard soon. Would you mind telling me what you’ll be doing there and how you found that job? Did your coding skills help you land this job?

Berenice: I’ll be working as a research assistant in a wet lab. The model organism is C. elegans and the project will focus on apical-basal polarity in neurons and glia. I found this job on Twitter! Having a science Twitter is a great way to find research and job opportunities as well as learn new science from other scientists. While I won’t be using my computational skills as part of this job, the research experience I have been able to obtain with my coding skills did help me. 

“coding always seemed intimidating and unattainable”

Pleuni: When did you start to learn coding? 

Berenice: I started coding after I was accepted to the Big Data Summer Program two years ago [Note from Pleuni: this is now the PINC Summer Program]. This was also my first exposure to research and I’m grateful I was given this opportunity. This opportunity really changed my experience here at SFSU and it gave me many new opportunities that I don’t think I would have gotten had I not started coding. Following the Big Data Summer Program I started working in Dr. Rori Rohlfs’ computational biology lab. I also received a fellowship [https://seo.sfsu.edu/] which allowed me to stop working my retail job, this gave me more time to focus on school and research. 

Pleuni: Did you always want to learn coding?

Berenice: Not at all, coding always seemed intimidating and unattainable. After my first exposure to coding, I still thought it was intimidating and I was slightly hesitant in taking CS classes. Once I started taking classes and the more I practiced everything began to make more sense. I also realized that Google and StackOverflow were great resources that I could access at any time. To this day, I still struggle and sometimes feel like I can’t make any progress on my code, but I remind myself that I’ve struggled many times before and I was able to persevere all those times. It just takes time!

The forensic genetics team at the Big Data Science Program in the summer of 2019. Berenice Chavez Rojas is in the middle.
The forensic genetics team at the Big Data Science Program in the summer of 2019. Berenice Chavez Rojas is in the middle.

“At the end of this project, I was able to see how much I had learned and accomplished”

Pleuni: You did the entire PINC program – which part did you like most? Which part was frustrating?

Berenice: My favorite part of the PINC program was working on a capstone project of our choice. At the end of this project, I was able to see how much I had learned and accomplished as part of the PINC program and it was a great, rewarding feeling. As with any project, our team goals changed as we made progress and as we faced new obstacles in our code. Despite taking many redirections, we made great progress and learned so much about coding, working in teams, time management, and writing scientific proposals/reports.

Link to a short video Berenice made about her capstone project: https://www.powtoon.com/c/eKaZB3kkxE5/0/m

Pleuni: Sometimes it looks like coding is something for only some kinds of people. There are a lot of stereotypes associated with coding. How do you feel about that? 

Berenice: I think computer science is seen as a male-dominated field and this makes it a lot more intimidating and may even push people away. The PINC program does a great job of creating a welcoming and accepting environment for everyone. As a minority myself, this type of environment made me feel safe and I felt like I actually belonged to a community. Programs like PINC that strive to get more students into coding are a great way to encourage students that might be nervous about taking CS classes due to stereotypes associated with such classes. 

“talking to classmates […] was really helpful”

Pleuni: Do you have any tips for students who are just starting out?

Berenice: You can do it! It is challenging to learn how to code and at times you will want to give up but you can absolutely do it. The PINC instructors and your classmates are always willing to help you. I found that talking to classmates and making a Slack channel where we could all communicate was really helpful. We would post any questions we had and anyone could help out and often times more than a few people had the same question. Since this past year was online, we would meet over Zoom if we were having trouble with homework and go over code together. Online resources such as W3Schools, YouTube tutorials and GeeksforGeeks helped me so much. Lastly, don’t bring yourself down when you’re struggling. You’ve come so far; you can and will accomplish many great things!

Pleuni: What’s your dog’s name and will it come with you to Boston?

Berenice: His name is Bowie and he’ll be staying with my family here in California. 

Pleuni: Final question. Python or R?

Berenice: I like Python, mostly because it’s the one I use the most. 

Pleuni: Thank you, Berenice! Please stay in touch!

Scientist spotlight: meet Dr Sabah Ul-Hasan!

28 Apr

Dr Ul-Hasan (they/them) is a postdoc and lecturer in bioinformatics under Dr Andrew Su and Dr Dawn Eastmond at Scripps Research, doing biocuration and automated data integration work within the Gene Wiki project of Wikidata. They received their PhD in Quantitative & Systems Biology from UC Merced, their Master’s in Biochemistry from the University of New Hampshire and their BSc degrees (3 majors! Biology, Chemistry, and Environmental & Sustainability Studies) from the University of Utah. Sabah is involved in what feels like a thousand different activities related to science, research, coding, outreach, conservation, environmental justice and other things. 

I got to know Sabah a couple of years ago when I visited UC Merced and then started following them on twitter. One thing I really love about them is how they don’t limit themselves to just doing one thing.They are ambitious and radical. They founded the Biota project to connect underrepresented communities with nature. They are a filmmaker (see here)! They volunteer for The Carpentries, and they started the venom-microbiome research consortium. They organize workshops, speak at events, teach classes and do many other things. 

In my opinion, too few scientists use their platform to fight for justice and to share their passion and knowledge. At the same time, many PhD students and postdocs and even assistant professors are shy about taking a stance, thinking that they would speak up louder (about science or justice or both) when they are more senior. But Sabah proves that you don’t have to be a tenured professor to make a difference in science (they have more than 8000 followers on twitter, just saying). 

Pleuni: Hi Sabah, thanks for taking the time to answer my questions! Could you tell us in a few sentences how you became interested in data science? 

Sabah: One of my dissertation chapters involved data that was over 100 years old. I know this isn’t a new concept for anyone doing paleo research. I was also well-familiar with “old” data through all the climate change reports that have come up in the public over the years. 

However, to directly work with data like that I realized there were so many more questions I wanted to ask people from 100 years ago. That then got me wondering, “How can I contribute to research in a way that can be sustainable 20, 50, or even 5 years from now?”. 

My interest in data science thus came from a position of wanting to be part of something bigger in terms of the infrastructure for how we can sustain the science of today and tomorrow. 

Pleuni: How did you start learning coding skills? Was it hard for you to learn? 

Sabah: I was first introduced to R during my (Biochemistry) Master’s at the University of New Hampshire in 2013. I sat-in on a casual meeting among graduate students and postdocs and truly had no idea what anyone was talking about. 

The data analysis section of my MSc thesis ended up utilizing Excel to make bar charts. In retrospect, I see how much faster I could’ve done the analyses if I took the time to learn coding. When I began the doctoral program at UC Merced in January 2015, I knew coding was a skill I wanted to learn and so I did through classes and workshops. 

Now it’s my job as a postdoctoral scholar and lecturer for bioinformatics, and I still sometimes struggle with basic concepts. The difference between then and now is I’m a lot better at admitting when I don’t know something, how to ask a question for what I need to learn, and where to go to find that answer. 

I’m not sure anyone who does bioinformatics considers themselves an expert, but perhaps the expertise lies within the ability to problem solve especially when it is difficult or can feel overwhelming. In sum, the sooner you can confront your fears the better! Don’t let them freeze you. Believe in your ability to constantly learn and grow, even when you’re a titled expert!

Pleuni: For your paper that appeared in Plos One in 2019, you studied the diversity of microorganisms (including archaea, bacteria and eukaryotes) in seawater and sediment in three different locations. It sounds like a complex dataset to work with. 

Community ecology across bacteria, archaea and microbial eukaryotes in the sediment and seawater of coastal Puerto Nuevo, Baja California

Sabah: It’s funny to only be two years out from that publication and already think of so many things I would’ve done differently. I guess that’s growth! 

I attribute a lot of credit and thanks to the co-authors of the paper and those in the acknowledgements. It came a long way from when I first drafted it to the final publication form, and posting it on bioRxiv also helped a great deal in soliciting feedback. 

What I think really makes a difference is the transparency of that research and associated code, especially in reference to data clean-up (which is the bulk of the analysis work, in personal opinion). I’ve since received several inquiries from people for their own work and to me that feels great to know that it can serve as something people can apply to their own research in making things a little easier. 

I also think it’s important we as scientists specify the microbes we’re investigating in any ‘microbial community’ -type paper. Many of the amplicon and metagenomics studies I see really focus on bacteria or fungi, which is absolutely fine but that isn’t a comprehensive microbial community for what many of the titles for these papers tend to imply. In this study, too, we focus on whatever microbial groups we identified solely through 16S and 18S. We need to be better at saying what the data is rather than wordsmithing for a nice story. That will help the next group build upon those gaps for something stronger next time, and overall our intent as scientists is to always have research be advancing further and further. Right? 

Pleuni: You used R for your data analysis (but also other software such as QIIME2). What do you like or not like about R? Could you imagine doing a paper like this one without R?

Sabah: Using wrappers such as QIIME2 and mothur are great for people who want to do an analysis of a microbial dataset and then perhaps never touch one again. For me, I found myself continuously asking a lot of “Why?” and wanting to dig deeper on the fundamentals behind what the software I was using. In the end, R took more time to learn short-term but made more sense to me of what was happening each step of the way in the analysis. It was also a good way to affirm my results in trying different avenues and seeing the same output. 

What I learned from putting together the paper is it’s not about finding the ‘right’ or ‘wrong’ answer, it’s about finding an answer that is logical and as unbiased as possible. A lot of the time we have these hypotheses we ‘prove’ through confirmation bias. To me, code (when done with intention) is a way to step outside of ourselves and see what the data is telling us rather than what we want the data to say — and that’s where the interesting science lives.

This publication, for example, wasn’t exactly what we were wanting to see. It’s actually a failed attempt at sequencing the venom microbial community of Californiconus californicus, which was the focus of my dissertation (venom microbiomes), due to too much host contamination of the tissues we sampled for that region of Puerto Nuevo. So, what do we do? Do we call it all a wash? There was a lot of thought, time, and resources that went into that work. 

I had sampled the sediment and water of the area, along with some generic chemistry tests, to see if the venom microbial community was largely specialized to the snail venom glands or from the surrounding environment (they burrow in the sand). That data was still usable, had good replication, and we didn’t know anything about the microbial community of Puerto Nuevo before that point. Ah-ha! A different story than we were thinking, but still a valuable one. Let the data tell you, don’t misconstrue the data to fit your narrative. 

R, and all the programming languages I’ve learned thus far, have helped me learn that.

Pleuni: On your twitter profile, you list many interests, such as advocacy, consulting, data visualization. Can you tell us a bit about your different interests? Are these things linked to each other?

Sabah: Well… haha. The link is that, at heart, I’m a bit of a troublemaker. It’s the nature of a scientist to ask a lot of questions, and asking too many questions can often get us into trouble! I likewise enjoy being asked a lot of questions, and hope to always maintain humility in learning just as much from high school students as I do from tenured professors. 

I wanted my Twitter profile and bio to emulate that duality of being both a ‘credible academic’ while also pushing back on what we define as ‘the norm’. I disagree with the idea that a science expert needs to possess a PhD (or some other form of higher education certification) because of the privilege and whiteness involved, but I do also benefit from it after completing the process and there is of course also danger in believing ‘just anyone’ on the internet. And I love learning and helping, which are really the only drivers behind all my many interests.

In my view, the most important quality in being a scientist is being approachable. If only a few people can understand the work you do, then what’s the point? That’s why I’m on Twitter, and also as a way to keep myself grounded, especially learning from moments of being called out (which does happen from time to time). I’d also say my family keeps me in check, as I’m one of the few with a science background. I have one cousin on my Mom’s side with a Ph.D. and that’s it for our extended family of over 100 people (South Asian families are big). Being a good scientist is just as much about humanity as it is about the basic research. I think only good things can come from staying tuned into the reality of the world around us, even though it can feel like a lot to balance.

Pleuni: Do you have any advice for the bio and chem Master’s students in my Data Science class? 

Sabah: My advice is to just go for it! 

This past Fall I taught a bioinformatics course to (mainly) graduate students and it was an adventure for all of us. It was my first time as a full instructor for a course (versus a teaching assistant), during COVID no less, and it was also the first time many students in the course were getting into bioinformatics. 

At the end, it was clear to me that student progress in the course wasn’t about who knew how much at the start but rather about showing up with enthusiasm and simply trying. That went both ways for me as the instructor giving lectures my all as well as for the students and their performance. And life happens! I had to cancel one of the days due to personal life things, and that’s okay. Be good to yourself when you need to and also don’t hold yourself back. And be good to others, too. We really never know what someone else may be experiencing behind the scenes for them to be flakey or on edge, and the more we can find the good in each other the better we can focus on doing the good science. 

On that note, I can’t express enough how much of a difference it’s made in my life to work for or alongside with even just one considerate person. As they say, “You are what you eat.”. My PhD co-advisors (Dr Tanja Woyke and Dr Clarissa Nobile) and my current PIs (Dr Su and Dr Eastmond) are truly outstanding people. They have so many stresses in their own careers and lives, and they still somehow show up with kindness and professionalism every day. And they also believe in me to do good work, even when I’ve had a bad week (or month!). That trust really goes such a long way when you’re underrepresented in your field, and often used to being discouraged and/or people expecting very little of you. Being entrusted to teach a course at a renowned research institute directly out of my PhD, for instance, is a big reason why I chose this position in knowing that my voice was heard and respected. That’s been true throughout, and makes it much easier to show up with my best foot forward even on the tough days.

Tying it all together, so many times I’ve got myself stuck because I see others who are ahead of me, doing better than me, and/or with access to more resources than me. One truth we can all agree upon is that life is unfair, and while hopefully it will become equitable over time through our own efforts to create change the fact is that life is still happening in the meantime. No one will help you as much as you can help yourself, and the moments where I’ve been able to just sit down and see something through is how I’ve realized more and more just how much more ability I have than I thought. You’re much more capable than you give yourself credit! It’s super cheesy, but it’s very true. And feel free to reach out any time!

Pleuni: Thanks for answering my questions, Sabah! So much here that resonates with me, including one of the last things you said, that you realized that you have more ability than you thought. This happens to me too! As just one example, just over a year ago, I didn’t think I could learn Machine Learning, but now I am even teaching it. Not that I am suddenly an expert, but I can do it and it is no longer scary. 

I look forward to seeing all the science, art, and justice-related projects you will be doing in the future! 


Sabah Ul-Hasan Google Scholar profile 

Sabah Ul-Hasan, PhD Twitter Profile (@sabahzero)

How we run an inclusive & online coding program for biology and chem undergrads in 2020 

7 May

By: Nicole Adelstein, Pleuni Pennings, Rori Rohlfs

Coding summer program (BDSP) in 2018, when students were in the same room for 8 hours a week.

In 2018 this team (led by Chinomnso Okorie) met in the “yellow room” for 8 hours a week to learn R.  

We have been running combined coding/research summer programs for several years, with a  focus on undergraduate students, women, and students from historically underrepresented racial and ethnic groups. This summer, we will run our 9-week program as an online program. We think that others may be interested in doing this too, so we’ll share here how we plan to  do it. 

Some of the information below will also be published as a “ten rules paper” in Plos Computational Biology*, but we wanted to share this sooner and focus on doing things online vs in person. 

TL; DR version

  1. Have students work in teams of 4 or 5, for 2 hours per day, 4 days a week. Learning to code should be done part-time, even if your program is full time. 
  2. Use near-peer mentors to facilitate the team meetings (not to teach, but to facilitate). 
  3. Use existing online courses – we’ll share a few that we like. Don’t try to make your own curriculum last minute. There are good online courses available. 
  4. Give the students a simple (repeat: simple!) research project to work on together. 

1. Have students work in teams for two hours a day – with pre-set times. 

Learning to code is stressful and tiring. Even though many students may not have jobs this summer – it doesn’t mean that they can code for 8 hours a day. First, because they have other stuff to do (like taking care of family members) and second because there’s a limit to how long you can be an effective learner. 

Our program is 10 hours per week (8 hours of coding, 2 hours of “all-hands” meeting). We make it clear that no work is expected outside of these hours. For example, a team may meet from 10am to 12pm four days a week for coding. 

Check-ins, quiet working, shared problem solving. 

During the coding hours, the near-peer mentor is always present (on Zoom, of course!) and facilitates the meeting. The very first day should be all about introductions and expectations. After that, we suggest that every day, there is time for check-ins (everybody shares how they are doing, what they’re excited about or struggling with, or what music they’re listening to), quiet working (mute all microphones, set a timer, everybody works on the online class by themselves) and shared problem solving (for example, let’s talk about the assignment X from the online class). One of the mentors last year was successful with starting every meeting with a guided meditation. 

Each team has a faculty mentor in our program (this could be a postdoc or faculty member). Once a week, the faculty mentor joins the meeting for about 1 hour. This hour could consist of introductions / check-ins, a short presentation or story by the faculty mentor, and the opportunity for the team to ask questions. It’s great if the near-peer mentor and the team prepare questions beforehand. 

1B. Add a non-coding meeting (if you can/want)

In addition to the 8 coding hours per week, our students also meet for 2 hours per week in an “all hands meeting”. Such an all-hands meeting is not absolutely necessary, but if you have the bandwidth, it may be nice to meet once a week to do something other than coding. Maybe to read a paper together or meet with someone online (an alum who is now somewhere else? A faculty member or grad student?). 

If your program is full time (like an REU program), we suggest to still only do about 8-15 hours of coding per week. Fill up the rest with more standard things such as lectures, reading etc (and don’t make anyone do Zoom 40 hours a week!). If students are enjoying themselves with coding and getting more confident, they may do more coding by themselves, but in our program it is not the expectation. 

2. Mentors and teams are key 

When working alone, we’ve often seen students get stuck on technical problems, leaving many feeling lost and inadequate and wanting to discontinue learning this new skill. Working in a mentored team, however, students have access to immediate support from their peers and mentor. This helps them learn technical skills more efficiently, develop relationships with each other, and cultivate a shared sense of belonging in computational research (Kephart et al. 2008). We recommend that each participant in a coding summer program be assigned to a team of 4 to 5 students with similar technical skill levels led by a near-peer mentor. 

Mentors in our program are typically a year or two ahead of participants but belong to similar demographic groups and come from similar academic backgrounds. The mentor facilitates the meetings and leads the team in learning skills and applying them to a research question (without doing the work themselves). 

Each team also has a faculty advisor, who comes up with a research project that is likely to be completed in the available time and that is of interest to the students (Harackiewicz et al. 2008). The faculty advisor meets with the whole team at least once per week to guide learning and research. Of note, acting as a mentor improves students’ retention and success in STEM (Trujillo et al. 2015) therefore, this setup benefits mentors as well as mentees. 

2B. Who can be mentors? 

Over the years, we have found that near-peer mentors are incredibly useful for a number of reasons including 1) student participants are more likely to ask for help from a near-peer mentor than from a faculty advisor, 2) near-peer mentors serve as role models, giving participants an idea of what they can aim for in the next year or two, and 3) the use of mentors allows the program to serve many more participants than it could if it relied on a few time-pressed faculty advisors. Our selection criteria for mentors include essential knowledge (for example, the mentor for a team doing an advanced chemistry research project should have taken physical chemistry), mentoring experience or potential, logistical availability, and having a similar demographic background as the participants. Mentors don’t need experience with the specific coding language or research topic they will work on with their team. Rather than being the expert in the room, they are expected to help team members work together to find solutions or formulate questions for the faculty advisor. 

Mentors are crucial for the success of the program and need to be paid well for their work. Each week of the program, we pay our mentors a competitive wage for 8 contact hours with their team, a 2-hour all hands lunch meeting, a 2-hour mentor meeting, and 3-4 additional hours to account for preparation. However, we realize that this summer, things may be different for many! You may find that PhD students or Master’s students who can not work in the lab (but are still paid / on a fellowship) could be excellent near-peer mentors. Just make sure that the mentors know that this is a real commitment that will eat up a significant chunk of time each week. 

3. Identify an appropriate online course for each team

We have found that when learning basic coding skills, interactive online classes to learn computer programming (for example, from Datacamp, Udacity or Coursera) motivate and engage students better than books or online texts. Yet, when working individually, most students – especially beginners and historically underrepresented students – don’t finish online classes (Ihsen et al. 2013; Jordan 2015). As a solution, we have found that in teams, where students can work together and support each other, they learn a great deal from an online class. 

Each team’s faculty advisor picks a free, clearly structured online class with videos and assignments to teach participants coding skills. We have had good experiences with Udacity’s Exploratory Data Analysis course because this class is suitable for beginners. It does a good job motivating students to think about data and learn R. In early team meetings, participants spend time quietly working on the online class with their headphones on, followed by a team discussion or collaborative problem-solving session. If students encounter difficulty with any of the material, mentors may develop mini-lectures or create their own exercises to facilitate learning. Note, the students’ goal is not necessarily to finish the online course, but to learn enough to perform their research project. 

3B. Suggested classes:

Udacity Exploratory Data Analysis with R https://www.udacity.com/course/data-analysis-with-r–ud651

CodeHS https://codehs.com/ (the faculty mentor or the near-peer-mentor needs to create a section on Code HS, we use the introduction to python (rainforest).  

Coursera https://www.coursera.org/learn/r-programming (this one is a tip from our UCSF colleague Dr Kala Mehta)

4. Assign each team a simple and engaging research project 

Learning to code without a specific application in mind can feel boring and irrelevant, sometimes leading students to abandon the effort. In our summer program, teams carry out a research project to motivate them to learn coding skills, improve their sense of belonging in science (Jones, Barlow, and Villarejo 2010) and cultivate their team work and time/project management skills. Faculty advisors assign each team a research project early in the program. These projects should answer real questions so that participants feel their work is valuable (Woodin, Carter, and Fletcher 2017). The projects should also be relatively simple. Small and self contained projects that can be completed within a three week time frame are ideal to ensure completion and make participants feel that their efforts have been successful. For example, past research projects in our program, which reflect the interests of faculty advisors and the students, include writing computer simulations to model the evolution of gene expression, analyzing bee observations from a large citizen science project, examining trends in google search term data with respect to teen birth outcomes, and building an app for finding parking spots on or near campus. 

For 2020, we’d like to encourage you to pick a project that appears extremely simple if you normally use R or Python to make your plots / do stats, but that would be quite challenging if you’re new to coding. We also suggest that – unless the students are already quite advanced – you don’t give them a project that you want to publish on quickly. Nobody needs more pressure this summer.  

Here are some suggestions for simple research projects

  1. Let students plot the number of COVID19 cases in their county over time using R. Let them plot the number of cases in 5 different counties on the same figure. Add an arrow for when a stay-at-home order was implemented or terminated. Easy to download data are here: https://github.com/nytimes/covid-19-data 
  2. Let students keep track of how many steps they take each day for 10 days using their phone or watch. Let them plot the number of steps per day using R. Let them add a line for the mean. Collect data from 6 people and create a pdf with 6 plots in different colors. 
  3. If you have any data from your lab, let the students plot those data. Try making 4 different plots with the same data (scatter, box, histogram, etc). 
  4. Let students recreate an existing plot from a publication when the data are available. 
  5. Let students analyze (anonymized) data from your class. How strong is the correlation between midterm grades and final exam grades? Do students who hand in homework regularly do better on the test? 

* reference: Pleuni Pennings, Mayra M. Banuelos, Francisca L. Catalan, Victoria R. Caudill, Bozhidar Chakalov, Selena Hernandez, Jeanice Jones, Chinomnso Okorie, Sepideh Modrek, Rori Rohlfs, Nicole Adelstein Ten simple rules for an inclusive summer coding program for non-CS undergraduates, accepted for publication in Plos Computational Biology.

Meet Francisca Catalan, SFSU PINC alum and research associate at UCSF (spotlight)

9 Jan


Francisca Catalan, SFSU PINC alum and research associate at UCSF

  1. How did you get into coding? 

I took a regular CS class my second year at SF state. I thought it would be a good skill to have as an aspiring researcher and saw that it fulfilled one of my major requirements. It was a PowerPoint-heavy 8 am class three times a week. I didn’t talk to anyone else in the class and by the end of the semester I found it very difficult to show up. I passed the class but was really devastated about my experience. I thought I could never learn to program, though I never gave up completely. A couple semesters went by and I saw a friendly flier announcing PINC, SFSU’s program that promotes inclusivity in computing for biologist and other non-computer science majors. I eagerly signed up and started the “Intro to Python” class soon after. Then, with some more programming under my belt, I joined Dr. Rohlfs’ lab and began doing research in the dry lab for the remainder of my undergraduate career.

  1. What kind of work do you do now? 

I currently work at UCSF as a dry lab research associate. Our lab focuses on an aggressive form of brain cancer, glioblastoma. We try to find gene targets for new drug treatments and research the cell type of these cancerous cells in order to fight drug resistance. My main duties now include creating pipelines for our single cell, RNA-Seq, and Whole Genome Sequencing data. You can read about our lab’s latest study in our new publication on cancer discovery! DOI: 10.1158/2159-8290.


  1. How did learning coding skills impact your career?

Coding has opened so many pathways for me. I was able to find a great job at UCSF soon after graduating with my Bachelor’s of Science in cell and molecular biology and minor in Computing Applications. It has also given be a giant boost of confidence! As a woman of color in STEM, I often felt underrepresented and out of place, but those feelings now quickly subside when I can help my colleagues answer coding questions! It’s motivating to feel like a necessary component of your community when often time you feel pushed out. It’s also impacted my career choices! I know now I want to be a professor in the future, I want to provide access to programming to others in hopes it will open pathways like it did for me!

  1. Do you have any advice for students who are just starting? 

Yes! Don’t give up! It can be really difficult to learn coding, but know that it’s not you, talking to a computer can just be hard sometimes! Continue practicing and ask questions, google your heart out. Take breaks when necessary, remember to breathe, and keep in mind all the amazing science you will be able to do once you have these skills under your belt!

Meet Simone Webb, Bioinformatics and Immunology PhD student

2 Dec


I am spotlighting scientists who code for my students who are learning to code in Python. Today, I’ve chatted with Simone Webb from the UK. Simone Webb is a PhD student in the group of Professor Haniffa at Newcastle University in the UK.

Pleuni: Hi Simone, how did you get into coding?

Simone: I got into coding during my undergraduate degree, where I took some compulsory statistics and intro to bioinformatics courses.

To be honest, I struggled with it a lot! These courses remain my worst grades during university. However, there was something about it that drew me to it. The maths-based logic of it all really appealed to me at a time where the bio-related content I was learning seemed a lot more uncertain and up for debate. I’m not a natural at it by any means.

I liked how it felt to get an answer correct during our tutorials and stuck with it.

By the time I got to my undergraduate thesis, I realized that my real interest lay in microbiology and bioinformatics. The projects on offer for my thesis didn’t have massive diversity in these fields, so I crossed my fingers and applied for the project led by our first-year bioinformatics tutor – I got in! From then onwards, it’s fair to say that I would always choose coding over wet lab work. My thesis project was purely bioinformatics and I had a very encouraging and hands-on supervisor who was patient with me and taught me a lot to do with coding technique, method and reasoning. After I graduated I knew I wanted to keep coding, whether in research or a non-academic role.

Pleuni: What is your current job or project?

Simone: I’m currently studying for a PhD in bioinformatics and immunology. I now use coding (in both R and python languages) to analyze sequencing data. In this work, code is able to help us understand exactly what cells are present in both healthy and disease tissue, and helps us look further into the role these cells could be playing.

Pleuni: Do you have any advice for students who are starting to learn coding skills?

Simone: If you have an interest in anything bioinformatics related, my advice is to seek out a role model and be brave – ask for their advice and see what you can learn from their experiences! Also, there are active online communities for women in STEM, women who code and people who are Black in academia. Reach out if any of these groups relate to you and know that you are not alone

You can find Simone on twitter under her twitter handle @SimSci9 !

The ridiculous order of the streets in the Excelsior (SF)

26 Sep

I live in the Excelsior neighborhood in San Francisco. My street is Athens Street. If I walk westwards from my home, I come to Vienna Street and then Naples, Edinburgh and Madrid. If you have any knowledge of map of Europe, you realize that the order makes no sense!

(Also, why is there Naples, but not Rome, and why Munich, but not Berlin? And why oh why, is there no Amsterdam Street? So many questions!)

Last week, I asked the students in the CoDE lab to create a map to show the ridiculous order of the streets in the Excelsior. They had fun figuring out how to make a map in R, so I thought I share their work here. Several students were involved, but my graduate student Olivia Pham did most of the work.

The code is here: http://rpubs.com/pleunipennings/212840


The surprising order of street names in the Excelsior neighborhood in San Francisco. We connected the cities in the order of the streets. London Street is the first city-name street if you enter the neighborhood from Mission Street, just east of London Street is Paris Street, then Lisbon Street etc. The last city-name street is Dublin Street which is closest to McLaren Park.


A map of part of the Excelsior neighborhood showing the order of the city-name streets.

How to get started with R

1 Feb


I often get asked how to get started with learning R if there is not currently a class offered. Here is what I recommend:

1. Start with a free online Code School tutorial

First of all, check out this (free) online course: https://www.codeschool.com/courses/try-r
No need to install anything, no need to pay. Students in my bioinformatics class liked this online Code School course a lot. It will not make you a master of R, but it’s a nice starting point.

2. Install R, Rstudio and swirl on your computer

Next, it is time to install R and Rstudio on your computer. Once you have that, install the swirl package. Instructions for installing R, Rstudio and swirl can be found here: http://swirlstats.com/students.html
swirl is an R package that helps you learn R while you are in the Rstudio environment. I highly recommend using the Rstudio environment! The swirl tutorials teach you the basics of vectors, matrices, logical expressions, base graphics, apply functions and many other topics. Kind words included (“Almost! Try again. Or, type info() for more options.”)

3. Dive in with great Udacity class …

If you are ready to really dive in (and have some time to invest), try out this great Udacity class: https://www.udacity.com/course/data-analysis-with-r–ud651 (no need to pay for it, you can do the free version). This class is taught by people from the Facebook data science team. They do a great job guiding you through a lot of R coding. Importantly, they always take the time to explain why you’d want to do something before they let you do it. A large part of the course is focused on using the ggplot2 package.

… or start reading The R Book

The R Book is a book by biologist and R hero Michael Crawley. The pdf of the book is available from many websites (for example: ftp://ftp.tuebingen.mpg.de/pub/kyb/bresciani/Crawley%20-%20The%20R%20Book.pdf). Make sure you also download the example data that come with the book (http://www.bio.ic.ac.uk/research/mjcraw/therbook/).

The R Book is a great resource and very clearly written. The students in my lab enjoy reading from it and trying out the code. If you are a biologist, it’ll be fun to work with the biology examples in the R book.

4. Find others who are using R or learning R.

Learning R is hard. You will get frustrated sometimes. If you know someone who is learning with you or who could help you when you are stuck, things will be easier! If there is no one near you, try to find R minded people on Twitter or elsewhere online. Also, check out the R forum on Stack Overflow (http://stackoverflow.com/questions/tagged/r) for many questions and answers on R.

Good luck!