Tag Archives: writing

The acknowledgement section of our NSF proposal

25 Aug

A few weeks ago two colleagues and I submitted an NSF proposal. We submitted on a Friday afternoon even though the deadline wasn’t until Tuesday! I am proud that we managed this almost without any deadline stress!

I had fun and we wrote a great proposal

I know that we may not end up getting funded by NSF, but until we get that message, I plan to be very optimistic. We wrote a really neat proposal for a great project. I can’t wait to get started! The ambitious goal of the project is to determine the fitness cost of every possible point mutation in the HIV genome in vivo.

I think nobody likes to write proposals when the success rate is only 5%, but I actually enjoyed working on this proposal and I learned a lot while writing it: both about the biology of our project and about the art of proposal writing. It’s important for me to commit that to paper (OK, screen) so that if NSF decides not to fund us, I will remember that writing the proposal was actually a good experience.

Writing with a newborn

In addition the many scientists and administrators who contributed to the proposal, I also want to mention how I could write a proposal with a newborn. We started working on the proposal two weeks before I gave birth and we submitted the proposal when our baby was just shy of seven weeks old. The hours that I spent on the proposal were made possible by my mom who flew in to help and by the fact that Facebook gives new parents four months paid paternity leave so that my husband was also at home during my maternity leave. It was fun to be home together with my husband and we took shifts working and taking care of Maya. Most days I worked on the proposal just two or three hours, so a large part of the work was done by others.

HomeOfficePleuni

Me in my home office with baby, changing table, a laptop and a grant writing handbook.

It was a huge team effort

Many people were involved in writing the proposal. Many more than I ever expected to be. I want to list them here so that I remember who helped out and also to show that being a researcher doesn’t have to be a lonely affair.

Note that these people are only the people I am aware off. Others certainly helped my co-PI Adi Stern.

The main team that wrote the proposal consisted of four people:

  • co-PI Adi Stern (Tel Aviv)
  • postdoc Marion Hartl (SFSU)
  • professional grant writer Kristin Harper
  • myself

At SFSU, people from the Office for Research and Sponsored Programs helped:

  • Rowena Manalo
  • Raman Paul
  • Michael Scott
  • Jessica Mankus
  • Uschi Simonis (vice-dean for Research)

At Stanford there were

  • co-PI Bob Shafer
  • collaborator David Katzenstein
  • Elizabeth White (Katzenstein lab)
  • Holly Osborne (Office for Sponsored Research)

In Tel Aviv

  • Office for Sponsored Research
  • Adi Stern’s lab members brainstormed ideas
  • Maoz Gelbart help with ideas and figures

Colleagues who read earlier versions of the proposal

  • Sarah Cobey (U Chicago)
  • Sarah Cohen (SFSU)
  • Alison Feder (Stanford)
  • Nandita Garud (UCSF)
  • Arbel Harpak (Stanford)
  • Joachim Hermisson (U Vienna)
  • Claus Wilke (U Texas Austin)

A huge thank you to all these amazing people! I am lucky to be part of such a supportive community.

team-451372_960_720

Why I write my NSF preproposal by hand and to a lay audience

18 Jan

IMG_3408

Susan Holmes suggests (here) that it’s best to write your first draft of anything on paper, with an old fashioned pen, rather than on your computer. She believes
that the process of writing by hand helps us clear our thoughts. l think she has a point. So, I am writing this blog post on paper.

I would like to add my own piece of advice for better writing: l like to write my first draft as if I am writing to a friend or family member. For me, this strategy helps to remedy some anxiety I have thinking about the colleagues who may ultimately read my manuscript or proposal, and who may be harsh and skeptical. Writing with a lay person in mind also helps me to use simple words and to get to he point faster.

Years ago, l was struggling with the introduction chapter of my PhD thesis. The audience for this chapter would be my advisor and the other committee members. They were all well established and accomplished researchers in the field of population genetics. I was completely writer’s blocked. What could l write that they didn’t already know? l guess the only real information they were going to get from this chapter was whether I had mastered the material, but I had no motivation at all to write the chapter as a test of my knowledge.
l don’t remember who or what gave me the idea, but I decided to write the chapter as if it was meant for a lay audience. I actually didn’t think that my committee cared about the chapter much anyways, so I imagined an audience of friendly lay-people and students who were interested to enter the field., and I started to write for them.

This change of perspective made a huge difference to my writing. Suddenly, I was eager to write and I enjoyed the process. I had no more fear and a clear goal. (If you’re interested, you can download the introduction of my thesis here:  2007_Pennings_Pleuni_ThesisIntroduction).

This week, I am working on an NSF proposal. This is just as daunting and possibly nearly as futile as writing an intro chapter to my thesis (OK, not really). I therefore decided to try the same trick. I will write my first draft as if I’m writing to a friendly lay person, not the NSF committee that will ultimately read and judge my work. In addition to writing to a lay person, I will write my first draft on paper, following Susan Holmes’ advice. Clear thoughts and sentences, here I come!

 

No programming background? No problem! Learn R

14 Jun

Guest post by Rosana Callejas

Rosana Callejas

Rosana Callejas

Can someone with no programming knowledge learn “R”? The answer is yes! My name is Rosana Callejas. I am a Physiology major, and recent graduate from San Francisco State University. I began to learn the programming language “R” at the beginning of February of this year. Despite not having any previous programming experience , I analyzed my first data set of more than 20,000 data points in only a couple of months. Would you like to learn how I did it? Stay tuned.

The power of “R”

So what exactly is “R”? It is a programming language used by many data analysts, scientists, and statisticians, to analyze data, and perform statistical analysis with graphs and figures. “R” is a great tool when analyzing large data sets. It has many additional packages that can be downloaded, which allow the user to expand or simplify commands when analyzing data.

How R coded its way into my heart

Dr. Pleuni Pennings, an evolutionary biologist, and Professor at SFSU, introduced me to this wonderful tool. “I do all my research on my computer,” Dr. Pennings said, as she showed me the open program. At first, the idea puzzled me. In all my years as a biology student, I had never met a biologist like Dr. Pennings, who has made many discoveries from analyzing HIV DNA sequences using R. She explained to me that there is an accumulation of data collected by scientists everyday waiting to be analyzed. Therefore, there is a need for scientists with the skills to interpret, and draw conclusions from such large data sets. This interested me as biologist. I imagined all the new findings that could be made if all the data collected was analyzed. It would definitely contribute to the advancement of science. With this in mind, I embarked myself in the adventure of learning R.

One command at a time

I began by taking the online course “Exploratory Data Analysis with R” on Udacity.com. The course is composed of 6 lessons, in which I first learned the basics of R, a few basic commands, followed by the analysis of one variable, and how to make simple plots. In my learning, I used R, and R studio, which can be downloaded free online. I also used data sets provided by Udacity to analyze. In addition, R comes with other data sets I practiced with. My first graphing assignment was a simple bar plot (Figure 1), that represented friend count for Facebook users of different ages. This task required the package “ggplot2”, which allows graphing.

BlogFigure1

Figure 1. Friend count as function of age.

As I learned more, I began to work with different packages, new commands, and to make better graphs. I discovered how to add color to the graphs. I learned how to order variables, make subsets, group variables, add a new columns to my data sets, work with multiple variables, run correlation tests, and much more. The following are some figures that followed that first one, and show the progress of my learning as I added more detail to that first plot throughout the course.

BlogFigure2

Figure 2. Median friend count as function of age by gender.

BlogFigure3

Figure 3. Friend count as function of age.  In the green graph each point represents 20 data points in the data set. The black line represents the mean friend count. The blue line represents with the 50th quantile. The dotted lines represent the 90th and 10th quantiles.

 

Figure 4. The top graph represents friend count as function of age in months, with the blue line representing the mean. The middle graph represents friend count as a function of age with blue line represents the mean. The bottom graph represents friend count vs. age in moths rounded, multiplied, and divided by 5.

Figure 4. The top graph represents friend count as function of age in months, with the blue line representing the mean. The middle graph represents friend count as a function of age with blue line represents the mean. The bottom graph represents friend count vs. age in moths rounded, multiplied, and divided by 5.

Patience is the mother of all virtues

Learning R was definitely a challenge. Commands that in theory should work, sometimes did not work. As a new user, it was difficult to know exactly what had gone wrong. Fortunately, I had the guidance of Dr. Pennings who helped me through the process. I also looked for resources outside of Udacity. One great package to use along with R is “swirl,” which is a teaching package. With swirl, I learned commands not taught in the Udacity course. It has multiple lessons that give the user immediate feedback. Patience and persistence are key to learning R. Now I have seen what R can do, I know it was worth learning.

The possibilities are endless

My favorite feature of R is that the code used in a previous analysis can be saved, and reused. R users can also share pieces of code with one another, which helps expand the knowledge among users. If changes need to be made in the middle of analysis, this is rather simple, and there is no need to reanalyze the data. R can be used to study many different types of data of any size or background. Scientists such a Dr. Pennings make major findings in Biology using R.

Although new to R, I was able to begin the analysis of my own data set [1] within only a few months of learning about it. Below is a figure which resulted from the question: Which HIV regimens are most common and in what years? In order to answer this question, many hours of work were invested in preparing the data set, excluding undesired data points, sub setting, color coding, etc., ending up with 6255 HIV data points, which included only the 26 most common unique regimens as a function of time. The graph represents the most common regimens of HIV treatments taken by patients in different years. It is also organized in order of increasing number of drugs per regimen. Each regimen was color coded to include a NNRTI drug, a PI drug, or consist of nRTIs.

Figure 5. The graph represents the most common regimens of HIV treatments taken by patients in different years belonging either to NNRTI, nRTI, or PI.

Figure 5. The graph represents the most common regimens of HIV treatments taken by patients in different years belonging either to NNRTI, nRTI, or PI.

As the graph shows in 1989, and early 1990s, the HIV treatment consisted of the single drug AZT, and later in 1997, NVP. As the years progressed, regimens composed of two drugs became more common. It isn’t until 1996 that we begin to see regimens composed of three drugs. Regimens composed of three drugs are the most abundant and continue to be taken by patients up to 2013, while the single drug treatments seemed to have ceased in 2008. In 2002, we first observe regimens composed of four drugs (although RTV is often not counted as a drug, so these regimens may be considered 3-drug regimens as well), which also continue to be used along with the three drugs regimens.

R is a great program for data analysis. I believe that anyone who would like to learn it, with persistence can definitely do it. I will continue learning R, and analyzing my data set. I hope to use it as a useful tool for future investigations in my career.

[1] Thanks to Dr Robert Shafer from Stanford University for sharing the data with us!

A reading seminar where every student reads, writes and contributes to the discussion in class

16 Jan

I remember reading seminars as follows: one student spends the entire week preparing for a powerpoint presentation, which often turns out to be stressful for the student and somewhat boring and uninformative for the audience. The other students only glanced over the paper and so any discussion quickly falls flat. I therefore decided to have multiple short presentations without powerpoint (less preparation, more fun to listen to, plus repetition is good for learning a skill). I also decided to use short writing assignments as homework to make sure that all students were prepared to contribute to the discussion in class. At the same time, I wanted to keep things manageable for everyone.

1. Learning to present: every student does multiple short presentations without powerpoint.

No powerpoint: I didn’t want students to spend too much time preparing a presentation. I believe that often, when students spend a lot of time preparing presentations, they focus too much on making powerpoint slides and not enough on informing the audience and telling a story.

Short presentations: Doing an engaging 45 minute presentation is extremely difficult, and a skill that most postdoc don’t have, so why do we use 45 minute presentations in our graduate seminars? I decided in stead to let each student do three 10 minute presentations.

Feedback: After each presentation the presenters got feedback (from the other students and myself), so that they could improve their presentation skills during the semester.

Easy listening: An added benefit of 10 minute presentations is that it is much easier for the audience. Each week started with three student presentations, one on the background and main question of the paper, one on the data and the results of the paper, and one on the conclusion and implications of the paper.

2. Practice writing: every student does a different writing assignment every week.

Graded homework each week: A paper discussion can only work if people have read the paper. If students don’t read, they may spend most of their energy to try to hide that they didn’t read (I know I was in that situation!). So even though I understand that life and research get in the way of reading, I really wanted to make sure that the students were prepared for the seminar. To do that, I made every student do a written assignment every week that would count towards their grade (unless they were presenting that week).

A different assignment for each student: I had a long list of assignments so that each week, many different assignments were done AND so that over the course of the semester each student did many different assignments. This guaranteed that the students read the paper, but each with a different question in mind.

There were several types of written assignments. Descriptive: 1. Describe the background and main question of the paper, 2. describe the data and the results, 3. describe the conclusions, 4. describe which virus the paper is about. Critical: 5. What is your opinion of the paper? 6. What do you think the authors should have done differently? 7. Play the devil’s advocate: why should the paper not have been published? Summaries: 8. Summarize the paper in your own words, as if writing to a friend, 9. summarize the paper using only the most common 1000 words of the English language, 10. summarize the paper in a graphical abstract, 11. summarize the paper in a tweet. Meta: 12. Who are the authors of the paper? 13. How often is the paper cited, do you think it is influential?

Short! Each written assignment could not be more than 150 words, to keep the workload manageable for me and for the students.

Surprisingly hard: Some of the assignments were harder than the others. Summarizing the paper using only the 1000 most common words from the English language turned out to be very hard, but some of the students did a great job (see here and here). The graphical abstract was also hard for some students, but others liked it just because it was so different from their usual work (see here and here). The ”devil’s advocate” writing assignment was always very interesting to read.

Easy: Grading the written assignments was quite easy. I simply gave a plus or minus for 5 categories (answered the question, scientific accuracy, clarity, grammar and word count).

Revisions allowed: After a request from a student, I decided that the students could redo any assignment where they had gotten less than 100% because I believe that feedback is most useful when it can be applied to a revision.

3. Promoting equity: thanks to the written assignments, every student could contribute to every class.

Everyone contributes: One of the nice things about the homework schedule with different assignments for everyone is that in class, I could ask each student about their homework. This way, each student contributed to the class, promoting equity, and the brief discussions of the homework assignments always let to questions from other students. Even if I didn’t ask, some students would volunteer to share information they found while they researched for their homework. For example, I remember someone remarking at the end of a presentation: “In your presentation, you said this result may be very important, but I found that the paper hardly has any citations even though it was published ten years ago, so I think it may not have been picked up by anyone.”

Sharing homework: I also encouraged the students to share their written assignments on the online forum we had for the class, so that the other students (and not just me) could read them. Sometimes they led to interesting forum threads. I also published some of the written assignments on my blog, after asking the students for permission. This way even more people could enjoy them.

Reading about using phylogenetics in court

5 Sep

In my new job at SFSU, I am teaching a seminar on the evolution of human viruses. We are reading one paper every week and every student gets a different assignment for each paper. We’ve done one week now and I am very happy with the results. The paper we read was Metzker et al (PNAS, 2002), it is about using phylogenetic methods in an HIV infection case that went to court (thanks to Graham Coop for suggesting the paper).

I asked the students if I could publish some of their work. Here we go:

Describe the context and main question of the paper

The Metzker et al. study details the first instance of the admission of phylogenetic analysis as forensic evidence in a criminal case. It sought to determine whether scientific support existed for the proposed viral transmission event between the suspect (via injection of blood from an HIV-positive patient) and the victim by inferring phylogenies of the patient, victim, and HIV-infected control strains from the same geographic region using two loci under different selective pressures. In trees generated from both loci, the isolates from the victim clustered with the patient’s, supporting a close relationship between victim and patient HIV strains. Phylogenetic analysis has previously been used in inferring HIV transmission events, notably in the “Florida dentist case”. Five individuals were inferred to have contracted HIV-1 from their dentist based on the distinct clustering of their strains with the dentist’s relative to geographically similar HIV-positive controls.

Roxanne Bantay

Who are the (main) authors of the paper?

Dr. Michael Metzker, the primary author of Molecular evidence of HIV-1 transmission in a criminal case (2012), is an associate professor at Baylor college of Medicine and Rice University where he teaches human genetics. Additionally, he is president & CEO at RedVault Biosciences, a technology company that aims to advance personalized genomic medicine. Metzker is also an active researcher in the field of bioinformatics and next-generation sequencing.

The last author of the preceding publication is Dr. David Hillis, who is a current evolutionary biology professor and former director of the biology and bioinformatics department at the University of Texas (Austin). Hillis’ research focuses on experimental laboratory evolution; he believes that by studying this process we can ultimately gain insight into the underlying mechanisms that drive evolution.

Eduardo Lujan

Explain the main results of the paper using only the 1000 most common English words

This paper is about a doctor who tried to kill his girlfriend by using blood from a sick person. The doctor got the blood from their work and stuck their girlfriend during a fight. The important part of this case is the way that they showed that it really was the doctor who made the woman sick. For this case, tiny changes that happened in the thing that made the woman sick were found. These changes can show which person made the other people sick and show the relationships between all of the sick people.   By looking at these changes and the relationships, they showed that the doctor was the one who was at fault for making the woman sick.

Bradley Bowser

(see http://splasho.com/upgoer5/)

Make a graphical abstract of the paper

PeterManzo

Peter Manzo

 

 

Thoughts on arXiv and journals

9 Jul

One of the best things about working at Stanford is having lunch outside with my colleagues almost every day. Last Friday it was fairly cold (70 degrees orso, 20°C) but we are a tough bunch and we were sitting outside.

One of the newer people in the lab asked to the others: “do you publish your manuscripts on the arXiv?” What followed was a brief discussion of the pros and cons of publishing on the arXiv before a paper is published in a journal. Here is my summary.

Screen shot 2013-07-09 at 2.49.11 PM

Pros and cons of publishing on the arXiv

Pros

1. Science goes faster when we share our results faster.

2. Published papers will be better if more people can give feedback early on.

3. There is some evidence (though not from a randomized trial) that papers get cited more when they are first published on the arXiv.

4. Getting your paper “out there” before it is accepted by a journal takes away some of the stress of getting the paper accepted by a journal. Others can already see what you’ve done, and an arXiv-ed paper looks much better on your CV than “in preparation.”

5. In quantitative biology, the arXiv is cool and you will look like a modern 21st century scientist if you publish on the arXiv. But don’t try to impress a physicist with your new-found arXiv-fondness, because they already used the arXiv before most current graduate students were born. If you go for hip, consider publishing your preprint on Figshare, because they allow you to keep track of traffic, and PeerJ Preprints is another new option.

6. If you’re in evolutionary biology, you can benefit from exposure on Haldane’s Sieve if you publish on arXiv (or another preprint server).

Cons

1. The paper may still change a lot and you cannot remove the arXiv-ed version (though you can add a newer version, and I think it is unlikely that anyone looks at an old version).

2. Some journals don’t like to publish arXiv-ed papers, see this list: http://en.wikipedia.org/wiki/List_of_academic_journals_by_preprint_policy

3. If many people read the arXiv-ed version, they may not bother reading the improved journal-version.

Honestly, I am not too convinced of these cons.

So should do away with publishing in peer-reviewed journals?

I don’t think so. Despite everything that is wrong with journals, I think they are very useful.
Ask yourself: when was the last time you really took the time to read through a paper by someone you didn’t know?
Right, I think that may have been when you were reviewing a paper! And chances are that you were reviewing that paper because an editor asked you. There is not yet a system – outside of journals – that makes sure that a paper gets read & scrutinized by at least a few people. When I tried to publish a somewhat controversial paper on HIV last year, I was annoyed with the peer review system, because I felt it was unfair to a newby in the field. But without the review system, chances are that my paper would have been ignored entirely. If it wasn’t for journals, how would a person who is not yet known in the field get the attention of the community?

Editors are important hubs in our scientific community

Of course, there are reviewers who do not take their task seriously, and there are scientists who do take time to read papers by unknown scientists even if they are not reviewing, but I bet that both are rather small minorities. I like to review papers, I am happy that my papers get reviewed, and I think that the editors who organize it all are important hubs in our scientific community. We shouldn’t do away with that!

January is for writing

9 Jan

A new month has started and I decided to use it to focus on becoming a better writer. For me, writing is one of the favorite parts of being a scientist. Obviously, writing is also very important for my work, because the written word is one of the main communication tools in science. January is a good month to write. No holidays, no conferences, just four-and-a-half weeks of uninterrupted time to work. So, as I plan to write both a paper and a grant proposal this month, it is the perfect time to “become a better writer.” My goals are simple:

1. Learn to write more convincing and easier to follow texts.
2. Learn to write papers and proposals more efficiently.

In order to do this, I plan to follow other people’s advice.

1. I will watch videos and read blogs on writing papers and proposals.
2. I will read the relevant chapters in the HHMI lab management book. If you don’t own this book already, you should definitely order it. It is useful and free.
3. I will read parts of The Elements of Style book and apply what I learn.

A one page narrative

I already watched one video on writing papers and thought it was pretty good. What I took from it is to write a one page narrative of the contents of a paper before starting to actually write the paper. If you cannot write this one page narrative, you are not ready to write the paper.

I use a similar rule – or rather a set of rules – for presentations. Before even thinking about the slides, I need to have the story ready (usually written out entirely). Before I start to write the entire story, I prepare an informal abstract. And before writing the abstract, I will write a one-sentence take home message. This way, I am sure that I don’t end up with slides but no story. The other way around (a story but no slides) wouldn’t be so bad – but in reality this never happens: as soon as the story is there, the slides can be made in almost no time.

By the way, I used this site from the Purdue Writing Lab to check that my use of commas in this blog post is correct. Commas are complicated, especially in English and German.

If you have tips on how to become a better writer, I would love to hear from you!