Tag Archives: genetics

The acknowledgement section of our NSF proposal

25 Aug

A few weeks ago two colleagues and I submitted an NSF proposal. We submitted on a Friday afternoon even though the deadline wasn’t until Tuesday! I am proud that we managed this almost without any deadline stress!

I had fun and we wrote a great proposal

I know that we may not end up getting funded by NSF, but until we get that message, I plan to be very optimistic. We wrote a really neat proposal for a great project. I can’t wait to get started! The ambitious goal of the project is to determine the fitness cost of every possible point mutation in the HIV genome in vivo.

I think nobody likes to write proposals when the success rate is only 5%, but I actually enjoyed working on this proposal and I learned a lot while writing it: both about the biology of our project and about the art of proposal writing. It’s important for me to commit that to paper (OK, screen) so that if NSF decides not to fund us, I will remember that writing the proposal was actually a good experience.

Writing with a newborn

In addition the many scientists and administrators who contributed to the proposal, I also want to mention how I could write a proposal with a newborn. We started working on the proposal two weeks before I gave birth and we submitted the proposal when our baby was just shy of seven weeks old. The hours that I spent on the proposal were made possible by my mom who flew in to help and by the fact that Facebook gives new parents four months paid paternity leave so that my husband was also at home during my maternity leave. It was fun to be home together with my husband and we took shifts working and taking care of Maya. Most days I worked on the proposal just two or three hours, so a large part of the work was done by others.

HomeOfficePleuni

Me in my home office with baby, changing table, a laptop and a grant writing handbook.

It was a huge team effort

Many people were involved in writing the proposal. Many more than I ever expected to be. I want to list them here so that I remember who helped out and also to show that being a researcher doesn’t have to be a lonely affair.

Note that these people are only the people I am aware off. Others certainly helped my co-PI Adi Stern.

The main team that wrote the proposal consisted of four people:

  • co-PI Adi Stern (Tel Aviv)
  • postdoc Marion Hartl (SFSU)
  • professional grant writer Kristin Harper
  • myself

At SFSU, people from the Office for Research and Sponsored Programs helped:

  • Rowena Manalo
  • Raman Paul
  • Michael Scott
  • Jessica Mankus
  • Uschi Simonis (vice-dean for Research)

At Stanford there were

  • co-PI Bob Shafer
  • collaborator David Katzenstein
  • Elizabeth White (Katzenstein lab)
  • Holly Osborne (Office for Sponsored Research)

In Tel Aviv

  • Office for Sponsored Research
  • Adi Stern’s lab members brainstormed ideas
  • Maoz Gelbart help with ideas and figures

Colleagues who read earlier versions of the proposal

  • Sarah Cobey (U Chicago)
  • Sarah Cohen (SFSU)
  • Alison Feder (Stanford)
  • Nandita Garud (UCSF)
  • Arbel Harpak (Stanford)
  • Joachim Hermisson (U Vienna)
  • Claus Wilke (U Texas Austin)

A huge thank you to all these amazing people! I am lucky to be part of such a supportive community.

team-451372_960_720

Using deep sequencing data to estimate selection coefficients in HIV

28 Apr

Messer, P. W., & Neher, R. (2011). Estimating the strength of selective sweeps from haplotype diversity data. Genetics.

I recently reread this paper by my colleagues Philipp Messer (used to be my office mate at Stanford) and Richard Neher (who works on the population genetics of HIV, just like I do). I thought it’d be worth writing a short blog post about this paper because it has some really nice ideas but it is quite technical and you may not have read it.

Selective sweeps in HIV

Selective sweeps happen in HIV when the virus fixes immune escape mutations or drug resistance mutations. Often, we don’t have good enough time series data to determine the frequency path of the beneficial mutation (i.e., how fast does the beneficial mutation increase in frequency in the viral population). Without frequency path it is hard to quantify the selection coefficient of the beneficial mutation; how much fitter are they than the virus they replace?

The authors of the paper present a new method to estimate the selection coefficient of a beneficial mutation. The method requires deep sequencing data from a population in which a beneficial mutation has recently gone to fixation. The method is applied to HIV sequences from patients in which a drug resistance mutation or an immune escape mutation has just gone to fixation. It seems to me that the method may be especially useful for drug resistance mutations because they may go to fixation rapidly and at unpredictable times, so that it is hard to follow their frequency path. The proposed method just requires a sample after fixation has happened.

The idea

The method is based on the following idea: If the selection coefficient of a beneficial mutation is very high, then the selected allele will quickly reach a high frequency without accumulating many new mutations. But if the selection coefficient is not so high, then it will take more time for the selected allele to reach a high frequency, during this time it will accumulate new mutations.

New, neutral, mutations that occur on the background of the beneficial mutation, will be dragged to a higher frequency by the beneficial mutation. If a new mutation occurs on the background of the beneficial mutation very early when there is only one copy of the beneficial mutation, then the frequency of the new mutation will always be the same as the frequency of the beneficial mutation. They likely fix in the population together. If, however, the new mutation occurs when there are already 8 copies of the beneficial mutation, then the new mutation will likely reach approximately 12% frequency (like the red fraction of the population in the figure).

This figure shows how earlier mutations on the background of the beneficial mutation reach higher frequencies.

This figure shows how earlier mutations on the background of the beneficial mutation reach higher frequencies. (Fig 1 A in the paper)

In a fast sweep, the “5 copy moment” goes by quickly

For a new, neutral, mutation on the background of the beneficial mutation to ultimately reach frequency 20% in the population, it needs to occur when the beneficial mutation is present at approximately 5 copies. The new mutation then occurs on one of the 5 copies, and is thus present on 20% of the viruses with the beneficial mutation. If the beneficial mutation fixes, the new mutation will have a population frequency of around 20%. In a slow sweep, the beneficial mutation may spend several generations at around 5 copies, whereas in a fast sweep, the “5 copy moment” goes by quickly. A mutation that happens when there are 10 copies may reach 10% freq, at 100 copies 1%. If we have many sequences from the population (say, 1000), we can look at all the new mutations and their frequencies and determine how fast the sweep went, or what the frequency path of the beneficial mutation was. If we know the frequency path, we can estimate the selection coefficient of the beneficial mutation.

Richard and Philipp used their method on HIV data because these data are deep enough to do this.

This is a sweep of a drug resistance mutation. The inset shows the genetic distances between the most common haplotypes in the dataset. All haplotypes have just one new mutation, except haplotype 13 which has 2. The main figure shows the ranks of the haplotypes on the x-axis vs their abundance (relative to the haplotype that had no new mutations) on the y-axis. Haplotype 1 (with 1 new mutation) has approximately frequency 0.05. The estimated selection coefficient is 0.07. This is figure 6 A in the paper.

This is a sweep of a drug resistance mutation. The inset shows the genetic distances between the most common haplotypes in the dataset. All haplotypes have just one new mutation, except haplotype 13 which has 2. The main figure shows the ranks of the haplotypes on the x-axis vs their abundance (relative to the haplotype that had no new mutations) on the y-axis. Haplotype 1 (with 1 new mutation) has approximately frequency 0.05, so it must have occurred when there were around 20 copies of the beneficial mutation. The estimated selection coefficient is 0.07. This is figure 6 A in the paper.

Use the method to study new infections?

I wonder whether this method can be used to see how quickly a new HIV infection is growing in a person if we’d have deep sequence data from a newly infected person.