Monday, July 29, 2013

The Science OR

Today I got an email via Ecolog, informing me of the birth of a new way for scientists to disseminate their research. Launched by Queens University a couple of weeks ago, one of SciOR's main objectives is to place accountability in the article review process.

The idea that reviewers can say whatever they want and reject papers that conflict with their own beliefs, all behind a veil of anonymity, is not new. Nor is the idea that journal editors select what knowledge is published, and therefore the sphere of knowledge that scientists have. So how does SciOR (or Science Open Reviewed) claim to offer a way around this?

1. Reviewers register with the website and advertise their reviewing experience and offer a list of topics they feel qualified to review papers on. 

2. Authors post paper titles and abstracts as a sales pitch for their papers; SciOR provides a platform for potential reviewers to contact the author's and offer their reviewing services.

3. Authors pick from the list of offers (or invite new ones), and both author and reviewer complete a No-Conflict-of-Interest (NCOI) declaration.

4. Authors pay the reviewers, if that was part of the agreement, and SciOR facilitate the transaction.

5. The authors revise and re-upload the paper, asking for more reviews if they so wish.

6. SciOR serves as a kind of marketer for these articles; journal editors from other journals (or the in-house Proceedings of Science Open Reviewed) pick from the rack of finished reviewed products. They then contact the authors and the authors unpost the paper.

I don't really have enough experience with the peer review process to know if this will work, but the idea of paying reviewers seems a little weird. I don't like the assumption that the SciOR people make that many reviewers don't offer their services "because they are nice people wanting to help advance science", but rather imply that people enjoy the power the position gives them.  Wouldn't money take it the other way? If people are interested in power, not the advancement of science, then wouldn't this hold true for authors too, meaning that a reviewer powered by money could become a popularist, letting things slip through? Sure there is a second round of editing before complete acceptance, but editors are not specialists in the subject and therefore may not catch mistakes. Meanwhile, the reviewer has cash in hand and the author another publication to his/her name.

Furthermore, is this really any better and more open than the traditional publication model? First, if journals are going to send the paper out for review again, why bother marketing a reviewed product? Won't this retard the publication timeline? Second, external journals are still picking and choosing the articles it thinks are interesting, while the editor of the Proceedings of SciOR has his/her say on the remainders. Editors are still controlling what we know. Foucault lives on.

What do you think - can ScienceOR resuscitate science communication?

Thursday, July 25, 2013

X, Y, (and Z?)

Since it is my lab tech Rebecca's last full day tomorrow before she leaves for grad school in Florida, I thought I would write about one of her favorite topics: sex determination.

Let's start with us humans - as you know, humans with two copies of the X chromosome are female, and those with two different ones (XY) are males. But what if you are missing an X or a Y chromosome? What if you have an extra sex chromosome? XYY individuals aren't "super men", but XXYY (or XXXY or XXY) individuals are sterile males. Women with an extra X chromosome (XXX - trisomy X) are developmentally delayed, so there is such a thing as being "too much woman". But missing an X chromosome results in Turner's syndrome for women (XO), and spontaneous abortion for males (OY).

So what does this tell us about these chromosomes and sex determination? Well, you HAVE to have the Y chromosome to be male, and an X chromosome to live, but having more X chromosomes doesn't make you more of a woman - it just makes you sicker. This is because the X chromosome actually carries a lot of important information - including sperm production - while the Y chromosome has been shrinking for the past millions of years and pretty much just carries a single important gene. This gene, SRY (pronounced "sorry"), is a regulator for testis development early on, but all the genes controlled by this are on other chromosomes. Thus came the (in)famous statement that y chromosome shrinkage may be driving men extinct - although this has been disproved. One reason for this is because the SRY gene alone can determine maleness, and it is possible for it to insert into the X chromosome (XX males).

Look how tiny the Y chromosome is compared to the X chromosome! Image from:

But what about other organisms. That fly buzzing over your half-rotten bowl of fruit you haven't quite managed to finish? Femaleness is decided based on having two X chromosomes, rather than maleness being on the presence or absence of a Y chromosome. Those cockroaches crawling out of your walls? They only have X chromosomes, with males having a single copy and females having two. And that sparrow outside your window? It is like the opposite of humans - men have two chromosomes the same while females have two different ones.

But perhaps the most intriguing method of sex determination is found in sea turtles (and alligators), whose sex is determined by egg incubation temperature. Hot eggs are female, and cooler eggs become male; the temperature difference is very slight, and therefore a mother can somewhat control the sex ratio of her offspring by rearranging eggs. However, some researchers worry that this temperature differential will not be possible under a future climate, and so the reptilian world will be run by females. Of course, females are generally better at spacing sexual encounters than males, so a female-shifted population may not be all bad and could mean more sea turtles.

How many methods of sex determination can you count? From

Or what about organisms which can reproduce without sex (an administrator at denies Jesus was conceived this way). Or those which switch sex at some point in the lifecycle (think Nemo). Or earthworms, which just serve as whatever sex they feel like, and usually have sex with another worm, but can also just fertilize their own eggs.

There are so many more organisms - like sharks (thanks Rebecca!)- that we know nothing about. Understanding sex determination is more than just an intellectual question - it has important implications for managing wildlife, whether it be on the endangered species list, our menu, or a novel invasive species.

Sunday, July 21, 2013

Now, now...share nicely!

In lab meeting a couple of years ago, we discussed whether making government-funded ecology research publicly available would actually benefit science. The general consensus was that while the public should have the right to access research its tax dollars have paid for, making data open would not really benefit them or science. My labmates argued that there is already too much data and too few people with the knowledge necessary to make meaning from the data. Furthermore, they argued, the frequency with which some grants require data to be made publicly available would require researchers to take time away from science during peak field season in order to enter and upload the data. And I followed along.

However, attitudes are shifting. Recently there has been a flurry of papers and blog posts on open data and what it means for ecology. For example, in a really nice article in Frontiers in Ecology and the Environment, Hampton and colleagues argue that if ecologists are to survive, they must both share and use shared data. Yet in a survey, the authors found that less than half of the papers produced using NSF funds had also published some or all of the data used to write the paper. As another incentive to "open" data, the authors argue that there are instances - such as when rapid responses to environmental crises are needed - when open data is used more extensively than what they refer to as "dark data". Thus worries about data overload and lack of relevance appear to be unfounded; the government needs bang for its buck, not tree-hugging.

Joern Fischer, a professor at Leuphana University responded to this paper on his blog, stating that while he believes sharing is a nice idea, in practice there is no shortage of data, and allowing other people not intimate with the sites from which the data was collected is dangerous. Ecology is apparently a touchy-feely science which cannot be reduced to data points that can be used to look for larger global patterns, a point which the Hampton paper also brings up.

But I would argue that 1. getting too intimate with your site is dangerous (you start seeing patterns which aren't there, so you MAKE them there when you do statistical analyses), and 2. we really just need more complete metadata, including many pictures of research sites throughout the seasons. For example, there have been fires in various plots at the Boston Area Climate Experiment, and they have been logged in the online shared lab notebook. However, to my knowledge, this information is only accessible to people working at the site. "Hidden" metadata like this must be made available to anyone reading papers and using the associated data to complete a meta-analysis of climate warming effects themselves. 

Another point that Joern brings up is that field ecologists will do the hard work collecting data and have to publish in smaller, regional, less-prestigious journals while the modelers sit at their desks, distant from the field, and compile all this data into articles the top journals are begging for. I have a number of gripes with this statement. First, if you are doing ecology to get publicity, you are in the wrong field. That applies for all desk-, lab-, and field-bound types. Second, this separation between writers and doers is ancient - how many techs do biomedical labs have, and yet PIs write the paper with no input from the technicians about what funky things happened along the way? Third, having gone from an almost exclusively field-based position to an almost exclusively computer-based one, I would do anything to be spending my summer outside looking at nature's pixels; working at a computer is not some lazy-ass bliss. Nothing is. Fourth, most ecological data collection can be done by minimally-trained volunteers (Earthwatch actually requires that projects it funds use volunteer data collectors extensively); I reckon the future of ecology will be a PI with some model or question they want to ask, going to public data, identifying a hole, and involving the public to collect that data, and possibly analyze it. It seems like a grant-writers dream given the current funding requirements.

So what are we really worried about? The idea of more work? Being responsible for a broader array of literature? Isn't it our job to understand the world? Ecologists don't write grants which say "I want to understand exactly what happens in the four 6m*6m plots I will be studying", but rather "I will design a study using four 6m*6m plots superficially representative of the broader environment with the hope of understanding patterns and processes in ecology which can be extended to larger spatial scales". 

But to scale up in this day and age, we have a responsibility to not just conjecture, but actually test it. If nobody is asking the same question (or if it has been asked, but the data has been analyzed inappropriately), and we only have published results to go on, how will we do this? We can ask people for their raw data, but emailing busy professors who have to dig up datasets not necessarily formatted for sharing is a time-consuming process. 

It's time to go beyond the costs of taking the time now to put your data in a clear format for others (and you a few years down the line) to access, and to think long-term. That is not to say that I think all data should be analyzed blindly without respect to site intricacies; we don't know what factors are important in ecological data, and how they may differ with time and space. However, looking over larger landscapes allows us to examine broader patterns and identify best practices for land management in the absence of finer resolution data, and if the metadata we have does not predict responses of interest at a broader scale, we have a reason to apply for more funding to do field work and ask why. 

For a field so obsessed with statistics, such aversion to testing the effect of increasing sample size seems ridiculous. 

For a more positive spin on open data, Chris Lortie of York University has made a pre-print available on the role of open data in meta-analyses which is available here.

Saturday, July 13, 2013

Cows can fly!

At the Gordon Conference last week, I was introduced to flying cows (aka Hoatzin, or stinkbirds), which, like happy cows, feed almost exclusively on leaves. Because a diet composed exclusively of leaves is incredibly poor in nutrients and hard to digest, like cows, hoatzins use microbes to ferment the food they eat. 

A hoatzin. Hoatzins are awesome not only because they are "flying bioreactors", but also because they are a bit like modern-day versions of Archeopteryx, the ancient gliding bird ancestor which had claws on its wings which enabled it to climb up trees.  Hoatzins live in the Amazon basin. Image courtesy of

 Both animals have a wide array of bacteria which make cellulase and lignase enzymes the host animal cannot. These enzymes break down leaf components such as cellulose (long strings of glucose linked together) and lignin (the irregular, phenolic (or ringed) compounds which give the leaf structure), which the microbes ferment into short-chain fatty acids such as butyric acid (which gives Parmesan its "distinctive" smell), propionic acid (which smells like really bad sweat), and acetic acid (as in vinegar). Because this process is relatively slow, the animals must eat a lot of food and have a large fermentation chamber; hoatzins are poor flyers and have to have an extra bump on their chest to help balance on branches so their full gut doesn't topple them, and the cow rumen is so big you could probably fit an adult human in it, though I don't think anyone has tried it.

Compare how much space the crop - the pouch birds use to store food if it over-gorges itself - takes up in the hoatzin (left) compared to the chicken (right). This is where the "pre-digestion" of vegetation occurs in the hoatzin. Small amounts of fermented fluid are released into the small intestine where the short chain fatty acids can be absorbed. Pictures from and

 Cows and hoatzins aren't the only animals which depend on microbes to break down their food. We too depend on microbes, except the majority of our microbes live in our large intestine and feed on our "leftovers" because most of our nutrients are absorbed in the small intestine. Research indicates that some other organisms, such as the giant panda, have lost some of the ability to degrade complex plant matter, and their genomes contain fewer genes encoding enzymes involved in this process than their nearest omnivorous relatives. This might explain why there have been reports of mother panda's feeding offspring their feces - populating your gut with the right microbes is obviously important if you cannot digest your food yourself.

Of course, pandas aren't the only animals to practice coprophagy (poo-eating). Babies do it. Dogs do it. And rodents like rabbits and guinea pigs do it. The last two animals are relatively easy to explain...they are hindgut fermenters, which means the majority of the microbes responsible for breaking down the plants they eat live in a part of the gut which comes after where the majority of absorption occurs. Therefore, in order to get all of the nutrients out of the food they have taken in, the food has to pass through the gut a second time. But babies and dogs...let's just say I don't kiss them. 
If you want to learn more about poo, Wikipedia has your a** covered

Monday, July 8, 2013

Baas-Becking for Trouble?

I am currently at my first conference (The GRC on Applied & Environmental Microbiology), and I thought that writing about some of the topics being covered at it would be good. Last night, the conference opened with a discussion of the ubiquity of microbe "species", so I thought that would be a good place to start here too.

Perhaps one of the most provocative statements made by a microbiologist to date is Baas Becking's 1934 statement that "alles is overal: maar het milieu selecteert". This translates to "Everything is everywhere but the environment selects", which I interpret as through wind and wave, microbes have the ability to disperse anywhere on the planet, although whether or not they are able to thrive and grow depends on their needs.

 You may think this sounds a bit obvious - how could something like Neisseria gonorrhoeae, the human-dependent bacteria which causes gonorrhea, be found surviving and thriving in Antarctica, thousands of miles from the nearest human? And we know that the obligate pathogen Bacillus anthracis, which causes anthrax, is not in most people's lungs, because if it was, they would be dead.
And yet this question is under particularly intense debate at the moment. But why?

To understand this question, we have to consider the journey environmental microbiology has taken since Becking made this statement. Twenty or thirty years ago, if you wanted to know whether your microbe of interest was everywhere, you would have to take samples of water or soil or rocks from all different places, and then use a series of different growth conditions to try and enrich for the microbe. But not all microbes are culturable using current techniques, or perhaps at all, so we are missing out on some of the picture. But perhaps more importantly, a given microbe may be very easily cultivated and identified in samples from some places, and present but impossible to culture from another place, meaning that it may be present - even thriving - but not detected. Shockingly, this is the case for some strains of fecal bacteria used as indicators of water quality - they may become unculturable after passing through wastewater treatment, making it difficult to assess the safety of the effluent. 

Fortunately, the rapid growth of new culture-independent methods for detecting microbes in the environment has ensured that our understanding of who is present (though not always what they are doing) is much less of a problem. In these methods, researchers collect a piece of the environment (seawater, soil, leaves, rocks), extract the DNA, and using microbe-specific DNA fragments as primers to intiate sequencing reactions, they sequence the DNA (usually just the ribosomal RNA sequences) in the sample. These rRNA sequences are like barcodes for the microbes, and the most widely-used definition of bacterial "species" is based on similarity of this sequence. Thus by sequencing just a short stretch of DNA, we can see who is present, and theoretically we can detect any microbe that is present whether or not we know anything about its preferred growth conditions. This makes it much easier to answer the "everything is everywhere" question! Theoretically.

Even with this enhanced ability to see who is where, we are still debating whether everything is everywhere, possibly because, as my PI pointed out, this question means something very different to the day it was first made. In a study examining seasonal changes in the microbial community at a site in the English Channel, it was noted that organisms previously thought to be lost from the community were in fact present in low numbers and possibly deeper in the water column; this is the so called microbial seedbank hypothesis. Microbes are everywhere, in low abundance in various stages of dormancy, and the environment selects from this pool. Here, time (everywhen) is used as an analog for everywhere.
But other studies have concluded that in fact, everything is not everywhere. For example, in a study utilizing data from thousands of samples taken all over the global oceans, researchers from Woods Hole found that in some instances, geographical proximity, rather than environment type,  dominates whether a given microbial species is present. The environment selects, but dispersal limitations also play a key role in this.

Some researchers have responded to the observation that everything is not everywhere by stating that not all microbes are found in all samples because "sequencing isn't deep enough" - that is, not enough DNA has been sampled and sequenced from the environment - and some of the "singletons" (or sequences which appear only once in a sample) which are routinely assumed to be sequencing errors and therefore discarded may be real, though rare organisms. Furthermore, there is always at least some sequencing bias - the primers used to initiate sequencing runs may not bind all bacterial genomes equally or at all, meaning that some microbes are missed. Even if we could sequence all kinds of bacteria, the depth neccessary would come at a high cost: Tim Vogel of the Ecole Centrale de Lyon estimated that we would need about a thousand Illumina sequencing runs to get all the microbes in a gram of soil, which would cost in the millions of dollars per sample. Of course, as many of the speakers and commentors brought up last night, a much cheaper way to prove that everything is everywhere is to define everything at a broad taxonomic level (ie bacteria vs. E. coli 0157:H7 substrain xxx) and everywhere at a large scale (for example, on this continent)!

So after all that, is it worth it to try and sequence all these bacteria to see if everything really is everywhere? Last night, Rob Knight told us that knowing that everything is everywhere is very important for understanding how to treat patients. Take a patient about to undergo chemotherapy that will wipe out his immune system. If a potential pathogen is already in his body, putting him in a clean room will do little good. But if it isn't, then taking this preventative approach could save his life. But in other systems such as soils, there appears to be sufficient microbes doing the same thing, and stochastic processes driving the extinction and local recolonization of species, that everything being everywhere isn't a particularly good (or "biologically informative") debate to have. Maybe we need to look at whether functions - rather than arbitrarily defined species - are everywhere, and what functions the environment selects for.

Thursday, July 4, 2013


This post finishes up the trio of open access posts here on my blog. We began with my own naiive view as a scientist in training, before moving onto my mum's position as an employee of a large publishing company. Here we finish up with a post by my scientist-science communication guru grandfather, Jack Meadows, as a response to both our views. I hope you enjoy the "stop the worrying and excuses and get on with it" position he takes. You can read more about his opinions on open access here, where he was interviewed by Richard Poynder.
Remember how this Open Access thing started. It mainly stemmed from a gripe by research-based institutions in the latter part of the last century. They asked - putting it crudely - why they should supply information for free to publishers, only to be charged heavily to have it back again. Things came to a head then for a number of reasons. For example, in the days of hot-metal printing, publishers had to supervise the transition from the input MS to the printed page. By the 1990s, authors were expected to prepare their own MSS in print-ready form: neatly transferring part of the effort from the publisher to the institution. More importantly, publishers had started to assume that they owned the copyright in published papers. (Earlier in the last century, it was generally accepted that authors retained the copyright.) Irritatingly, this claim was only made for university-based authors: publishers accepted that governments could claim the copyright in any material published by their own researchers. At the same time, the Internet was making it possible for unlimited direct contact between authors and readers. So, it was asked - why should not research papers be transmitted directly from author to reader without going via a publisher? Such thoughts soon led on to the exploration of Open Access publishing.

Publishers, of course, had an answer to these various institutional complaints. Their function, they said, was to provide ‘added value‘. They took in the literary creations of researchers, polished them into an acceptable form for reading, and then circulated them to potential users. Above all, they provided the quality control mechanism which ensured that only acceptable research was published. The control mechanism usually runs as follows. Papers are submitted by authors to editors, who may well be fellow-academics, who, in turn, farm them out to (mainly academic) referees. Most of the academics, however, provide their input either cheaply, or free of charge. From an institutional viewpoint, therefore, the publishers’ arguments actually bolstered the institutions own case: the quality control mechanism is parasitic since the institutions pay the people involved. (To be fair, publishers have a case for arguing that the relationship is actually symbiotic.) However, everybody - authors publishers, institutions, readers - all assert that peer review is essential when publishing research. Any new method of publishing must take account of this. So it is worth looking at the activity a little more closely.
Quality control has become the shibboleth of research publication. Publish in an unrefereed journal and you join the ranks of the damned (or at least the ignored). Yet the hardline form such control tends to take currently is relatively recent. Thus, back in the dark ages when I started research, a number of high-prestige journals had no external referees: all the reviewing was done by the editor(s). (Indeed, one journal that had used external referees ceased to do so for a time because they rejected a couple of ground-breaking papers that the editors would have accepted.) The point is that you don’t usually need to be an expert in each small area of research in order to decide whether a paper is publishable or not. Separating the wheat from the chaff is not all that difficult. What the expert can do is to suggest improvements to the paper. If the worry is about publishing poor research, then cursory editing can do that quite well, and is a good deal more cost-effective.

 In some disciplines, acceptable research is sufficiently well defined that assessments of the accept/reject sort are a minor problem. The obvious Open Access example is arXiv. (May I note, in passing, that not all the contributions to arXiv subsequently appear in journals and, in any case, many of the citations by other authors are to the online version. Anecdotally, from discussions I have had, I would suspect that arXiv could continue to exist without a journal-based back-up.) It has been said that there are two types of science - physics and stamp-collecting. It is true that the arXiv approach might not work so well for the latter as for the former. But its success within its field does suggest that greater flexibility in achieving quality control is both desirable and feasible.

 Then there is the problem of bad research on the Internet. I don’t actually see this as an important question for research journals.  Most online ‘bad research’ lies outside the normal system of academic communication. It is, unfortunately, often more readable than any research paper. Hence, members of the general public are attracted to it. I doubt whether either tightening or relaxing quality control in academic publishing would have much effect on public interest. Gresham’s law applied to this situation suggests not.

 So, to return to the original question, can Open Access provide all the added value, and especially the quality control, that traditional publishers claim to provide? The immediate answer is obviously that it can, since a variety of Open Access journals are already available. But like any other journals they have to be funded - by subscription, or by tapping the authors, or in some other way. From this viewpoint, the whole thing simply boils down to which is the more cost-efficient method of publishing - the existing system or a new Open Access system. But, of course, this over-simplifies. Researchers by and large are not interested in the routine of organising research communication. In addition, most researchers don’t like rapid change in the system - they have too much intellectual capital tied up in their publications. I fancy, in consequence, that, for the foreseeable future, publishers will continue to be involved as intermediaries. However, their financial pickings will decrease. A word of encouragement to anyone currently in their fifties, and involved in commercial journal publishing. Take as your motto some words from Dr. Johnson: ‘These things will last our time, and we may leave posterity to shift for themselves‘..   

Actually, I find all this a little disappointing. Journals - that is bundles of research papers - were devised as an efficient way of distributing research using print. With computer-based handling, the individual paper is a more sensible unit to use. Maybe the question we should be concentrating on, therefore, is - when can we do away with journals altogether?. Incidentally, all our discussion has been about science. I reckon the more interesting questions now are about the humanities. What about open access to scholarly monographs, for example?