Among these subfamilies, the extensive expansion of several TKLs was very apparent

nt groups; therefore, many confusing names and synonyms exist. We adhered to SWISS-PROT names where possible, and compiled a list including all available synonyms and accession numbers of 196 human GPCRs with known ligands and 84 human orphan receptors. Gustatory and olfactory receptors were omitted. Multiple protein sequences were aligned and the extremely variable amino Relebactam web termini upstream of the first transmembrane domain and carboxyl termini downstream of the seventh transmembrane domain were deleted to avoid length heterogeneity. The deleted regions contained no significant sequence conservation. Phylogenetic analysis Because of the large number of sequences in family A, we had to use a combination of computational methods to accomplish the best possible description of their phylogenetic relationship. In a first step we used the distance-based neighbor-joining method as the only one computationally feasible. Neighbor joining has been shown to be efficient at recovering the correct tree topology, but is greatly influenced by methodological errors, for example, the sampling error. This can in part be overcome by bootstrapping, a method of testing the reliability of a dataset by the creation of pseudoreplicate datasets by resampling. Bootstrapping assesses whether stochastic effects have influenced the distribution of amino acids. In previous publications on this topic, bootstrapping has not been generally used. We generated a neighbor-joining tree of family-A sequences, and considered tree branches to be confirmed if they were found in more than 500 of 1,000 bootstrap steps. The same branching pattern was found by least squares as implemented in FITCH, but it was not possible to compute enough bootstrap steps with the equipment used. The remaining sequences of unconfirmed branches were then assigned to existing branches according to results obtained with the local alignment tool BLASTP to account for similarities in parts of the sequences not sufficient for repeated global alignment. The main advantage is the application of a well defined model of sequence evolution to a given dataset. Maximum likelihood is the estimation method least affected PubMed ID:http://www.ncbi.nlm.nih.gov/pubmed/19816210 by sampling error and tends to be robust to many violations of the assumptions in the evolutionary model. The methods are statistically well founded, evaluate different tree topologies and use all sequence information available. Because of their smaller size, families B and C could be subjected to these methods without prior subgrouping. This resulted in 19 phylogenetic trees, comprising 241 receptors for family A, one tree from 23 sequences for family B and one tree from 14 sequences for family C. Family-A trees were rooted with the human family-B receptor GPRC5B and families B and C with family-A receptor 5H1A. The sequence used to root the tree is supposed to be a distant, though related, sequence. In some of our groups, the phylogenetic trees could not be fully resolved. This could be due to either very similar or very distant sequences. In both cases the phylogenetic signal is too weak to resolve the tree. Several receptors were found to be only distantly related to other known receptors used in our analysis. A possible explanation could be the previously proposed convergent evolution of this large protein family, meaning that these receptors have acquired the compelling similarity in their overall structures as a result of functional need, not phylogenetic relationship. The lack of signific