Guanylate-binding proteins have important roles in immunity, being involved in the cell-autonomous innate immune response against bacterial, parasitic and viral infections [28]. Despite their relevance, evidenced by their presence in a broad range of eukaryotes (from plants to humans) [18, 19], the evolution of GBPs has been mainly addressed in humans [18]. Lately, we have expanded the knowledge on GBP’s evolution to several mammalian groups such as Primates [22], Muroids (Rodentia) [23], Lagomorphs [25] and Chiroptera [24]. These studies disclosed a pattern of evolution with gain and loss events in these groups leading to a different GBP gene repertoire in each group. With the present work, we sought to expand the knowledge of GBPs to the most ancient mammalian groups, the Monotremes and the Marsupials, and to obtain an evolutionary model for mammalian GBPs.

Searching public databases, we obtained 62 sequences annotated as Monotreme and Marsupial GBPs. The presence of conserved GTP binding domains GxxxxGK (S/T) and TVRD/TLRD [20] in all these GBPs indicates that these genes belong to the GBP family. Unlike other GTPases, which contain an (N/T)(K/Q)xD motif, GBPs changed to a unique TVRD/TLRD motif for GTP binding, which identifies these proteins [12]. In the TVRD/TLRD motif, the conservative exchange of threonine for hydrophobic amino acids (A/I/V) has already been observed in Rodents and Lagomorphs and has been present since the origin of Mammals, thus, it most likely presents a minor functional consequence [44]. Interestingly, in the mouse and Primates, the TLRD motif is associated with GBP1, 2 and 5 in addition to the CaaX motif while the TVRD is linked with GBP4, 6 and 7 [20]. However, this pattern is not observed in Marsupials and Monotremes, which indicates that this association between the TLRD and CaaX motifs only emerged in the Placentals ancestor between 66 and 102 mya [32, 45,46,47,48]. The ancestral motif most likely was TVRD, since in both Marsupials and Monotremes it was the most common motif found in all GBPs (Additional file 1: Table S1). Analysing the Gallus gallus GBP sequences used as outgroup in this study, we verified that these have a TVRD/VVRD motif (see Additional file 2: Data 1 for sequence alignment), corroborating that the TVRD was the mammal’s ancestral motif.

Our phylogenetic analysis demonstrates that all mammals’ GBPs share a common ancestor. Still, each major mammalian group has evolved a specific subset of GBP genes, as shown by the existence of well-supported GBP clusters for each mammalian group in our ML tree (Fig. 1). This is in line with our previous studies on several placental mammal groups [22,23,24,25]. Monotremes present four GBP clusters, two of which are basal to all other mammalian GBPs, and Marsupials present two GBP clusters. As such, we suggest a new nomenclature for these groups (see Additional file 1: Table S1). All groups are well supported with high bootstrap values and present characteristic amino acids. Characteristic amino acids further support the newly proposed groups (Fig. 2). Analysing the GBPs of ancestral mammals, this study confirms that GBP3 and GBP7 clusters are only present in primates [22].

GBP evolutionary models in mammals

Based on previous analysis in primates, we assumed that the GBP gene cluster arose from a common ancestor [20]. Moreover, it has been hypothesised that after the first duplication, one gene gave rise to modern GBPs 1, 2, 3 and 5, and the other to GBPs 4, 6 and 7 [20]. Based on our ML analysis, we hypothesised that the GBP ancestor first duplicated into the GBP8, GBP9 and GBP1/2/3/5/4/6/7 ancestors (Hypothesis A), which then duplicated and originated four genes: GBP9, GBP8 and GBP1/2/3/5 and GBP4/6/7 (Fig. 5A). Monotremes maintained these four genes, but Marsupials and Placentals lost GBP9 and GBP8. Considering the analysis of the mammalian GBP’s syntenic regions, in which no partial sequences, pseudogenes or traces of deletions of GBP8 and 9 were found in the syntenic regions of Marsupials and Placentals, we alternatively hypothesised that the mammals’ GBP ancestor first duplicated into the GBP1/2/3/5 and GBP4/6/7 ancestors. In Monotremes, the GBP1/2/3/5 and GBP4/6/7 were maintained, and two new genes emerged, GBP8 and GBP9, in a different chromosomal location (Hypothesis B; Fig. 5B). A third hypothesis would be that the GBP ancestor duplicated into the GBP8/9 and GBP1/2/3/5/4/6/7 ancestors (Hypothesis C; Fig. 5C). Our ML tree suggests that GBP8 and GBP9 form distinct clades that were present in the ancestral genome. In this scenario, we would expect to find traces of GBP8 and GBP9 in non-Monotremes after their loss in these groups. Our analysis of the mammals’ syntenic region corresponding to the Monotreme GBP8 and GBP9 regions failed to reveal remnants of these genes; it also showed that these are very dynamic regions, having undergone chromosomal rearrangements that may have caused the absence of such remnants. Constraint-based tests did not reject our ML tree (Hypothesis A) or alternative hypotheses in which GBP9 and GBP8 are nested within the GBP1–7 lineage (Hypothesis B) or form a sister clade to GBP1–7 (Hypothesis C). These results suggest that the current dataset does not have enough phylogenetic signal to conclusively resolve the placement of GBP8 and GBP9 (Additional file 1: Table S2). Additional data, broader taxonomic sampling, or complementary genomic context (e.g. synteny or functional divergence) may be required to clarify their evolutionary history.

Fig. 5figure 5

Schematic representations of the possible evolution of GBPs in mammals. Integrating our ML and synteny analysis results, we tested three putative evolutionary models that could explain the origin and evolution of GBPs in mammals. A Hypothesis A—the mammals’ GBP ancestor sequentially duplicated, originating the GBP8, GBP9 and GBP1/2/3/5 and GBP4/6/7. Monotremes maintained these four genes, but GBP8 and GBP9 were lost in Marsupials and Placentals. B Hypothesis B—the mammals’ GBP ancestor first duplicated into the GBP1/2/3/5 and GBP4/6/7 ancestors. In Monotremes, the GBP1/2/3/5 and GBP4/6/7 were maintained, and two new genes emerged, GBP8 and GBP9, in a different chromosomal location. C Hypothesis C—the GBP ancestor duplicated into the GBP8/9 and GBP1/2/3/5/4/6/7 ancestors, which duplicated originating GBP8, GBP9, GBP1/2/3/5 and GBP4/6/7. Monotremes maintained these four genes, but GBP8 and GBP9 were lost in Marsupials and Placentals

After the Marsupials and Placentals split, GBP1/2/3/5 gave rise to Marsupials GBP1/2/3/5a, GBP1/2/3/5b and GBP1/2/3/5c and Placentals GBP1, 2 and 5, with GBP3 arising only in Primates as described elsewhere [22]. A similar pattern is observed for the GBP4/6/7 gene. For the Marsupials GBP4/6/7 group, despite having undergone several duplication processes, the division of this group, as placentals GBP4, 6 and 7, is not possible since no clear gene clusters appear. Instead, it clusters by species, indicating that all Marsupial GBP4/6/7 are in the same family of genes. In Placentals, this group differentiated into two genes, GBP4 and 6, with GBP7 duplicating from GBP4 in primates as previously described [22]. Interestingly, Monotreme GBP8 has a carboxy-terminal CaaX motif, comparable to GBP1, 2 and 5 (Fig. 5). This suggests the CaaX motif might have been present in the mammals’ GBP ancestor gene and was then lost in three independent lineages, Monotremes GBP9, GBP4/6/7 and Marsupial GBP1/2/5b. Since Marsupial GBP1/2/5b is present in both Marsupial superorders, it seems that the loss of the CaaX motif occurred in the common ancestor and that the gene might have been deleted from some Marsupial species. Marsupial GBP1/2/5b was found to be the least expressed gene in Monodelphis domestica. Altogether, these findings suggest that Marsupial GBP1/2/3/5b could be a gene in the process of being deleted or pseudogenized from the genome of all Marsupials.

The model of birth-and-death evolution was proposed [29] to explain the evolution of multigene families that did not fit the concerted evolution model. According to this model, when duplication events give rise to new copies of a gene, and these persist in the genome for long periods of time, divergence will lead to different fates. Some copies may diverge and remain functional, while others will diverge more and split functions (subfunctionalization), acquire new functions (neofunctionalization) or become pseudogenes if deleterious mutations occur, which can either be deleted or maintained in the genome [27, 29]. As previous studies have demonstrated, the evolutionary history of the GBP multigene family in mammals is complex, with duplications, deletions and neofunctionalization in several groups [22,23,24,25]. The GBP3 gene is a Simiiformes-specific GBP, emerged through a duplication of GBP1 [22] and gained a new function: regulation of caspase-4 activation [30]. Other genes have emerged through duplication, such as Primates GBP7 [22], Rodents GBPa, b, c and d [23], Leporidae GBP5 and GBP4 multiple copies [25], and Bats GBP6a and 6b [24], though the function of these new genes is yet to be studied. In contrast, genes have been deleted from some mammalian lineages, such as GBP4 and GBP5 which have been deleted from the genomes of Old-World monkeys, GBP6 has been deleted from the Lagomorphs genome, GBP1 which is not present in the Muroids genome, and GBP2 has been lost in several bat families [22,23,24,25]. Given this context, it is not entirely surprising to find new genes in the phylogenetically more basal mammals. Given the divergence between GBP8 and 9 and the other GBPs, it is possible that these two genes have acquired new functions.

In each mammal group, the evolution of GBPs presents a pattern of gene gain and loss and the mechanisms that regulate this are not yet understood. It is possible that evolutionary pressures due to the environment and invading pathogens could dictate the birth and death of genes. Despite this, GBPs share common features and their functions seem to be conserved in some mammalian species (i.e. human, mice, Tupaia and Oryctolagus cuniculus) [18, 25]. The N-terminal GTPase domain is the most conserved domain and includes five motifs, P-loop (GxxxxGKS/T), switch I (T), switch II (DxxG), T(V/L)RD motif and the guanine cap, involved in GTP binding/orientation and GTP hydrolysis [44, 49–51). These can be found in several Mammal species (human, pig, mouse, Tupaia and O. cuniculus) and can also be found in Marsupials and Monotremes, hinting at shared common functions. Additionally, GBP1, 2 and 5 present the CaaX motif which allows isoprenylation and consequently target GBPs to intracellular membranes [52]. The CaaX can be found also in Monotreme GBP1/2/3/5 and GBP8 and Marsupial GBP1/2/3/5 a and b, indicating the ability of these proteins to anchor to membranes and enabling the destruction of pathogen-containing vacuoles (mainly bacteria) [9, 18]. Moreover, Danio rerio GBP3 and 4 have been described as having physical and functional interactions with inflammasome similar to human GBPs [3].

GBPs are ubiquitously expressed in humans, mice, Tupaia and O. cuniculus [3, 25, 26], the same is observed in Opossum. It was demonstrated that the expression of GBPs is increased upon stimulation by IFN (human, mice, Tupaia and O. cuniculus) indicating a function related to the cell-autonomous innate immune response by providing defence against a broad range of invading pathogens. Indeed, it was described that mouse and human GBPs offered protection against bacteria (L. monocytogenes and M. bovis) and protozoan pathogen T. gondii [53]. Additionally, GBPs promote antiviral activities against a broad range of viruses interfering with viral replication through different mechanisms (recently reviewed [18, 53]). It was also observed that GBP5 from O. cuniculus was able to inhibit furin activity, similarly to hGBP2/5 [25, 54], suggesting that ocGBP5 could interfere with the proper cleavage of virus glycoproteins (HIV, influenza A, zika virus and measles).

It seems that GBPs have consistent features across several species; however, newly formed genes could gain new functions, as observed in hGBP3 [30]. This suggests that each GBP could gain a specific function, and it would be interesting to investigate the functions of GBPs 8 and 9 in Monotreme since they are not found in any other mammal.

Overall, it seems that the evolution of GBPs does not influence the functions related to innate immunity (i.e. GTPase activity and IFN inducibility), remaining conserved across different families from fish to mammals, suggesting that Marsupial and Monotreme GBPs could have a role in the innate immunity. However, we cannot discard the possibility of some GBPs presenting specific functions to Marsupials and Monotremes. As such it is paramount to perform a functional characterisation of these proteins to further understand their function.

GBP expression in laboratory opossum and platypus

Expression profile of GBPs has been described in humans, mouse, European rabbit and Tupaia [20, 25, 26, 40, 55]. We examined the expression profile of 19 adult tissues, pooled newborn opossum, and pooled foetal tissues at terminal gestation in M. domestica. We observed that GBPs are ubiquitously expressed in 19 different adult tissues, similarly to human, mouse, European rabbit and Tupaia [20, 25, 26, 40, 55]. However, no comparison regarding each GBP group can be made. It should be noted that this publicly available M. domestica transcriptome dataset used here was generated from normalised total RNA libraries generated as part of the opossum genome project [56]. Normalised libraries allow the discrimination between the presence or absence of any particular transcript; however, relative or quantitative information on the level of transcription is not available from this dataset.

Opossums, like all Marsupials, are born highly altricial, at a state similar to a foetal human or mouse [57]. Initiation of T cell development in the opossum can be first detected in the last day of gestation [58] and mature T cells are not detected until postnatal day 2 [59]. Likewise, B cell development occurs entirely postnatally with mature B cells undetectable until the end of the first postnatal week [60]. The opossum adaptive immune system, therefore, develops postnatally. The same is likely true for all Marsupials and it has long been established that newborn Marsupials are born “immune-incompetent” and fail to mount significant adaptive immune responses until they are often 2 to 3 weeks old [61, 62]. The nature of their short gestation periods followed by postnatal “foetal-like” development clearly leaves the newborn Marsupial highly dependent on both maternal immunity as well as their own innate immune system. It is noteworthy, therefore, that we found GBP transcripts in late foetal opossum tissues. The diversity of transcripts was particularly broad as well in newborn opossums. These results are consistent with GBP being available to the newborn opossum at birth to contribute to the generation of innate-like immune responses.

In Platypus, the expression of GBPs in the six analysed tissues was lower than that observed for Placentals and Marsupials. A study of individual gene expression shifts in six tissues for mammalian species (including primates, mouse, opossum and platypus) showed that the Platypus has the highest number of gene expression switches in most organs [60], which can help to explain the observed pattern. Furthermore, shifts in gene expression are often larger after gene duplication, and accompanied by accelerated or decelerated rates of protein evolution, supporting the idea that gene duplication tends to free genes up for regulatory or structural functional divergence, and sometimes both [61]. These features could be related to the birth and death model evolution that drives the evolutionary patterns of the GBP genes, originating subfunctionalisation or neofunctionalisation.