我正在将系统发育树读入 R
library(ape)
library(geiger)
library(caper)
taxatree <- read.tree("newicktest.tre")
LWEVIYRcombodata <- read.csv("LWEVIYR.csv")
LWEVIYRcombodataPGLS <-data.frame(LWEVIYRcombodata$Sum.of.percentage,OGT=LWEVIYRcombodata$OGT, Species=LWEVIYRcombodata$Species)
comp.dat <- comparative.data(taxatree, LWEVIYRcombodataPGLS, "Species")
但是我收到以下错误消息:
Error in comparative.data(taxatree, LWEVIYRcombodataPGLS, "Species") :
No tips are common to the dataset and phylogeny
我的树是这样的:
(('Nanoarchaeum equitans':4.0,('Aeropyrum pernix':4.0,'Pyrobaculum aerophilum':4.0,('Sulfolobus tokodaii':4.0,'Sulfolobus solfataricus':4.0,'Sulfolobus acidocaldarius':4.0):4.0):4.0,('Methanopyrus kandleri':4.0,('Methanothermobacter thermautotrophicus':4.0,'Methanosphaera stadtmanae':4.0):4.0,('Picrophilus torridus':4.0,('Thermoplasma volcanium':4.0,'Thermoplasma acidophilum':4.0):4.0):4.0,('Thermococcus kodakarensis':4.0,('Pyrococcus horikoshii':4.0,'Pyrococcus abyssi':4.0,'Pyrococcus furiosus':4.0):4.0):4.0,('Natronomonas pharaonis':4.0,'Haloarcula marismortui':4.0):4.0,'Archaeoglobus fulgidus':4.0,(('Methanococcoides burtonii':4.0,('Methanosarcina acetivorans':4.0,'Methanosarcina mazei':4.0,'Methanosarcina barkeri':4.0):4.0):4.0,'Methanospirillum hungatei':4.0):4.0,('Methanococcus maripaludis':4.0,'Methanocaldococcus jannaschii':4.0):4.0):4.0):4.0,('Candidatus Koribacter versatilis Ellin345':4.0,'Fusobacterium nucleatum subsp. nucleatum ATCC 25586':4.0,'Aquifex aeolicus':4.0,('Trichormus variabilis':4.0,('Thermosynechococcus elongatus':4.0,'Synechococcus elongatus':4.0):4.0):4.0,'Thermotoga maritima':4.0,('Mesoplasma florum':4.0,('Ureaplasma urealyticum':4.0,('Mycoplasma penetrans':4.0,'Mycoplasma mobile':4.0,'Mycoplasma synoviae':4.0,'Mycoplasma pulmonis':4.0,'Mycoplasma pneumoniae':4.0,'Mycoplasma hyopneumoniae':4.0,'Mycoplasma genitalium':4.0,'Mycoplasma gallisepticum':4.0):4.0):4.0):4.0,('Bifidobacterium longum':4.0,'Thermobifida fusca':4.0,('Streptomyces avermitilis':4.0,'Streptomyces coelicolor':4.0):4.0,'Cutibacterium acnes':4.0,('Nocardia farcinica':4.0,('Mycobacterium tuberculosis':4.0,'Mycobacterium leprae':4.0,'Mycobacterium bovis':4.0,'Mycobacterium avium':4.0):4.0,('Corynebacterium efficiens':4.0,'Corynebacterium jeikeium':4.0,'Corynebacterium glutamicum':4.0,'Corynebacterium diphtheriae':4.0):4.0):4.0,'Leifsonia xyli':4.0):4.0,((('Caldanaerobacter subterraneus':4.0,'Carboxydothermus hydrogenoformans':4.0,'Moorella thermoacetica':4.0):4.0,('Desulfitobacterium hafniense':4.0,'Symbiobacterium thermophilum':4.0,('Clostridium tetani':4.0,'Clostridium perfringens':4.0,'Clostridium acetobutylicum':4.0):4.0):4.0):4.0,((('Lactobacillus johnsonii':4.0,'Lactobacillus salivarius':4.0,'Lactobacillus sakei':4.0,'Lactobacillus plantarum':4.0,'Lactobacillus acidophilus':4.0):4.0,'Enterococcus faecalis':4.0,('Lactococcus lactis subsp. lactis':4.0,('Streptococcus pyogenes':4.0,'Streptococcus pneumoniae':4.0,'Streptococcus agalactiae':4.0,'Streptococcus mutans':4.0,'Streptococcus thermophilus':4.0):4.0):4.0):4.0,(('Listeria innocua':4.0,'Listeria monocytogenes':4.0):4.0,('Oceanobacillus iheyensis':4.0,'Geobacillus kaustophilus':4.0,('[Bacillus thuringiensis] serovar konkukian':4.0,'Bacillus halodurans':4.0,'Bacillus clausii':4.0,'Bacillus subtilis':4.0,'Bacillus licheniformis':4.0,'Bacillus cereus':4.0,'Bacillus anthracis':4.0):4.0):4.0,('Staphylococcus aureus subsp. aureus MRSA252':4.0,'Staphylococcus saprophyticus':4.0,'Staphylococcus haemolyticus':4.0,'Staphylococcus epidermidis':4.0):4.0):4.0):4.0):4.0,('Chlorobaculum tepidum TLS':4.0,'Pelodictyon luteolum':4.0):4.0,('Salinibacter ruber':4.0,('Porphyromonas gingivalis':4.0,('Bacteroides thetaiotaomicron':4.0,'Bacteroides fragilis':4.0):4.0):4.0):4.0,('Chlamydia pneumoniae AR39':4.0,'Chlamydia muridarum':4.0,'Chlamydia trachomatis':4.0):4.0,(('Deinococcus geothermalis':4.0,'Deinococcus radiodurans':4.0):4.0,'Thermus thermophilus':4.0):4.0,('Leptospira interrogans':4.0,(('Borreliella bavariensis':4.0,'Borreliella burgdorferi B31':4.0):4.0,('Treponema pallidum':4.0,'Treponema denticola':4.0):4.0):4.0):4.0,('Bdellovibrio bacteriovorus':4.0,('Caulobacter vibrioides CB15':4.0,('Ruegeria pomeroyi':4.0,'Rhodobacter sphaeroides':4.0):4.0,('Rickettsia typhi':4.0,'Rickettsia prowazekii':4.0,'Rickettsia conorii':4.0):4.0,('Erythrobacter litoralis':4.0,('Novosphingobium aromaticivorans':4.0,'Zymomonas mobilis':4.0):4.0):4.0,(('Magnetospirillum magneticum':4.0,'Rhodospirillum rubrum':4.0):4.0,'Gluconobacter oxydans':4.0):4.0,('Brucella melitensis':4.0,('Bartonella henselae':4.0,'Bartonella quintana':4.0):4.0,'Mesorhizobium loti':4.0,('Rhodopseudomonas palustris':4.0,('Nitrobacter winogradskyi':4.0,'Nitrobacter hamburgensis':4.0):4.0,'Bradyrhizobium japonicum':4.0):4.0,('Rhizobium etli':4.0,'Sinorhizobium meliloti':4.0,'Agrobacterium tumefaciens':4.0):4.0):4.0):4.0,(('Chromobacterium violaceum':4.0,('Neisseria meningitidis':4.0,'Neisseria gonorrhoeae':4.0):4.0):4.0,('Thiobacillus denitrificans':4.0,('Nitrosospira multiformis':4.0,'Nitrosomonas europaea':4.0):4.0,'Methylobacillus flagellatus':4.0):4.0,('Rhodoferax ferrireducens':4.0,('Bordetella pertussis':4.0,'Bordetella parapertussis':4.0,'Bordetella bronchiseptica':4.0):4.0,('Cupriavidus metallidurans':4.0,'Burkholderia thailandensis':4.0,'Ralstonia solanacearum':4.0):4.0):4.0):4.0,(('Hahella chejuensis':4.0,'Chromohalobacter salexigens':4.0):4.0,'Legionella pneumophila subsp. pneumophila':4.0,'Saccharophagus degradans':4.0,('Francisella tularensis subsp. tularensis SCHU S4':4.0,'Hydrogenovibrio crunogenus':4.0):4.0,'Nitrosococcus oceani':4.0,('Pasteurella multocida':4.0,('[Haemophilus] ducreyi 35000HP':4.0,'Haemophilus influenzae':4.0):4.0):4.0,('Aliivibrio fischeri ES114':4.0,'Photobacterium profundum':4.0,('Vibrio vulnificus':4.0,'Vibrio parahaemolyticus':4.0,'Vibrio cholerae':4.0):4.0):4.0,('Photorhabdus luminescens':4.0,('Sodalis glossinidius':4.0,'Pectobacterium atrosepticum':4.0):4.0,('Yersinia pseudotuberculosis':4.0,'Yersinia pestis':4.0):4.0,('Salmonella enterica subsp. enterica serovar Typhi str. CT18':4.0,('Shigella sonnei':4.0,'Shigella flexneri':4.0,'Shigella dysenteriae':4.0,'Shigella boydii':4.0):4.0,'Escherichia coli':4.0):4.0):4.0,'Methylococcus capsulatus':4.0,('Xylella fastidiosa':4.0,('Xanthomonas oryzae':4.0,'Xanthomonas citri':4.0,'Xanthomonas campestris':4.0):4.0):4.0,('Psychrobacter arcticus':4.0,('Pseudomonas protegens':4.0,'Pseudomonas savastanoi':4.0,'Pseudomonas putida':4.0,'Pseudomonas aeruginosa':4.0):4.0):4.0,('Idiomarina loihiensis':4.0,('Shewanella denitrificans':4.0,'Shewanella oneidensis':4.0):4.0,'Colwellia psychrerythraea':4.0,'Pseudoalteromonas haloplanktis':4.0):4.0):4.0,(('Sulfurimonas denitrificans':4.0,'Wolinella succinogenes':4.0,('Helicobacter hepaticus':4.0,'Helicobacter pylori':4.0):4.0):4.0,'Campylobacter jejuni':4.0):4.0,('Anaeromyxobacter dehalogenans':4.0,'Desulfotalea psychrophila':4.0,('Desulfovibrio alaskensis':4.0,'Desulfovibrio vulgaris':4.0):4.0,(('Geobacter sulfurreducens':4.0,'Geobacter metallireducens':4.0):4.0,'Pelobacter carbinolicus':4.0):4.0):4.0):4.0):4.0);
我的输入数据的一个子集如下所示:
+-------------------------------+-----+-------------------+
| Species | OGT | Sum of percentage |
+-------------------------------+-----+-------------------+
| Aeropyrum pernix | 95 | 46.3165467333 |
| Argobacterium fabrum | 26 | 39.0114463099 |
| Anaeromyxobacter dehalogenans | 27 | 40.7932155627 |
| Aquifex aeolicus | 85 | 45.4972652338 |
| Archaeoglobus fulgidus | 83 | 44.7570927331 |
| Bacillus anthracis | 30 | 40.9076162356 |
| Bacillus cereus | 30 | 40.8716699079 |
| Bacillus clausii | 30 | 40.3212556402 |
+-------------------------------+-----+-------------------+
我知道一些标签可能有点错误,但这不应该告诉我它们都不是。
有趣的是,当读入 R 时,系统发育树的提示标签没有任何空格:
'Anaeromyxobacterdehalogenans''Aeropyrumpernix'
虽然您可以在上面看到树文件中有空格。我不想编辑所有的 csv 文件(我有大约一百万个),有没有办法在树被读入 R 后编辑树尖标签,或者这个问题的另一个解决方案?
谢谢,
正如评论中提到的 patL,答案是使用 gsub,
LWEVIYRcombodataPGLS$Species<-gsub(" ", "", LWEVIYRcombodataPGLS$Species)
本文收集自互联网,转载请注明来源。
如有侵权,请联系[email protected] 删除。
我来说两句