您好,欢迎来到钮旅网。
搜索
您的当前位置:首页GENEVESTIGATOR. Arabidopsis microarray database and

GENEVESTIGATOR. Arabidopsis microarray database and

来源:钮旅网
Bioinformatics

GENEVESTIGATOR.ArabidopsisMicroarrayDatabaseandAnalysisToolbox1[w]

PhilipZimmermann2,MatthiasHirsch-Hoffmann2,LarsHennig,andWilhelmGruissem*InstituteofPlantSciences,SwissFederalInstituteofTechnologyandZurich-BaselPlantScienceCenter,ETHCenter,CH–8092Zurich,Switzerland(P.Z.,M.H.-H.,L.H.,W.G.);andFunctionalandGenomicsCenterZurich,UNIIrchel,Y32H52,CH–8057Zurich,Switzerland(W.G.)

High-throughputgeneexpressionanalysishasbecomeafrequentandpowerfulresearchtoolinbiology.Atpresent,however,fewsoftwareapplicationshavebeendevelopedforbiologiststoquerylargemicroarraygeneexpressiondatabasesusingaWeb-browserinterface.WepresentGENEVESTIGATOR,adatabaseandWeb-browserdatamininginterfaceforAffymetrixGeneChipdata.Userscanquerythedatabasetoretrievetheexpressionpatternsofindividualgenesthroughoutchosenenvironmentalconditions,growthstages,ororgans.Reversely,miningtoolsallowuserstoidentifygenesspecificallyexpressedduringselectedstresses,growthstages,orinparticularorgans.UsingGENEVESTIGATOR,thegeneexpressionprofilesofmorethan22,000Arabidopsisgenescanbeobtained,includingthoseof10,600currentlyuncharacterizedgenes.Theobjectiveofthissoftwareapplicationistodirectgenefunctionaldiscoveryanddesignofnewexperimentsbyprovidingplantbiologistswithcontextualinformationontheexpressionofgenes.Thedatabaseandanalysistoolboxisavailableasacommunityresourceathttps://www.genevestigator.ethz.ch.

Amajorchallengeinbiologytodayisthelarge-scaledeterminationofgenefunction(Boyesetal.,2001).First,theestablishmentofstandardsandcontrolledvocabulariesfacilitatestheintegrationofexperimentaldataintoacomputationalframework,therebyallow-ingstructuredandsystematicprocessingofinfor-mation(Ashburneretal.,2000;Brazmaetal.,2001).Second,structureddatabasesanddataqueryingtoolsprovidethemeanstoassignputativefunctionalin-formationtogenes.

ThecompletesequencingoftheArabidopsisge-nomeachievedintheyear2000(TheArabidopsisGenomeInitiative,2000)enablesustomonitorgeneexpressionofthisfloweringplantonagenome-scaleusingmicroarrays.Insitusynthesisofhigh-densityoligonucleotidesonglassslides(Lockhartetal.,1996)hasbecomeapowerfultooltorapidlyintegratethesequenceknowledgeintoexpressionprofilingplat-forms,suchastheATH1fullgenomearraydevelopedbyAffymetrixandTheInstituteforGenomicResearch(TIGR),whichrepresentsapproximately23,750genesfromArabidopsis(Redmanetal.,2004).Theavailabil-ityofafull-genomearrayandthecompletetechnicalenvironmentprovidedbytheAffymetrixsystemledtoawideuseoftheGeneChiptechnologyintheplantcommunity.Thousandsofarrayshavesincebeen

ThisworkwassupportedbyETH,StrategicExcellenceProject2–74213–02/TH–8/02–2,andbytheFunctionalGenomicsCenterZurich.2Theseauthorscontributedequallytothepaper.

*Correspondingauthor;e-mailwilhelm.gruissem@ipw.biol.ethz.ch;fax41–1–632–10–79.[w]TheonlineversionofthisarticlecontainsWeb-onlydata.www.plantphysiol.org/cgi/doi/10.1104/pp.104.046367.

1processed,ofwhichasignificantnumberarepubliclyavailablethroughservicesandrepositoriessuchasNottinghamArabidopsisStockCentreTranscrip-tomicsService(NASCArrays;Craigonetal.,2004),ArrayExpressattheEuropeanBioinformaticsInstitute(EBI;Brazmaetal.,2003),orGeneExpressionOmni-bus(GEO)attheNationalCenterforBiotechnologyInformation(NCBI;Edgaretal.,2002).

Theexploitationoflarge-scalegeneexpressiondata-sets,mainlyfromSaccharomycescerevisiaeandEscheri-chiacoli,hasalreadyledtothediscoveryofglobalstructuresgoverningmetabolicandregulatorynet-works(Leeetal.,2002;Ravaszetal.,2002;Stellingetal.,2002;Ihmelsetal.,2004).Multiple-genomecompar-isonshavealsoyieldedinterestingobservationsonthemodularityandconnectivitydistributionsofgeneexpressiondata(Bergmannetal.,2004).Nevertheless,thecombinationofmultipledatasetsstillraisesanum-berofquestionsconcerningtheircompatibility,inparticularwhencomparingdatafromdifferentplat-formsandorganisms.Whileanalysesrevealingglobalpropertiesofnetworksormodulesmaynotnecessar-ilyrequirefullcompatibilityofexpressiondatasets,thedetailsareoftennoisy(Friedman,2004)andthecomparativesearchforthefunctionofindividualgenesrequiresamorestringentselection.

TheAffymetrixplatformprovidesastandardizedsystemwithahighdegreeofreproducibility(Hennigetal.,2003;Redmanetal.,2004).Althoughdatafromdifferentexperimentsmaynotbepooledforarigorousexpressionprofilinganalysis,onecanassumethatthelarge-scalecombinationandanalysisofexpressiondatafromasingleorganismusingasingleplatformliketheAffymetrixsystemallowstheidentificationofbiologicallymeaningfulexpressionpatternsof

2621

PlantPhysiology,September2004,Vol.136,pp.2621–2632,www.plantphysiol.orgÓ2004AmericanSocietyofPlantBiologists

Zimmermannetal.

individualvelopedgenes.Todate,fewtoolshavebeende-databases.for(yMGV)ThebiologistsYeasttoMicroarrayquerylargegeneGlobalexpressionanalysisgenes2004).amongofistranscriptionaladatabaseprovidingonlinetoolsViewerforthe82differentexpressiondatasets(Lelandaisprofilesofetyeastal.,etmicroarrayal.,In2004)theplantprovidescommunity,aNASCArrays(Craigonminingdataandsomerepositorysimple‘‘gene-centric’’forArabidopsisdataGENEVESTIGATORHere,tools.

wedescribeanovelonlinetooldatabasecomprisingageneexpressioncalledfunctionalitiesanddiscovery.developedanumbertooffacilitatequeryinggeneandfunctionalanalysispresentedorgan,inGENEVESTIGATORthecontextofallowsthedatatobevidualinggenesandenvironmentalplantdevelopment,plantorconditions,bothforindi-genequestionssuchforfamiliesas‘‘inwhichofgenes,growththerebyanswer-specificallyofintereststageismyofexpressedexpressed?’’or‘‘whichgenesaregenethesoftwareistoassigninroots?’’contextualThemainobjectiveexperimentsexpressionandgenedata,functionaldirectingdiscovery.

thedesigninformationofnewtoRESULTS

DatabaseConceptandSoftwareDesign

friendlyGENEVESTIGATORwasconceivedasaanalysis.onlineandItconsiststoolofforalarge-scaleexpressionuser-data(PHPaWebserverapplicationMySQLprogrammedrelationalindatabasethePHPdatabaseHypertextexperimentalworksPreprocessor)asa‘‘datawarehouse’’scriptinglanguage.containingTheasanalysiswellasdiverseandannotationtablesfordata,preprocesseddata,usingRawexperimental(Fig.1).

controlofworkflowanddatafromusersis(TGT)AffymetrixvaluesofMAS5.0softwaretoatargetprocessedvalueGeneChipare1,000collected(Liuetal.,for2002).SignalintensitiesandPcanArrayExpressbeimportedarray.Alternatively,eachhybridizedAffymetrixfromdataandannotationet(Brazmaetpublical.,2003)repositoriesandGEOsuch(Edgarassets)al.,theirto2002).ArabidopsisTheassignmentlocusofarrayelements(probesetsannotationsisbasedidentifiersonregularly(AGIupdatedcodes)data-andsourceobtainedfromtheArabidopsisInformationRe-home/tair/Microarrays/Affymetrix/;(TAIR)ftpserver(ftp://ftp.arabidopsis.org/Aprilcurrentlyannotation5,2004,2004]).releasebasedonthefinalArabidopsisgenomeasofgenesInadditiontofromprobeTIGRsets[versionrepresenting5.0,Januaryarrays(endingorincludenonunique‘‘_at’’),theprobeATH1andAGGeneChipuniquemultiplemorecloselysetsrepresentingtwo‘‘_x_at’’;forcross-hybridizingrelatedgenes(ending‘‘_s_at’’)ordetails,seeRedmanprobeetal.,2004).setsAlthough

(ending2622

Figure1.ConceptanddesignofGENEVESTIGATOR.Theexperi-mentersubmitsRNAprofilingdatatothedatabasecurator,whoprocessesthedataanduploadsittothedatabase.ThedatawarehousecontainsrawsignalintensityandPvalues,aswellaspreprocessedtables.AWebserverapplicationactsasaninterfacebetweenusersandtheGENEVESTIGATORdatabase.

theseonlyprobesettypesrepresenttwoormoregenes,TheseoneGENEVESTIGATORambiguouslocusidentifierprobeisdisplayedperprobeset.totodrawsetstheareattentionhighlightedoftheuserinstructuredThethisexperimentissue.

annotationiscurated,entered,anduniqueenvironmental(e.g.ingrowtheitherstage),hierarchical(e.g.plantorgans),signedcondition).Theormulti-selectsoftwarehasformbeen(e.g.de-theseforeasyadditionsofnewannotationsinanyofingannotationtoolsformatstoandforrapidcreationofthecorrespond-providedofanalyzearrayswasandbasedvisualizethedata.Theinformationbyusersorpublicrepositories.ontheinformationMissingspondingcalculations.arraysdoesarenotimpacttheresults,asthecorre-wereAmbiguousnotincludedorunsuitableintotheannotationsrespectiveextractedfurtherrosettefromignored.wholeForexample,arraysfromRNAtoolsleaves,andinflorescence)adultplantsare(includingunsuitableroots,forandcalculations,arerelatingthereforetoplantorganspecificity(GeneAtlas)suchtheasGenebutChronologer.maynotincludedbeproperintoEachtoolfortheusecorrespondingthereforeinothertoolscessing,bestrespectiveavailablesourcesofdataaccessesforpro-separately.DatafromwhiletheunsuitableATH1anddataAGisignored.

arearrayusedDifferentsetsofoligonucleotidearraysaresequencesprocessedprobetypes,toandprobeidenticaltargetgenesonthetwohybridizationhybridizationthusanddifferentnontargetefficienciestooftargettointensitiesproducibilityimpossible.makesadirectcomparisonprobeofsignalcross-byprobeboththeATH1wasfoundAlthoughandforthemostahighdegreeofre-AGarrays,targetgenesprobeddifferingsetresultsforidentical(Hennigtargetetal.,genes2003).

yielded300stronglypairsofPlantPhysiol.Vol.136,2004

availableAsofJulycoveringdata2004,fromthe750databaseATH1containedpubliclyLaboratory81and121AGarrays2003;(http://www.pb.ethz.ch;publicexperimentsfromtheGruissem2004),Hennig(http://www.fgcz.ethz.ch),theFunctionaletal.,Genomics2004;KleffmannMengesetetal.,al.,ssbdjc2.nottingham.ac.uk/narrays/experimentbrowse.NASCArraysCenter(http://Zurichpl;www.ebi.ac.uk/arrayexpress/;Craigonetal.,2004),ArrayExpressandgov/geo/;fromGEOatBrazmaatetEBIal.,(http://2003),demicGENEVESTIGATOREdgaretNCBIal.,(http://www.ncbi.nlm.nih.is2002).

freelyaccessibletoallpresentinstitutions.data,bothpubliclySinceavailabletheasdatabasecontainsaca-atmanagementwehavewellasconfidentialAllsystemimplementedadualuserprofileloginusersarethereforeforaskedpublictoregisterandprivateonceusers.offoreachsession.WelimitthecollectionandandusetoadministerpersonalGENEVESTIGATOR.theinformationdatabaseandtowhatimproveisnecessarythetosharedwiththirdparties.

PersonalinformationutilityisnotofAnalysisTools

twoTheingtypesGENEVESTIGATORofqueries:agene-centrictoolsgenerallyapproachcontainreport-afulfillinggenome-centricsignalintensityapproachvaluesforprovidingindividuallistsgenes,andanyvaluestoolchosenarebasedcriteria.Theresultsobtainedofgenesfromcases,thepresent/absentandthecorrespondingonallavailablecallinformationannotations.signalintensityasdefinedInsomebysignalTheMAS5.0firstalgorithmisindicated(seebelow).

selectionintensitytool,valuesDigitalofNorthern,inputgeneswillretrievethelectionthosesuchexperimentstoolof(Fig.GeneChip2A)experiments.Anforelaborateachosense-thatallowsfitsingletheuserortochooseexactlyfactors.astaneously,Upanatomy,filling,displayedto10probegrowthstage,ormultipleenvironmentalcriteriainsetsseveralcanbecolors,processedshapes,simul-andpresentrevealingsymbols)call(closedbothsymbols)signalandintensityabsentvaluescall(openandintensityTheGeneinformationCorrelator(Fig.allows2B).

comparingthesignalexperimentsvaluesDigital(Fig.of2C;twoidenticalgenesthroughoutselectiontoolallchosenandthecanNorthern).beidentifiedEachbyspotmouse-overrepresentsorbyaGeneChipasforlinkingtocoefficientannotationdatabase.ThePearson’scorrelationbetweeninformationexpressionisgivenassignalsameasureoftwofortherelationshipcontextualBecausetheisvisualizedbyacolorgenes.codingPresent(Fig.2C).calladditionallyinformationobjectiveforofthethesoftwareexpressionwastoprovidethreementalmainstage,annotationfocusedonrelatinggeneexpressionofgenes,wetoandenvironmentalgroups:plantstress.

organ,develop-PlantPhysiol.Vol.136,2004

GENEVESTIGATOR

signalTheorintensityGeneAtlasvaluestoolofsimilarlyageneprovidestheaverageversely,tissuesannotatedinthedatabaseofinterest(Fig.inall2D).organsRe-forGENEVESTIGATORcanoutputlistsofgenesinwhichsignalintensitiesexceedachosenthreshold(Fig.selectedpreferentially2E).Thisorgansallowsversususersabaselinetofindchoiceoforgansroots,tionyoungleavesincertainorstamina.organsorgenestissues,expressedsuchasbytontology.org/)thewasPlantbasedOntologyonstandardanatomyTheanatomytermsasannota-defined(callus,sette,cellsuspension,thatweConsortiumclassified(http://www.plan-seedling,intosixmaingroupsTheseandisolatedcategoriesroots)coverandtheinflorescence,ro-alltissuescorrespondingthatsubgroups.extendedforbecomeasexpressiontissueandanalysis,butcancancurrentlyeasilybebegrowthTheGenemoreChronologerprecise(Birnbaumcellseparationtechniquestool,basedetal.,2003).

twosignalmainstagefeatures.ontologyFirst,(Boyesetal.,2001),onthepossessesBoyesalifegeneintensities(orexpressionitoutputslevels)theandaverageSEsofquerycycleofinterestofArabidopsisfor10representative(Fig.2F).Second,sectionsoftheaboveexample,athegivendatabasethresholdtooutputuserscanatallgenesexpressedsignaltheeachsumintensityallgenescanbechosenselectedgrowthforstages.whichFortheofallataveragetheseedlingsignalstageintensityexceedsvalues90%foroflifecyclecategory,ofthemeasuredplant(Fig.for2G).

thisgenethroughoutthetionalitiesTheResponseonasGeneViewerAtlasandtoolGeneprovidesChronologer,thesamefunc-eachstressmentscondition,responseoneannotationsor(Fig.2,HandI).basedForcorrespondingwerechosen.ingdirectcomparison.

controlEachseveralfromstressthesefactorrepresentativeexperiments,isgivenwithexperi-allow-thestudyTheMeta-Analyzerutilityhasbeendesignedtosimultaneouslythegeneexpressionprofilesofseveralgenesstresses,oforgans,inthecontextofenvironmentalsemi-colon-,genescanbeenteredandgrowthindiversestagesformats(Fig.2,J–L).Liststurn,Thelinefeed],ororspace-separated,directlycopiedfromCRLFa[carriage(comma-,re-tensityoutputpage)valuesis(seeaheatDocumentationmapofnormalizedspreadsheet).signalin-linkageclusteredusefulhierarchicalbyeithersectiononourWebclustering.single,average,orcompleteidentifytoclusterscomparemembersofThisgenetoolfamiliesisespeciallyandtoprovideFinally,perimentsuserstheDatabaseofsimilarlyexpressedgenes.

withannotationandDocumentationinformationsectionsmationwasrepository,conceived(Fig.in2,theMdatabase,aswellastechnicalaboutinfor-ex-toandbeanN).analysisSinceGENEVESTIGATORtoolandnotadataTheMicroarrayfullMIAMEareduced(MinimumsetofannotationsInformationisstoredAboutlocally.(Brazmaetal.,Experiment)2001)areavailablecompliantbylinkingannotationsatothe

2623

Zimmermannetal.

Figure2.ScreenshotsofsomeofthefeaturesofGENEVESTIGATOR.Topleft,Logoandavailabletools.A,ChipSelectiontool;B,DigitalNorthern;C,GeneCorrelator;DandE,GeneAtlas(relatestoplantanatomy);FandG,GeneChronologer(relatestotheplantgrowthstages);HandI,ResponseViewer(relatestoenvironmentalfactors);JtoL,Meta-Analyzer(multiplegeneanalysiswithrespecttoanatomy,growthstage,andenvironmentalfactors);MandN,Databasetoolforviewingexperimentandarrayannotation,andDocumentationsectionforuserinformation.

2624PlantPhysiol.Vol.136,2004

GENEVESTIGATOR

correspondingrepositorysitesfromwhichtheexperi-mentsweredownloaded.

GeneralApproachandValidation

Thedatabasecontainsexpressiondatafromahighdiversityofexperimentscoveringdifferenttissues,ages,andtreatments(TableI).Thegeneralhypothesisinourapproachisthatasthenumberofexperimentspercategory(e.g.growthstage5.10)increases,in-dividualeffectsareaveragedoutandglobaltrendsbecomevisible.Asameasureofconfidencefortheexpressionofgenesindifferentcategories,weindicatetherespectivenumberofGeneChipsandtheSEofthemeanforeachcategory.

Tovalidateourhypothesis,wecheckedwhetherstronglypopulatedcategoriesyieldresultsthatareconsistentwiththeliterature.Inafirststep,weselectedanumberofmarkergeneswithpreferentialexpressioninparticularorgans,atspecificgrowthstages,orinresponsetocertainstressesandthenan-alyzedtheirexpressionpatternsgeneratedbyGENE-VESTIGATOR.Markergeneswerechosenfromtheliterature.

First,usingGeneAtlas,threeAGAMOUS-likegenesknowntobepreferentiallyexpressedinrootsas

measuredbyreversetranscription-PCR(AGL12[At1g71692],AGL14[At4g11880],andAGL17[At2g22630];Parenicovaetal.,2003)infactshowedstrongexpressioninrootsandradicle,butweakersignalsinallotherorgans(Fig.3,A–C).Twogenesassociatedwithpollentubegrowth(At1g55570,Albanietal.,1992;andAt2g25600,Moulineetal.,2002)werealsoidentifiedasbeingspecifictostamina(andbyextensiontothecategories‘‘flower’’and‘‘inflores-cence’’)inourexpressiondatabase(Fig.3,DandE).Furthermore,twogenesinvolvedinphotosynthesis(chlorophylla/bbindingproteins,At1g19150andAt3g040)werefoundtobeabundantlyexpressedingreenplanttissues(rosette,caulineleaf,stem,node,flower,cotyledon,andhypocotyl),butlowlyexpressedinphotosyntheticallyinactivetissues(roots,stamen,andseeds;Fig.3,FandG).Thispatternwasobservedforallgenesfromthechlorophylla/bbindingfamilyexceptforonegene(TAIR;http://www.arabidopsis.org/info/genefamily/Chloroplast.html;seeSupple-mentalTableII,availableatwww.plantphysiol.org).Second,toverifythereliabilityoftheGeneChronol-ogertool,welookedforgenesannotatedasbeingdevelopmentallyregulated.Twogenesinvolvedinseedgerminationandseedlingdevelopment(encodingtheembryonicabundantproteinATEM1[AT3G51810,

TableI.AnnotationcategoriesincorporatedinGENEVESTIGATORasofJuly2004

PlantTissues/Organs

DevelopmentalStages

EnvironmentalFactors

(Continued)

0Callus

1Cellsuspension2Seedling

21Cotyledons22Hypocotyl23Radicle3Inflorescence31Flower311Carpel312Petal313Sepal314Stamen315Pedicel32Silique33Seed34Stem35Node

36Shootapex37Caulineleaf4Rosette

41Juvenileleaf42Adultleaf43Petiole

44Senescentleaf5Roots

51Primaryroot52Lateralroot53RoothairRoottip

55Elongationzone

PlantPhysiol.Vol.136,2004

10CategoriesbasedontheBoyeskeyontology:A)0.10.0.70B)1.00.1.02C)1.03.1.05

D)1.06.1.08/3.20E)1.09.1.12/3.50

F)1.13/1.14/3.70/5.10G)3.90/6.00/6.10H)6.30/6.50I)6.90/8.00J)9.70

HormonesEthyleneAuxin

AbscisicacidGibberellinAtmosphereOzone

CarbondioxideIlluminationLightintensityLightDark

LightqualityFar-redBlueUVAUVBVisible

Bioticinteractions

PseudomonassyringaeGigasporarosea

AgrobacteriumtumefaciensHeteroderaschachtiiErisyphecichoracearumProgrammedcelldeathSenescenceHeatCold

2625

EnvironmentalFactorsNutrients/heavymetalsPhosphateNitrateSulfatePotassiumWaterSuc/GlcLeadZinc

Zimmermannetal.

Figure3.ValidationofthequalityofdatageneratedbyGENEVESTIGATOR.AtoG,Expressionoforganortissue-specificmarkergenesusedfortestingtheGeneAtlastool(A,AGL12,At1g71692;B,AGL14,At4g11880;C,AGL17,At2g22630;D,At1g55570;E,At2g25600;F,At1g19150;G,At3g040).HtoK,ExpressionofgrowthstagespecificmarkergenesusedtovalidatetheGeneChronologertool(H,ATEM1,At3g51810;I,At4g37580;J,APETALA1,At1g69120;K,FLOWERINGLOCUST,At1g680).LtoQ,ExpressionofenvironmentalfactorspecificmarkergenestovalidatetheResponseViewertool(L,At4g14690;M,At5g190;N,ERF1,At3g23240;O,AtERF1,At4g17599;P,AtERF2,At5g47220;Q,AtERF13,At2g44840).

Vicientetal.,2000]andageneinvolvedinapicalhookdevelopment[At4g37580,Lehmanetal.,1996])showedhighestexpressionduringmatureseedandgermi-nationstages(Fig.3,HandI),butlowerlevelsinallotherstages.Incontrast,twogenesinvolvedinflow-ering(APETALA1[At1g69120,Pelazetal.,2001]andFLOWERINGLOCUST[At1g680,Ruiz-Garciaetal.,1997])wereshowntobemostabundantlyexpressedinthefloweringstages(Fig.3,JandK).

2626

Third,theResponseViewertoolwasusedforseveralgenesknowntoberesponsivetoparticularstresses(Fig.3,L–Q).GENEVESTIGATORcorrectlyshowedtheexpressionpatternofalight-inducedgeneencodingalight-harvestingchlorophylla/bbindingprotein(AT4G14690,Janssonetal.,2000)andofthelight-repressedprotochlorophyllidereductaseAgene(At5g190,Rungeetal.,1996;Fig.3,LandM,respectively).Similarly,fourgenesreportedtobe

PlantPhysiol.Vol.136,2004

GENEVESTIGATOR

TableIIA.Representativesamplesofgenesexpressedinspecifictissuesoratparticulargrowthstages

(Tablecontinuesonfollowingpage.)

PlantPhysiol.Vol.136,2004

2627

Zimmermannetal.

TableIIB.

(Tablecontinuesonfollowingpage.)

2628PlantPhysiol.Vol.136,2004

GENEVESTIGATOR

TableIIC.

(Tablecontinuesonfollowingpage.)

responsivetoethylene(ERF1[At3g23240];AtERF1[At4g17500];AtERF2[At5g47220];andAtERF13[At2g44840])werecorrectlyfoundbythesoftwaretoberesponsivetoethyleneandtothepathogenPseudo-monassyringae,asreportedbytheauthors(Onate-SanchezandSingh,2002;Fig.3,N–Q).

Thisfirstvalidationstepconfirmsthatglobaltrendscanbedetectedintheexpressionprofilesofindividualgenesbycombiningnumerousnormalizedexpressiondatasetsusingthesametechnicalplatform,i.e.theAffymetrixsystem.Basedonthisinformation,weperformedasecondvalidationstep,inwhichwetestedwhetherGENEVESTIGATORcanidentifygeneswithknownexpressionprofiles.UsingGeneAtlas,72geneswereidentifiedtobeexpressedinpollen.Ofthese,9hadbeenidentifiedbyHonysandTwell(2003)aswellasBeckeretal.(2003)tobepollen-specificusing8KArabidopsisGenomeArrays(seeTableIIA;SupplementalTableII).Oftheremaininggenes,severalcouldbefunctionallyassociatedwithpollenbasedonannotationssuchas‘‘self-incompati-bilityprotein,’’‘‘pollencoatprotein-related,’’or‘‘al-lergen.’’Further,14geneswereannotatedas‘‘expressedprotein,’’revealingthepotentialofGENE-VESTIGATORtoidentifynovelgenesrelatedto

PlantPhysiol.Vol.136,2004

particularorgans.Asimilaranalysiswasperformedtoidentifygenesexpressedspecificallyinsiliques(TableIIB,comparewithHennigetal.,2004),roots,photosyntheticactivetissues,leaves,senescentleaves,stemandnode,carpel,petal,sepal,andshootapex(seeSupplementalTableII)andatspecificdevelop-mentalstagessuchasseedlingstage(TableIIC)orearlyfloweringstage(TableIID;SupplementalTableII).Weconcludethatwiththecurrentsetofdata,GENEVESTIGATORgenerateshighqualityresults.Moreover,weexpectthatthisqualitywillcontinuetoriseasthesizeofthedatasetincreases.

DISCUSSION

PublicrepositoriessuchasGEOandArrayExpressprovidetoolsforsubmission,storage,andretrievalofheterogeneousdatasets.Incontrast,GENEVESTIGATORcontainsacoherentdatasetfromasingleorgan-ismgeneratedonacommonhybridizationplatform.Despitethehighdiversityofexperimentsrepresentedinthedatabase,thevalidationstepswecarriedoutdem-onstratethattheunderlyinghypothesisisvalidandthatbiologicallymeaningfulresultscanbeobtained

2629

Zimmermannetal.

TableIID.

Genesexpressedpreferentially(A)instaminaandpollen,(B)inseedsandsiliques,(C)duringseedlingstage,and(D)duringearlyfloweringstage.Forthedescriptionofgrowthstagegroups(labeledA–J),seeTableI.SeealsoSupplementalTableII,whichprovideslistsofgenesexpressedpreferentiallyinroots,greentissues,photosyntheticactiveleaves,senescentleaves,stemandnode,carpel,petal,sepal,andshootapex.

usingGENEVESTIGATOR.Thesoftwaregenerallyperformsprimarylevelanalysisanddisplaysresultseitherasgraphsorasnumericdata,whichcaneasilybecombined,exported,orfurtheranalyzedwithotherdataanalysisandvisualizationtools.

Thecomplexityofmulticellularliferequiresthepropercontext-dependentexpressionofgenes,whichisachievedbyhighlyinterconnectedtranscriptionalnetworks.Theinferenceofsuchmodulenetworksmayrequiretheuseofmanydatatypessuchasgeneexpression,proteinabundance,proteininteraction,metaboliteabundance,affinityprecipitation,syntheticlethality,etc.(Troyanskayaetal.,2003).Nevertheless,theanalysisofgeneexpressiondatacanrevealsignif-icantpatternsofsuchnetworks(Segaletal.,2003).Incontrasttomanyothertools,GENEVESTIGATORusesexperimentannotationtoyieldcontextualinformationthatcanbebroughtintounderstandinggenenet-works.Theidentificationofgenesexhibitingsimilartissuelocalizationandstressresponseattributesfacil-2630

itatesmodelingofgenenetworksusingnetworkin-ferencetools(Willeetal.,2004)byreducingthenumberoftestablecandidates.Thus,thecombinedgene-centricandgenome-centricapproachesmakeitapowerfultoolfortargetedfunctionalgenomicsefforts.

CriticalissuesinusingtheGENEVESTIGATORtoolsare(1)thequestionsbeingaddressedbyque-riesand(2)theinterpretationofoutputdata.First,GENEVESTIGATORallowsqueriesatahighlevelofdetailandinalargevarietyofcombinationsspecifyingorgan,developmentalstage,ortreatment.AlthoughGENEVESTIGATORcurrentlycontainsinformationfrommorethan750publiclyavailablefullgenomearrays,somecombinationsatverydetailedlevelmaynotyethavesufficientdatasupporttoyieldrobustresults.Thequalityoftheresultsthereforedependsstronglyonthelevelofgranularitytheuserchoosesandthenumberandtypesofunderlyingexperiments.Second,caremustbetakennottoover-interpret

PlantPhysiol.Vol.136,2004

outputfacilitatedatacomputedbyGENEVESTIGATOR.Toperdatainterpretation,thenumberofsamplesNevertheless,categoryandgranularity,whentheSEworkingsoftheinmeansaareindicated.advisedoriginofusingapost-verificationtheeffectstheDigitalobserved.

NorthernofindividualdetailedtooltoconfirmgenesleveloftheisCONCLUSION

GENEVESTIGATORBoththeforwardandreverseoftechnologyannotateddatafromrevealedvarioussourcesthatthevalidationusingcombinationofcontextualIninformationplatformisaboutavalidthesameelementsapproachoftorevealgenesourcase,contextfromtheArabidopsisexpressionprofilesofmorethethandataset.22,000ronmentalofplantcanbegeneratedintheriesstress.organ,plantdevelopmentandenvi-arrays,arecurrentlyAlthoughwellcoverednotinalltermsannotatedofnumbercatego-mayobtainedbeandsomewhatthereforebiased,theoutputthegeneralfromthesecategoriesofmanentconstantlysubmissionusingGENEVESTIGATORofnewdatasetsisqualityhigh.ofTheresultsper-resultinghypothesesinformationimprovethecanqualityisexpectedtobeoftheoutput.Theexpressionnetworks,networkorgenerateusedtoconfirmpreviousstructuresnewhypothesesandaboutgenetargetedexperiments.

resultinginthedesignofgeneticmorepreciseregulatoryandACKNOWLEDGMENTS

WethankEvaVranova

´andFranziskaHumairforfeedbackontheuseofthesoftwareindevelopment.WearealsogratefultotheFunctionalGenomicsCenterZurichforprovidingsupportandtheAffymetrixplatformforGeneChipexperiments,aswellasallpublicrepositoriesforprovidingdata.ReceivedMay14,2004;returnedforrevisionJuly12,2004;acceptedJuly16,2004.

LITERATURECITED

TheArabidopsisGenomeInitiative(2000)AnalysisofthegenomesequenceofthefloweringplantArabidopsisthaliana.Nature408:796–815AlbaniD,SardanaR,RobertLS,AltosaarI,ArnisonPG,FabijanskiSF(1992)ABrassicanapusgenefamilywhichshowssequencesimilaritytoascorbateoxidaseisexpressedindevelopingpollen.Molecularcharac-terizationandanalysisofpromoteractivityintransgenictobaccoplants.PlantJ2:331–342

AshburnerM,BallCA,BlakeJA,BotsteinD,ButlerH,CherryJM,DavisAP,DolinskiK,DwightSS,EppigJT,etal(2000)Geneontology:toolfortheunificationofbiology.TheGeneOntologyConsortium.NatGenet25:25–29

BeckerJD,BoavidaLC,CarneiroJ,HauryM,FeijoJA(2003)Transcrip-tionalprofilingofArabidopsistissuesrevealstheuniquecharacteristicsofthepollentranscriptome.PlantPhysiol133:713–725

BergmannS,IhmelsJ,BarkaiN(2004)Similaritiesanddifferencesingenome-wideexpressiondataofsixorganisms.PLoSBiol2:E9

BirnbaumK,ShashaDE,WangJY,JungJW,LambertGM,GalbraithDW,BenfeyPN(2003)AgeneexpressionmapoftheArabidopsisroot.Science302:1956–1960

BoyesDC,ZayedAM,AscenziR,McCaskillAJ,HoffmanNE,DavisKR,GorlachJ(2001)Growthstage-basedphenotypicanalysisofArabidop-

PlantPhysiol.Vol.136,2004GENEVESTIGATOR

sis:amodelforhighthroughputfunctionalgenomicsinplants.PlantCell13:1499–1510

BrazmaA,HingampP,QuackenbushJ,SherlockG,SpellmanP,StoeckertC,AachJ,AnsorgeW,BallCA,CaustonHC,etal(2001)Minimuminformationaboutamicroarrayexperiment(MIAME)—towardstandardsformicroarraydata.NatGenet29:365–371

BrazmaA,ParkinsonH,SarkansU,ShojatalabM,ViloJ,Abeygunawar-denaN,HollowayE,KapusheskyM,KemmerenP,LaraGG,etal(2003)ArrayExpress—apublicrepositoryformicroarraygeneexpres-siondataattheEBI.NucleicAcidsRes31:68–71

CraigonDJ,JamesN,OkyereJ,HigginsJ,JothamJ,MayS(2004)NASCArrays:arepositoryformicroarraydatageneratedbyNASC’stranscriptomicsservice.NucleicAcidsRes(Databaseissue)32:D575–D577

EdgarR,DomrachevM,LashAE(2002)GeneExpressionOmnibus:NCBIgeneexpressionandhybridizationarraydatarepository.NucleicAcidsRes30:207–210

FriedmanN(2004)Inferringcellularnetworksusingprobabilisticgraph-icalmodels.Science303:799–805

HennigL,GruissemW,GrossniklausU,Ko

¨hlerC(2004)Transcriptionalprogramsofearlystagesofplantreproduction.PlantPhysiol135:1765–1775

HennigL,MengesM,MurrayJA,GruissemW(2003)ArabidopsistranscriptprofilingonAffymetrixGeneChiparrays.PlantMolBiol53:457–465

HonysD,TwellD(2003)ComparativeanalysisoftheArabidopsispollentranscriptome.PlantPhysiol132:0–652

IhmelsJ,LevyR,BarkaiN(2004)PrinciplesoftranscriptionalcontrolinthemetabolicnetworkofSaccharomycescerevisiae.NatBiotechnol22:86–92

JanssonS,AnderssonJ,KimSJ,JackowskiG(2000)AnArabidopsisthalianaproteinhomologoustocyanobacterialhigh-light-inducibleproteins.PlantMolBiol42:345–351

KleffmannT,RussenbergerD,vonZychlinskiA,ChristopherW,SjolanderK,GruissemW,BaginskyS(2004)TheArabidopsistha-lianachloroplastproteomerevealspathwayabundanceandnovelproteinfunctions.CurrBiol14:3–362

LeeTI,RinaldiNJ,RobertF,OdomDT,Bar-JosephZ,GerberGK,HannettNM,HarbisonCT,ThompsonCM,SimonI,etal(2002)TranscriptionalregulatorynetworksinSaccharomycescerevisiae.Science298:799–804

LehmanA,BlackR,EckerJR(1996)HOOKLESS1,anethyleneresponsegene,isrequiredfordifferentialcellelongationintheArabidopsishypocotyl.Cell85:183–194

LelandaisG,LeCromS,DevauxF,VialetteS,ChurchGM,JacqC,MarcP(2004)yMGV:across-speciesexpressiondataminingtool.NucleicAcidsRes(Databaseissue)32:D323–D325

LiuWM,MeiR,DiX,RyderTB,HubbellE,DeeS,WebsterTA,HarringtonCA,HoMH,BaidJ,SmeekensSP(2002)Analysisofhighdensityexpressionmicroarrayswithsigned-rankcallalgorithms.Bio-informatics18:1593–1599

LockhartDJ,DongH,ByrneMC,FollettieMT,GalloMV,CheeMS,MittmannM,WangC,KobayashiM,HortonH,etal(1996)Expressionmonitoringbyhybridizationtohigh-densityoligonucleotidearrays.NatBiotechnol14:1675–1680

MengesM,HennigL,GruissemW,MurrayJA(2003)Genome-widegeneexpressioninanArabidopsiscellsuspension.PlantMolBiol53:423–442

MoulineK,VeryAA,GaymardF,BoucherezJ,PilotG,DevicM,BouchezD,ThibaudJB,SentenacH(2002)PollentubedevelopmentandcompetitiveabilityareimpairedbydisruptionofaShakerK(1)channelinArabidopsis.GenesDev16:339–350

Onate-SanchezL,SinghKB(2002)IdentificationofArabidopsisethylene-responsiveelementbindingfactorswithdistinctinductionkineticsafterpathogeninfection.PlantPhysiol128:1313–1322

ParenicovaL,deFolterS,KiefferM,HornerDS,FavalliC,BusscherJ,CookHE,IngramRM,KaterMM,DaviesB,etal(2003)MolecularandphylogeneticanalysesofthecompleteMADS-boxtranscriptionfactorfamilyinArabidopsis:newopeningstotheMADSworld.PlantCell15:1538–1551

PelazS,Gustafson-BrownC,KohalmiSE,CrosbyWL,YanofskyMF(2001)APETALA1andSEPALLATA3interacttopromoteflowerde-velopment.PlantJ26:385–394

2631

Zimmermannetal.

RavaszE,SomeraAL,MongruDA,OltvaiZN,BarabasiAL(2002)Hierarchicalorganizationofmodularityinmetabolicnetworks.Science297:1551–1555

RedmanJC,HaasBJ,TanimotoG,TownCD(2004)DevelopmentandevaluationofanArabidopsiswholegenomeAffymetrixprobearray.PlantJ38:5–561

Ruiz-GarciaL,MaduenoF,WilkinsonM,HaughnG,SalinasJ,Martinez-ZapaterJM(1997)Differentrolesofflowering-timegenesintheactivationoffloralinitiationgenesinArabidopsis.PlantCell9:1921–1934

RungeS,SperlingU,FrickG,ApelK,ArmstrongGA(1996)Distinctrolesforlight-dependentNADPH:protochlorophyllideoxidoreductases(POR)AandBduringgreeninginhigherplants.PlantJ9:513–523SegalE,ShapiraM,RegevA,Pe’erD,BotsteinD,KollerD,FriedmanN(2003)Modulenetworks:identifyingregulatorymodulesandtheir2632condition-specificregulatorsfromgeneexpressiondata.NatGenet34:166–176

StellingJ,KlamtS,BettenbrockK,SchusterS,GillesED(2002)Metabolicnetworkstructuredetermineskeyaspectsoffunctionalityandregula-tion.Nature420:190–193

TroyanskayaOG,DolinskiK,OwenAB,AltmanRB,BotsteinD(2003)ABayesianframeworkforcombiningheterogeneousdatasourcesforgenefunctionprediction(inSaccharomycescerevisiae).ProcNatlAcadSciUSA100:8348–8353

VicientCM,HullG,GuilleminotJ,DevicM,DelsenyM(2000)Differ-entialexpressionoftheArabidopsisgenescodingforEm-likeproteins.JExpBot51:1211–1220

WilleA,ZimmermannP,Vranova

´E,BleulerS,Fu¨rholzA,HennigL,LauleO,Prelı

´cA,vonRohrP,ThieleL,etal(2004)Sparsegraphicalgaussianmodelingforgeneticregulatorynetworkinference.GenomeBiol(inpress)

PlantPhysiol.Vol.136,2004

因篇幅问题不能全部显示,请点此查看更多更全内容

Copyright © 2019- niushuan.com 版权所有 赣ICP备2024042780号-2

违法及侵权请联系:TEL:199 1889 7713 E-MAIL:2724546146@qq.com

本站由北京市万商天勤律师事务所王兴未律师提供法律服务