您的当前位置：首页 Design considerations for distributed caching on the Internet

Design considerations for distributed caching on the Internet

来源：小侦探旅游网

DesignConsiderationsforDistributedCachingontheInternet

RenuTewari

IBMT.J.WatsonResearchCenter

MichaelDahlin,HarrickM.VinDepartmentofComputerSciencesJonathanS.Kay

CephalapodProliferationists,Inc.

UniversityofTexasatAustintewarir@watson.ibm.com

dahlin,vin@cs.utexas.edu

Abstract

Inthispaper,wedescribethedesignandimplementa-tionofanintegratedarchitectureforcachesystemsthatscaletohundredsorthousandsofcacheswiththousandstomillionsofusers.Ratherthansimplytrytomaximizehitrates,wetakeanend-to-endapproachtoimprovingre-sponsetimebyalsoconsideringhittimesandmisstimes.WebeginbystudyingseveralInternetcachesandwork-loads,andwederivethreecoredesignprinciplesforlargescaledistributedcaches:(1)minimizethenumberofhopstolocateandaccessdataonbothhitsandmisses,(2)sharedataamongmanyusersandscaletomanycaches,and(3)cachedataclosetoclients.Ourstrategiesforaddressingtheseissuesarebuiltaroundascalable,high-performancedata-locationservicethattrackswhereobjectsarerepli-cated.Wedescribehowtoconstructsuchaserviceandhowtousethisservicetoprovidedirectaccesstoremotedataandpush-baseddatareplication.Weevaluateoursystemthroughtrace-drivensimulationandﬁndthatthesestrate-giestogetherprovideresponsetimespeedupsof1.27to2.43comparedtoatraditionalthree-levelcachehierarchyforarangeoftraceworkloadsandsimulatedenvironments.

1.Introduction

ThegrowthoftheInternetandtheWorldWideWeballowincreasingnumberofuserstoaccessvastamountsofin-

jkay@cs.utexas.edu

amongmanyusersandscaletomanycaches,and(3)cachedataclosedtoclients.Althoughtheseprinciplesmayseemobviousinretrospect,currentcachearchitecturesroutinelyviolatethematasigniﬁcantperformancecost.Forexample,hierarchicalcachesintheUnitedStatesareoftenseenasawaytoreducebandwidthconsumptionratherthanasawaytoimproveresponsetime.

Toaddresstheseprinciples,wedesignascalable,high-performancedata-locationservicethattrackswhereobjectsarereplicated.Wedescribehowtoconstructsuchaserviceandhowtousethisservicetomeetourdesignprinciplesviadirectaccesstoremotedataandpush-baseddatarepli-cation.Throughsimulationusingarangeofworkloadsandnetworkenvironments,weﬁndthatdirectaccesstoremotedatacanachievespeedupsof1.3to2.3comparedtoastan-dardhierarchy.Wealsoﬁndthatpushingadditionalreplicasofdataprovidesadditionalspeedupsof1.12to1.25.

Weconstructourdata-locationserviceusingascalablehinthierarchyinwhicheachnodetracksthenearestlo-cationofeachobject.Scalabilityandperformanceofthehinthierarchycomesfromfoursources.First,weusesim-ple,compactdatastructurestoalloweachnode’sviewofthehinthierarchytotrackthelocationmanyobjects.Sec-ond,thelocationsystemsatisﬁesallon-linerequestslo-callyusingthehintcache;thesystemonlysendsnetworkmessagesthroughthehierarchytopropagateinformationinthebackground—offthecriticalpathforend-userre-quests.Third,thehierarchyprunesupdatessothatupdatesarepropagatedonlytotheaffectednodes.Fourth,weadaptPlaxton’salgorithm[24]tobuildascalable,faulttoleranthierarchyfordistributinginformation.

Wehaveimplementedaprototypeofoursystembyaugmentingthewidely-deployedSquidproxycache[34].1Itimplementshintcaches,push-on-write,andself-conﬁguringdynamichierarchies.

Therestofthepaperisorganizedasfollows.Section2evaluatestheperformanceoftraditionalcachehierarchiesandexaminesthecharacteristicsofseverallargeworkloadsandthenderivesasetofbasicdesignprinciplesforlarge-scale,distributedcaches.Section3providesanoverviewofourdesign,andSection4discussesimplementationdetailsandevaluatessystemperformance.Section5surveysre-latedwork,andSection6summarizesourconclusionsandoutlinesareasforfutureresearch.

#ofAccessesDistinct(millions)

Dates#of

DEC[6]

Berkeley[13]Prodigy

2.Evaluatingtraditionalcachehierarchies

Inthissection,weevaluatetheperformanceoftraditionalcachehierarchiesusingmeasurementsofseveralcachesontheInternetandtrace-drivensimulations,withthegoalofunderstandingthefactorsthatlimitcacheperformance.

cacheconsistencybyinvalidatingallcachedcopieswhen-everdatachange.Wedothisfortworeasons.First,tech-niquesforapproximatingorprovidingstrongcacheconsis-tencyinthisenvironmentareimproving[36],soweexpectthisassumptiontobeagoodreﬂectionofachievablefuturecachetechnology.Second,weakconsistencydistortsper-formanceeitherbyincreasingapparenthitratesbycounting“hits”tostaledataorbyreducingapparenthitratesbydis-cardinggooddata.Ineithercase,thiswouldaddasourceof“noise”toourresults.

Thesetraceshavetwoprimarylimitationsthataffectourresults.First,althoughweusetraceswiththousandsofclients,itstillrepresentsonlyasmallfractionoftheclientpopulationontheweb.Severalstudies[3,7,13]suggestthathitrateswillimproveasmoreclientsareincludedinacachesystem.Thesecondlimitationofthesetracesisthattheyaregatheredatproxiesratherthanatclients.Thus,allofthesetracesdisplaylesslocalityandlowertotalhitratesthanwouldbeseenbyclients.

2.1.2.Sourcesofcachemisses

Figure1showsthebreakdownofcachemissratesandbytemissratesforaglobalcachesharedbyallclientsinthesys-temascachesizeisvaried.ThecacheusesLRUreplace-ment.Missesfallintofourcategories:

1.Compulsorymisses.Thesemissescorrespondtotheﬁrstaccesstoanobject.Thetwokeystrategiesforreducingcompulsorymissesareincreasingthenum-berofclientssharingacachesystemandprefetching.Facilitatingsharingisanimportantfactorindesigninglargescalecaches,andwediscusssharingindetailinthenextsubsection.Wedonotaddressprefetchinginthispaper.2.Capacitymisses.Thesemissesoccurwhenthesystemreferencesanobjectthatithaspreviouslydiscardedfromthecachetomakespaceforanotherobject.Thedatasuggestthatforsharedcaches,capacitymissesarearelativelyminorproblemthatcanbeadequatelyaddressedbybuildingcachenodeswithareasonableamountofdiskspace.3.Communication/consistency.Thesemissesoccurwhenacacheholdsastalecopyofdatathathasbeenmodiﬁedsinceitwasreadintothecache.Providingef-ﬁcientcacheconsistencyinlargesystemsisacurrentresearchtopic[36],andwedonotfocusonthatprob-lemhere.Wedonote,however,thatthedatalocationabstractionweconstructcouldalsobeabuildingblockforascalableconsistencysystem.4.Uncachable/error.Objectsaremarked“uncachable”orencountererrorsforanumberofreasons,someofwhichmightbeaddressedbymoresophisticated

cacheprotocolsthatsupportbettercacheconsis-tency,cachingdynamicallygeneratedresults[31],dynamicallyreplicatingservers[33],negativeresultcaching[5],andcachingprogramsalongwithdata[4,32].Wedonotaddresssuchprotocolextensionshere.Also,becauseweareinterestedinstudyingtheeffec-tivenessofcachingstrategies,fortheremainderofthisstudy,wedonotinclude“Uncachable”or“Error”re-questsinourresults.

Forallofthetraces,evenanidealcachewillsufferasig-niﬁcantnumberofmisses.Thus,onekeydesignprincipleisthatinadditiontohavinggoodhitratesandgoodhittimes,cachesystemsshouldnotslowdownmisses.

2.1.3.Sharing

Figure2illustratestheimportanceofenablingwidespreadsharinginlargecachesystems.Inthisexperiment,wecon-ﬁgurethesystemasathree-levelhierarchywith256clientssharingaL1proxy,eightL1proxies(2048clients)shar-ingaL2proxy,andallL2proxiessharinganL3proxy.Asmoreclientsshareacache,thecompulsorymissrateforthatcachefallsbecauseitbecomeslesslikelythatanygivenaccesstoanobjectistheﬁrstaccesstothatobject.Forex-ample,intheDECtracesgoingfroma256-clientsharedcachetoa16,336-clientsharedcacheimprovesthebytehitratebynearlyafactoroftwo.Priorstudies[3,7,13]havealsoreachedsimilarconclusions.Thischaracteristicoftheworkloadsuggeststhatcachesystemsshouldaccom-modatelargenumbersofclientsandtherebyreducecom-pulsorymissrates.

Forcacheswithdifferentdegreesofsharing,Figure3depictsthevariationintherequestresponsetimeswithin-creaseinthedistancebetweenclientsandthesharedcache.Itillustratesadilemmafacedbythedesignersofsharedproxycaches:althoughitisimportanttoshareacacheamongmanyclients,itisalsoimportantthatthesharedcachebeclosetoclients.Forexample,a256-clientcachewithanaveragehittimeof50mscanoutperforma16,336-clientcachethataverages300msperaccess.

2.2.Analyzingtraditionalcachehierarchies

Hierarchicalcachesattempttoresolvethedilemmaofsharingversuslocalityandscalabilitybyputtingdifferentcachesneardifferentgroupsofclients,andsendingmissesfromtheseL1cachesthroughadditionallayersofcacheex-ploitsharing.Thissubsectionexamineshowsucharrange-mentsworkinlarge-scalesystems.

TraditionalhierarchicalcachearchitecturessuchasHar-vest[5]orSquid[34]deﬁneparent-childrelationshipsamongcaches.Eachcacheinthehierarchyissharedbyagroupofclientsoragroupofchildrencaches.Dataac-cessproceedsasfollows:ifthelowest-levelcachecontainsthedatarequestedbyaclient,itsendsthedatatotheclient.

10.8TotalCompulsoryCapacityCommunicationErrorUncachable10.8TotalCompulsoryCapacityCommunicationErrorUncachable10.8TotalCompulsoryCapacityCommunicationErrorUncachableMiss Ratio (Berkeley)0.60.40.200.60.40.20Miss Ratio (Prodigy)35Miss Ratio (DEC)0.60.40.200510152025303505101520253005101520253035Size of Cache (GB)10.80.60.40.20TotalCompulsoryCapacityCommunication10.80.60.40.20(a)Per-readmissrate

1TotalCompulsoryCapacityCommunication0.80.60.40.20Size of Cache (GB)Size of Cache (GB)Byte Miss Ratio (Berkeley)0510152025303505101520253035Byte Miss Ratio (Prodigy)Byte Miss Ratio (DEC)TotalCompulsoryCapacityCommunication05101520253035Size of Cache (GB)(b)Per-bytemissrate.

Size of Cache (GB)Size of Cache (GB)Figure1.Requestmissratesandbytemissratesforthetracesexaminedinthisstudy.Foreachgraph,wecategorizemissesascapacityforaccessestodatathathavebeendiscardedfromthecachetomakespaceforotherdata,communicationforaccessestodatathathavebeeninvalidatedfromthecacheduetoanupdate,errorforrequeststhatgenerateanerrorreply,uncachableforrequeststhatspecifythatthecachemustcontacttheservertoretrievetheresult,fornon-GETrequests,orforrequeststhataredesignatedasCGIrequestsinthetraces,andcompulsoryfortheﬁrstaccesstoanobject.

Otherwise,thecacheaskseachofitsneighborsforthedata.Ifnoneoftheneighborspossessthedata,thenthecachesendsarequesttoitsparent.Thisprocessrecursivelycon-tinuesupthehierarchyuntilthedataislocatedortherootcachefetchesthedatafromtheserver.Thecachesthensendthedatadownthehierarchyandeachcachealongthepathstoresthedata.

AlthoughhierarchiessuchasHarvestandSquidwerede-signedundertheassumptionthatcachescouldbelayeredwithoutaddingmuchdelay[5],wehypothesizethattwoas-pectsofthisarchitectureasappliedtoInternetcachescansigniﬁcantlylimitperformance.First,thecostofaccessingaseriesofcachesinthehierarchyaddssigniﬁcant“store-and-forward”delays2tohigher-levelcachehitsandtocachemisses[22].Second,whenhigh-levelcachesservicealargenumberofclientsdistributedoveralargeregion,thenet-workdelaysbetweenaclientandahigh-levelcachemaybelarge,whichreducesthebeneﬁtofhitstothesecaches.Toexaminetheseeffects,weusetwosourcesofinfor-mation.First,tounderstandthedetailsofperformanceinacontrolledenvironment,weconstructatesthierarchyandexamineitunderasyntheticworkload.Second,tounder-standhowsuchsystemsperforminrealhierarchiesandun-derrealworkloads,weexamineRousskov’smeasurementsofseveralSquidcachesdeployedatdifferentlevelsofahi-erarchy[26].AlthoughSquidsupportstheInternetCache

1.00.8HRHierarchy Level 3Hierarchy Level 2Hierarchy Level 1Hit Ratio0.60.40.20.0DECrate(BHR)withininﬁniteL1cachessharedby256clients,inﬁniteL2cachessharedby2048,andinﬁniteL3cachessharedbyallclientsinthetrace.Assharingincreases,sodoestheachievablehitrate.

Figure2.Overallper-readhitrate(HR)andper-bytehit

1000 1 Client256 Clients2048 Clients16336 Clients800Min Response Time (ms)60040020000100200300400500600700Distance from Single Shared Cache (msec.)Figure3.Responsetimeforcacheswithdifferentde-greesofsharingasthedistancebetweenclientsandcachesincrease

LocationUCBerkeleyUCBerkeleyUCSanDiegoUTAustin

CornellUniversity

100000CLN--L1CLN--L1--L2CLN--L1--L2--L3CLN--L1--L2--L3--SRV10000100000CLN--L1CLN--L2CLN--L3CLN--SRV10000100000CLN--L1CLN--L1--L2CLN--L1--L3CLN--L1--SRV10000Response Time (ms)Response Time (ms)10001000Response Time (ms)2481632641282565121024100010010010010248163264128256512102410102481632641282565121024Size of Data (KB)(a)

Size of Data (KB)(b)

Size of Data (KB)(c)

(b)objectsfetcheddirectlyfromeachcacheandserver;and(c)allrequestsgothroughtheL1proxyandthendirectlytotheproxyorserver.

ClientConnectminmax

Leaf

Root

Miss

550100

min:550,

72Disk

135

650

max:3200

1050

271981

2767

320

7217

2850

641

3417

ProxyReplyminmax

Figure4.Measuredaccesstimesinthetestbedforobjectsofvarioussizes:(a)objectsaccessedthroughathree-levelhierarchy;

minmax

163

352

min271

max2767

Table3.SummaryofSquidcachehierarchyperformancebasedonRousskov’smeasurements.Allthemeasurementsareinms.

3.Designoverview

Intheprevioussection,wederivedthreebasicdesignprin-ciplesforlarge-scalecaches:(1)minimizethenumberof

hopstolocateandaccessdataonbothhitsandmisses,(2)sharedataamongmanyusersandscaletomanycaches,and(3)cachedataclosetoclients.Inthissection,wedescribethestrategieswepursuetoaddressthesedesignprinciplesandprovideahigh-leveldescriptionofourspeciﬁcimple-mentationofthesestrategies.Thenextsectionwillprovideamoredetailed,quantitativeanalysisofthesystemandourdesigndecisions.

Oursystemisbuiltaroundascalabledata-locationser-vicecalledthehinthierarchywhichallowseachnodetoknowthenearestlocationofeachobject.Thehinthierar-chyprovidestheinformationneededbytwobasicstrategiesforprovidinggoodperformance.

Theﬁrstbasicstrategyisdirectaccessofremotelycacheddata.Undertheidealdirectaccessstrategy,aclientlocatesthenearestcopyofdatausinganoracleandsendsitsrequestdirectlytothenearestsharedcachethatcontainsthedesireddataordirectlytotheserverifnocachecon-tainsthedata.Cachesandserversthatreceivesuchrequestssendthedatadirectlybacktotheclient.Asweillustratedintheprevioussection,traditionalsingle-levelcachescanlimitsharingwhilemulti-levelcachessufferhighper-hopcostswhentheyforcehitsandmissestotraversemultiplelayersofahierarchy.Thisidealdirectaccessstrategy,ontheotherhand,minimizesthenumberofhopstolocateandaccessdata(principle1)whileallowingwidespreadsharing(principle2).

Thehinthierarchyallowsustoapproximatethisdirectaccessstrategyusinghintcaches.Clustersofnearbyclients

sendtheirrequeststoasharedL1proxycache.TheproxyconsultsalocalhintcachethatcontainsamappingfromtheIDoftheobjectbeingrequestedtotheIDofthenearestproxycachethatcontainstheobject.Theﬁrstproxysendstherequestdirectlytothesecond,whichsendsthedatadi-rectlyback.Theclient’sproxythensendsthedatatotheclientthatissuedtherequest.3

Oursecondbasicstrategyispush.Asthedatainthepre-vioussectionindicate,evenwithdirectaccess,thecostofaccessingdatafromanearbycacheismuchlowerthanthecostofaccessingdatafromadistantcache.Forexample,inthetestbedhierarchyL1cacheaccessesfor8KBobjectsare4.7timesfasterthandirectaccessestocachesthatareasfarawayasL2cachesand6.2timesfasterthandirectaccesstocachesthatareasfarawayasL3caches.Thegoalofpushistoimprovehittimebypushingobjectsnearcachesthatarelikelytoreferencetheminthefuture.Theidealpushal-gorithmpostulatesthatwheneveranobjectisbroughtintothesystem,itismagicallypushedtoallL1cachesinthesystemthatwillreferencethedatainthefuture;thusallL2andL3hitsbecomeL1hits.

Weexamineseveralsimple,practicalapproximationstothisidealthatareenabledbytheinformationinthehinthi-erarchy.Push-on-updateisbasedontheobservationthatwhenanobjectismodiﬁed,theproxiescachingtheoldver-sionoftheobjectareagoodlistofcandidatestoreferencethenewversionoftheobject.Thus,whenacachefetchesanobjectduetoacacheconsistencymiss,itpushestheob-jecttocachesstoringoldversionsofit.Thepush-shared

algorithmisbasedontheintuitionthatiftwosubtreesofanodeinthemetadatahierarchyaccessanitem,itislikelythatmanysubtreesinthehierarchywillaccesstheitem.Thus,whenoneproxyfetchesdatafromanother,italsopushesthedatatoasubsetofproxiesthatshareacommonmetadataancestor.Bothofthesealgorithmsaresimple,andtheyachieveasigniﬁcantfractionofthetotalperformanceachievedbytheidealstrategy.However,thereisroomformoresophisticatedalgorithmstoachievefurthergains.

4.Detaileddesignandevaluation

Thissectiondetailsourdesignandimplementation.Foreachaspectofthedesign,weexamineoverallperformanceandmajordesigndecisions,andwediscussareasofpoten-tialconcern.Whenappropriate,weusesimulationresultstoexamineourdecisions.

Scalabilityandperformanceofthehinthierarchycomesfromfoursources.First,weusesimple,compactdatastruc-turestoalloweachnode’sviewofthehinthierarchytotrackthelocationofmanyobjects.Second,thelocationsystemsatisﬁesallon-linerequestslocallyusingthehintcache;thesystemonlysendsnetworkmessagesthroughthehierarchytopropagateinformationinthebackground—offthecriticalpathforend-userrequests.Third,thehierarchyprunesup-datesbypropagatingthemonlytotheaffectednodestosup-portaglobalindexwhileusingminimalbandwidth.Fourth,weadaptPlaxton’salgorithm[24]tobuildascalable,faulttoleranthierarchyfordistributinginformation.

Wehaveimplementedaprototypeofoursystemthatimplementsahinthierarchy,hintcaching,andpushcaching[16].ItisbasedonSquid1.1.20[34]andnamedCuttleﬁsh.

4.1.Hintcachesanddirectaccess

Localhintcachesallowcachestoapproximatethedirectaccessstrategy.AhintisanobjectId,nodeIdpairwherenodeIdidentiﬁestheclosestcachethathasacopyofobjec-tId.Inordertofacilitatetheprinciplesofminimizinghopsandallowingwidespreadsharing,thehintcachedesignal-lowscachestostorelargenumbersofhintsandtoaccessthemquickly.

Animportantimplementationdetailinourprototypeisourdecisiontousesmall,ﬁxed-sizedrecordsandtostorehintsinasimplearraymanagedasak-wayassociativecache.Inparticular,ourdesignstoresanode’shintcacheinamemorymappedﬁleconsistingofanarrayofsmall,ﬁxed-sizedentries.Eachentryconsumes16bytes:an8-bytehashofaURLandan8-bytemachineidentiﬁer.

WestoreaURLhashratherthanacompleteURLtosavespaceinthetableandtoensurethatentriesareofﬁxedsize.Smallrecordsimproveperformancebyallowinganodetostorehintentriesforalargenumberofobjects;thisfacil-itatestheprincipleofmaximizingsharing.Smallrecords

alsoreducethenetworkcostofpropagatinghints,andtheyallowalargerfractionofthehinttabletobecachedinagivenamountofphysicalmemorytoavoiddiskaccesses.IftwodistinctURLshashtothesamevalue,thehintforoneofthemwillbeincorrectandanodemaywastetimeaskinganothercachefordataitmaynothave.Inthatcase,becausethereadrequestcontainsthefullURL,theremotenodereturnsanerrorandtheﬁrstnodetreatstherequestasamiss.With64-bithashkeysbasedonMD5signaturesofURLs,wedonotanticipatethathashcollisionswillhurtperformance.Infact,somehintcacheimplementorsmayconsiderreducingthekeysto32bitstoincreasethereachofthecacheforagivenhintcachesize.

Fixed-sizedrecordssimplifyandspeedupaccesswhenthehintcachedoesnotﬁtinphysicalmemory.Thesystemcurrentlystoreshintsinanarraythatitmanagesasa4-wayassociativecacheindexedbytheURLhash,anditmapsthisarraytoaﬁle.Thus,ifaneededhintisnotalreadycachedinmemory,thesystemcanlocateandreaditwithasinglediskaccess.4

Westoreentriesina4-wayassociativecacheratherthan,forinstance,maintainingafully-associativeLRUlistofen-triestoreducethecostofmaintainingthehintcachewhenitdoesnotﬁtinmemory.Weincludeamodestamountofas-sociativitytoguardagainstthecasewhenseveralhotURLslandinthesamehashbucket.

Ourimplementationofhintcachesapproximatestheidealofdirectaccess.However,threeaspectsofthedesignmaylimititsperformance.First,tosimplifydeployment,wefocusonaproxy-basedimplementationofhintcachesratherthanaconﬁgurationwhereallclientsmaintainlocalhintcaches.Second,thehintcachedatastructureissmallerandslowerthantheidealoracle.Third,hintcachesmaycontainanoutofdatepictureofwhereobjectsarecached.4.1.1.Overallperformanceandconﬁguration

Wewillusethefollowingsimulationconﬁgurationthrough-outthepaper.AsinFigure2,weassumethatwhenthesystemisconﬁguredasa3-levelhierarchy,clustersof256nearbyclientsshareanL1cache,groupsof2048clientsshareanL2cache,andallclientsinthetraceshareanL3cache.WeparameterizethedistancebetweenclientsinthesecategoriesusingtheTestbedtimesshowninFigures4andtheMinandMaxaccesstimesmeasuredbyRousskov.Ourdirectaccessandhintconﬁgurationsaresimilar,buttheyusetheTotalDirectandTotalviaL1timesfromthoseﬁgures.Wesimulatedperformanceforbothinﬁnite-sizedcachesandﬁnite,5GBcaches.Fortheﬁnitecachecase,wereserved10%ofthespaceforhintcacheswhenappro-priate.Duetospacelimitations,unlessotherwisenotedwe

10.8Hit Ratio 0.60.40.2000.1110100InfSize of Hint Cache (MB)Figure6.Hitrateassumingthatgroupsof256clients

fromtheDECtraceeachaccessaninﬁniteproxycacheandthateachproxycachecanaccessotherproxycachesviaahintcacheofthespeciﬁedsize.Eachentryinthehintcachetakes16bytesandtheyarestoredina4-waysetassociativearraywithtotalsizespeciﬁedinMBonthex-axis.

displaythegraphforinﬁnitecachesandomitthegraphsﬁnite-caches.AsFigure1suggests,wefoundthatcachesizeisasecondaryfactorforperformance,andourﬁnitecacheresultsarequalitativelythesameasourinﬁnitecacheresults.

Toquantifytheoverallperformanceofthesystemandexaminetheimpactofassumingaproxy-basedimplemen-tation,Figure5showsthesimulatedperformancefortheDEC,Berkeley,andProdigytracesunderaseriesofalgo-rithms.Theﬁgureprovidestwobaselinesforperformance.Itincludesatraditional3-levelHierarchyandanunreal-izablebestcase,SharedAll,whichisanL1cachethatissharedbyallclientsinthesystem,butthatisasclosetoallclientsasthesmallerL1cachesusedinthestandardconﬁg-uration.

First,notethatthisconﬁgurationisdesignedtoletusex-plorehowdifferentarchitecturesbalancesharing,locality,andscaling.AsthebarsfortheSharedAllL1cacheshow,iftheunderlyingphysicalnetworkandcachemachinesal-lowsharingtobeincreasedwithoutincreasingcacheaccessdelays,thebestconﬁgurationisalarge,sharedcache.

Directaccesssigniﬁcantlyoutperformthetraditional3-levelhierarchieswithspeedupsrangingfrom1.3to2.9.Theadditionalnetworkhopfortheproxy-basedHintCachingconﬁgurationhurtsperformancemodestlycomparedtotheIdealForwardingsystem.Therealisticimplementationachievesspeedupsof1.3to2.3.Forthissetofnetworktopologiesandworkloads,anotherreasonablealternativeisforeachgroupof256clientstoshareaL1cachesbutnottosharecachesfurther.Suchanapproachfallsshortoftheidealdirectaccessprotocolbyasmuchasafactorof1.53andfallsshortofthehintcacheprotocolbyasmuchasafactorof1.3.

4.1.2.Hintcachesize

Oneobviousworryassociatedwithhintsisthestoragere-quirements.Howmuchstorageisrequiredtotrackthenear-estcopyallobjectsinalargecachesystem?Thisnumber

turnsouttobesurprisinglyreasonable.

Anode’shintcachewillonlybeeffectiveifitcanin-dexsigniﬁcantlymoredataobjectsthanthenodecanstorelocally,anditwillobservethedesignprincipleofmaximiz-ingsharingifitcanindexmostorallofthedatastoredbythecachesinthecachesystem.AswedescribeindetailinSection4.1,oursystemcompresseshintsto16-byte,ﬁxed-sizedrecords.Atthissize,eachhintisalmostthreeordersofmagnitudesmallerthananaverage10KBdataobjectstoredinacache[2].Thus,ifacachededicates10%ofitscapacityforhints,itshintcachewillindexabouttwoordersofmagnitudemoredatathanitcanstorelocally.Eveniftherewerenooverlapofwhatdifferentcachesstore,suchadirectorywouldallowanodetodirectlyaccessthecontentofabout63nearbycaches.But,becausethehintcacheneedonlystoreoneentrywhenanobjectisstoredinmultiplere-motecaches,coverageshouldbemuchbroaderinpractice.Anotherwayofviewingcapacityistoconsiderthereachofa500MBindex(10%ofamodest,5GBproxycache).Suchanindexcouldtrackthelocationofover30millionuniqueobjectsstoredinacachesystem.FinallyconsiderthatinOctober1998a6GBdiskcostsunder$160,sug-gestingthatﬂatindicesofhundredsofmillionstobillionsofobjectsarefeasible.

Figure6showshowthesizeofthehintcacheaffectsoverallperformanceintheDECtrace.Verysmallhintcachesprovidelittleimprovementbecausetheyindexlit-tlemorethanwhatisstoredlocally.Forthisworkload,hintcachessmallerthan10MBprovidelittleadditional“reach”beyondwhatisalreadycachedlocally,buta100MBhintcachecantrackalmostalldatainthesystem.

4.1.3.Delayedhintpropagation

Althoughthesystemusesdirectrequestsanddirectdatatransfersforclientrequests,itpropagateshintsthroughahierarchy.WedeferdetaileddiscussionofhintpropagationuntilafterourdescriptionofhowthesystemconﬁguresthehierarchyinSection4.3.Here,weexaminethegeneralis-sueofdelayedhintpropagation.Whereastheoracleintheidealizeddirectaccessprotocolalwaysknowswheretoﬁndthenearestcopyofdata,actualhintupdatestaketimetopropagatethroughthesystem,socachesmaymakesubop-timaldecisionsaboutwheretosendtheirrequests.

Figure7quantiﬁesthedependenceofglobalhitrateontheamountoftimeittakestoupdatehintsinthesystem.Inthisexperiment,weassumethatwheneveranobjectisdroppedfromacachetomakespaceforanotherobjectoranobjectisaddedtoacache,noneofthehintcacheslearnofthechangeforanamountoftimespeciﬁedonthex-axisintheﬁgure.Thisexperimentsuggeststhattheperformanceofhintcacheswillbegoodaslongasupdatescanbeprop-agatedthroughthesystemwithinafewminutes.AsSec-tion4.3willdetail,tofacilitatefastpropagation,oursystem

3000DEC3000Berkeley4000ProdigyMean Response Time (ms)Mean Response Time (ms)Mean Response Time (ms)2000HierarchyHint CachingIdeal ForwardingShared All2000HierarchyHint CachingIdeal ForwardingShared All3000200010000HierarchyHint CachingIdeal ForwardingShared All100010000Max0MaxMaxFigure5.SimulatedperformanceforDEC,Berkeley,andProdigytraces.Thethreegroupsofbarsforeachtraceshowtheperfor-mancewhentheaccesstimesareparameterizedbytheTestbedtimesshowninFigures4ortheMinandMaxaccesstimesmeasuredbyRousskovshownintheTotalHierarchicalandTotalviaL1columnsofTable3.Withineachgroup,weshowperformancefora3-levelhierarchy,idealizedforwarding,hintcaches,andanidealizedcachesharedbyallclientsinthesystemandasclosetoeachastheirnormalL1cache.

10.8Hit Ratio 0.60.40.2001101001000Hint Propagation Delay (minutes)Figure7.Hitrateassumingthatgroupsof256clients

fromtheDECtraceeachaccessaninﬁniteproxycacheandthateachproxycachecanaccessotherproxycachesviaahintcachethatisupdatedthespeciﬁednumberofminutesafterobjectsappearordisappearfromcaches.

usesametadatahierarchythatpreserveslocalityforhintup-dates,itusesascalablehierarchytoavoidbottlenecks,anditusessmallhintrecordstoreducethenetworkoverheadsofpropagation.

4.2.Push

Theidealpushalgorithmusesfutureknowledgetoredis-tributedata,thustransformingallL2andL3hitsintoL1hits.5Actualpushalgorithmsmustapproximatethisbysendingdatatheypredictwillbereferencednearcachestheypredictwillreferenceit.6

4.2.1.Algorithms

Becauseofthelargenumbersofobjectsthatpassthroughthesystem,wefocusonsimplealgorithmsthatdonotex-plicitlytrackthereferencefrequencyofindividualobjects.

3000250Mean Response Time (ms)BW Consumed (KB/sec)2000HierarchyDirectory HintsUpdate PushPush-1Push-halfPush-allPush-ideal200150100500PushedDemand Fetch10000MaxFigure9.SimulatedresponsetimeforDECtracework-

DemandUpdatesloadforsixalgorithms:nopush(datahierarchy),nopush(hinthierarchy),updatepush,push-1(sendacopyto1nodeineacheligiblesubtree),push-half(sendacopytohalfofthenodesineacheligiblesubtree),andpush-all(sendacopytoallnodesinaneligiblesubtree).

Figure10.Bandwidthofpushalgorithms(DECtrace.)Figure10showsthebandwidthconsumedbythealgo-rithms.Thepush-sharedalgorithmsincreasethebandwidthconsumedbyuptoafactoroffourcomparedtothedemand-onlycase.Thismaybeanacceptabletrade-offofbandwidthforlatencyinmanyenvironments.

popularity,itresultsinthedesiredeffectthatpopularitemsaremorewidelyreplicatedthanunpopularones.Alsonotethatifthereislocalitywithinsubtrees,itemspopularinonesubtreebutnotanotherwillbemorewidelyreplicatedinthesubtreewheretheyarepopular.

4.2.2.Evaluation

Actualpushalgorithmsmayfallshortofidealonesintwoways.First,theymayfailtopushtherightdataneartherightcaches,solatencywillsuffer.Second,theymaypushdataunnecessarily,therebyconsumingexcessbandwidth.Figures9and10comparesimulationsoftheidealpushal-gorithmandbasecasealgorithmswithoutpushtopush-on-updateandpush-shared.

Althoughwedonotexpectcapacitymissestobeama-jorfactorinsystemsthatonlyreplicatedataon-demand,inpush-basedsystemsspeculativereplicationmaydisplacemorevaluabledatafromcaches.Tomonitorthiseffectinoursimulations,wereportresultsforthespace-constrainedconﬁgurationinwhicheachofthe64L1cacheshas5GBofcapacity.Duetotimeconstraintsandmemorylimitationsofoursimulationmachines,theseresultsuseonlytheﬁrstsevendaysoftheDECandBerkeleytraces.

Figure9showsthesimulatedresponsetimefortheDECtraceunderarangeofpushoptions.Thisexperimentsug-geststhatanidealpushalgorithmcouldachievespeedupsof1.54to2.63comparedtotheno-pushdatahierarchy,andspeedupsof1.21to1.62comparedtotheno-pushhinthier-archy;thelargestspeedupscomewhenthecostofaccessingremotedataishighsuchastheMaxvalueinRousskov’smeasurements(seeTable3).Thepush-sharedalgorithmsdescribedhereachievesspeedupsof1.42to2.03comparedtotheno-pushdatahierarchy,andspeedupsof1.12to1.25comparedtotheno-pushhinthierarchy.Althoughalargefractionofobjectspushedbypush-updatesareusedbytheirdestinations,updatesareinfrequentsoitsoverallperfor-mancegainsaresmall.

4.3.Hinthierarchy

Tomaximizesharing,ourgoalistoallowthesystemtoscaletolargenumbersofcaches.Thediscussionabovede-scribedhowthehinthierarchyusessimple,compactdatastructurestotrackthelocationmanyobjectsandhowhintcachessatisfyon-linerequestslocallytoavoidthenetworkdelaysencounteredwhenlocatingdatainatraditionalhier-archyorusingICP[35].Thissectionexaminestworemain-ingaspectsofthehierarchy’sdesign.

First,althoughthehierachycanbevisualizedasatree,inrealityweuseamorescalabledatastructurethatdynami-callyconﬁguressubtreestoreﬂectnetworklocalityandthatmapseachnodeofasubtreeacrossthesubtree’sleavestoprovidescalabilityandfaulttoleranceusinganalgorithmdevelopedbyPlaxtonet.al[24].Duetospacelimitations,weomitdetaileddescriptionofthismapping.Detailsofourimplementationmaybefoundintheextendedversionofthispaper[29].

Second,toreducetheamountofinformationsentglob-ally,thehierarchyprunesupdatessothatupdatesareprop-agatedonlytotheaffectednodes.Wheneveracacheloadsanewobject(ordiscardsapreviouslycachedone),itin-formsitsparent.Theparentpropagatesthatinformationtoitschildrenoritsparentorbothusingalimitedﬂoodingal-gorithminwhichnodesonlypropagatechangesrelatingtothenearestcopiesofdata.Forexample,ifanodealreadyknowsthatoneofitschildrenhasacopyofanobject,thenonreceivingahintfromitsparentaboutanewcopyoftheobject,thenodedoesnotpropagatethehinttoitschildren.Thisisbecause,assumingthatthehierarchy’stopologyre-ﬂectsnetworklocality,propagatingthehintwillnotaltertheinformationregardingthenearestcopyoftheobjectwithrespecttoanyofitschildrennodes.Table4examineshoweffectivethemetadatahierarchyisatﬁlteringtrafﬁc.

Onepotentialdangeristhatwhenanobjectisﬁrst

Organization

Centralizeddirectory

2.1

Peak14.4

6.2

90%7.4

References

[1]A.RousskovandD.Wessels.CacheDigests.InProceed-ingsofthe3rdInternationalWWWCachingWorkshop,June1998.[2]M.ArlittandC.Williamson.WebServerWorkload

Characterization:TheSearchforInvariants.InPro-ceedingsoftheSIGMETRICSConferenceonMeasure-mentandModelingofComputerSystems,May1996.http://www.cs.usask.ca/projects/discus/discus

[35]D.WesselsandK.Claffy.ApplicationofInter-netCacheProtocol(ICP),version2.Requestfor

CommentsRFC-2186,NetworkWorkingGroup,1997.http://ds.internic.net/rfc/rfc2187.txt.

[36]J.Yin,L.Alvisi,M.Dahlin,andC.Lin.VolumeLeasesto

SupportConsistencyinLarge-ScaleSystems.IEEETransac-tionsonKnowledgeandDataEngineering,February1999.[37]L.Zhang,S.Floyd,andV.Jacobson.AdaptiveWebCaching.

InProceedingsoftheNLANRWebCacheWorkshop,June1997.http://ircache.nlanr.net/Cache/Workshop97/.

因篇幅问题不能全部显示，请点此查看更多更全内容

查看全文