【文档说明】计算机组成原理英文版第一章Computer-Abstractions-and-Technology--课件.ppt,共(63)页,8.526 MB,由小橙橙上传
转载请保留链接:https://www.ichengzhen.cn/view-76280.html
以下为本文档部分文字说明:
Chapter1ComputerAbstractionsandTechnologyTheComputerRevolutionProgressincomputertechnologyUnderpinnedbyMoor
e’sLawWhatisMoore’sLaw???Moore'slawdescribesalong-termtrendinthehistoryofcomputinghardware.Thequantityoftransistorsthatcanbeplacedinexpensivelyon
anintegratedcircuithasdoubledapproximatelyeverytwoyears.§1.1Introductionppt课件Moore’sLawThetrendhascontinuedformorethanhalfacenturyandisnotexpe
ctedtostopuntil2015orlater.ppt课件TheComputerRevolutionMakesnovelapplicationsfeasibleComputersinautomo
biles§1.1Introductionppt课件TheComputerRevolutionMakesnovelapplicationsfeasibleCellphones§1.1Introductionp
pt课件TheComputerRevolutionMakesnovelapplicationsfeasibleHumangenomeprojectWorldWideWebSearchEngines§1.1Introductionppt课件TheComputerRevolut
ionComputersarepervasive§1.1Introductionppt课件ClassesofComputersQuestion:Howdoyouclassifycomputers?DesktopComputersServe
rComputersEmbeddedComputersppt课件ClassesofComputersDesktopcomputersPCGeneralpurpose,varietyofsoftwareSubjecttocost/performancetradeof
fppt课件ClassesofComputersServercomputersNetworkbasedHighcapacity,performance,reliabilityRangefromsmallserverstobuil
dingsizedWorld’ssmallestwebserverppt课件ClassesofComputersEmbeddedcomputersHiddenascomponentsofsystemsStringentpower/performance/costconstrain
tsppt课件TheProcessorMarketppt课件WhatYouWillLearnHowprogramsaretranslatedintothemachinelanguageAndhowthehardwareexecutesthemThehardware/soft
wareinterfaceWhatdeterminesprogramperformanceAndhowitcanbeimprovedHowhardwaredesignersimproveperformanceWhatisparallel
processingppt课件LevelsofProgramCodeHigh-levellanguageLevelofabstractionclosertoproblemdomainProvidesforproductivityandportabilityAssemblylan
guageTextualrepresentationofinstructionsHardwarerepresentationBinarydigits(bits)Encodedinstructionsanddatappt课件BelowYourProgramApp
licationsoftwareWritteninhigh-levellanguage(HLL)SystemsoftwareCompiler:translatesHLLcodetomachineco
deOperatingSystem:servicecodeHandlinginput/outputManagingmemoryandstorageSchedulingtasks&sharingresourcesHardwareP
rocessor,memory,I/Ocontrollers§1.2BelowYourProgramppt课件UnderstandingPerformanceAlgorithmDeterminesnumberofoperationsexecutedProgramming
language,compiler,architectureDeterminenumberofmachineinstructionsexecutedperoperationProcessorandmemor
ysystemDeterminehowfastinstructionsareexecutedI/Osystem(includingOS)DetermineshowfastI/Ooperationsareexecutedp
pt课件ComponentsofaComputerSamecomponentsforallkindsofcomputerDesktop,server,embeddedInput/outputincludesUser-interfacedevices
Display,keyboard,mouseStoragedevicesHarddisk,CD/DVD,flashNetworkadaptersForcommunicatingwithothercomputers§1.3Underth
eCoversTheBIGPictureppt课件Anatomy(结构)ofaComputerOutputdeviceInputdeviceInputdeviceNetworkcableppt课件AnatomyofaMouse
OpticalmouseLEDilluminatesdesktopSmalllow-rescameraBasicimageprocessorLooksforx,ymovementButtons&wheelSupersedesroller-ballmech
anicalmouseppt课件ThroughtheLookingGlassLCDscreen:pictureelements(pixels)Mirrorscontentofframebuffermemoryppt课件OpeningtheBoxppt课件In
sidetheProcessor(CPU)Datapath:performsoperationsondataControl:sequencesdatapath,memory,...CachememorySmallfastSRAMmemoryforimmediateaccesst
odataSRAM–StaticRandomAccessMemoryppt课件InsidetheProcessorAMDBarcelona:4processorcoresppt课件ASafePlac
eforDataVolatile(易变的)mainmemoryLosesinstructionsanddatawhenpoweroffNon-volatilesecondarymemoryMagneticdiskFlashmemoryOpticald
isk(CDROM,DVD)ppt课件NetworksCommunicationandresourcesharingLocalareanetwork(LAN):EthernetWithinabuildingW
ideareanetwork(WAN):theInternetppt课件AbstractionsInstructionSetArchitecture(ISA)Aninterfacebetweenthehardwareandthelowest-levelsoftw
areTheabstractimageofacomputingsystemthatisseenbyamachine/assemblylanguageprogrammerIncludinginstructions,regis
ters,memoryaccess,I/O,…TheBIGPictureppt课件ISAsSystem/360andupwardscompatiblesuccessorsz/ArchitecturePowerArchitectur
ePDP-11SPARCSuperHTricoreTransputerUNIVAC1100/2200seriesVAXx86IA-32(32-bitx86,firstimplemente
dintheIntel80386)x86-64(64-bitsupersetofIA-32,firstimplementedintheAMDOpteron)EISC(AE32K)4004,40406800,6502,6809,68HC11,68HC08.8008,8080,8085,Z
80,Z180,eZ80,etc.8048,8051,etc.Z8,eZ8,etc.AlphaARMBurroughsB5000seriesBurroughsB6000/B7000serieseSi-RISCIA-64(Itanium)
Mico32MIPSMotorola68kPA-RISCIBM700/7000linesppt课件AbstractionsApplicationBinaryInterface(ABI)thelow-levelinter
facebetweenanapplicationprogramandOSABIscoverdetailssuchasdatatype,size,alignment,callingconvention,binaryformatofob
jectfiles,etc.Definesastandardforbinaryportabilityacrosscomputers.ppt课件AbstractionsImplementationHardwarethatobeysthearchi
tectureabstractionManyimplementationsforthesameISAExample:IntelPentiumvsAMDAthlon,almostidenticalISA,butradica
llydifferentinternaldesigns.ppt课件PERFORMANCEppt课件DefiningPerformanceWhichairplanehasthebestperformance?§1.4Performanceppt课件DefiningPerformance
Whichairplanehasthebestperformance?0100200300400500DouglasDC-8-50BAC/SudConcordeBoeing747Boeing777PassengerCapacity02000400060008000
10000DouglasDC-8-50BAC/SudConcordeBoeing747Boeing777CruisingRange(miles)050010001500DouglasDC-8-50BAC/SudConcordeBoeing74
7Boeing777CruisingSpeed(mph)0100000200000300000400000DouglasDC-8-50BAC/SudConcordeBoeing747Boeing777Passengersxmph§1.4Perfo
rmanceppt课件DefiningPerformancePerformance=SpeedE.g.,takingasinglepassengerfromonepointtoanotherWinner
:ConcordePerformance=PassengerThroughputPassengersxm.p.hE.g.,transporting450passengersfromonepointtoanotherWinner:Boeing
747ppt课件ResponseTimeandThroughputResponsetimeHowlongittakestodoataskThroughputTotalworkdoneperunittimee.g.,tasks/transactions/…perhou
rppt课件ResponseTimeandThroughputHowareresponsetimeandthroughputaffectedbyReplacingtheprocessorwithafasterversion?
Addingmoreprocessors?We’llfocusonresponsetimefornow…ppt课件RelativePerformanceDefinePerformance=1/ExecutionTime“XisntimefasterthanY”nXY
YXtimeExecutiontimeExecutionePerformancePerformancExample:timetakentorunaprogram10sonA,15sonBExecu
tionTimeB/ExecutionTimeA=15s/10s=1.5SoAis1.5timesfasterthanBppt课件MeasuringExecutionTimeElapsedtimeTotalre
sponsetime,includingallaspectsProcessing,I/O,OSoverhead,idletimeDeterminessystemperformanceCPUtimeTimespentprocessingagivenj
obDiscountsI/Otime,otherjobs’sharesComprisesuserCPUtimeandsystemCPUtimeDifferentprogramsareaffecteddifferentlybyCPUandsystemperformanceppt课件CPU
ClockingOperationofdigitalhardwaregovernedbyaconstant-rateclockClock(cycles)DatatransferandcomputationUpdatestateClockperiodC
lockcycletime:durationofaclockcyclee.g.,250ps=0.25ns=250×10–12sClockfrequency(clockrate):cyclesperseconde.g.,4.0GHz=4000MHz=4.0×109Hzppt课件CPUTime
PerformanceimprovedbyReducingnumberofclockcyclesIncreasingclockrateHardwaredesignermustoftentrad
eoffclockrateagainstcyclecountRateClockCyclesClockCPUTimeCycleClockCyclesClockCPUTimeCPUppt课件CPUTimeExampleComputerA:2GHzclock,10sCPUtimeDesign
ingComputerBAimfor6sCPUtimeCandofasterclock,butcauses1.2×clockcyclesHowfastmustComputerB’sclockra
tebe?4GHz6s10246s10201.2RateClock10202GHz10sRateClockTimeCPUCyclesClock6sCyclesClock1.2TimeCPUCyclesClockRateCloc
k99B9AAAABBBppt课件InstructionCountandCPIInstructionCountforaprogramDeterminedbyprogram,ISAandcompilerAveragecyclesper
instructionDeterminedbyCPUhardwareIfdifferentinstructionshavedifferentCPIAverageCPIaffectedbyinstructionmixRat
eClockCPICountnInstructioTimeCycleClockCPICountnInstructioTimeCPUnInstructioperCyclesCountnInstructioCyclesClockCPIppt课件C
PIExampleComputerA:CycleTime=250ps,CPI=2.0ComputerB:CycleTime=500ps,CPI=1.2SameISAWhichisfaster,andbyhowmuch?1.2500psI600psIATimeCPUBTimeC
PU600psI500ps1.2IBTimeCycleBCPICountnInstructioBTimeCPU500psI250ps2.0IATimeCycleACPICountnInstructioATim
eCPUAisfaster……bythismuchppt课件CPIinMoreDetailIfdifferentinstructionclassestakedifferentnumbersofc
yclesn1iii)CountnInstructio(CPICyclesClockWeightedaverageCPIn1iiiCountnInstructioCountnInstructioCPICountnInstructioC
yclesClockCPIRelativefrequencyppt课件CPIExampleAlternativecompiledcodesequencesusinginstructionsinclassesA,B,CClassABCCPIforclass123ICinsequence1
212ICinsequence2411Sequence1:IC=5ClockCycles=2×1+1×2+2×3=10Avg.CPI=10/5=2.0Sequence2:IC=6ClockCycles=4×1+1×2+1×3=
9Avg.CPI=9/6=1.5Whatisavg.CPI?IC=InstructionCountppt课件PerformanceSummaryPerformancedependsonAlgorithm:aff
ectsIC,possiblyCPIProgramminglanguage:affectsIC,CPICompiler:affectsIC,CPIInstructionsetarchitecture:affectsIC,CPI,TcTheBIGPictur
ecycleClockSecondsnInstructiocyclesClockProgramnsInstructioTimeCPUppt课件POWERppt课件PowerTrendsInCMOSICtechnology§1.5ThePowerWallFre
quencyVoltageloadCapacitivePower2×1000×305V→1Vppt课件ReducingPowerSupposeanewCPUhas85%ofcapacitiveloadofoldCPU
15%voltageand15%frequencyreduction0.520.85FVC0.85F0.85)(V0.85CPP4old2oldoldold2oldoldoldnewThepowerwallWecan’treducevoltag
efurtherWecan’tremovemoreheatHowelsecanweimproveperformance?ppt课件UniprocessorPerformance§1.6TheSeaChange:TheSwitchtoMu
ltiprocessorsConstrainedbypower,instruction-levelparallelism,memorylatencyppt课件MultiprocessorsMulticoremicroprocessorsMorethanonepr
ocessorperchipRequiresexplicitlyparallelprogrammingComparewithinstructionlevelparallelismHardwareexecutesmultipleinstruction
satonceHiddenfromtheprogrammerHardtodoProgrammingforperformanceLoadbalancingOptimizingcommunicationandsynchronizationppt课件MANUFACTURINGppt课件Ma
nufacturingICsYield:proportionofworkingdiesperwafer§1.7RealStuff:TheAMDOpteronX4ppt课件AMDOpteronX2WaferX2:300mmwafer,117
chips,90nmtechnologyX4:45nmtechnologyppt课件IntegratedCircuitCostNonlinearrelationtoareaanddefectrateWafercostandareaaref
ixedDefectratedeterminedbymanufacturingprocessDieareadeterminedbyarchitectureandcircuitdesign2area/2))Dieareaper(Defects(11Yie
ldareaDieareaWaferwaferperDiesYieldwaferperDieswaferperCostdieperCostppt课件BENCHMARKINGppt课件SPECCPUBenchmarkProgramsusedtomeasureperformance
SupposedlytypicalofactualworkloadStandardPerformanceEvaluationCorp(SPEC)DevelopsbenchmarksforCPU,I/O,Web,…SPECCPU2006Elapsedtimetoexecuteasel
ectionofprogramsNegligibleI/O,sofocusesonCPUperformanceNormalizerelativetoreferencemachineSummari
zeasgeometricmeanofperformanceratiosCINT2006(integer)andCFP2006(floating-point)nn1iiratiotimeExecutionppt课件CINT2006forOpteronX42356NameDescripti
onIC×109CPITc(ns)ExectimeReftimeSPECratioperlInterpretedstringprocessing2,1180.750.406379,77715.3bzip2Block-sortingcompress
ion2,3890.850.408179,65011.8gccGNUCCompiler1,0501.720.47248,05011.1mcfCombinatorialoptimization33610.000.401,3459,
1206.8goGogame(AI)1,6581.090.4072110,49014.6hmmerSearchgenesequence2,7830.800.408909,33010.5sjengChessgame(AI)2,1760.
960.483712,10014.5libquantumQuantumcomputersimulation1,6231.610.401,04720,72019.8h264avcVideocompression3,1020.800.4099322,13022.3omnetppDiscretee
ventsimulation5872.940.406906,2509.1astarGames/pathfinding1,0821.790.407737,0209.1xalancbmkXMLparsing1,0582.700.401,1436,9006.0Geometricmean11.7pp
t课件SPECPowerBenchmarkPowerconsumptionofserveratdifferentworkloadlevelsPerformance:ssj_ops/secPower:Watts(Joules/sec)
100ii100iipowerssj_opsWattperssj_opsOverallppt课件SPECpower_ssj2008forX4TargetLoad%Performance(ssj_ops/s
ec)AveragePower(Watts)100%231,86729590%211,28228680%185,80327570%163,42726560%140,16025650%118,32424640%92
0,3523330%70,50022220%47,12620610%23,0661800%0141Overallsum1,283,5902,605∑ssj_ops/∑power493ppt课件Pitfall:Amdahl’sLawImproving
anaspectofacomputerandexpectingaproportionalimprovementinoverallperformance§1.8FallaciesandPitfalls208020nCan’tbedone!unaffectedaffectedimprovedTf
actortimprovemenTTExample:multiplyaccountsfor80s/100sHowmuchimprovementinmultiplyperformancetoget5×overall?ppt课件Fallacy:LowPoweratI
dleLookbackatX4powerbenchmarkAt100%load:295WAt50%load:246W(83%)At10%load:180W(61%)GoogledatacenterMostlyopera
tesat10%–50%loadAt100%loadlessthan1%ofthetimeConsiderdesigningprocessorstomakepowerproportionaltoloadppt
课件Pitfall:MIPSasaPerformanceMetricMIPS:MillionsofInstructionsPerSecondDoesn’taccountforDifferencesinISAsbetweencomputersDifferencesincomp
lexitybetweeninstructions66610CPIrateClock10rateClockCPIcountnInstructiocountnInstructio10timeExecutioncountnInstructioMIPSCPIvariesbetw
eenprogramsonagivenCPUppt课件ConcludingRemarksCost/performanceisimprovingDuetounderlyingtechnologydevelopmentHierarchicallayersof
abstractionInbothhardwareandsoftwareInstructionsetarchitectureThehardware/softwareinterfaceExecutiontime:thebestperformancemeasurePowe
risalimitingfactorUseparallelismtoimproveperformance§1.9ConcludingRemarksppt课件