【文档说明】计算机组成原理英文版第一章Computer-Abstractions-and-Technology--课件.ppt,共(63)页,8.526 MB,由小橙橙上传
转载请保留链接:https://www.ichengzhen.cn/view-76280.html
以下为本文档部分文字说明:
Chapter1ComputerAbstractionsandTechnologyTheComputerRevolutionProgressincomputertechnologyUnderpinnedbyMoore’sLawWhatisMoo
re’sLaw???Moore'slawdescribesalong-termtrendinthehistoryofcomputinghardware.Thequantityoftransistorsthatcanbeplacedinexpensivelyonanin
tegratedcircuithasdoubledapproximatelyeverytwoyears.§1.1Introductionppt课件Moore’sLawThetrendhascontinued
formorethanhalfacenturyandisnotexpectedtostopuntil2015orlater.ppt课件TheComputerRevolutionMakesnovelapplication
sfeasibleComputersinautomobiles§1.1Introductionppt课件TheComputerRevolutionMakesnovelapplicationsfeasibleCellphones§1.1Introductio
nppt课件TheComputerRevolutionMakesnovelapplicationsfeasibleHumangenomeprojectWorldWideWebSearchEngines§1.1Introductionppt课件The
ComputerRevolutionComputersarepervasive§1.1Introductionppt课件ClassesofComputersQuestion:Howdoyouclass
ifycomputers?DesktopComputersServerComputersEmbeddedComputersppt课件ClassesofComputersDesktopcomputersPCGeneralpurpose,
varietyofsoftwareSubjecttocost/performancetradeoffppt课件ClassesofComputersServercomputersNetworkbasedHighcapacity,performance,reliability
RangefromsmallserverstobuildingsizedWorld’ssmallestwebserverppt课件ClassesofComputersEmbeddedcomputersHiddenascompon
entsofsystemsStringentpower/performance/costconstraintsppt课件TheProcessorMarketppt课件WhatYouWillLearnHowprogramsaretranslatedint
othemachinelanguageAndhowthehardwareexecutesthemThehardware/softwareinterfaceWhatdeterminesprogramperformanceAndhowitcanbeimprovedHowhardwa
redesignersimproveperformanceWhatisparallelprocessingppt课件LevelsofProgramCodeHigh-levellanguageLevelofabstraction
closertoproblemdomainProvidesforproductivityandportabilityAssemblylanguageTextualrepresentationofinstructionsHar
dwarerepresentationBinarydigits(bits)Encodedinstructionsanddatappt课件BelowYourProgramApplicationsoftwareWritteninhigh-levellan
guage(HLL)SystemsoftwareCompiler:translatesHLLcodetomachinecodeOperatingSystem:servicecodeHandlinginput/outp
utManagingmemoryandstorageSchedulingtasks&sharingresourcesHardwareProcessor,memory,I/Ocontrollers§1.2B
elowYourProgramppt课件UnderstandingPerformanceAlgorithmDeterminesnumberofoperationsexecutedProgramminglanguage,compiler,archite
ctureDeterminenumberofmachineinstructionsexecutedperoperationProcessorandmemorysystemDeterminehowfastinstruc
tionsareexecutedI/Osystem(includingOS)DetermineshowfastI/Ooperationsareexecutedppt课件ComponentsofaComputerSamecomponentsforallkindsofcomputerD
esktop,server,embeddedInput/outputincludesUser-interfacedevicesDisplay,keyboard,mouseStoragedevicesHarddisk,
CD/DVD,flashNetworkadaptersForcommunicatingwithothercomputers§1.3UndertheCoversTheBIGPictureppt课件Anatomy(结构)ofaComputerOutputdeviceInput
deviceInputdeviceNetworkcableppt课件AnatomyofaMouseOpticalmouseLEDilluminatesdesktopSmalllow-rescameraBasicimag
eprocessorLooksforx,ymovementButtons&wheelSupersedesroller-ballmechanicalmouseppt课件ThroughtheLookingGlassLCDscreen:pictureelemen
ts(pixels)Mirrorscontentofframebuffermemoryppt课件OpeningtheBoxppt课件InsidetheProcessor(CPU)Datapath:performsoperati
onsondataControl:sequencesdatapath,memory,...CachememorySmallfastSRAMmemoryforimmediateaccesstodataS
RAM–StaticRandomAccessMemoryppt课件InsidetheProcessorAMDBarcelona:4processorcoresppt课件ASafePlaceforDataVolatile(易变的)mainmemoryLoses
instructionsanddatawhenpoweroffNon-volatilesecondarymemoryMagneticdiskFlashmemoryOpticaldisk(CDROM,DVD)ppt课件NetworksCommunicationand
resourcesharingLocalareanetwork(LAN):EthernetWithinabuildingWideareanetwork(WAN):theInternetppt课件Abst
ractionsInstructionSetArchitecture(ISA)Aninterfacebetweenthehardwareandthelowest-levelsoftwareTheabstractimageofa
computingsystemthatisseenbyamachine/assemblylanguageprogrammerIncludinginstructions,registers,memoryaccess,I/O,…TheBIGPictureppt课件ISAsSystem/360a
ndupwardscompatiblesuccessorsz/ArchitecturePowerArchitecturePDP-11SPARCSuperHTricoreTransputerUNIVAC1100/2200seriesVAXx86IA-32
(32-bitx86,firstimplementedintheIntel80386)x86-64(64-bitsupersetofIA-32,firstimplementedintheAMDOpteron)EISC(AE32K)4004,40406800,6
502,6809,68HC11,68HC08.8008,8080,8085,Z80,Z180,eZ80,etc.8048,8051,etc.Z8,eZ8,etc.AlphaARMBurroughsB5000seriesBurroughsB6000/B7
000serieseSi-RISCIA-64(Itanium)Mico32MIPSMotorola68kPA-RISCIBM700/7000linesppt课件AbstractionsApplicationBinaryInterface(
ABI)thelow-levelinterfacebetweenanapplicationprogramandOSABIscoverdetailssuchasdatatype,size,alignment,callingconvention,binaryformat
ofobjectfiles,etc.Definesastandardforbinaryportabilityacrosscomputers.ppt课件AbstractionsImplementationHardwarethatob
eysthearchitectureabstractionManyimplementationsforthesameISAExample:IntelPentiumvsAMDAthlon,almostidenticalISA,butradi
callydifferentinternaldesigns.ppt课件PERFORMANCEppt课件DefiningPerformanceWhichairplanehasthebestperformance?§1.4
Performanceppt课件DefiningPerformanceWhichairplanehasthebestperformance?0100200300400500DouglasDC-8-50BAC/SudConcordeBoeing747
Boeing777PassengerCapacity0200040006000800010000DouglasDC-8-50BAC/SudConcordeBoeing747Boeing777CruisingRange(miles)0500100015
00DouglasDC-8-50BAC/SudConcordeBoeing747Boeing777CruisingSpeed(mph)0100000200000300000400000DouglasDC-8-50BAC/
SudConcordeBoeing747Boeing777Passengersxmph§1.4Performanceppt课件DefiningPerformancePerformance=SpeedE.g.,takingasinglepassengerfro
monepointtoanotherWinner:ConcordePerformance=PassengerThroughputPassengersxm.p.hE.g.,transporting450
passengersfromonepointtoanotherWinner:Boeing747ppt课件ResponseTimeandThroughputResponsetimeHowlongittakestodoataskThroughputTotalworkdoneperu
nittimee.g.,tasks/transactions/…perhourppt课件ResponseTimeandThroughputHowareresponsetimeandthroughputaffectedbyReplacingtheprocessorwi
thafasterversion?Addingmoreprocessors?We’llfocusonresponsetimefornow…ppt课件RelativePerformanceDefinePe
rformance=1/ExecutionTime“XisntimefasterthanY”nXYYXtimeExecutiontimeExecutionePerformancePerformancExample
:timetakentorunaprogram10sonA,15sonBExecutionTimeB/ExecutionTimeA=15s/10s=1.5SoAis1.5timesfasterthanBppt课件MeasuringExecutionTi
meElapsedtimeTotalresponsetime,includingallaspectsProcessing,I/O,OSoverhead,idletimeDeterminessystemperformanceCPUtimeTimespentp
rocessingagivenjobDiscountsI/Otime,otherjobs’sharesComprisesuserCPUtimeandsystemCPUtimeDifferentprogram
sareaffecteddifferentlybyCPUandsystemperformanceppt课件CPUClockingOperationofdigitalhardwaregovernedbyaconstant-rateclockClock(cycles)Datatransferan
dcomputationUpdatestateClockperiodClockcycletime:durationofaclockcyclee.g.,250ps=0.25ns=250×10–12sClockfrequency(clockrate):cyclesperseconde.
g.,4.0GHz=4000MHz=4.0×109Hzppt课件CPUTimePerformanceimprovedbyReducingnumberofclockcyclesIncreasingclockrateHardwarede
signermustoftentradeoffclockrateagainstcyclecountRateClockCyclesClockCPUTimeCycleClockCyclesClockCPUTimeCP
Uppt课件CPUTimeExampleComputerA:2GHzclock,10sCPUtimeDesigningComputerBAimfor6sCPUtimeCandofasterclo
ck,butcauses1.2×clockcyclesHowfastmustComputerB’sclockratebe?4GHz6s10246s10201.2RateClock10202GHz10sRateClockTimeCPUCyclesClock6sCycl
esClock1.2TimeCPUCyclesClockRateClock99B9AAAABBBppt课件InstructionCountandCPIInstructionCountforaprogramDeterminedbyp
rogram,ISAandcompilerAveragecyclesperinstructionDeterminedbyCPUhardwareIfdifferentinstructionshavedifferentCPIAvera
geCPIaffectedbyinstructionmixRateClockCPICountnInstructioTimeCycleClockCPICountnInstructioTimeCPUnInstructioperCyclesCountnIns
tructioCyclesClockCPIppt课件CPIExampleComputerA:CycleTime=250ps,CPI=2.0ComputerB:CycleTime=500ps,CPI=1.2
SameISAWhichisfaster,andbyhowmuch?1.2500psI600psIATimeCPUBTimeCPU600psI500ps1.2IBTimeCycleBCPICountnInstructi
oBTimeCPU500psI250ps2.0IATimeCycleACPICountnInstructioATimeCPUAisfaster……bythismuchppt课件CPIinMoreDetailIfdifferentinstructio
nclassestakedifferentnumbersofcyclesn1iii)CountnInstructio(CPICyclesClockWeightedaverageCPIn1iiiCountnInstructioCountn
InstructioCPICountnInstructioCyclesClockCPIRelativefrequencyppt课件CPIExampleAlternativecompiledcodesequencesusinginstruct
ionsinclassesA,B,CClassABCCPIforclass123ICinsequence1212ICinsequence2411Sequence1:IC=5ClockCycles=2×1+1×2+2×3=10Avg.CPI=10/5=
2.0Sequence2:IC=6ClockCycles=4×1+1×2+1×3=9Avg.CPI=9/6=1.5Whatisavg.CPI?IC=InstructionCountppt课件PerformanceSummaryPerfo
rmancedependsonAlgorithm:affectsIC,possiblyCPIProgramminglanguage:affectsIC,CPICompiler:affectsIC,CPIIns
tructionsetarchitecture:affectsIC,CPI,TcTheBIGPicturecycleClockSecondsnInstructiocyclesClockProgramnsInstructioTimeCP
Uppt课件POWERppt课件PowerTrendsInCMOSICtechnology§1.5ThePowerWallFrequencyVoltageloadCapacitivePower2×1000×305V→1Vppt课件Reduci
ngPowerSupposeanewCPUhas85%ofcapacitiveloadofoldCPU15%voltageand15%frequencyreduction0.520.85FVC0.85F0.85)(V0.85CPP4old2oldoldol
d2oldoldoldnewThepowerwallWecan’treducevoltagefurtherWecan’tremovemoreheatHowelsecanweimproveperf
ormance?ppt课件UniprocessorPerformance§1.6TheSeaChange:TheSwitchtoMultiprocessorsConstrainedbypower,instruction-levelpa
rallelism,memorylatencyppt课件MultiprocessorsMulticoremicroprocessorsMorethanoneprocessorperchipRequire
sexplicitlyparallelprogrammingComparewithinstructionlevelparallelismHardwareexecutesmultipleinstructions
atonceHiddenfromtheprogrammerHardtodoProgrammingforperformanceLoadbalancingOptimizingcommunicationandsynchronization
ppt课件MANUFACTURINGppt课件ManufacturingICsYield:proportionofworkingdiesperwafer§1.7RealStuff:TheAMDOpteronX4ppt课件AMDOpteronX2WaferX2:300mmw
afer,117chips,90nmtechnologyX4:45nmtechnologyppt课件IntegratedCircuitCostNonlinearrelationtoareaanddefe
ctrateWafercostandareaarefixedDefectratedeterminedbymanufacturingprocessDieareadeterminedbyarchitectureandcircuitdesign2area/2)
)Dieareaper(Defects(11YieldareaDieareaWaferwaferperDiesYieldwaferperDieswaferperCostdieperCostppt课件BENCHMARKINGppt课件SPECCPUBenchmark
ProgramsusedtomeasureperformanceSupposedlytypicalofactualworkloadStandardPerformanceEvaluationCorp(SPEC)Developsbenchmar
ksforCPU,I/O,Web,…SPECCPU2006ElapsedtimetoexecuteaselectionofprogramsNegligibleI/O,sofocusesonCPUperformanceNormalizerelativ
etoreferencemachineSummarizeasgeometricmeanofperformanceratiosCINT2006(integer)andCFP2006(floating-point)nn1iiratiotimeExecution
ppt课件CINT2006forOpteronX42356NameDescriptionIC×109CPITc(ns)ExectimeReftimeSPECratioperlInterpretedstringprocessing2,1180.750.406379,7
7715.3bzip2Block-sortingcompression2,3890.850.408179,65011.8gccGNUCCompiler1,0501.720.47248,05011.1mc
fCombinatorialoptimization33610.000.401,3459,1206.8goGogame(AI)1,6581.090.4072110,49014.6hmmerSearchgenesequenc
e2,7830.800.408909,33010.5sjengChessgame(AI)2,1760.960.483712,10014.5libquantumQuantumcomputersimulation1,6231.610.401,04720,72019.8h264avcVide
ocompression3,1020.800.4099322,13022.3omnetppDiscreteeventsimulation5872.940.406906,2509.1astarGames/pathfinding1,0821.790.407737,0209.1xa
lancbmkXMLparsing1,0582.700.401,1436,9006.0Geometricmean11.7ppt课件SPECPowerBenchmarkPowerconsumptionofserver
atdifferentworkloadlevelsPerformance:ssj_ops/secPower:Watts(Joules/sec)100ii100iipowerss
j_opsWattperssj_opsOverallppt课件SPECpower_ssj2008forX4TargetLoad%Performance(ssj_ops/sec)AveragePower(Watts)100%231,8672959
0%211,28228680%185,80327570%163,42726560%140,16025650%118,32424640%920,3523330%70,50022220%47,12620610%23,06
61800%0141Overallsum1,283,5902,605∑ssj_ops/∑power493ppt课件Pitfall:Amdahl’sLawImprovinganaspectofacomputerandexpectingaproportionalimprovementinoveral
lperformance§1.8FallaciesandPitfalls208020nCan’tbedone!unaffectedaffectedimprovedTfactortimprovemenTTExample:mu
ltiplyaccountsfor80s/100sHowmuchimprovementinmultiplyperformancetoget5×overall?ppt课件Fallacy:LowPoweratIdleL
ookbackatX4powerbenchmarkAt100%load:295WAt50%load:246W(83%)At10%load:180W(61%)GoogledatacenterMostlyoperatesat10%–
50%loadAt100%loadlessthan1%ofthetimeConsiderdesigningprocessorstomakepowerproportionaltoloadppt课件Pitfall
:MIPSasaPerformanceMetricMIPS:MillionsofInstructionsPerSecondDoesn’taccountforDifferencesinISAsbetweencomputersDifferencesincomple
xitybetweeninstructions66610CPIrateClock10rateClockCPIcountnInstructiocountnInstructio10timeExecutioncountnInstructioMIPSCPIvariesbetweenp
rogramsonagivenCPUppt课件ConcludingRemarksCost/performanceisimprovingDuetounderlyingtechnologydevelopmentHierarchicallayers
ofabstractionInbothhardwareandsoftwareInstructionsetarchitectureThehardware/softwareinterfaceExecutiontime:thebestperformancemeasurePower
isalimitingfactorUseparallelismtoimproveperformance§1.9ConcludingRemarksppt课件