【文档说明】计算机组成原理英文版第一章Computer-Abstractions-and-Technology--课件.ppt,共(63)页,8.526 MB,由小橙橙上传
转载请保留链接:https://www.ichengzhen.cn/view-76280.html
以下为本文档部分文字说明:
Chapter1ComputerAbstractionsandTechnologyTheComputerRevolutionProgressincomputertechnologyUnderpinnedbyMoore
’sLawWhatisMoore’sLaw???Moore'slawdescribesalong-termtrendinthehistoryofcomputinghardware.Thequantityoftransistorstha
tcanbeplacedinexpensivelyonanintegratedcircuithasdoubledapproximatelyeverytwoyears.§1.1Introductionppt课件Moore’sLawThetrendhascontin
uedformorethanhalfacenturyandisnotexpectedtostopuntil2015orlater.ppt课件TheComputerRevolutionMakesnovelap
plicationsfeasibleComputersinautomobiles§1.1Introductionppt课件TheComputerRevolutionMakesnovelapplicationsfeasibleCellphones§1.1Introductionppt课件T
heComputerRevolutionMakesnovelapplicationsfeasibleHumangenomeprojectWorldWideWebSearchEngines§1.1Introductionppt课件T
heComputerRevolutionComputersarepervasive§1.1Introductionppt课件ClassesofComputersQuestion:Howdoyouclassifycomp
uters?DesktopComputersServerComputersEmbeddedComputersppt课件ClassesofComputersDesktopcomputersPC
Generalpurpose,varietyofsoftwareSubjecttocost/performancetradeoffppt课件ClassesofComputersServercomputersNetworkbasedHighcapacit
y,performance,reliabilityRangefromsmallserverstobuildingsizedWorld’ssmallestwebserverppt课件ClassesofComputersEmbeddedcomputersHiddenascompo
nentsofsystemsStringentpower/performance/costconstraintsppt课件TheProcessorMarketppt课件WhatYouWillLearnHo
wprogramsaretranslatedintothemachinelanguageAndhowthehardwareexecutesthemThehardware/softwareinterfaceWhatdeterminesprogramperformanceAndho
witcanbeimprovedHowhardwaredesignersimproveperformanceWhatisparallelprocessingppt课件LevelsofProgramCodeHigh-l
evellanguageLevelofabstractionclosertoproblemdomainProvidesforproductivityandportabilityAssemblylan
guageTextualrepresentationofinstructionsHardwarerepresentationBinarydigits(bits)Encodedinstructionsa
nddatappt课件BelowYourProgramApplicationsoftwareWritteninhigh-levellanguage(HLL)SystemsoftwareCompiler:tran
slatesHLLcodetomachinecodeOperatingSystem:servicecodeHandlinginput/outputManagingmemoryandstorageSchedulingtasks&sharingr
esourcesHardwareProcessor,memory,I/Ocontrollers§1.2BelowYourProgramppt课件UnderstandingPerformanceAlgorithmDeterminesnumberofoperati
onsexecutedProgramminglanguage,compiler,architectureDeterminenumberofmachineinstructionsexecutedperoperationP
rocessorandmemorysystemDeterminehowfastinstructionsareexecutedI/Osystem(includingOS)DetermineshowfastI/Ooperationsareexecut
edppt课件ComponentsofaComputerSamecomponentsforallkindsofcomputerDesktop,server,embeddedInput/outputincludesUser-in
terfacedevicesDisplay,keyboard,mouseStoragedevicesHarddisk,CD/DVD,flashNetworkadaptersForcommunicatingwithothercomputers§
1.3UndertheCoversTheBIGPictureppt课件Anatomy(结构)ofaComputerOutputdeviceInputdeviceInputdeviceNetworkcableppt课件AnatomyofaMouseO
pticalmouseLEDilluminatesdesktopSmalllow-rescameraBasicimageprocessorLooksforx,ymovementButtons&wheelSupersedesroller-ballmechanicalmous
eppt课件ThroughtheLookingGlassLCDscreen:pictureelements(pixels)Mirrorscontentofframebuffermemoryppt课件Openin
gtheBoxppt课件InsidetheProcessor(CPU)Datapath:performsoperationsondataControl:sequencesdatapath,memory,...CachememorySmallfastSRAMmemoryfori
mmediateaccesstodataSRAM–StaticRandomAccessMemoryppt课件InsidetheProcessorAMDBarcelona:4processorcoresppt课件ASafePlaceforData
Volatile(易变的)mainmemoryLosesinstructionsanddatawhenpoweroffNon-volatilesecondarymemoryMagneticdiskFlashmemoryOpticaldisk(CDROM,DVD)pp
t课件NetworksCommunicationandresourcesharingLocalareanetwork(LAN):EthernetWithinabuildingWideareanetwork(WAN):theInternet
ppt课件AbstractionsInstructionSetArchitecture(ISA)Aninterfacebetweenthehardwareandthelowest-levelsoftwareTheabstractimageofacomputing
systemthatisseenbyamachine/assemblylanguageprogrammerIncludinginstructions,registers,memoryaccess,I/O,…TheBIGPictu
reppt课件ISAsSystem/360andupwardscompatiblesuccessorsz/ArchitecturePowerArchitecturePDP-11SPARCSuperHTricoreTransputerUNIVAC1100/2200series
VAXx86IA-32(32-bitx86,firstimplementedintheIntel80386)x86-64(64-bitsupersetofIA-32,firstimplementedintheAM
DOpteron)EISC(AE32K)4004,40406800,6502,6809,68HC11,68HC08.8008,8080,8085,Z80,Z180,eZ80,etc.8048,8051,etc.
Z8,eZ8,etc.AlphaARMBurroughsB5000seriesBurroughsB6000/B7000serieseSi-RISCIA-64(Itanium)Mico32MIPS
Motorola68kPA-RISCIBM700/7000linesppt课件AbstractionsApplicationBinaryInterface(ABI)thelow-levelinterfacebetweenanapplicationprogram
andOSABIscoverdetailssuchasdatatype,size,alignment,callingconvention,binaryformatofobjectfiles,etc.Definesastandardforbinaryportability
acrosscomputers.ppt课件AbstractionsImplementationHardwarethatobeysthearchitectureabstractionManyimplementationsforthesameISAExample:IntelPenti
umvsAMDAthlon,almostidenticalISA,butradicallydifferentinternaldesigns.ppt课件PERFORMANCEppt课件DefiningPerformanceWhichairplanehasthebestperformance?§1
.4Performanceppt课件DefiningPerformanceWhichairplanehasthebestperformance?0100200300400500DouglasDC-8-50BAC/SudConcordeBoeing747Boeing777Passe
ngerCapacity0200040006000800010000DouglasDC-8-50BAC/SudConcordeBoeing747Boeing777CruisingRange(miles)050010001500DouglasDC-8-5
0BAC/SudConcordeBoeing747Boeing777CruisingSpeed(mph)0100000200000300000400000DouglasDC-8-50BAC/SudConcordeBoeing747Boeing777Passengers
xmph§1.4Performanceppt课件DefiningPerformancePerformance=SpeedE.g.,takingasinglepassengerfromonepointtoanotherWinner
:ConcordePerformance=PassengerThroughputPassengersxm.p.hE.g.,transporting450passengersfromonepointtoanotherWinner:B
oeing747ppt课件ResponseTimeandThroughputResponsetimeHowlongittakestodoataskThroughputTotalworkdoneperunittimee.g.,ta
sks/transactions/…perhourppt课件ResponseTimeandThroughputHowareresponsetimeandthroughputaffectedbyReplacing
theprocessorwithafasterversion?Addingmoreprocessors?We’llfocusonresponsetimefornow…ppt课件RelativePerformanceDefi
nePerformance=1/ExecutionTime“XisntimefasterthanY”nXYYXtimeExecutiontimeExecutionePerformancePerformancExample:timetakentor
unaprogram10sonA,15sonBExecutionTimeB/ExecutionTimeA=15s/10s=1.5SoAis1.5timesfasterthanBppt课件MeasuringExecutionTime
ElapsedtimeTotalresponsetime,includingallaspectsProcessing,I/O,OSoverhead,idletimeDeterminessystemperformanceCPUti
meTimespentprocessingagivenjobDiscountsI/Otime,otherjobs’sharesComprisesuserCPUtimeandsystemCPUtimeDifferentprogramsareaffecteddifferentlybyCPUa
ndsystemperformanceppt课件CPUClockingOperationofdigitalhardwaregovernedbyaconstant-rateclockClock(cycles)Datatransferandco
mputationUpdatestateClockperiodClockcycletime:durationofaclockcyclee.g.,250ps=0.25ns=250×10–12sClockfrequency(clockrate):cyclespersecond
e.g.,4.0GHz=4000MHz=4.0×109Hzppt课件CPUTimePerformanceimprovedbyReducingnumberofclockcyclesIncreasing
clockrateHardwaredesignermustoftentradeoffclockrateagainstcyclecountRateClockCyclesClockCPUTimeCycleClockCyclesC
lockCPUTimeCPUppt课件CPUTimeExampleComputerA:2GHzclock,10sCPUtimeDesigningComputerBAimfor6sCPUtimeCandofasterclock,butcauses1.2×cloc
kcyclesHowfastmustComputerB’sclockratebe?4GHz6s10246s10201.2RateClock10202GHz10sRateClockTimeCPUCyclesC
lock6sCyclesClock1.2TimeCPUCyclesClockRateClock99B9AAAABBBppt课件InstructionCountandCPIInstructionCountforaprogra
mDeterminedbyprogram,ISAandcompilerAveragecyclesperinstructionDeterminedbyCPUhardwareIfdifferentin
structionshavedifferentCPIAverageCPIaffectedbyinstructionmixRateClockCPICountnInstructioTimeCycleClockCPICountnInstructioTimeCPUnInstructioperCycles
CountnInstructioCyclesClockCPIppt课件CPIExampleComputerA:CycleTime=250ps,CPI=2.0ComputerB:CycleTime=500ps,CPI=1.2SameIS
AWhichisfaster,andbyhowmuch?1.2500psI600psIATimeCPUBTimeCPU600psI500ps1.2IBTimeCycleBCPICountnInstr
uctioBTimeCPU500psI250ps2.0IATimeCycleACPICountnInstructioATimeCPUAisfaster……bythismuchppt课件CPIinMoreDetail
Ifdifferentinstructionclassestakedifferentnumbersofcyclesn1iii)CountnInstructio(CPICyclesClockWeightedaverageCPI
n1iiiCountnInstructioCountnInstructioCPICountnInstructioCyclesClockCPIRelativefrequencyppt课件CPIExampleAlternativecompiledcodesequencesusinginstr
uctionsinclassesA,B,CClassABCCPIforclass123ICinsequence1212ICinsequence2411Sequence1:IC=5ClockCycles=2×1+1×2+2×3=10Av
g.CPI=10/5=2.0Sequence2:IC=6ClockCycles=4×1+1×2+1×3=9Avg.CPI=9/6=1.5Whatisavg.CPI?IC=InstructionCountppt课件PerformanceSummaryPerformancedependson
Algorithm:affectsIC,possiblyCPIProgramminglanguage:affectsIC,CPICompiler:affectsIC,CPIInstructionsetarchitec
ture:affectsIC,CPI,TcTheBIGPicturecycleClockSecondsnInstructiocyclesClockProgramnsInstructioTimeCPUppt课件POWERppt课件PowerT
rendsInCMOSICtechnology§1.5ThePowerWallFrequencyVoltageloadCapacitivePower2×1000×305V→1Vppt课件ReducingPowerSupposeanewCPUhas85%ofca
pacitiveloadofoldCPU15%voltageand15%frequencyreduction0.520.85FVC0.85F0.85)(V0.85CPP4old2oldoldold2oldoldoldnewThepowerwall
Wecan’treducevoltagefurtherWecan’tremovemoreheatHowelsecanweimproveperformance?ppt课件UniprocessorPerformance§1.6
TheSeaChange:TheSwitchtoMultiprocessorsConstrainedbypower,instruction-levelparallelism,memorylatencypp
t课件MultiprocessorsMulticoremicroprocessorsMorethanoneprocessorperchipRequiresexplicitlyparallelprogrammingCompar
ewithinstructionlevelparallelismHardwareexecutesmultipleinstructionsatonceHiddenfromtheprogrammerHardtodoProgramming
forperformanceLoadbalancingOptimizingcommunicationandsynchronizationppt课件MANUFACTURINGppt课件Manufacturi
ngICsYield:proportionofworkingdiesperwafer§1.7RealStuff:TheAMDOpteronX4ppt课件AMDOpteronX2WaferX2:300mmwafer,117chips,90nmtechnologyX4
:45nmtechnologyppt课件IntegratedCircuitCostNonlinearrelationtoareaanddefectrateWafercostandareaarefixedDefectratedeterminedbymanufacturingproce
ssDieareadeterminedbyarchitectureandcircuitdesign2area/2))Dieareaper(Defects(11YieldareaDieareaWaferwaferp
erDiesYieldwaferperDieswaferperCostdieperCostppt课件BENCHMARKINGppt课件SPECCPUBenchmarkProgramsusedtomeasureperfo
rmanceSupposedlytypicalofactualworkloadStandardPerformanceEvaluationCorp(SPEC)DevelopsbenchmarksforCPU,I/O,Web,…SPECCPU2006Elapsedti
metoexecuteaselectionofprogramsNegligibleI/O,sofocusesonCPUperformanceNormalizerelativetoreferencemachineSummarizeasgeometricmeanofperform
anceratiosCINT2006(integer)andCFP2006(floating-point)nn1iiratiotimeExecutionppt课件CINT2006forOpteronX42356NameDescriptionIC×109CPITc(
ns)ExectimeReftimeSPECratioperlInterpretedstringprocessing2,1180.750.406379,77715.3bzip2Block-sortingcompression2,3890.850.408179,65011.8gccG
NUCCompiler1,0501.720.47248,05011.1mcfCombinatorialoptimization33610.000.401,3459,1206.8goGogame(AI)1,6581.090.40
72110,49014.6hmmerSearchgenesequence2,7830.800.408909,33010.5sjengChessgame(AI)2,1760.960.483712,10014.5libquantumQuantumcomp
utersimulation1,6231.610.401,04720,72019.8h264avcVideocompression3,1020.800.4099322,13022.3omnetppDiscr
eteeventsimulation5872.940.406906,2509.1astarGames/pathfinding1,0821.790.407737,0209.1xalancbmkXMLparsing1,0582.700.401,1
436,9006.0Geometricmean11.7ppt课件SPECPowerBenchmarkPowerconsumptionofserveratdifferentworkloadlevelsPerformance:ssj_ops/secPower:Watts(Joules/sec)
100ii100iipowerssj_opsWattperssj_opsOverallppt课件SPECpower_ssj2008forX4TargetLoad%Performance(ssj_o
ps/sec)AveragePower(Watts)100%231,86729590%211,28228680%185,80327570%163,42726560%140,16025650%118,32424640%920,3523330%70,50022220%47,12
620610%23,0661800%0141Overallsum1,283,5902,605∑ssj_ops/∑power493ppt课件Pitfall:Amdahl’sLawImprovinganaspectofacomputerandexpectingaproportionalim
provementinoverallperformance§1.8FallaciesandPitfalls208020nCan’tbedone!unaffectedaffectedimprovedTfactortimprovemenTTExample:multiplyaccounts
for80s/100sHowmuchimprovementinmultiplyperformancetoget5×overall?ppt课件Fallacy:LowPoweratIdleLookbackatX4powe
rbenchmarkAt100%load:295WAt50%load:246W(83%)At10%load:180W(61%)GoogledatacenterMostlyoperatesat10%–50%loadAt100%loadlessthan1%ofth
etimeConsiderdesigningprocessorstomakepowerproportionaltoloadppt课件Pitfall:MIPSasaPerformanceMetricMIPS:MillionsofInstructionsPerS
econdDoesn’taccountforDifferencesinISAsbetweencomputersDifferencesincomplexitybetweeninstructions66610CPIrateClock10rateClockCPIcountnIns
tructiocountnInstructio10timeExecutioncountnInstructioMIPSCPIvariesbetweenprogramsonagivenCPUppt
课件ConcludingRemarksCost/performanceisimprovingDuetounderlyingtechnologydevelopmentHierarchicallayersofabstractionInbothhardwa
reandsoftwareInstructionsetarchitectureThehardware/softwareinterfaceExecutiontime:thebestperformancemeasurePowerisalimitingfactorUsepa
rallelismtoimproveperformance§1.9ConcludingRemarksppt课件