Computer Architecture (1st Week)
Definition:
In
computer science and engineering, computer architecture is the art that
specifies the relations and part of a computer system.
At a high
level, computer architecture is concerned with how the central processing unit
(CPU) acts and how it uses computer memory.
The main blocks of a digital computer system are:
1.
ALU (arithmetic logic unit)
2.
CU (control unit)
3.
Central RAM (random access memory)
4.
Storage devices
5.
I/O (input/output interface units)
6.
Communications devices
Functionality of Computer Architecture
Building Blocks
1.
Arithmetic Logical Unit (ALU)
All
calculations are performed in the Arithmetic Logic Unit (ALU) and it also does
comparisons and takes decisions on behalf of logic.
The ALU can perform basic operations such as
addition, subtraction, multiplication, division, etc and does logic operations like
>, <, =, 'etc. Whenever calculations are required, the control unit
transfers the data from storage unit to ALU once the computations are done, the
results are transferred to the storage unit by the control unit and then it is
send to the output unit for displaying results.
2.
Control Unit (CU)
This controls all the units in the computer.
The control unit instructs the input units where to store the data after
receiving it from the user/ the outside world. It controls the flow of data and
instructions from the storage units to ALU. It also controls the flow of
results from the ALU to the storage units and to the output units which send it
to the user/the outside world. The control unit is generally referred as the
"central nervous system" of the computer that controls and
synchronizes its overall working.
3.
Central RAM
This stores and delivers data very fast but
not as fast as cache does. RAM is generally used to hold the program being
currently executed in the computer, the data being received from the input unit
and the intermediate and final results of the program. RAM memory is temporary
in nature: the data is lost when the computer is switched off.
Primary storage costs more per megabyte
compared to secondary storage. Therefore most computers have limited primary
storage capacity.
4.
Storage Devices
Storage Devices stores several programs,
documents, data bases etc. The programs that you run on the computer are first
transferred to the primary memory before it is actually run. Whenever the
results are saved, again they get stored in the secondary memory. The secondary
memory is slower and cheaper than the primary memory. Some of the commonly used
secondary memory devices are Hard disk, USB Stick, CD, DVD, etc
5.
I/O (input/output interface units)
I.
Input units
Input can be entered from a keyboard, a mouse
pointing device, a USB stick and the various types of photo storage cards used
by digital cameras. Input can also be downloaded from the Internet via a
communications device.
Older personal computers used floppy disks
and magnetic tape devices. Even older designs of mini-computers and main-frame
computers took their input from punched cards, paper tape, etc. as well as
communications devices.
All input peripheral devices perform the following functions:
·
Accept the data and instructions from the outside world.
·Convert it to a form that the computer can
understand.
·Supply that converted data to the computer
system for further processing.
II.
Output Units
The output
units of a computer provide the information and results of an arithmetical
computation - or some other kind of data processing, such as a search - to the
outside world.
Commonly
used output units are printers and visual display units (VDUs), also known
simply as "screens". Output data can also be sent to USB sticks and
other types of data storage cards. Output can also be uploaded to the Internet
via a communications device.
All output peripheral devices perform the
following functions:
·
Accept output data and instructions from the computer.
·
Convert it to a form that the outside world can
understand.
·
Supply that converted data to the outside world.
Evolution of computer
architecture
The
design of a computer's CPU architecture, instruction set, addressing modes.
Computer design is concerned with the hardware design of the computer.
First Generation (1945-1958)
·
Features
o
Vacuum tubes, Machine code, Assembly language
o
Computers contained a central processor that
was unique to that machine
o
It is considered for general purpose from
different type of supported instruction for few machines.
o
It uses magnetic core or drum memory for
loading data and programs using paper tape or punch card.
Second Generation (1958-1964)
·
Features
o
Transistors
o
Small, low-power, low-cost, more reliable
than vacuum tubes,
o
Magnetic core memory
o
Two's complement, floating point arithmetic
o
Reduced the computational time from
milliseconds to microseconds
o
High level languages
o
First operating Systems: handled one program
at a time
o
On-off switches controlled by electricity
o
floating point arithmetic
o
Contains more than 50,000 transistors plus
extremely fast magnetic core storage.
o
Basic Cycle Time: 2.18 Seconds
Third Generation
(1964-1974)
·
Features
o Introduction of integrated circuits combining
thousands of transistors on a single chip
o Semiconductor memory
o Timesharing, graphics, structured programming
o 2 Mb memory, 5 MIPS
o Use of cache memory
o Microprocessor chips combine thousands of
transistors, entire circuit on one computer chip
o Semiconductor memory
o Multiple computer models with different
performance characteristics
o Smaller computers that did not need a specialized
room
Fourth Generation
(1974-present)
·
Features
o
Introduction of Very Large-Scale Integration (VLSI)/Ultra
Large
o
Scale Integration (ULSI) - combines millions of
transistors
o
Single-chip processor and the single-board computer
emerged
o
Smallest in size because of the high component density
o
Creation of the Personal Computer (PC)
o
Wide spread use of data communications
o
Object-Oriented programming: Objects & operations on
objects
o
Artificial intelligence: Functions & logic predicates
o
1974 - 1977 the first personal computers - introduced on
the market as kits (Major assembly required).
Stored Program Concept
This refers to the ability
of a calculating machine to store its instructions in its internal memory and
process them in its arithmetic unit, so that in the course of a computation
they may be not just executed but also modified at electronic speeds.
Vann Neumann Concept &
Von Neumann bottleneck
Von Neumann described a design architecture
for an electronic digital computer with subdivisions of a processing unit
consisting of an arithmetic logic unit and processor registers, a control unit
containing an instruction register and program counter, a memory to store both
data and instructions, external mass storage, and input and output mechanisms.
The meaning of the term has evolved to mean a stored-program computer in which
an instruction fetch and a data operation cannot occur at the same time because
they share a common bus. This is referred to as the Von Neumann bottleneck and
often limits the performance of the system.
Harvard Architecture
The Harvard architecture is computer
architecture with physically separate storage and signal pathways for
instructions and data. The term originated from the Harvard Mark I relay-based
computer, which stored instructions on punched tape (24 bits wide) and data in
electro-mechanical counters. These early machines had limited data storage,
entirely contained within the central processing unit, and provided no access
to the instruction storage as data. Programs needed to be loaded by an
operator, the processor could not boot itself.
Today, most processors implement such
separate signal pathways for performance reasons but actually implement Modified
Harvard architecture, so they can support tasks like loading a program from
disk storage as data and then executing it.
Pipeline Technique
An instruction pipeline is a technique used
in the design of computers to increase their instruction throughput (the number
of instructions that can be executed in a unit of time). Pipelining does not
reduce the time to complete an instruction, but increases the number of
instructions that can be processed at once.
Each instruction is
split into a sequence of dependent steps. The first step is always to fetch the
instruction from memory; the final step is usually writing the results of the
instruction to processor registers or to memory.
Principle of Locality
In computer science, locality of reference,
also known as the principle of locality, is a phenomenon describing the same
value, or related storage locations, being frequently accessed. Types of
reference locality are:
1.
Temporal locality:
If at one point in time a particular memory location is
referenced, then it is likely that the same location will be referenced again
in the near future. In this case it is common to make efforts to store a copy
of the referenced data in special memory storage, which can be accessed faster.
2.
Spatial locality:
If a particular memory location is referenced at a
particular time, then it is likely that nearby memory locations will be
referenced in the near future. In this case it is common to attempt to guess the
size and shape of the area around the current reference for which it is
worthwhile to prepare faster access.
3.
Branch locality:
If there are only few amount of possible alternatives for
the prospective part of the path in the spatial-temporal coordinate space. This
is the case when an instruction loop has a simple structure, or the possible
outcome of a small system of conditional branching instructions is restricted
to a small set of possibilities. Branch locality is typically not a spatial
locality since the few possibilities can be located far away from each other. (Explanation Required)
4.
Equidistant locality:
It is
halfway between the spatial locality and the branch locality. Consider a loop
accessing locations in an equidistant pattern, i.e. the path in the
spatial-temporal coordinate space is a dotted line. In this case, a simple
linear function can predict which location will be accessed in the near future.
Processor Memory Performance Gap
Prime reason for processor memory performance
gap is the division of the semiconductor industry into microprocessor and memory
fields. As a consequence their technology headed in different directions: the
first one has increased in speed, while the latter has increased in capacity. The
result of this two approaches lead to an improvement rate of 60%/year in
microprocessor performance, while the access time to DRAM has been improving at
less than 10%/year.
The performance gap grows exponentially.
Although the disparity between microprocessor and memory speed is currently a
problem, it will increase in the next few years. This increasing
processor-memory performance gap is now the primary obstacle to improved computer
system performance.
In a way to recognize where is the problem,
let us consider a hypothetical computer with a processor that operates at 800
MHz (a Pentium III, for instance), connected to a memory through a 100 MHz bus
(SDRAM PC-100). Let us consider that this processor manipulates 800 million
items (instructions and/or data) per second and that the memory achieves of 100
million items per second. In this computer, for each single memory access, 8
processor clock cycles have elapsed. This way 7 in each 8 clock cycles are
wasted, waiting for items. That represents a very high cost.
The performance of the processor-memory
interface is characterized by two parameters: the latency and the bandwidth.
The latency is the time between the initiation of a memory request, by the
processor, and its completion. In fact the problem of the increasing divergence
between the memory and processor speeds is a latency growing trouble. The
bandwidth is the rate at which information can be transferred to or from the
memory system. Maximum performance is achieved by zero latency and an infinite
bandwidth, which characterizes the ideal memory system.
No comments:
Post a Comment