Saturday 31 August 2013

1st Week Introductory - Knowing about Computer Architecture

Computer Architecture (1st Week)
Definition:
In computer science and engineering, computer architecture is the art that specifies the relations and part of a computer system.
At a high level, computer architecture is concerned with how the central processing unit (CPU) acts and how it uses computer memory.
The main blocks of a digital computer system are: 
1.    ALU (arithmetic logic unit)
2.    CU (control unit)
3.    Central RAM (random access memory)
4.    Storage devices
5.    I/O (input/output interface units)
6.    Communications devices
Functionality of Computer Architecture Building Blocks
1.   Arithmetic Logical Unit (ALU)
All calculations are performed in the Arithmetic Logic Unit (ALU) and it also does comparisons and takes decisions on behalf of logic.
The ALU can perform basic operations such as addition, subtraction, multiplication, division, etc and does logic operations like >, <, =, 'etc. Whenever calculations are required, the control unit transfers the data from storage unit to ALU once the computations are done, the results are transferred to the storage unit by the control unit and then it is send to the output unit for displaying results.  
2.   Control Unit (CU)
This controls all the units in the computer. The control unit instructs the input units where to store the data after receiving it from the user/ the outside world.  It controls the flow of data and instructions from the storage units to ALU. It also controls the flow of results from the ALU to the storage units and to the output units which send it to the user/the outside world. The control unit is generally referred as the "central nervous system" of the computer that controls and synchronizes its overall working. 
3.   Central RAM
This stores and delivers data very fast but not as fast as cache does. RAM is generally used to hold the program being currently executed in the computer, the data being received from the input unit and the intermediate and final results of the program. RAM memory is temporary in nature: the data is lost when the computer is switched off. 
Primary storage costs more per megabyte compared to secondary storage. Therefore most computers have limited primary storage capacity. 
4.   Storage Devices
Storage Devices stores several programs, documents, data bases etc. The programs that you run on the computer are first transferred to the primary memory before it is actually run. Whenever the results are saved, again they get stored in the secondary memory. The secondary memory is slower and cheaper than the primary memory. Some of the commonly used secondary memory devices are Hard disk, USB Stick, CD, DVD, etc
5.   I/O (input/output interface units)
                    I.        Input units
Input can be entered from a keyboard, a mouse pointing device, a USB stick and the various types of photo storage cards used by digital cameras. Input can also be downloaded from the Internet via a communications device.
Older personal computers used floppy disks and magnetic tape devices. Even older designs of mini-computers and main-frame computers took their input from punched cards, paper tape, etc. as well as communications devices. 


All input peripheral devices perform the following functions:

·            Accept the data and instructions from the outside world.
·Convert it to a form that the computer can understand.
·Supply that converted data to the computer system for further processing. 
                 II.        Output Units
The output units of a computer provide the information and results of an arithmetical computation - or some other kind of data processing, such as a search - to the outside world. 
Commonly used output units are printers and visual display units (VDUs), also known simply as "screens". Output data can also be sent to USB sticks and other types of data storage cards. Output can also be uploaded to the Internet via a communications device.
All output peripheral devices perform the following functions:
·         Accept output data and instructions from the computer.
·         Convert it to a form that the outside world can understand.
·         Supply that converted data to the outside world. 
Evolution of computer architecture
The design of a computer's CPU architecture, instruction set, addressing modes. Computer design is concerned with the hardware design of the computer.
First Generation (1945-1958)
·         Features     
o    Vacuum tubes, Machine code, Assembly language
o    Computers contained a central processor that was unique to that machine
o    It is considered for general purpose from different type of supported instruction for few machines.
o    It uses magnetic core or drum memory for loading data and programs using paper tape or punch card.
Second Generation (1958-1964)
·         Features
o    Transistors
o    Small, low-power, low-cost, more reliable than vacuum tubes,
o    Magnetic core memory
o    Two's complement, floating point arithmetic
o    Reduced the computational time from milliseconds to microseconds
o    High level languages
o    First operating Systems: handled one program at a time
o    On-off switches controlled by electricity
o    floating point arithmetic
o    Contains more than 50,000 transistors plus extremely fast magnetic core storage.
o    Basic Cycle Time: 2.18 Seconds
Third Generation (1964-1974)
·         Features
o   Introduction of integrated circuits combining thousands of transistors on a single chip
o   Semiconductor memory
o   Timesharing, graphics, structured programming
o   2 Mb memory, 5 MIPS
o   Use of cache memory
o   Microprocessor chips combine thousands of transistors, entire circuit on one computer chip
o   Semiconductor memory
o   Multiple computer models with different performance characteristics
o   Smaller computers that did not need a specialized room
Fourth Generation (1974-present)
·         Features
o    Introduction of Very Large-Scale Integration (VLSI)/Ultra Large
o    Scale Integration (ULSI) - combines millions of transistors
o    Single-chip processor and the single-board computer emerged
o    Smallest in size because of the high component density
o    Creation of the Personal Computer (PC)
o    Wide spread use of data communications
o    Object-Oriented programming: Objects & operations on objects
o    Artificial intelligence: Functions & logic predicates
o    1974 - 1977 the first personal computers - introduced on the market as kits (Major assembly required).
Stored Program Concept
This refers to the ability of a calculating machine to store its instructions in its internal memory and process them in its arithmetic unit, so that in the course of a computation they may be not just executed but also modified at electronic speeds. 
Vann Neumann Concept & Von Neumann bottleneck
Von Neumann described a design architecture for an electronic digital computer with subdivisions of a processing unit consisting of an arithmetic logic unit and processor registers, a control unit containing an instruction register and program counter, a memory to store both data and instructions, external mass storage, and input and output mechanisms. The meaning of the term has evolved to mean a stored-program computer in which an instruction fetch and a data operation cannot occur at the same time because they share a common bus. This is referred to as the Von Neumann bottleneck and often limits the performance of the system.
Harvard Architecture
The Harvard architecture is computer architecture with physically separate storage and signal pathways for instructions and data. The term originated from the Harvard Mark I relay-based computer, which stored instructions on punched tape (24 bits wide) and data in electro-mechanical counters. These early machines had limited data storage, entirely contained within the central processing unit, and provided no access to the instruction storage as data. Programs needed to be loaded by an operator, the processor could not boot itself.

Today, most processors implement such separate signal pathways for performance reasons but actually implement Modified Harvard architecture, so they can support tasks like loading a program from disk storage as data and then executing it.
Pipeline Technique
An instruction pipeline is a technique used in the design of computers to increase their instruction throughput (the number of instructions that can be executed in a unit of time). Pipelining does not reduce the time to complete an instruction, but increases the number of instructions that can be processed at once.
Each instruction is split into a sequence of dependent steps. The first step is always to fetch the instruction from memory; the final step is usually writing the results of the instruction to processor registers or to memory.
Principle of Locality
In computer science, locality of reference, also known as the principle of locality, is a phenomenon describing the same value, or related storage locations, being frequently accessed. Types of reference locality are:
1.   Temporal locality:
If at one point in time a particular memory location is referenced, then it is likely that the same location will be referenced again in the near future. In this case it is common to make efforts to store a copy of the referenced data in special memory storage, which can be accessed faster.
2.   Spatial locality:
If a particular memory location is referenced at a particular time, then it is likely that nearby memory locations will be referenced in the near future. In this case it is common to attempt to guess the size and shape of the area around the current reference for which it is worthwhile to prepare faster access.
3.   Branch locality:
If there are only few amount of possible alternatives for the prospective part of the path in the spatial-temporal coordinate space. This is the case when an instruction loop has a simple structure, or the possible outcome of a small system of conditional branching instructions is restricted to a small set of possibilities. Branch locality is typically not a spatial locality since the few possibilities can be located far away from each other. (Explanation Required)
4.   Equidistant locality:
It is halfway between the spatial locality and the branch locality. Consider a loop accessing locations in an equidistant pattern, i.e. the path in the spatial-temporal coordinate space is a dotted line. In this case, a simple linear function can predict which location will be accessed in the near future.
Processor Memory Performance Gap
Prime reason for processor memory performance gap is the division of the semiconductor industry into microprocessor and memory fields. As a consequence their technology headed in different directions: the first one has increased in speed, while the latter has increased in capacity. The result of this two approaches lead to an improvement rate of 60%/year in microprocessor performance, while the access time to DRAM has been improving at less than 10%/year.
The performance gap grows exponentially. Although the disparity between microprocessor and memory speed is currently a problem, it will increase in the next few years. This increasing processor-memory performance gap is now the primary obstacle to improved computer system performance.
In a way to recognize where is the problem, let us consider a hypothetical computer with a processor that operates at 800 MHz (a Pentium III, for instance), connected to a memory through a 100 MHz bus (SDRAM PC-100). Let us consider that this processor manipulates 800 million items (instructions and/or data) per second and that the memory achieves of 100 million items per second. In this computer, for each single memory access, 8 processor clock cycles have elapsed. This way 7 in each 8 clock cycles are wasted, waiting for items. That represents a very high cost.

The performance of the processor-memory interface is characterized by two parameters: the latency and the bandwidth. The latency is the time between the initiation of a memory request, by the processor, and its completion. In fact the problem of the increasing divergence between the memory and processor speeds is a latency growing trouble. The bandwidth is the rate at which information can be transferred to or from the memory system. Maximum performance is achieved by zero latency and an infinite bandwidth, which characterizes the ideal memory system.

No comments:

Post a Comment