aqualiner.blogg.se - Superscalar procssor

Each pipeline consists of multiple stages to handle multiple instructions at a time which support parallel execution of instructions. There are multiple functional units each of which is implemented as a pipeline. In Superscalar processor, multiple such pipelines are introduced for different operations, which further improves parallel processing. As we already know, that Instruction pipelining leads to parallel processing thereby speeding up the processing of instructions. The main principle of superscalar approach is that it executes instructions independently in different pipelines. Superscalar approach produces the high performance general purpose processors. In most applications, most of the operations are on scalar quantities. It is a machine which is designed to improve the performance of the scalar processor.

Aerodynamics and space flight simulations.

The following are some areas where vector processing is used: Applications of Vector ProcessorsĬomputer with vector processing capabilities are in demand in specialized applications. In vector processor a single instruction, can ask for multiple data operations, which saves time, as instruction is decoded once, and then it keeps on operating on different data items. This can be achieved by vector processors. Vector processor, not only use Instruction pipeline, but it also pipelines the data, working on multiple data at the same time.Ī normal scalar processor instruction would be ADD A, B, which leads to addition of two operands, but what if we can instruct the processor to ADD a group of numbers(from 0 to n memory location) to another group of numbers(lets say, n to k memory location). Therefore, while the data is fetched for one instruction, CPU does not sit idle, it rather works on decoding the next instruction set, ending up working like an assembly line. These sub-units perform various independent functions, for example: the first one decodes the instruction, the second sub-unit fetches the data and the third sub-unit performs the math itself. Hence, the concept of Instruction Pipeline comes into picture, in which the instruction passes through several sub-units in turn. So until the data is found, the CPU would be sitting ideal, which is a big performance issue. Also, simple instructions like ADD A to B, and store into C are not practically efficient.Īddresses are used to point to the memory location where the data to be operated will be found, which leads to added overhead of data lookup. Scalar CPUs can manipulate one or two data items at a time, which is not very efficient. Such complex instructions, which operates on multiple data at the same time, requires a better way of instruction execution, which was achieved by Vector processors. These problems require vast number of computations on multiple data items, that will take a conventional computer(with scalar processor) days or even weeks to complete. There is a class of computational problems that are beyond the capabilities of a conventional computer. But in today's world, this technique will prove to be highly inefficient, as the overall processing of instructions will be very slow. The most evident effect is that we shall need several functional units, but, as we shall see, each stage of the pipeline will be affected.A Scalar processor is a normal processor, which works on simple instruction at a time, which operates on single data items. From the microarchitecture viewpoint, we make the pipeline wider in the sense that its representation is not linear any longer.

In doing so, we make a transition from a scalar processor to a superscalar one. The only possibility to increase the ideal IPC of 1 is to radically modify the structure of the pipeline to allow more than one instruction to be in each stage at a given time.

We see that in order to reduce EX CPU in a processor with the same ISA – that is, without changing the number of instructions, N – we must either reduce CPI (increase IPC) or reduce the cycle time, or both. If we return to the formula giving the execution time, namely,ĮX CPU = Number of instructions × CPI × cycle time This so-called scalar processor had an ideal throughput of 1, or in other words, ideally the number of instructions per cycle ( IPC) was 1. The basic concept was that the instruction execution cycle could be decomposed into nonoverlapping stages with one instruction passing through each stage at every cycle. In the previous chapter we introduced a five-stage pipeline.