Computer architecture and design interview questions

  1. What is pipelining?
  2. What are the five stages in a DLX pipeline?
  3. For a pipeline with ‘n’ stages, what’s the ideal throughput? What prevents us from achieving this ideal throughput?
  4. What are the different hazards? How do you avoid them?
  5. Instead of just 5-8 pipe stages why not have, say, a pipeline with 50 pipe stages?
  6. What are Branch Prediction and Branch Target Buffers?
  7. How do you handle precise exceptions or interrupts?
  8. What is a cache?
  9. What’s the difference between Write-Through and Write-Back Caches? Explain advantages and disadvantages of each.
  10. Cache Size is 64KB, Block size is 32B and the cache is Two-Way Set Associative. For a 32-bit physical address, give the division between Block Offset, Index and Tag.
  11. What is Virtual Memory?
  12. What is Cache Coherency?
  13. What is MESI?
  14. What is a Snooping cache?
  15. What are the components in a Microprocessor?
  16. What is ACBF(Hex) divided by 16?
  17. Convert 65(Hex) to Binary
  18. Convert a number to its two’s compliment and back
  19. The CPU is busy but you want to stop and do some other task. How do you do it?
This entry was posted in Hardware. Bookmark the permalink. Post a comment or leave a trackback: Trackback URL.

12 Comments on Computer architecture and design interview questions

  1. Posted 3/23/2005 at 5:05 am | Permalink

    I think questions should involve clues with sections of reference of reference to aid in effective revision.

  2. Deepak
    Posted 10/24/2007 at 4:21 am | Permalink

    What are the five stages in a DLX pipeline?

  3. Deepak
    Posted 10/24/2007 at 4:22 am | Permalink

    What is MESI?

  4. Rahul Singhal
    Posted 11/6/2007 at 3:14 am | Permalink

    #
    Deepak said,

    What are the five stages in a DLX pipeline?

    Answer is:

    Instruction Fetch Stage
    Instruction Decode Stage
    Instruction Execution Stage
    Memory Stage
    Write Back

  5. Rahul Singhal
    Posted 11/6/2007 at 3:16 am | Permalink

    #
    Deepak said,

    What is MESI?

    Answer is:

    MESI is a Cache Coherency protocol used in multi-processor systems to indicate the state in which the data in the cache of a particular processor is. It stands of Modified, Exclusive, Shared and Invalid

  6. sateesh rahul
    Posted 8/6/2008 at 9:47 am | Permalink

    1. Instruction fetch cycle (IF):
    Send the program counter (PC) to memory and fetch the current instruction
    from memory. Update the PC to the next sequential PC by adding 4 (since
    each instruction is 4 bytes) to the PC.
    2. Instruction decode/register fetch cycle (ID):
    Decode the instruction and read the registers corresponding to register
    source specifiers from the register file. Do the equality test on the registers
    as they are read, for a possible branch. Sign-extend the offset field of the
    instruction in case it is needed. Compute the possible branch target address
    by adding the sign-extended offset to the incremented PC. In an aggressive
    implementation, which we explore later, the branch can be completed at the
    end of this stage, by storing the branch-target address into the PC, if the
    condition test yielded true.
    Decoding is done in parallel with reading registers, which is possible
    because the register specifiers are at a fixed location in a RISC architecture.
    Appendix A Pipelining: Basic and Intermediate Concepts
    This technique is known as fixed-field decoding. Note that we may read a
    register we don’t use, which doesn’t help but also doesn’t hurt performance.
    (It does waste energy to read an unneeded register, and power-sensitive
    designs might avoid this.) Because the immediate portion of an instruction
    is also located in an identical place, the sign-extended immediate is also calculated
    during this cycle in case it is needed.
    3. Execution/’effective address cycle (EX):
    The ALU operates on the operands prepared in the prior cycle, performing
    one of three functions depending on the instruction type.
    • Memory reference: The ALU adds the base register and the offset to form
    the effective address.
    • Register-Register ALU instruction: The ALU performs the operation
    specified by the ALU opcode on the values read from the register file.
    • Register-Immediate ALU instruction: The ALU performs the operation
    specified by the ALU opcode on the first value read from the register file
    and the sign-extended immediate.
    In a load-store architecture the effective address and execution cycles
    can be combined into a single clock cycle, since no instruction needs to
    simultaneously calculate a data address and perform an operation on the
    data.
    4. Memory access (MEM):
    If the instruction is a load, memory does a read using the effective address
    computed in the previous cycle. If it is a store, then the memory writes the
    data from the second register read from the register file using the effective
    address.
    5. Write-back cycle (WB):
    • Register-Register ALU instruction or Load instruction:
    Write the result into the register file, whether it comes from the memory
    system (for a load) or from the ALU (for an ALU instruction).

  7. Rahul Singhal
    Posted 10/16/2008 at 6:19 pm | Permalink

    9. Write Through: All the writes to cache always go to the main memory.

    Write back: Writes to cache do not go to the main memory.

    Advantages of WT: Mostly used in mutliprocessor systems and keeps the data coherent beteen the cache and the main memory.

    Disadv of WT: It requires more bandwidth as all the writes need to go all the way to the main memory.

    Adv of WB: All the writes happen at the speed of the cache which is faster and does not require a large bandwidth.

    Disadv of WB: Coherency is not there. Needs protocol for coherency.

  8. Rahul Singhal
    Posted 10/16/2008 at 6:26 pm | Permalink

    6. Branch Prediction is the scheme that is used to predict if the branch would be taken or not based on the behavior of the branch in the past.

    Branch Target Buffer is the hardware that, in most cases, stores the target of a taken branches in the past. Thus if the branch is encountered again and if the prediction is taken the address of the next instruction to be read is provided by the reading out the BTB. Usually BTB is implemented as a fully associative cache. Some BTB implementation also store the target instructions.

  9. Rahul Singhal
    Posted 10/16/2008 at 6:27 pm | Permalink

    19. Send an interrupt to the busy CPU.

  10. Rahul Singhal
    Posted 10/16/2008 at 6:30 pm | Permalink

    For a pipeline with ‘n’ stages, what’s the ideal throughput? What prevents us from achieving this ideal throughput?

    Ans: The ideal throughput should be 1 instruction per clock cycle. Dependencies and hazards like Data Hazard, Control Hazard and structural hazards prevent us from acheiving them.

  11. Rahul Singhal
    Posted 10/16/2008 at 6:37 pm | Permalink

    Instead of just 5-8 pipe stages why not have, say, a pipeline with 50 pipe stages?

    Ans: There are several reasons for that:
    1. Since almost all the processors use branch prediction (which happens quite late in the pipeline say 30), if branch prediction comes out to be incorrect then processor has to flush all 29 instructions in pipe behind branch instruction which is a huge performance hit.

    2. 50 pipe statges would require registers after each stage and each register would have some latency. So eventually the latency of each instruction increases (although throughput will increase)

    3. If you have 50 pipe stages, you are more likely to run it at a higher clock speed [because you are able to :)], you would end up having heating issues. Faster the clock, hotter it gets….remember Pentium 4 Prescott :)

  12. Rahul Singhal
    Posted 10/16/2008 at 6:41 pm | Permalink

    What is a cache?

    A cache is faster memnory that is placed between the main memory and the processor to reduce the gap between their operating speeds. Cache are usually very fast SRAMS.

    Caches exploit the property of temporal and spatial locality. (Remember to mention this in your interview):

    Temporal locality: If the data accessed recently, it is more likely to be accessed soon again.

    Spatial locality: If a data was accessed, processor is more likely to access the data closer to it.

Post a Comment

Your email is never published nor shared. Required fields are marked *

*
*