Introduction - If you have any usage issues, please Google them yourself
Most previous research into vector architectures has concentrated on supercomputing applications
and small enhancements to existing vector supercomputer implementations. This thesis expands the body of
vector research by examining designs appropriate for single-chip full-custom vector microprocessor implementations
targeting a much broader range of applications.
I present the design, implementation, and evaluation of T0 (Torrent-0): the first single-chip vector
microprocessor. T0 is a compact but highly parallel processor that can sustain over 24 operations per
cycle while issuing only a single 32-bit instruction per cycle. T0 demonstrates that vector architectures
are well suited to full-custom VLSI implementation and that they perform well on many multimedia and
human-machine interface tasks.
The remainder of the thesis contains proposals for future vector microprocessor designs. I show
that the most area-efficient vector register file designs have several banks with severa