A yellow background color for the slide release date either
means that the version for the current academic year is available
(when the date mentions 2017) or that the slides have not been
updated for the current academic year. A pink background means
that only last year's slides are available for the moment.
|February 6, 2017||Organization||-||Organization||February 6, 2017|
|February 6, 2017||Introduction, Models of computation||[Par09]||Introduction||February 6, 2017|
|February 6, 2017||Software synthesis||Sections I and II of [Bha00]||Software Synthesis||February 1, 2016|
|February 6, 2017||Background in QPSK (on behalf of System Studio exercises)||-
||QPSK||February 1, 2016|
|February 13, 2017||Architecture synthesis and scheduling||[Ger99]||Architectural Synthesis||February 13, 2012|
|February 13, 2017||Overlapped scheduling||[Ger98]|
|February 20, 2017
||No lecture, holiday week|
|February 27, 2017||Algorithm transformations||[Par95]||Transformations Addendum||February 19, 2012|
|February 27, 2017
March 6, 2017
|Fixed-point design||[Bou08]||Fixed-Point Design||February 27, 2017|
|March 6, 2017
||The Arx RTL Language and Toolset
||March 6, 2017
|March 13, 2017||Polyphase implementation of multirate filters||[Lan02] and [Vai90]||Polyphase implementation||March 7, 2016|
|March 13, 2017||Code generation||Sections III and IV of [Bha00]||Code Generation||March 17, 2014|
|March 13, 2017||Case study: simultaneous design of processor and compiler||[Goo05]||-||-|
|March 20, 2017
||Multiplierless filter design||[Hew00], [Vor07], [Aks14] and [Kot03]||Multiplierless Filter Design||March 20, 2017|
|March 20, 2017||Modern DSP Architectures
||DSP Architectures||March 20, 2017|
|March 27, 2017||The CORDIC Algorithm||[And98] and [Loe00]||CORDIC||March 14, 2016|
|March 27, 2017||FFT basics + FFT Hardware Structures||Sections 9.2+9.3 of [Chi12], [He98]||FFT basics and FFT hardware||March 27, 2017|
|April 3, 2017||No lecture
Caption of Figure 6: last subscript of y should be n-1 instead of n.
Right column of Page 856: Read Figure 9(b) where 9(a) is mentioned and vice versa.
Contents of Figure 17: In order to be consistent with next figures, rewrite "x = a - b" and "y = a - b + c * d".
Those interested in a detailed analysis of the probability density function of the truncation error after multiplication can consult the followin non-compulsory paper:
Ahmadi, A. and M. Zwolinski, Fixed-Point Multiplication: A Probabilistic Bit-Pattern View, Microelectronics Reliability, Vol. 51(4), pp 790-796, (April 2011). Online copy (only in UT domain).
You can skip Seciton 11.3 (2D FIR filters).
Page 203, halfway bottom paragraph: twice add a minus sign to 2's exponent (so 2**n should become 2**-n).
Page 204, Equation 11.10: the "close" parenthesis with exponent 2 should move to the end of the equation.
You can skip Section 6.5.3 on the efficient computation of the iteration-period bound.
You can skip Section 12.4.3 on force-directed scheduling.
You can skip Sections 6 (folding) and 8 (relaxed look-ahead).
Comments on Figure 6. The issue is that unfolding can improve the processor utilization. The explanation in the paper is not correct.
The schedule shown in Figure 6(b) is rate optimal i.e. it repeats at the iteration-period bound (T0min) value of 3. In this period, the total of the computations to be performed is 9 (4 operations of 2 and 1 of 1) time units. The lower bound on the number of processors is 3 (=9/3). However, this bound cannot be met. The reason is that the schedule needs to repeat every 3 time units. This means that a separate processor is necessary for each of the operations A to D that take two time units (a processor that would execute two of them would require an iteration period of 4). One has an average processor utilization of 75% (9/12).
Figure 6(c) shows a schedule of the graph after 2-unfolding. The unfolded graph contains 2 iterations of the original graph. This schedule is also rate optimal which means that the 2 iterations are executed in 6 time units. The optimal number of processor in this situation would be again 3 (=18/6). There now exists a schedule that reaches 100% processor utilization (the available 6 time units per processor can now be filled optimally with operations of 2 time units).
In Figure 6(b), the operations A0, B0, C0, D0 and E0 belong to one iteration. The schedule has an iteration period of 3 (A1 starts 3 time units after A0, etc.) a latency of 7 (output on E0) and a span of 8 (end of D0).
In Figure 6(c), the operations A0/A1, B0/B1, C0/C1, D0/D1 and E0/E1 belong to one iteration. The schedule has an iteration period of 6 (A2 starts 6 time units after A0, etc.) and a latency and span of 12 (output on E1).
Comments on Figure 9(a). According to me, two inequalities are incorrect: r(A2) - r(M1) <= 2 and r(M4) - r(A3) <= -1.
Only study Section 1 (until page 6); the rest is optional.
after lecture of
|PRE||Preparatory Exercise (no marks)||0
||February 6, 2017
||17 hours||February 27, 2017
|SCH||Scheduling, Designing in Arx and Verification by Means of Co-simulation||10
||17 hours||March 6, 2017
|GFS||The GFSK Receiver||30
||56 hours||March 20, 2017
Important: Please do not keep System Studio, Davis, etc. running when not necessary due to the limited number of available licenses. Check also for background processes that may continue to run and stop those as explained below.
Session cleanup: Before you log out, after closing System Studio, please check that all your simulations have properly terminated (in rare cases, stray processes continue running in the background). Command clean-up-ccss (to be typed at the shell prompt) will take care of this.
When ready with all projects, you should provide me (Sabih Gerez) with a hardcopy of the reports and arrange for an appointment to discuss them. Bring me all reports at once, in hardcopy (one hardcopy per team); do not send me individual reports unless you are stuck and want an advice. Send me as well the reports in PDF by e-mail. I use the electronic versions for back-up purposes, to zoom in to details that are not very clear in print, etc.
The mark will be based on the reports and the defense of your work in an oral examination session. The mark is basically the sum of points obtained for the projects divided by 5:
The performance at the oral exam can lead to a correction of at most one point up or down. In principle, the two members of a project team will receive the same mark unless there are strong indications of differences in performance.
The course needs to be terminated within the quarter in which it is taught. As an exceptional case, two extra weeks are available for academic year 2016-2017 which means that the deadline to deliver your work is Monday morning May 8, 2017. I will be at my office between 9:30 and 13:00. When you come to deliver your work, we will directly make an appointment for the closing oral session.
|Go (back) to||Sabih's Home Page.|