For instance you can search for appropriate power series on the web. Write a short program to see the convergence. As programming languages you can use C, C++, perl or python. For others you have to find an agreement with Jacob. Put your result into fast_pi.c, fast_pi.cpp, fast_pi.pl, or fast_pi.py. How many iterations you need to get 8, 10, 12, and 14 correct digits? (Compare with a constant for instance.) Give a guess how the number of correct digits depends on the number of iterations. To be better than the integration your computation must be faster than O(log(N)). Write this as plain text into fast_pi.analysis.
Put your complete gcc-compilable program into vector_addition.cpp. It is up to you if you use C arrays or STL vectors of doubles. You are free to use inheritance for more concise implementation. The three calculation versions are
Compute the required time per operation (only the addition not the assignment) for all 15 combinations. Give a short explanation why the timing differs for different sizes and different implementations of mathematically identical operations. Put your explanation in plain text into vector_addition.analysis.
The result will be three floating point numbers, where some of them can be zero. Your directory trunk/hw01 already contains a source code precision.cpp with some tests. Complete the code and check if the tests are computed correctly. Be aware that we will add more tests for the grading.
CHANGE !!!!!!!!!!! Say we call our matrices A, B, and C with C = AB. A is initialized so that aij=i+j. B is 2A so that bij=2(i+j). You can use the initialization in blocked_matrix_product_optimized.cpp.
Test: Obviously, the result C is an NxN matrix with
cij=1/3 N (1 - 3i - 3j + 6ij - 3N + 3iN + 3jN + 2N2).
Verify this for three elements. The relative error
|cij,computed-cij,expected|/|cij,expected|
should be less than 0.0001.
You are free to choose any blocking factor you want. More than 6 levels of nestings are allowed. Run this computation for matrices of doubles and floats.
Run 10 repetitions of each computation to get precise enough timing (accumulated time not for each repetition). Write your timing of 5 different implementations (blocking, nesting, ...) into matrix_multiplication.time_log. Start your timing after the matrices are initialized. Call your C or C++ program that contains all 5 different variations of matrix product (e.g., 5 templated functions or 10 non-templated functions (for double and float)) matrix_multiplication.cpp. Write your compile command with all options into matrix_multiplication.compilation. With 5 versions and applying this to double and float you will have ten timings. You can use one of the programs from the class as starting point. Of course you will make a new directory hw02 in trunk. The checkin that you consider as submission shall contain an appropriate comment.
Use molerat for your measurings. Log into burrow.cs.indiana.edu and ssh to molerat or log into molerat.cs.indiana.edu directly.