If you are writing something involving the math between vectors in C/C++, you may want to check out Vlfeat (http://vlfeat.org).
It is designed to be a library for Computer Vision related stuff, but it also bring you a wrapper for SSE2 acceleration for vector computation.
Say your original code for the calculation of vectors product looks like this:
float productOfVectors(const float *vecA, const float *vecB, const int dimension) {
float value = 0.0f;
for (int i = 0; i < dimension; i++)
{
value += (vecA[i] * vecB[i]);
}
return value;
}
It can save you time significantly by adding vlfeat to your project and replace it with this:
float productOfVectors(const float *vecA, const float *vecB, const int dimension) {
float value = 0.0f;
vl_eval_vector_comparison_on_all_pairs_f(&value, dimension, vecA, 1, vecB, 1, vl_get_vector_comparison_function_f(VlKernelL2));
return value;
}
It's pretty easy but it really works. It takes use of the SSE2 instructions provided by your CPU which result in an non-trivial acceleration when you are doing large scale computation.
You can find more supported forms of calculation here, thanks for the developer's good job.