c - Strange performance issue with AMD's ACML BLAS/LAPACK library -


i asked question on @ amd developers forum few days ago, haven't gotten answer. maybe here has insight.
http://devgurus.amd.com/thread/167492

i running acml version 5.3.1, libacml_mp gfortran_fma4 on opteron 6348 processors on ubuntu 12.04.

what happens performance of call dsyev (eigen decomposition) slows down dramatically (by factor of 10+) if first make call dpotrf (cholesky decomposition). makes no sense @ me why happen. maybe there kind of cache need clear or that.

here simple c program reproduces problem.

#include <stdio.h> #include <stdlib.h> #include <acml.h> #include <time.h>  int main(void) {   double * x = malloc(1000000 * sizeof(double));   double * y = malloc(1000000 * sizeof(double));   double * eig0 = malloc(1000000 * sizeof(double));   double * eig1 = malloc(1000000 * sizeof(double));   double * eigw = malloc(1000 * sizeof(double));   double * chol = malloc(1000000 * sizeof(double));    clock_t t0,t1;   int info;   int i;    // generate random matrix   for(i = 0; i<1000000; ++i){     x[i] = rand() / (double) rand_max;   }    // compute y = xx^t y symmetric positive definite   dgemm('n','t',1000,1000,1000,1,x,1000,x,1000,0,y,1000);    // make copy of y cholesky , eigen decompositions   for(i = 0; i<1000000; ++i){     chol[i] = y[i];     eig0[i] = y[i];     eig1[i] = y[i];   }    // first eigenvalue test   t0 = clock();   dsyev('v','u',1000,eig0,1000,eigw,&info);   t1 = clock();   printf("eigen decomposition time: %d\n", (t1-t0)/1000);    // cholesky   dpotrf('u',1000,chol,1000,&info);    // second eigenvalue test, after cholesky   t0 = clock();   dsyev('v','u',1000,eig1,1000,eigw,&info);   t1 = clock();   printf("eigen decomposition time: %d\n", (t1-t0)/1000); } 

here output:

eigen decomposition time: 8120 eigen decomposition time: 95140 

if comment out dpotrf line, works fine:

eigen decomposition time: 8150 eigen decomposition time: 8210 


Comments

Popular posts from this blog

c# - How Configure Devart dotConnect for SQLite Code First? -

c++ - Clear the memory after returning a vector in a function -

erlang - Saving a digraph to mnesia is hindered because of its side-effects -