Why is using single-precision slower than using double-precision

zxu@monalisa.usc.edu (Zhiwei Xu)
Wed, 23 Nov 1994 00:38:40 GMT

          From comp.compilers

Related articles
Why is using single-precision slower than using double-precision zxu@monalisa.usc.edu (1994-11-23)
Re: Why is using single-precision slower than using double-precision weaver@weitek.COM (1994-11-23)
Re: Why is using single-precision slower than using double-precision meissner@osf.org (1994-11-23)
Re: Why is using single-precision slower than using double-precision scott@cs.arizona.edu (1994-11-23)
Re: Why is using single-precision slower than using double-precision joelw@convex.convex.com (1994-11-23)
Re: Why is using single-precision slower than using double-precision koppel@omega.ee.lsu.edu (1994-11-23)
Re: Why is using single-precision slower than using double-precision bevan@cs.man.ac.uk (1994-11-23)
[12 later articles]
| List of all articles for this month |
Newsgroups: comp.parallel,comp.arch,comp.compilers
From: zxu@monalisa.usc.edu (Zhiwei Xu)
Status: R
Originator: rmuise@dragon.acadiau.ca
Organization: University of Southern California, Los Angeles, CA
Date: Wed, 23 Nov 1994 00:38:40 GMT

Can any one explain why a C program using single precision (float) is slower
that the same code using double precision (double)? Please try the following
code for computing pi. I have tried it on IBM RS6000/250, IBM SP2, Sun4, and
Sun SS20, and got the same strange timing.


On the RS6000 (using PowerPC 601) and AIX, I tried different compiler options,
such as
cc -O3 -qarch=ppc


On the Sun workstations, I tried
cc -O4 (with or without -fsingle)


The same thing!


However, on SP2, reasonable timing is seen after using the -qarch=pwr2 option.


Many thanks


Zhiwei Xu, zxu@aloha.usc.edu


------------------- C code for computing pi ---------------------


#include <stdio.h>
#include <sys/types.h>
#include <sys/times.h>
#include <sys/time.h>
#include <time.h>


#define CLK_TCK 100.0
/* #define double float */


int N = 2000000;


main()
{
    struct tms begin_time, end_time;
    struct timeval tv1, tv2;
    double kern_time, user_time, local, pi=0.0, w ;
    long i, j, t;


    times(&begin_time);
    gettimeofday(&tv1, (struct timeval*)0); /* before time */


    w = 1.0 / (double) N ;
    for(i=1;i<=N;i=i+1) {
        local = ( ((double) i) - 0.5 ) * w ;
        pi = pi + 4.0 / ( 1.0 + local * local ) ;
    }


    gettimeofday(&tv2, (struct timeval*)0); /* after time */
    times(&end_time);
    t = (tv2.tv_sec - tv1.tv_sec) * 1000000 + tv2.tv_usec - tv1.tv_usec;


    kern_time = (double)(end_time.tms_stime - begin_time.tms_stime) / CLK_TCK ;
    user_time = (double)(end_time.tms_utime - begin_time.tms_utime) / CLK_TCK ;


    printf("pi is %f \n",pi*w) ;
    printf("the kernel time is %f seconds\n",kern_time);
    printf("the user time is %f seconds\n",user_time);
    printf("the user MFLOPS is %f \n", 6 * (N/1000000.0) /user_time);


    printf("the wall clock time is %d uSecs\n",t);
    printf("the wall MFLOPS is %f \n", 6 * N / (double) t);


} /* main() */













Post a followup to this message

Return to the comp.compilers page.
Search the comp.compilers archives again.