 Requesting Large Pages
 Shared Segment Pointer =  504403158265495552
 Segment Size (DW) =  268435456  (MB =  2048 )
 Vector  Size (DW) =  67108864  (MB =  512 )
 Num_threads =  4
 Num_threads =  4
 Num_threads =  4
 Num_threads =  4
 rebind: num_parthds is  4
 Starting Initialization
 Done With Initialization
 a(1) 1.00000000000000000
 a(N) 0.000000000000000000E+00
 Base Offset =  67108864
 Incremental Offset =  2048
 Number of Threads =  4
----------------------------------------------
 Double precision appears to have 16 digits of accuracy
 Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
 Array size =   67108864
 Offset     =          0
 The total memory requirement is 1536 MB
 You are running each test   5 times
 The *best* time for each test is used
 ----------------------------------------------------
 Your clock granularity appears to be less than one microsecond
 Your clock granularity/precision appears to be      1 microseconds
 The tests below will each take a time on the order 
 of  79980  microseconds
    (=  79980  clock ticks)
 Increase the size of the arrays if this shows that
 you are not getting at least 20 clock ticks per test.
 ----------------------------------------------------
 WARNING -- The above is only a rough guideline.
 For best results, please be sure you know the
 precision of your system timer.
 ----------------------------------------------------
Function     Rate (MB/s)  RMS time   Min time  Max time
Copy:      14244.4373       .0757       .0754       .0767
Scale:     14013.4036       .0768       .0766       .0769
Add:       13574.0131       .1189       .1187       .1191
Triad:     13658.5572       .1180       .1179       .1183
 Sum of a is =  101921363943750.000
 Sum of b is =  20384272788750.0000
 Sum of c is =  27179030385000.0000
 Base Offset =  67108864
 Incremental Offset =  2304
 Number of Threads =  4
----------------------------------------------
 Double precision appears to have 16 digits of accuracy
 Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
 Array size =   67108864
 Offset     =          0
 The total memory requirement is 1536 MB
 You are running each test   5 times
 The *best* time for each test is used
 ----------------------------------------------------
 Your clock granularity appears to be less than one microsecond
 Your clock granularity/precision appears to be      1 microseconds
 The tests below will each take a time on the order 
 of  80574  microseconds
    (=  80574  clock ticks)
 Increase the size of the arrays if this shows that
 you are not getting at least 20 clock ticks per test.
 ----------------------------------------------------
 WARNING -- The above is only a rough guideline.
 For best results, please be sure you know the
 precision of your system timer.
 ----------------------------------------------------
Function     Rate (MB/s)  RMS time   Min time  Max time
Copy:      15311.3614       .1417       .0701       .0767
Scale:     14515.6600       .1444       .0740       .0769
Add:       15845.7121       .1847       .1016       .1191
Triad:     15888.1415       .1841       .1014       .1183
 Sum of a is =  101921363943750.000
 Sum of b is =  20384272788750.0000
 Sum of c is =  27179030385000.0000
 Base Offset =  67108864
 Incremental Offset =  2560
 Number of Threads =  4
----------------------------------------------
 Double precision appears to have 16 digits of accuracy
 Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
 Array size =   67108864
 Offset     =          0
 The total memory requirement is 1536 MB
 You are running each test   5 times
 The *best* time for each test is used
 ----------------------------------------------------
 Your clock granularity appears to be less than one microsecond
 Your clock granularity/precision appears to be      1 microseconds
 The tests below will each take a time on the order 
 of  80257  microseconds
    (=  80257  clock ticks)
 Increase the size of the arrays if this shows that
 you are not getting at least 20 clock ticks per test.
 ----------------------------------------------------
 WARNING -- The above is only a rough guideline.
 For best results, please be sure you know the
 precision of your system timer.
 ----------------------------------------------------
Function     Rate (MB/s)  RMS time   Min time  Max time
Copy:      15311.3614       .1844       .0701       .0767
Scale:     15132.5294       .1842       .0710       .0769
Add:       15845.7121       .2177       .1016       .1191
Triad:     15888.1415       .2174       .1014       .1183
 Sum of a is =  101921363943750.000
 Sum of b is =  20384272788750.0000
 Sum of c is =  27179030385000.0000
 Base Offset =  67108864
 Incremental Offset =  2816
 Number of Threads =  4
----------------------------------------------
 Double precision appears to have 16 digits of accuracy
 Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
 Array size =   67108864
 Offset     =          0
 The total memory requirement is 1536 MB
 You are running each test   5 times
 The *best* time for each test is used
 ----------------------------------------------------
 Your clock granularity appears to be less than one microsecond
 Your clock granularity/precision appears to be      1 microseconds
 The tests below will each take a time on the order 
 of  80745  microseconds
    (=  80745  clock ticks)
 Increase the size of the arrays if this shows that
 you are not getting at least 20 clock ticks per test.
 ----------------------------------------------------
 WARNING -- The above is only a rough guideline.
 For best results, please be sure you know the
 precision of your system timer.
 ----------------------------------------------------
Function     Rate (MB/s)  RMS time   Min time  Max time
Copy:      15311.3614       .2045       .0701       .0767
Scale:     15132.5294       .2058       .0710       .0769
Add:       15875.4467       .2322       .1015       .1191
Triad:     15963.2299       .2317       .1009       .1183
 Sum of a is =  101921363943750.000
 Sum of b is =  20384272788750.0000
 Sum of c is =  27179030385000.0000
 Base Offset =  67108864
 Incremental Offset =  3072
 Number of Threads =  4
----------------------------------------------
 Double precision appears to have 16 digits of accuracy
 Assuming 8 bytes per DOUBLE PRECISION word
----------------------------------------------
 Array size =   67108864
 Offset     =          0
 The total memory requirement is 1536 MB
 You are running each test   5 times
 The *best* time for each test is used
 ----------------------------------------------------
 Your clock granularity appears to be less than one microsecond
 Your clock granularity/precision appears to be      1 microseconds
 The tests below will each take a time on the order 
 of  79758  microseconds
    (=  79758  clock ticks)
 Increase the size of the arrays if this shows that
 you are not getting at least 20 clock ticks per test.
 ----------------------------------------------------
 WARNING -- The above is only a rough guideline.
 For best results, please be sure you know the
 precision of your system timer.
 ----------------------------------------------------
Function     Rate (MB/s)  RMS time   Min time  Max time
Copy:      15311.3614       .2160       .0701       .0768
Scale:     15132.5294       .2167       .0710       .0769
Add:       15875.4467       .2516       .1015       .1299
Triad:     15963.2299       .2512       .1009       .1295
 Sum of a is =  101921363943750.000
 Sum of b is =  20384272788750.0000
 Sum of c is =  27179030385000.0000
bindprocessor successful: thread_self() 1482985 cpu_id 2
bindprocessor successful: thread_self() 1572899 cpu_id 1
bindprocessor successful: thread_self() 1622237 cpu_id 3
bindprocessor successful: thread_self() 458945 cpu_id 0
