Page 2 of 2

Re: NXC: large bit array (Gimmick)

Posted: 25 Jun 2012, 19:30
by linusa
Oh, and btw.: Doing this primes-thingy on an NXT is a very good exercise. You learn a lot about the endless possibilities of optimization. It's an advantage, that the NXT is so slow, as you feel a greater impact of slow vs. fast code. With all the peculiarities of NXC...

Also, everyone has the EXACT same hardware at home an can reproduce the results here and join the tuning contest, which is not given when using different computers...

Re: NXC: large bit array (Gimmick)

Posted: 27 Jun 2012, 08:14
by HaWe
linusa wrote:There are possible performance improvements with this setbitarr and readbitarr code:

1.) Shift operations are slow in NXC (if anybody is interested, have a look at my NXC profiler and try for yourself, the results should be around here somewhere (or on nxtasy :/) ). You can pre-calc the 8 possible bitmasks:

Code: Select all

  byte mask[0]  = 1;
  byte mask[1]  = 2;
  byte mask[2]  = 4;
  byte mask[3]  = 8;
  byte mask[4]  = 16;
  //...

Code: Select all

  // this:
  byte mask = 1 << (idx % 8);
  // then becomes:
  mask[idx % 8];
  // use in both functions, of course!
You should also pre-calc ~mask to avoid the not-op.

Since array-lookups are (fairly) slow in NXC, you could try using a big SWITCH and see how fast it gets.

2.) In general, compiler like "const" keywords. Sophisticated compilers (gcc, MS VisualC++) will probably recognize many situations where you don't use "const" but don't modify the value, but it can't help to use it as much as possible in NXC, as it's probably not as optimizing as other compilers (just a guess).
Also, inlining the bittarr functions seems reasonable!

3.) Consider splitting the functions to setbitarrtrue, setbitarrfalse, and togglebitarr (with its own binary op). Then you save the if (?-op) and make the function smaller, if you know what you want to set (in Erasthothenes Sieve, you probably always want to set to true/false inside the loop).

4.) Try reducing calls to the bitarr functions and/or caching values...
I'm curious, I don't understand....
what's the complete code?

Code: Select all

void setbitarr( unsigned long idx, bool value ){
  byte mask = 1 << (idx % 8);
  mask[idx % 8]; ...? <------------------------? value....?
}

bool readbitarr( unsigned long idx ){
  byte mask = 1 << (idx % 8);
  mask[idx % 8]; 
  return ...? <------------------ ? what?
}

Re: NXC: large bit array (Gimmick)

Posted: 27 Jun 2012, 10:26
by gloomyandy
Hi Doc,
looking at your earlier code where you tested the unoptimized version of the bit array. I think you need to change the following line...

Code: Select all

      if(readbitarr(i)!=1) {
To be...

Code: Select all

      if(!readbitarr(i)) {
To make things work correctly. I'm no expert on NXC so I may be wrong but I notice that readbitarray is defined as a bool which in C means that any none zero value is true and a zero value is false. The actual code used in readbitarray will return values ranging from 1-128 as "true" values, so testing against a return value of 1 may be giving you the incorrect results....

Andy

Re: NXC: large bit array (Gimmick)

Posted: 27 Jun 2012, 13:02
by HaWe
thank you andy,
you are right, now I get the correct results for primes <=32000!

also testing of primes <= 240000 now works correctly!

Code: Select all

limit     biggest prime   number of primes   runtime
 32000      31991               3432           33795
240000    239999              21220          266427


Re: NXC: large bit array (Gimmick)

Posted: 27 Jun 2012, 13:54
by gloomyandy
Hi Doc,
that's good news. How do those time compare to your original method and to the best alternatives that can manage the higher numbers?

Andy

Re: NXC: large bit array (Gimmick)

Posted: 27 Jun 2012, 14:26
by HaWe

Code: Select all

 // testing on modulus(i)==0
limit           runtime  
 32000           54.1 s
240000         758.3 s
1000000       1650.0 s  
10000000        40.5 h

 // testing Eratosthenes by char array 
limit           runtime  
 32000           14.2 s
240000           -
1000000          -
10000000         -

 // testing Eratosthenes by  "bit array"
limit           runtime
32000            33.8 s
240000           266.4 s
1000000          -
10000000         -
edit:
BTW, for up to 10,000,000 on my PC,
the Eratosthenes algorithm needs 500 ms,
the Modulus algorithm needs 8000 ms,
(but on the NXT the Modulus algorithm needs 40 hours, O_M_G !!)

Code: Select all

#include <stdio.h>
#include <stdlib.h>
#include <time.h>

#define limit 10000000 // 10 millions

char arr[limit];
unsigned long  i=0, j=0, pz=0; 

int  main(void) { 
   printf("\nPrimzahlen < %i \nStart... \n",limit);    
   arr[1]=arr[0]=0; 
   if(i<3) pz=limit-1;
   for(i=2; i<limit; ++i) {
      if(!arr[i])      {
         //printf("%i\n",i);
         pz=i;
         for(j=i*i; j<limit; j+=i)   {
            arr[j]=1;            
         }
      }      
   }
   printf("\nfertig!\n");
   printf("\ngroesste Primzahl: %i\n", pz);   
   printf("\nLaufzeit %i ms\n", clock());
   
   getchar();
   return 0;
}