.net - In C# why is it faster to Create a HashSet from a List, instead of starting with a HashSet? -

- April 15, 2011

i have method takes upper limit, , returns list of primes numbers limit.

    public static list<int> allprimesunder(int upperlimit)

i later decided needed lookups on list, asking question "is prime". since dealing primes under values million, realized hashset structure should using. lookup using result of method faster, method self slower.

i believe reason it's slower because hashset checks duplicates before adding, while list shoves on end. surprised me, , spawned question , title, why starting list , using create hashset, so:

    hashset = new hashset<int>(prime.allprimesunder(1000000));

is faster using hashset internal method, enabling call this:

    hashset = prime.allprimesunder_hash(1000000);

if slowdown in duplicate checking, should have same amount of checking no matter what, right? understanding failing me.

here times i'm getting primes under 1 million.

0.1136s pure hash
0.0975s pure list (expected faster)
0.0998s pure list converted hash (not expected)

if reason can explained in simple terms, i'd love hear it. suppose @ minimum i'm looking enough of understanding know if should start list or hashset if end result large hashset of items.

i've added body of prime method below, note interaction data structure identical (code wise) between two. don't believe how add data structure should effect anomaly.

    public static list<int> allprimesunder(int upperlimit)     {         list<int> primelist = new list<int>();         primelist.add(2);         int testnumber = 3;         bool isprime;          while (testnumber <= upperlimit)         {             isprime = true;              foreach (int prime in primelist)             {                 if (testnumber % prime == 0)                 {                     isprime = false;                     break;                 }                 if (testnumber < prime*prime)                     break;             }              if (isprime)                 primelist.add(testnumber);              testnumber++;         }          return primelist;     }

edit: request adding code hash method. if looks identical that's because is.

public static hashset<int> allprimesunder_hash(int upperlimit) {     hashset<int> primehash = new hashset<int>();     primehash.add(2);     int testnumber = 3;     bool isprime;      while (testnumber <= upperlimit)     {         isprime = true;          foreach (int prime in primehash)         {             if (testnumber % prime == 0)             {                 isprime = false;                 break;             }             if (testnumber < prime*prime)                 break;         }          if (isprime)             primehash.add(testnumber);          testnumber++;     }      return primelist; }

also request (ugly hackish) code used test execution time:

        stopwatch stopwatch = new stopwatch();         int iterations = 1;         hashset<int> hashset = new hashset<int>();         list<int> list = new list<int>();          stopwatch.restart();         (int = 0; < iterations; i++)         {             hashset = prime.allprimesunder_hash(1000000);         }         stopwatch.stop();          console.writeline("hash: " + (stopwatch.elapsed.totalseconds / iterations).tostring("#.###################"));

//////////////////////////

        stopwatch.restart();         (int = 0; < iterations; i++)         {             hashset = new hashset<int>(prime.allprimesunder(1000000));         }         stopwatch.stop();           console.writeline("list converted: " + (stopwatch.elapsed.totalseconds / iterations).tostring("#.###################"));

in allprimesunder enumerating prime list many times (once each prime candidate). enumerating list faster enumerating hashset because internal array of hashset more sparse.

not seeing code allprimesunder_hash guess main cause.

i'm not convinced resizing list of few thousand items consume 20ms. copying memory using memcpy (which happens internally) 1 of highest-throughput operations can do. can copy tens of gigabytes per second per core.

Search This Blog

SSIS

.net - In C# why is it faster to Create a HashSet from a List, instead of starting with a HashSet? -

Comments

Post a Comment

Popular posts from this blog

c# - How Configure Devart dotConnect for SQLite Code First? -

java - Copying object fields -

c++ - Clear the memory after returning a vector in a function -