Wikipedia:Reference desk/Archives/Computing/2013 October 14

Computing desk
< October 13 << Sep | October | Nov >> October 15 >
Welcome to the Wikipedia Computing Reference Desk Archives
The page you are currently viewing is an archive page. While you can leave answers for any questions shown below, please ask new questions on one of the current reference desk pages.


October 14

edit

C++: How do I assign a bool array returned from a function to an empty array

edit

Consider the following code fragment

bool afunctionreturningaboolarray() { .... return (someboolarray); }

It represents a function returning a bool array. If I now try to do the following

bool myboolarray[] = afunctionreturningaboolarray();

I get the following compile-time errors

error: initializer fails to determine size of `myboolarray'
error: invalid initializer

How can I assign the array returned by afunctionreturningaboolarray() to something so that I can work with it? -- Toshio Yamaguchi 14:35, 14 October 2013 (UTC)[reply]

And yes, I understand that the problem is that the size of myboolarray isn't known at compile time. Does that mean I would need to use a vector instead, or is there some other trick? -- Toshio Yamaguchi 14:47, 14 October 2013 (UTC)[reply]

Your function is returning a single bool, not a bool array. And you need to consider where the storage for this array is allocated. Some workable options are:
  • have the function allocate the array from the heap and return a pointer to it (function f() in the code below)
  • allocate the array in the calling function and pass it by reference to the function (function g() in the code below)
bool* f() {
  bool * barray = new bool[10];

  for(int x=0; x<10; x++)
    barray[x] = true;

  return barray;
}

void g(bool (&val)[10]){ // val is a reference to an array, not an array of refererences                            
  for (int x=0; x<10; x++){
    val[x] = false;
  }
}

int main(){
  bool*a = f(); // f allocates the array on the heap                                                                
  // do something with b[] here                                                                                     
  delete(a);

  bool b[10]; // allocate on the stack here                                                                         
  g(b); // pass b by reference to g, which changes it                                                               
}
-- Finlay McWalterTalk 15:28, 14 October 2013 (UTC)[reply]
Toshio specified that s/he doesn't know the size of the array in advance. The second method in the example doesn't meet that requirement. Standard library vectors are well suited here:
    #include <vector>
    #include <stdlib.h>
    #include <iostream>
    #include <time.h>

    void f(std::vector<bool>&); // returns vector of random size, element is true iff index is a square number
    
    int main()
    {
       std::vector<bool> a; // empty array
       f(a); // fills array with vector of random size, elements are true for indices that are squares, otherwise false

       // dump the array
       int len = a.size();
       for (int i = 0; i < len; ++i)
       {
          std::cout << i << '\t' << (a[i] ? "square" : "-") << '\n';
       }
       return 0;
    }

    void f(std::vector<bool>& param)
    {
       srand(time(0));    // seed the random number generator
       int siz = rand();  // get a random size for the array
       std::vector<bool> wrk(siz,false); // a working copy of what is to be returned, with all elements initialized to false
       for (int i = 0; i*i < siz; ++i)
       {
          wrk[i*i] = true; // set the elements with square indices to true
       }
       param.swap(wrk); // after swapping, wrk holds the empty array that was created in main,
                        // and the data to be returned is moved to param.
                        // swap is useful, exception safe and efficient.
                        // the destructor of wrk is invoked when the function exits
    }
--NorwegianBlue talk 21:11, 14 October 2013 (UTC)[reply]
Let me clarify this a bit. What I want to do is using a bool array to store some small prime numbers without taking up too much memory. I would create a function containing a local bool array and initially set all array entries to, say, true. Then I could set all entries where the indices are multiples of a prime number to false (essentially creating a Sieve of Eratosthenes, where the values of the primes are represented through the indices of the bool array). The advantage of (compared to using an array of ints) that I hope for is that this will require less memory (according to http://www.cplusplus.com/doc/tutorial/variables/, a boll generally seems to require less memory than the numeric data types and that might matter even if I need to store some hundreds of primes). The function would then need to return that array so that I can use it elsewhere. -- Toshio Yamaguchi 21:40, 14 October 2013 (UTC)[reply]
Suppose you want the primes up to 1,000,000. An standard array of numbers for the primes would take (IIRC) 78,496 entries, at 4 bytes each that is 313,984 bytes. If you use one boolean value for each number (a bit vector) that takes 1 million of them. So it depends on how the boolean value is stored. If it takes one byte each that is too much, but if you pack eight boolean values per byte, that reduces it to 125,000 bytes. (Also, only 20 bits are required to store a number up to 1,000,000, so you can save memory that way.) Bubba73 You talkin' to me? 23:36, 14 October 2013 (UTC)[reply]
You are talking about storing the numbers as a bit field, am I understanding this correctly? -- Toshio Yamaguchi 10:06, 15 October 2013 (UTC)[reply]
NorwegianBlue's suggestion should work well if you're worried about space. vector<bool> is specialized and packs the bits efficiently, rather than using bytes or words to store a single value. However, it isn't quite a standard container and can have unexpected effects due to how references are handled. bitset is considered better but requires you to know the size at compile time. Boost has a dynamic version. See Bit array#Language support. Katie R (talk) 14:25, 15 October 2013 (UTC)[reply]

Searching an array

edit

Hey hey. A bit of a coding question here - trying to keep it as general as possible, but if more specific details would help I can provide them. Say I have an array with 1000 elements, which are all integers, and I have another variable which is an integer. I want to search for this variable in that array, and to return either the number itself if it exists in the array, or the next highest number following it which exists in the array. Currently I'd do something like this: (horrible pseudocode mixture of several programming languages ahead)

 var list = { ... } // List of 1000 integers
 var target = 123   // Integer to search for
 var return = 0     // Integer to return
 var temp = 0
 
 while return = 0
   if list(temp) >= target then
     return = list(temp)
   else
     temp = temp + 1
   end
 end

Obviously this is terribly inefficient. I've considered perhaps starting from the middle of the list, to potentially save time, or perhaps rather than incrementing "temp" 1 by 1, we increment 100 at a time until we pass the target, then reduce 20 at a time, then increment 1 at a time, to perhaps hone in on it faster, but I can't shift the feeling that there must be a more efficient way of doing it. Any thoughts would be much appreciated. OrganicsLRO 15:30, 14 October 2013 (UTC)[reply]

We have to assume the array is sorted, of course, but, if it is, you can do this:
N = 0
LO = 1 (or 0, if your first array element is numbered 0)
HI = TOTAL_COUNT (or TOTAL_COUNT - 1, if your first array element is numbered 0)
LOOP:
 OLD_N = N
 N = (HI+LO)/2 

 if OLD_N = N, then number not found in array, so return element at N+1, unless beyond array bounds.
 if you found number at N, end program.
 if array element you are looking for is higher than the one at N, then: 
    LO = N + 1
    goto LOOP
 if array element you are looking for is lower than the one at N, then: 
    HI = N - 1
    goto LOOP
So, it divides the list of 1000 into a smaller list of either the bottom 500 or top 500, then divides that list into either it's top 250 or bottom 250, etc., until we either find the array element or are down to 0 elements left to search. This would result in only 10 compares to search the array of 1000. However, searching an array of only 1000 elements is likely to be so fast that there's no need to optimize it like this. However, if you had a billion elements, then it would take only 30 compares, versus a billion, and that would make a noticeable difference in elapsed time.
If you assume your data is evenly distributed, then you could do a bit better than just starting in the center. For example, if your numbers range from -1 million to +1 million, and you are looking for 500,000, then that is 75% of the way from the low to the high, so you might start at element 750 out of 1000. You could then repeat this each time through the loop. However, as a practical matter, the additional time to do the calcs often cancels any savings you get from checking fewer elements, and adds significantly to the complexity and thus chance of error in the code. StuRat (talk) 15:42, 14 October 2013 (UTC)[reply]
Which is a binary search. If the array is not sorted, there is no better method than linear search. -- Finlay McWalterTalk 15:58, 14 October 2013 (UTC)[reply]
If you only need to do a single search, yes, that's true. However, if there are to be many searches, sorting the array first can make things faster. StuRat (talk) 16:04, 14 October 2013 (UTC)[reply]
If nothing is known about the nature of the array elements, other that they are integers and they are sorted, then the binary search makes sense. But if something IS known about the distribution, for example, that the values should be spread rather evenly between the minimum and maximum, it might make sense to compute an initial guess based on the nature of the elements. But the extra overhead might not be worth it if you only have 1000 elements. Jc3s5h (talk) 16:15, 14 October 2013 (UTC)[reply]
Yes, the last paragraph in my answer discusses this. However, we do need to know whether they are sorted in ascending or descending order. Of course, if we know they are sorted but don't know which order, a simple compare of the first and last array element will tell us this. StuRat (talk) 16:26, 14 October 2013 (UTC)[reply]
Ah, all very clever. The list is indeed sorted in ascending order, and distribution is an interesting point, it should make a good approximate guess. This was a much simplified example - at the time I wasn't entirely sure of the list size but now it seems to be closer to a million elements. Distribution is a bit iffy but approximately even, so I'll try some combination of these ideas. Many thanks! 82.39.232.82 (talk) 19:48, 14 October 2013 (UTC)[reply]
With a million elements, binary search will get you the result in 20 or less comparisons. It's hard to do better than that if you have any kind of overhead. --Stephan Schulz (talk) 22:24, 14 October 2013 (UTC)[reply]