@witchdoctor said:
The language is CUDA C. It's supposed to be a set of extensions to C for programming on the GPU.
And I was returning 0 when I got a NULL pointer because it made sense for the following processing. The pointer was pointing to an array of integers.
As for figuring it out: some very confused and increasingly desperate experimenting.
I didn't know it breaks the C standard. I just thought it was a very stupid idea to have NULL pointers refer to actual data.
Btw a nested WTF: This pointer is generated by the runtime when launching a GPU function. Unfortunately it is not passed as a parameter or anything so mundane, instead you have to declare it like so:
extern __shared__ int memory[;
__kernel__
void function(int arg1, float* arg2)
{
// ...
}
And there can only be one extern shared array. If you need more than one you have to request more bytes of memory and subdivide it yourself.
Weird. You sure you're not running into some sort of weird NVCC bug? (There's plenty of those, especially in some of the old CUDA releases).
I tried to repro this using
#include <cstdio>
device void f( int* p )
{
if( p == NULL ) printf( "p is NULL\n" );
else printf( "p is not NULL\n" );
}
global void test()
{
extern shared int foo[];
shared int bar;
printf( "foo = %p; bar = %p\n", foo, &bar );
f( foo );
f( &bar );
f( 0 );
}
int main()
{
test<<<1,1,0>>>();
cudaDeviceSynchronize();
test<<<1,1,4>>>();
cudaDeviceSynchronize();
return 0;
}
And get that
foo' is on 0x1000010, and
bar' on 0x1000000. (It also prints "not NULL", "not NULL", "NULL" for each kernel launch.) I compiled with
-arch compute_20
, and it's running on a GTX480.