cuda - Shared memory address passed to device function is still shared memory? -

- April 15, 2015

let's have __device__ function:

__device__ unsigned char* dev_kernel(unsigned char* array_sh, int params){     return array_sh + params; }

and within __global__ kernel use in way:

uarray = dev_kernel (uarray, params);

where uarray array located in shared memory.

but when use cuda-gdb see addresss of uarray within __global__ kernel get:

(@generic unsigned char * @shared) 0x1000010 "z\377*"

and within __device__ kernel get:

(unsigned char * @generic) 0x1000010 <error reading address 0x1000010: operation not permitted>

despite error, program in running ok (maybe limitation of cuda-gdb).

so, want know: within __device__ kernel, uarray shared yet? i'm changing array global shared memory , time same (with shared memory time little worse).

so, want know: within __device__ kernel, uarray shared yet?

yes, when pass pointer shared memory device function way, still points same place in shared memory.

in response questions posted below perplexing me, elected show simple example:

$ cat t249.cu #include <stdio.h>  #define ssize 256  __device__ unsigned char* dev_kernel(unsigned char* array_sh, int params){     return array_sh + params; }  __global__ void mykernel(){   __shared__ unsigned char myshared[ssize];   __shared__ unsigned char *u_array;   (int = 0; i< ssize; i++)     myshared[i] = (unsigned char) i;   unsigned char *loc = dev_kernel(myshared, 5);   u_array = loc;   printf("val = %d\n", *loc);   printf("val = %d\n", *u_array); }  int main(){    mykernel<<<1,1>>>();   cudadevicesynchronize();   return 0; } $ nvcc -arch=sm_20 -g -g -o t249 t249.cu $ cuda-gdb ./t249 nvidia (r) cuda debugger 5.5 release .... reading symbols /home/user2/misc/t249...done. (cuda-gdb) break mykernel breakpoint 1 @ 0x4025dc: file t249.cu, line 9. (cuda-gdb) run starting program: /home/user2/misc/t249 [thread debugging using libthread_db enabled]  breakpoint 1, mykernel () @ t249.cu:9 9       __global__ void mykernel(){ (cuda-gdb) break 14 breakpoint 2 @ 0x4025e1: file t249.cu, line 14. (cuda-gdb) continue continuing. [new thread 0x7ffff725a700 (lwp 26184)] [context create of context 0x67e360 on device 0] [launch of cuda kernel 0 (mykernel<<<(1,1,1),(1,1,1)>>>) on device 0] [switching focus cuda kernel 0, grid 1, block (0,0,0), thread (0,0,0), device 0, sm 2, warp 0, lane 0]  breakpoint 1, mykernel<<<(1,1,1),(1,1,1)>>> () @ t249.cu:12 12        (int = 0; i< ssize; i++) (cuda-gdb) continue continuing.  breakpoint 2, mykernel<<<(1,1,1),(1,1,1)>>> () @ t249.cu:14 14        unsigned char *loc = dev_kernel(myshared, 5); (cuda-gdb) print &(myshared[0]) $1 = (@shared unsigned char *) 0x8 ""        ^        |      cuda-gdb telling pointer defined in __shared__ statement, , therefore it's storage implicit , unmodifiable.  (cuda-gdb) print &(u_array) $2 = (@generic unsigned char * @shared *) 0x0        ^                          ^        |                          u_array stored in shared memory.       u_array generic pointer, meaning can point anything.    (cuda-gdb) step dev_kernel(unsigned char * @generic, int) (array_sh=0x1000008 "", params=5)     @ t249.cu:6 6           return array_sh + params; (cuda-gdb) print array_sh $3 = (@generic unsigned char * @register) 0x1000008 ""           ^                      ^           |                    array_sh stored in register.          array_sh generic pointer, can point anything.  (cuda-gdb) print u_array no symbol "u_array" in current context.  (note can't access u_array inside __device__ function, don't understand comment there.)  (cuda-gdb) step mykernel<<<(1,1,1),(1,1,1)>>> () @ t249.cu:15 15        u_array = loc; (cuda-gdb) step 16        printf("val = %d\n", *loc); (cuda-gdb) print u_array $4 = (     @generic unsigned char * @shared) 0x100000d ......        ^                         ^        |                       u_array stored in shared memory      u_array generic pointer, can point (cuda-gdb)

although haven't provided it, assuming definition of u_array similar mine, based on cuda-gdb output getting.

note indicators @shared not telling kind of memory pointer pointing to, telling either kind of pointer (defined implicitly in __shared__ statement) or else stored (in shared memory).

if doesn't sort out questions, please provide complete example, along complete cuda-gdb session output, have.

Search This Blog

SSIS

cuda - Shared memory address passed to device function is still shared memory? -

Comments

Post a Comment

Popular posts from this blog

c# - How Configure Devart dotConnect for SQLite Code First? -

java - Copying object fields -

c++ - Clear the memory after returning a vector in a function -