How does work_group_barrier works?

Hello. As I read the documentation on work_group_barrier, each thread must pass the work_group_barrier barrier before the threads can continue. It can be used to write stuff from one thread into global memory and for other threads to access the written data. (I use the term thread for simplicity as each kernel is one thread and they are run basically in parallel).

kernel void map2d_image(
global void* ranges,
int width,
int height,
float z,
global float *coord,
write_only image2d_t output
) {
    int g0 = get_global_id(0);
    int g1 = get_global_id(1);
    int l0 = get_local_id(0);
    int l1 = get_local_id(1);
    if (g0 == 0 && g1 == 0) {
        struct SMappingRanges mranges = *(struct SMappingRanges*)(ranges);
        map2D(coord, calc_seamless_none, mranges, width, height, z);
    }
    work_group_barrier(CLK_GLOBAL_MEM_FENCE|CLK_LOCAL_MEM_FENCE);
    printf("[map2d_image] g=(%d,%d) l=(%d,%d) coord=%f,%f,%f\\n", g0, g1, l0, l1, coord[0], coord[1], coord[2]);
}

In this code each thread should stop at work_group_barrier until all threads reach this barrier. What I want is that one thread is used to call map2D and the other are using the calculated data in coord. But the threads are not blocked at the barrier. Here is the output:

[map2d_image] g=(0,1) l=(0,0) coord=0,000000,0,000000,0,000000
[map2d_image] g=(1,1) l=(1,0) coord=0,000000,0,000000,0,000000
[map2d_image] g=(2,1) l=(2,0) coord=0,000000,0,000000,0,000000
[map2d_image] g=(3,1) l=(3,0) coord=0,000000,0,000000,0,000000
[map2d_image] g=(0,3) l=(0,0) coord=0,000000,0,000000,0,000000
[map2d_image] g=(1,3) l=(1,0) coord=0,000000,0,000000,0,000000
[map2d_image] g=(2,3) l=(2,0) coord=0,000000,0,000000,0,000000
[map2d_image] g=(3,3) l=(3,0) coord=0,000000,0,000000,0,000000
[map2d_image] g=(0,0) l=(0,0) coord=-1,000000,-1,000000,99,000000
[map2d_image] g=(1,0) l=(1,0) coord=-1,000000,-1,000000,99,000000
[map2d_image] g=(2,0) l=(2,0) coord=-1,000000,-1,000000,99,000000
[map2d_image] g=(3,0) l=(3,0) coord=-1,000000,-1,000000,99,000000
[map2d_image] g=(0,2) l=(0,0) coord=0,000000,0,000000,0,000000
[map2d_image] g=(1,2) l=(1,0) coord=0,000000,0,000000,0,000000
[map2d_image] g=(2,2) l=(2,0) coord=0,000000,0,000000,0,000000
[map2d_image] g=(3,2) l=(3,0) coord=0,000000,0,000000,0,000000

I think I misunderstand the barrier. How would I need to use work_group_barrier to have the coord data be available in all threads?