@Ramjivk – So you'll need something like a logical or geometrical relation first to identify the container object of the image to the right text frame that holds the description. We do not know your specifics here, so it seems impossible at that stage to make suggestions.
It could be the contents of the text frame that has a special relationship to some EXIF data inside the image for instance…
Uwe