@Laubender
Thank you for responding.
I was referring to what you said in number 1 above. It is the character count from the beginning of any story. I have actually tested it on two different stories in the same file and if a paragraph return exists in the 8001 (or a multiple of it) from the beginning of each it will do it in each.
I hadn't tested it with a snippet but I did after you asked. I get the same result with that.
As for your question about the DOMVersion, I am not familiar with uncompressing idmls. I do however have a tool that decompresses them and gives me all the story xmls separately but the line you are refering to is not there. The tool I am using must be dropping them automatically. The second line of the story xml that I get has the story preferences.