So your collision buffer is 240 bytes, 1 byte per 16x16-pixel metatile, right? The rest of my post is based on this assumption.
I noticed a few things wrong with your code. The first thing I noticed is that you load new data when the camera is in a position that's a multiple of 16. One problem with this is that you can't scroll at speeds larger than 1 pixel per frame, or you'll miss updates. The second problem is that if the camera stops at a position that's a multiple of 16, the same data will be loaded every frame, wasting CPU time. To avoid these problems, the correct way to detect a metatile boundary crossing is to compare the previous and the current values of CameraX, looking for changes in bit 4:
All you need for this to work is to copy CameraX to OldCameraX before modifying it. This way you can scroll up to 16 pixels per frame without problems, and the CPU will only spend time loading new data once.
Here's another possible problem:
LDA CameraX ;get absolute
AND #$10 ;only take the multiple of 16
Shouldn't you be ANDing with $f0, to keep the 4 highest bits of the camera? ANDing with $10 only preserves bit 4, so the index you create from this will only ever be 0 or 1, which I believe is not what you want. To tell the truth, you don't even need an AND here at all, since the lower bits will be discarded anyway due to the 4 LSRs. Also note that since CameraX is only 8 bits, you can't fully scroll into the second screen. Since the highest possible value for CameraX is $FF (1 pixel short of entering the second screen), the farthest you can start from is the last column of the first screen, meaning you'll never reach the last column of the second screen. Fixing this would require extending CameraX to 16-bits, so the index is created like this:
This uses only the lowest bit of CameraX+1, supporting levels up to 512 pixels wide. For longer levels you'll need to take more bits from CameraX+1.
Now, the biggest problem is probably the actual data transfer. You can't just transfer 240 contiguous bytes, since the level is longer than 16 metatiles. The buffer is 16 metatiles wide, so when the index goes from 15 to 16, that's an automatic wrap to the next row, but since the map in the ROM is longer, this index will just keep reading from the same row, wrapping at the wrong position, and causing your buffer to be filled with misaligned garbage. What you need to do is adjust the pointer (LevelDataPtr) after each run of 16 bytes copied, so the data is correctly copied from the row below.
Since your maps are apparently 32 metatiles long, you can get away with adding 16 to LevelDataPtr after every 16 bytes copied, to force a wrap to the next row, while still using the same unmodified index for the destination. This is a quick hack for when the length of the level is constant, but if you ever switch to levels of arbitrary lengths you will need a more versatile solution. This is one way to implement the hack:
EDIT: I edited the post a few times. Please be sure you didn't miss anything.