Yes. Your concerns are valid.
Let me put my 2Cs here.
When we tried a similar clustering scenario the mod_jk was doing a round robin load balance only it was no true load balancing as such. But the version we were experimenting was Tomcat3.2.x But I am not sure how much of this was enhanced in Catelina and the later versions.
Regarding the responses... to my knowledge, yes they will be sent from Apache only. because the client has no access to the mod_jk communication happening via the engines. So I am pretty sure at one point it will become a bottleneck. That's where I believe hardware load balancing devices come into play. In which they will do the real request handling and you will have enough of those cluster setup to handle the requests.
Note1: There may be other alternatives which I am not aware of.
Note2: take a look at F5 Networks for HW LB productshttp://www.f5.com/f5products/bigip/Apptraffic/