Delay lock acquisition for partitions until we route a tuple to them.
Instead of locking all partitions to which we might route a tuple at
executor startup, just lock them as we use them. In some cases such a
partition might get locked at executor startup anyway because it
appears in the query's range table for some other reason, but in other
cases this is a bit savings.
This changes the order in which partitions are locked in some cases,
which might conceivably create deadlock hazards that don't exist
today, but per discussion, it seems like such cases should be rare
enough that we can neglect them in favor of improving performance.
David Rowley, reviewed and tested by Tomas Vondra, Sho Kato, John
Naylor, Tom Lane, and me.