I just found some interesting performance characteristics of the matching algorithms. The tests are not complete, put it appears that the implementations (possibly the algorithms themselves?) are sensitive to which partition is passed first. I think pretty good performance gains can be had by passing the smallest partition first, and this check could be incorporated into the algorithms (constructors), so you'd get the advantage regardless of which order you pass the partitions.
I can supply my test code, and/or the timing data if you like. I thought I would first ask, to see if this is known already, and expected.
some matching algorithms in bipartite graphs perform computation for every vertex in the left partition. Thus, what you are observing is reasonable. Swapping the two partitions is a common optimization trick. Feel free to submit a pull-request if you observe a significant performance gain.